Distributed Cluster Tree Elimination? Ismel Brito and Pedro Meseguer IIIA, Artificial Intelligence Research Institute CSIC, Spanish National Research Council Campus UAB, 08193 Bellaterra, Spain ismel|pedro@iiia.csic.es Abstract Cluster and mini-cluster tree elimination are well-known solving meth- ods for constrained optimization problems, developed for the central- ized case. These methods, based on cost function combination, can be easily reformulated as synchronous algorithms to solve the distributed versions of the above mentioned problems. During solving they ex- change a linear number of messages, but these messages could be of exponential size, which is their main drawback that often limits their practical application. Filtering is a general technique to decrease the size of cost function combination when using upper and lower bounds. We combine this technique with the previous algorithms, producing a significative decrement in message size and improving their practi- cal memory usage. As result, the improved algorithm is able to solve larger problems, keeping under control memory consumption. 1 Introduction Most of constraint reasoning methods have been developed under the im- plicit assumption that the constraint network is in the memory of a single agent, which performs the solving task. This is called centralized solving. In the last years, there is an increasing interest to solve these problems in a dis- tributed form, when different problem parts are in the memory of different agents and they cannot be joined into a single one (because incompatible formats, privacy, etc.). New solving algorithms have been developed for this distributed model, where communication between agents is done by message passing. As examples, we mention ABT [10], ADOPT [5], DPOP [8]. In the centralized case, there are different forms to decompose a problem instance [2]. In particular, several algorithms work on a special structure: the cluster tree. These algorithms can be extended to distributed constraint solving, assuming that the distributed instance is arranged in a cluster tree. Interestingly, there are distributed algorithms able to build a cluster tree from a distribution of constraints into agents. So the extension of cluster tree solving algorithms to the distributed case seems feasible. The first goal of the paper is to show that a new set of distributed synchronous ? This work has been partially supported by the project TIN2006-15387-C03-01. Proceedings of the 15th International RCRA workshop (RCRA 2008): Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion Udine, Italy, 12–13 December 2008 algorithms, inspired in the centralized algorithms working on the cluster tree, can be developed to solve distributed constraint problems. In the centralized case, these algorithms exhibit a exponential complexity in time and memory. Exponential memory is often the most restrictive limitation for their practical applicability. Some versions limit the memory usage, at the cost of achieving approximate solutions. Function filtering is an strategy to overcome this fact, allowing a better use of memory and achieving, in many cases, the exact solution. The second goal of the paper is to show that this idea can be applied to distributed constraint solving, causing some benefits. The paper is organized as follows. In section 2, we provide a precise definition of the problems we consider. To make a self-contained paper, we summarize some centralized solving algorithms in section 3, while the idea of function filtering appears in section 4. Moving into a distributed context, we present new solving algorithms based on these ideas in sections 5 and 6, including distributed function filtering. The distributed algorithm to build cluster trees appears in section 7, with an example to clarify these new algorithms in section 8. Finally, section 9 contains some conclusions. 2 Preliminaries In a centralized setting, a Constraint Optimization Problem (COP) involves a finite set of variables, each one taking a value in a finite domain. Vari- ables are related by cost functions that specify the cost of value tuples on some variable subsets. Costs are natural numbers (including zero and ∞). Formally, a finite COP is defined by a triple (X, D, C), • X = {x1 , . . . , xn } is a set of n variables; • D = {D(x1 ), . . . , D(xn )} is a collection of finite domains; D(xi ) is the initial set of possible values for xi ; • C is a set of cost functions; fi ∈ C on the ordered set of variables var(fi ) = (xi1 , . . . , xir(i) ) specifies the cost of every combination of Qiri values for the variables in var(fi ), that is, fi : j=i 1 D(xj ) 7→ N + (where N + is the set of natural numbers including 0 and ∞). The arity of fi is the cardinality of the set var(fi ). The overall cost of a complete tuple (involving all variables) is the addition of all individual cost functions on that particular tuple. A solution is a complete tuple whose overall cost is not unacceptable. A solution is optimal if its overall cost is minimal. Previous COP definition does not make explicit the fact that there is an upper bound in the cost of acceptable value tuples, so those value tuples whose cost exceeds this upper bound can be safely removed. In addition, 2 this upper bound may change during problem resolution. To make explicit these ideas, COP definition is refined to produce the so called Weighted Constraint Satisfaction Problem (WCSP). Formally, a WCSP is defined as a four tuple (X, D, C, S(k)), where X and D are as in the previous definition, C is a set of cost functions and S(k) is a valuation structure [4]. While in COPs a cost function maps value combinations into natural numbers, in a WCSP a cost function maps value combinations into a special set {0, 1, ..., k}. Qiri That is, fi : j=i 1 D(xj ) 7→ {0, 1, ..., k}. Costs are elements of the set {0, 1, ..., k}, where 0 is the minimum cost and k is the minimum unacceptable cost. All costs lower than k are acceptable, while all costs higher or equal to k are equally unacceptable. Costs are combined with the ⊕ operation: a ⊕ b = min{a + b, k}, meaning that if the addition of two costs exceeds k, it automatically equals k. Costs are totally ordered with the standard order among naturals. The size of f , denoted |f |, is the cardinal of Sf . Observe that this definition includes purely satisfaction instances (classical CSP), where tuples are either permitted or forbidden: a permitted tuple costs 0, a forbidden tuple costs 1, and k must be 1. We store cost function f as a set Sf containing all pairs (t, f (t)) with cost less than k. This definition can be extended to a distributed context. A Distributed Weighted Constraint Satisfaction Problem (DWCSP), is a WCSP where vari- ables, domains and cost functions are distributed among automated agents. Formally, we define a variable-based (resp. cost-function-based ) DWCSP as a 6-tuple (X, D, C, S(k), A, α (resp. β)), where X, D, C and S(k) define a WCSP, A is a set of p agents and α (resp. β) maps each variable (resp. cost function) to one agent. Here we assume the DWCSP model: it is a refined version of distributed constraint optimization, where the notion of unacceptable cost is explicitly handled. In the rest of the paper, we will assume the cost-function-based definition of DWCSP. Next, some terminology to be used in the rest of the paper. An as- signment or tuple tS with scope S is an ordered sequence of values, each corresponding to a variable of S ⊆ X. The projection of tS on a subset of variables T ⊂ S, written tS [T ], is formed from tS removing the values of variables that do not appear in T . This idea can be extended to cost func- tions: the projection of f on T ⊂ var(f ), is a new cost function f [T ] formed by the tuples of f removing the values of variables that do not appear in T , removing duplicates and keeping the minimum cost of the original tuples in f . The join of two tuples tS and t0T , written tS 1 t0T , is a new tuple with scope S ∪ T , formed by the values appearing in tS and t0T ; it is only defined when common variables have the same values in tS and t0T . The cost of a tuple tX (involving all variables) is ⊕f ∈C f (tX ), that is, the addition of the individual cost functions evaluated on tX (implicitly, it is assumed that, for each f ∈ C, f (tX ) = f (tX [var(f )])). A solution is a tuple with cost lower than k. A solution is optimal if its cost is minimal. 3 3 Cluster Tree Elimination Centralized WCSPs can be solved using tree decomposition methods. A tree decomposition of a WCSP hX, D, C, S(k)i is a triple hT, χ, ψi, where T = hV, Ei is a tree, χ and ψ are labeling functions which associate with each vertex v ∈ V two sets, χ(v) ⊆ X and ψ(v) ⊆ C such that • for each cost function f ∈ C, there is exactly one vertex v ∈ V such that f ∈ ψ(v); in addition, var(f ) ⊆ χ(v); • for each variable x ∈ X, the set {v ∈ V |x ∈ χ(v)} induces a connected subtree of T . The tree-width of hT, χ, ψi is tw = maxv∈V |χ(v)|. If u and v are adjacent vertices, (u, v) ∈ E, its separator is sep(u, v) = χ(u) ∩ χ(v) [1]. Summing two functions Q f and g is a new Q function f + g with scope var(f ) ∪ var(g) and ∀t ∈ xi ∈var(f ) Di , ∀t0 ∈ xj ∈var(g) Dj (f + g)(t 1 t0 ) = f (t) ⊕ g(t0 ) (t 1 t0 is defined when t and t0 share values of common variables). Function g is a lower bound of f , denoted g ≤ f , if var(g) ⊆ var(f ) and for all P tuples t of f , g(t) ≤ f (t). A set of functions G is a lower bound of possible f iff ( g∈G g) ≤ f ; var(G) = ∪g∈G var(g). It isPeasy to checkP that for any f, U ⊂ var(f ), f [U ] is a lower bound of f , and f ∈F f [U ] ≤ ( f ∈F f )[U ]. The Cluster-Tree Elimination algorithm (CTE) solves WCSP by sending messages along tree decomposition edges [1]. Edge (u, v) ∈ E has associated two CTE messages m(u,v) , from u to v, and m(v,u) , from v to u. m(u,v) is a function computed summing all functions in ψ(v) with all incoming CTE messages except from m(v,u) and projected on sep(u, v). CTE appears in Figure 1. It is correct, with exponential complexity in time and space [3]. Mini-Cluster-Tree Elimination (MCTE(r)) approximates CTE. If the number of variables in u is high, it may be impossible to compute m(u,v) due to memory limitations. MCTE(r) computes a lower bound by limiting to r the maximum arity of the functions sent in the messages. A MCTE(r) message, M(u,v) , is a set of functions that approximate the corresponding CTE message m(u,v) (M(u,v) ≤ m(u,v) ). It is computed as m(u,v) but instead of summing all functions of set B (see Figure 1), it computes a partition P = {P1 , . . . , Pq } of B such that the arity of the sum of functions in every procedure CTE(hX, D, C, S(k)i, hhV, Ei, χ, ψi) 1 for each (u, v) ∈ E s.t. all m(i,u) , i 6= v have arrived do 2 B ← ψ(u) ∪ P{m(i,u) | (i, u) ∈ E, i 6= v}; 3 m(u,v) ← ( f ∈B f )[sep(u, v)]; 4 send m(u,v) ; Figure 1: The CTE algorithm. 4 Pi does not exceed r. The MCTE(r) algorithm is obtained replacing line 3 of CTE by the following lines (where the projection is done on the variables of the separator that appear in the scope of the functions in the partition class), 3.1 {P1 , ..., Pq } ← P partition(B, r); 3.2 M(u,v) ← {( f ∈Pi f )[sep(u, v) ∩ (∪f ∈Pi var(f ))] | i : 1...q}; 4 Filtering Cost Functions Filtering cost functions is a clever strategy to decrease the size of messages sent by CTE and MCTE(r). It was introduced for the centralized case in [9]. The basic idea is to detect tuples that, although having acceptable cost in their vertex, they will always generate tuples with unacceptable cost when combined with other cost functions coming from other vertices. These initial tuples are removed before they are sent, decreasing the size of exchanged cost functions. This idea generates IMCTEf(r), an iterative version of MCTEf(r), which nicely decreases the size of exchanged functions at each iteration. In the best case, this strategy would allow for an exact computation of the optimal solution. If not, it will compute an approximated solution closer to the optimal one than the one computed by MCTE(r). A nogood is a tuple t that cannot be extended into a complete assignment with acceptable cost. Nogoods are useless for solution generation, so they can be eliminated as soon as are detected. For summing f + g, we iterate over all the combinations (t, f (t)) ∈ Sf and (t0 , g(t0 )) ∈ Sg and, if they match, compute (t 1 t0 , f (t) ⊕ g(t0 )). If f (t) ⊕ g(t0 ) ≥ k, tuple t 1 t0 is a nogood so it is not stored in Sf +g . Filtering cost functions consists of anticipating the nogoods on cost func- tions, removing them before real operation. Imagine that we know that cost function f will be added (in the future) with cost function g, and we know that the set of functions H is a lower bound of g. We define the filtering of H f from H, noted f , as  L H f (t) if ( h∈H h(t)) ⊕ f (t) < k f (t) = k otherwise Tuples reaching the upper bound k are removed because they will be gen- erate unacceptable tuples when f will be added with g (remember that H is a lower bound of g). This causes to reduce |f | before operating with it. Let f and g be two cost functions, and G set of functions that is a lower bound of g. Filtering f with G before adding with g it is equal to f + g, G f +g =f +g To see this result, it is enough to decompose the function f , stored as the set of tuples that do not reach the upper bound k, in the following partition, Sf = {(t1 , f (t1 ))|t1 ∈ P } ∪ {(t2 , f (t2 ))|t2 ∈ Q} 5 where P = {t|t ∈ xi ∈var(f ) D(xi ), ∃t0 ∈ xj ∈var(G) D(xj ), t 1 t0 is defined, Q Q such that ( h∈G h(t0 )) ⊕ f (t) < k}, and Q = {t|t ∈ xi ∈var(f ) D(xi ), ∀t0 ∈ L Q xj ∈var(G) D(xj ), t 1 t is defined, such that ( 0 0 Q L h∈G h(t )) ⊕ f (t) ≥ k}. Assuming that var(G) = var(g), function f + g is stored as, Sf +g = {(t00 = t 1 t0 , f (t) ⊕ g(t0 ))} = {(t001 = t1 1 t0 , f (t1 ) ⊕ g(t0 ))|t1 ∈ P } ∪ {(t002 = t2 1 t0 , f (t2 ) ⊕ g(t0 ))|t2 ∈ Q} but the second set is empty because for any t0 , f (t2 ) ⊕ ( h∈G h(t0 )) ≤ L f (t2 ) ⊕ g(t0 ), since G is a lower bound of g, so f (t2 ) ⊕ g(t0 ) ≥ k, ∀t2 ∈ Q. Then, f + g is stored as, {(t001 = t1 1 t0 , f (t1 ) ⊕ g(t0 ))|t1 ∈ P } G which is exactly f + g. How often do we know that function f will be added with function g? In vertex u, function m(u,v) summarizes the effect of the part of the cluster tree rooted at u, while m(v,u) summarizes the effect of the rest of the cluster tree. To compute the solution in vertex u, these two functions must be added. Therefore, if we are able to compute a lower bound of m(v,u) before m(u,v) is sent, we can filter m(u,v) with that lower bound and reduce its size. With this idea, function filtering easily integrates into CTE. A filtering tree-decomposition is a tuple hT, χ, ψ, φi, where φ(u, v) is a set of functions associated to edge (u, v) ∈ E with scope included in sep(u, v). φ(u, v) must be a lower bound of the corresponding m(v,u) (namely, φ(u, v) ≤ m(v,u) ). The algorithms CTEf and MCTEf(r) use a filtering tree decomposition. They are equivalent to CTE and MCTE(r) except in that they use φ(u, v) for filtering functions before computing m(u,v) or M(u,v) . For CTEf, we replace line 3 by, P φ(u,v) 3 m(u,v) ← f ∈B f [sep(u, v)]; Similarly for MCTEf(r) we replace line 3 by two lines, 3.1 {P1 , ..., Pp } ← partitioning(B, r); P φ(u,v) 3.2 M(u,v) ← {( f ∈Pi f )[sep(u, v) ∩ (∪f ∈Pi var(f ))] | i : 1...p}]; An option for CTEf is to include in φ(u, v) a message M(v,u) from a pre- vious execution of MCTE(r). Applying this idea to MCTEf, we obtain a recursive algorithm which naturally produces an elegant iterative approxi- mating method called IMCTEf (Figure 2). It executes MCTEf(r) using as r−1 lower bounds φ(u, v) the messages M(v,u) computed by MCTEf(r −1) which, r−2 recursively, uses the messages M(v,u) computed by MCTEf(r − 2), an so on. 6 procedure IMCTE(hX, D, C, ki, hhV, Ei, χ, ψi) for each (u, v) ∈ E do φ(u, v) := {∅}; r := 1; repeat MCTEf(r); r := r + 1; for each (u, v) ∈ E do φ(u, v) := M(u,v) ; until exact solution or exhausted resources Figure 2: The IMCTE algorithm. 5 Distributed Cluster Tree Elimination The CTE algorithm can be easily adapted to the distributed case, producing the Distributed CTE (DCTE) algorithm. We assume that the DWCSP instance (X, D, C, A, β) to solve is arranged in a cluster tree (T, χ, ψ), where each vertex is a different agent (distributed algorithms to build a cluster tree exist, see section 7). Let us consider self , a generic agent. It owns a specific vertex in the cluster tree, so self knows its position in the tree: it knows its neighboring agents and the separators with them. Besides, self also knows variables of χ(self ) and cost functions of ψ(self ). DCTE exchanges messages among agents. There is one message type, CF , to exchange cost functions. When self has received function messages from all its neighbors except perhaps i, it performs the summation of the re- ceived cost functions (excluding cost function from i) with the cost functions of ψ(self ), producing a new cost function, which is projected on sep(self, i) and sent to agent i. This process is repeated for all neighbors. Agents having a single neighbor are the first computing and sending CF messages. CF messages play the same role as function messages in centralized CTE. For each edge (i, j) in the tree (i and j are neighboring agents) there are two CF messages: one from i to j and other from j to i. Once these two messages have been exchanged for all edges in the tree, agent self contains in its cluster (formed by ψ(self ) and the CF messages received from its neighbors) enough information to solve the particular WCSP instance, assuring that local optimal solutions at different vertices are part of a global optimal solution. To break ties, a total ordering on the values of each variable is needed, which has to be common to all agents. DCTE algorithm appears in Figure 3. It takes as input the tree decom- position (T, χ, ψ) on the vertex self (as explained in the first paragraph of this section), and returns the vertex self augmented with a number of cost functions, one per neighbor. These new cost functions plus the cost function of ψ(self ) are enough to compute the local optimal solutions of each vertex such that they are compatible with solutions of other vertices and share the global optimum (they form the minimal subproblem, see [1]). DCTE records the cost functions exchanged among agents. When a cost function 7 procedure DCTE(T, χ, ψ) compute neighbors(self ), χ(self ), ψ(self ); for each j ∈ neighbors(self ) do compute separator(self, j); recvCF [j] ← sentCF [j] ← false; end ← false; if neighbors(self ) = {other} then ComputeSendFunction(self, other); while (¬end) do msg ← getMsg(); if (msg.type = CF ) then NewCostFunction(msg); end ← ∧j∈neighbors(self ) (sentCF [j] ∧ recvCF [j]); compute solution from ψ(self ) ∪ {recvF unction[j] | j ∈ neighbors(self )}; procedure NewCostFunction(msg) recvF unction[msg.sender] ← msg.f unction; recv[msg.sender] ← true; for each j ∈ neighbors(self ) s.t. sent[j] = false do if ∧i∈neighbors(self ),i6=j recv[i] then ComputeSendFunction(self, j); procedure ComputeSendFunction(self, P dest) P f unction ← i∈neighbors(self ),i6=dest recvF unction[i] + f ∈ψ(self ) f ; sendMsg(self, dest, f unction[separator(self, dest)]); sent[dest] ← true; Figure 3: The Distributed Cluster Tree Elimination algorithm. is received from neighbor j, it is stored in recvF unction[j] and the boolean recvCF [j] is set to true. When a cost function is sent to neighbor j, the boolean sentCF [j] is set to true. The main procedure is DCTE, which works as follows. First, some ele- ments of the cluster tree required for self are computed (neighbors(self ), χ(self ), ψ(self ), separators between self and its neighbors), and some vari- ables (vectors recvCF and sendCF , variable end) are initialized. If self has a single neighbor, other, self does not have to wait for any incoming cost function. So the procedure ComputeSentFunction is called, comput- ing P the corresponding summation of functions projected on the separator ( f ∈ψ(self ) f )[separator(self, other)], that is sent to agent other. Next, there is a loop that reads a message, processes it and checks the final con- dition, when self has received/sent a cost function from/to each neighbor. Finally, the solution is computed using the cost functions in ψ(self ) and the received cost functions. Procedure NewCostFunction processes CF messages. It records the cost function contained in the message, and if this function allows for com- puting a cost function to be sent to another neighbor, it is done by the ComputeSentFunction procedure. DCTE is a distributed synchronous algorithm, since self has to wait to receive CF messages from all its neighbors but j, to be able to compute (and send) its CF message to j. 8 6 Distributed Mini-Cluster Tree Elimination DCTE can be easily modified to produce the Mini-cluster version (DMCTE). We have two new parameters here: r is the maximum arity of the cost functions that can be sent to neighbors, and ub is the initial upper bound. DMCTE is conceptually close to DCTE, but its practical implementation is more involved. While DCTE adds all cost functions of an agent (no matter the resulting arity), and sends the projection on the separator, DMCTE limits the arity of the resulting cost function. In consequence, DMCTE exchanges cost functions (via CF messages) which are approximations of the exact cost functions (those exchanged by DCTE). When each agent has sent to/received from a CF message to each of its neighbors, these approximate cost functions have been propagated. While this approximation allow for computing an approximate solution at each agent, there is no guarantee that these solutions will be compatible each other (that is, with the same values for common variables in the separators). To assure this, once CF messages have been exchanged, values of common variables in the separators are exchanged via SS messages. If a discrepancy exists, the value coming from the agent with lower id prevails. Finally, a global upper bound is computed by exchanging U B messages, containing the cost of the current solution on the original cost functions of each agent. DMCTE uses three message types, • CF : cost function messages. They work as the CF messages of the DCTE algorithm, with the following exception. The set of cost func- tions to be added is partitioned, such that the cost function resulting from the addition of the cost functions of each class do not exceed arity r. Each resulting function is projected on the separator and sent to the corresponding agent. A CF message contains not a single cost function (as in DCTE) but a set of cost functions. • SS: solution separator messages. When self has received all approx- imate cost functions and computed an approximate solution, it ex- changes the values of variables in the separators with its neighbors. The value sent by the agent with lower id prevails. These messages are sent/received following the same strategy as CF messages. • U B: upper bound messages. Once a compatible solution has been found, agents exchange the cost of this solution via U B messages. They are sent/received following the same strategy as CF messages. We do not provide the DMCTE code due to space limitations. The very same idea of filtering cost functions can be applied to the distributed case. The iterative version of mini-cluster tree elimination can be extended to the 9 distributed case, where messages of the previous iteration are used as filters to the messages of the current iteration. The DMCTEf(r) algorithm performs filtering when adding cost func- tions. Its only novelty with respect DMCTE(r) is that new cost functions to be send to other agents j are filtered when they are computed. The Dis- tributed IMCTEf (DIMCTEf) algorithm is a direct extension of IMCTEf (Figure 2) to the distributed case, where previous messages are recorded and used as filters. The upper bound computed at the end of iteration i is used as parameter ub in the next iteration. We do not provide the code of DIMCTEf due to space limitations. 7 Distributed Cluster Tree Formation Throughout this paper it is assumed the existence of a cluster tree where the problem instance is arranged. In the centralized case, it is well-known the existence of algorithms to build such a tree [1]. In the distributed case, of interest here, there are also algorithms able to build a cluster tree in a distributed form. In the following, we provide a short description of the ERP algorithm [6], able to compute the cluster tree from a DWCSP instance. Initially, we have (X, D, C, S(k), A, β), where X is the set of variables, D is the collection of corresponding domains, C is the set of cost functions, S(k) is a valuation structure, A is a set of agents and β is a function that maps cost functions into agents. We assume that β covers the whole set of agents (if there is some agent without cost function, it is removed from A). The construction of the cluster tree (T (V, E), χ, ψ) has the following steps (where χ0 (v) denotes the initial value of χ(v)) : 1. The set of vertices is the set of agents. 2. For each vertex u, ψ(u) = {f ∈ C | β(f ) = u}; χ0 (u) = ∪f ∈ψ(u) var(f ). 3. Two vertices u and v are considered adjacent if they share one or more variables, that is, χ0 (u) ∩ χ0 (v) 6= ∅. This criterion defines a graph G. 4. Using a distributed spanning tree algorithm on G [7], we obtain a span- ning tree T (V = A, E). It does not necessarily satisfy the connectness (also called running intersection) property. 5. The connectness property is assured as follows. Each agent i sends to its neighbors in the tree the variables that initially appear in is χ0 (i). Once agent i has received from its neighbors all these messages, it updates χ(i) as follows, [ χ(i) ← χ0 (i) ( χ0 (j) ∩ χ0 (k) ) j,k∈neighbors(i),j6=k 10 X Y T U X Y Z T a1 Z V fXY fYT fTZ {Z T} X Y Y T Z T fXY a a 20 a b 10 fYT a a 0 a a 14 fZT a b 10 a2 U V Z T b a 10 a b 4 b a 4 b a 14 fTU fUV fVZ b b 0 b b 12 b b 10 T U Z V U V a a 3 a a 3 a a 0 fTU fZV a b 2 fUV a b 10 a b 2 b a 2 b a 2 b a 10 b b 0 b b 0 b b 0 Figure 4: Problem instance, involving 6 variables and 6 cost functions. Its tree decomposition, formed by two vertices a1 and a2 , appears in the right. The separator between them is {Z T }. which has a clear meaning: if a variable x appears in two neighbors j and k, it must also appear in the vertex itself, to assure connectness. 8 Example On the instance depicted in Figure 4, with the indicated tree decomposition, we will detail the execution of DCTE, DMCTE and DIMCTE. DCTE. Agent a1 computes function f1 ← fXY + fY T + fT Z , projects this function on the separator between a1 and a2 , and sends the result, f2 = f1 [ZT ], to agent a2 in a CF message, X Y T Z X Y T Z a a a a 34 b a a a 24 a a a b 34 b a a b 24 Z T a a b a 34 b a b a 24 a a 18 f1 : a a b b 34 b a b b 24 f2 : a b 22 a b a a 28 b b a a 18 b a 18 a b a b 28 b b a b 18 b b 22 a b b a 32 b b b a 22 a b b b 32 b b b b 22 Analogously, a2 computes f3 ← fT U + fU V + fV Z , projects on the separator and sends the result, f4 = f3 [ZT ], to a1 in a CF message, 11 U V Z T U V Z T a a a a 6 b a a a 15 a a a b 5 b a a b 13 Z T a a b a 5 b a b a 14 a a 4 f3 : a a b b 4 b a b b 12 f4 : a b 2 a b a a 15 b b a a 4 b a 2 a b a b 14 b b a b 2 b b 0 a b b a 13 b b b a 2 a b b b 12 b b b b 0 Agent a2 receives the CF message sent by a1 and executes the procedure NewCostFunction, storing the received function and calling ComputeSendUB. The very same process happens in a1 when receiving the CF message from a2 : NewCostFunction is executed, the received cost function is stored. Now, a1 computes its local optimum using the cost functions {fXY , fY T , fT Z , f4 }, obtaining XY ZT ← bbba. Analogously, a2 computes its local optimum us- ing the cost functions {fT U , fU V , fV Z , f2 }, obtaining U V ZT ← bbba. DMCTE(r = 2). If r = 2, agents cannot compute cost functions of arity greater than 2. Since original cost functions are binary, agents do not per- form any addition on them, they just project on the separator and send the resulting cost functions. Thus, a1 computes g1 = fY T [T ] and sends a CF message with g1 and fZT to a2 . Z T T a a 14 g1 : a 0 fZT : a b 10 b 4 b a 14 b b 10 Analogously, a2 computes g2 = fT U [T ] and g3 = fZV [Z], builds a CF mes- sage containing g2 and g3 and sends it to a1 . T Z g2 : a 2 g3 : a 2 b 0 b 0 Agent a1 computes its local optimum, that is XY ZT ← bbba, while a2 does the same, obtaining U V ZT ← bbbb. Since there is discrepancy in the value of T , values of T and Z are exchanged between the agents. Assuming that a2 has smaller identifier than a1 , its value for T prevails, so a1 changes its optimum to XY T Z ← bbbb. Now agents exchange their actual costs on the initial cost functions using U B messages. At the end, each agent knows that a true upper bound of the cost of the exact optimum is 22, the cost of the all b’s solution in the whole instance. DIMCTEf. We take 9 as the limit of the #tuples computed by agent (9 = 23 , we use the same memory as DMCTE(r = 3)). Let us start with r = 2 (r = 1 has no sense here because all initial cost functions are binary). 12 This is exactly the DMCTE execution indicated above. Then, we move to r = 3 taking as new ub = 22, the upper bound computed in the previous iteration. Agent a1 computes g4 = fXY +fY T , filtering its construction with g2 . Then, it computes g5 = g4 g2 [T ], builds a CF message with g5 and fZT , and sends it to a2 . It is worth noting that three tuples are eliminated, since they are nogoods (their cost is higher or equal the current ub). X Y T a a a 20 + 2 ≥ ub a a b 24 + 0 ≥ ub Z T a b a 14 + 2 T a a 14 g4 : a b b 22 + 0 ≥ ub g5 : a 0 fZT : a b 10 b a a 10 + 2 b 12 b a 14 b a b 14 + 0 b b 10 b b a 0+2 b b b 12 + 0 Agent a2 computes g6 = fT U + fU V , filtering with g1 and fZT [T ]. Four tu- ples are eliminated because their cost reach the upper bound. It computes g7 = g6 {g1 ,fZT [T ]} [T ]. U V T a a a 3 + 0 + 14 a a b 2 + 3 + 10 a b a 13 + 0 + 14 ≥ ub T g6 : a b b 12 + 3 +10 ≥ ub g7 : a 2 b a a 12 + 0 + 14 ≥ ub b 0 b a b 10 + 3 + 10 ≥ ub b b a 2 + 0 + 14 b b b 0 + 3 + 10 Agent a2 filters fZV with fZT [Z], but no tuple is eliminated. Its projection f [Z] on Z generates g8 = fZV ZT [Z]. Z g8 : a 2 b 0 At this point, a1 and a2 compute the approximate optimum in their respective vertices. It happens that a1 computes XY ZT ← bbba, while a2 computes U V ZT ← bbba. No discrepancy exists on values of variables in separators, so they exchange U B messages on the cost of the optimum, obtaining 20 as upper bound (in fact, this is the exact optimum). Then, we move with r = 4, expecting to have enough memory to compute the exact optimum. Agent a1 computes f1 = fXY + fY T + fZT , filtering it with g7 and g8 . It happens that all tuples are removed because their cost reach the upper bound. 13 X Y T Z X Y T Z a a a a 34 + 2 + 2 ≥ ub b a a a 24 + 2 + 2 ≥ ub a a a b 34 + 2 + 0 ≥ ub b a a b 24 + 2 + 0 ≥ ub a a b a 34 + 0 + 2 ≥ ub b a b a 24 + 0 + 2 ≥ ub f1 : a a b b 34 + 0 + 0 ≥ ub b a b b 24 + 0 + 0 ≥ ub a b a a 28 + 2 + 2 ≥ ub b b a a 18 + 2 + 2 ≥ ub a b a b 28 + 2 + 0 ≥ ub b b a b 18 + 2 + 0 ≥ ub a b b a 32 + 0 + 2 ≥ ub b b b a 22 + 0 + 2 ≥ ub a b b b 32 + 0 + 0 ≥ ub b b b b 22 + 0 + 0 ≥ ub This means that the previous approximate solution is in fact the true opti- mum, and its cost the optimum cost. We observe that in the r = 4 iteration, DIMCTEf uses no extra memory since each tuple is discarded as it is gener- ated. To satisfy the communication protocol, a1 builds a CF message with only the tuple ba with cost 20 for {ZT }, and sends it to a2 . Agent a2 computes f3 = fT U + fU V + fZV , filtering with fZT and g5 . {f ,g5 } Then, it computes f5 = f3 ZT [ZT ]. U V Z T U V Z T a a a a 6 + 14 + 0 ≥ ub b a a a 15 + 14 + 0 ≥ ub a a a b 5 + 10 + 12 ≥ ub b a a b 13 + 10 + 12 ≥ ub a a b a 5 + 14 + 0 b a b a 14 + 14 + 0≥ ub f3 : a a b b 4 + 10 + 12 ≥ ub b a b b 12 + 10 + 12 ≥ ub a b a a 15 + 14 + 0 ≥ ub b b a a 4 + 14 + 0 a b a b 14 + 10 + 12 ≥ ub b b a b 2 + 10 + 12 ≥ ub a b b a 13 + 14 + 0≥ ub b b b a 2 + 14 + 0 a b b b 12 + 10 + 12 ≥ ub b b b b 0 + 10 + 12 ≥ ub Z T f5 : a a 4 b a 2 All but three tuples are removed because they reach the upper bound. Agent a2 builds a CF message with f5 and sends it to a1 . At this point, a1 computes its local optimum XY ZT ← bbba and a2 does the same U V ZT ← bbba. There is no need to exchange SS and U B messages, because the problem has been solved exactly. The solution inside each agent is the optimal one. In this case, DIMCTEf exchanges messages of the same size as DMCTE(r = 3) (maximum of 23 tuples), but it is able to solve exactly this instance, while the exact algorithm DCTE requires larger messages (maximum of 24 tuples). 9 Conclusions We have presented DCTE, DMCTE and DIMCTEf, distributed synchronous algorithms for solving distributed WCSPs (a more precise version of the well- known distributed COPs). The DCTE algorithm solves the problem exactly, but requires messages of exponential size. DMCTE and DIMCTEf limit the used memory, at the cost of achieving an approximated solving. Using 14 the function filtering strategy, that also holds in the distributed context, DIMCTEf performs a better memory usage than DMCTE, which increases its practical applicability. These algorithms are inspired in their centralized counterparts, but their extension requires some care. This is especially true for the DMCTE algo- rithm that, in addition to the CF message type used for DCTE, requires two new types of messages SS and U B to deal with the subtleties of ap- proximated solving. More work is needed, especially of experimental nature, to assess the practical applicability of the presented algorithms on different benchmarks. Acknowledgments Authors thank reviewers for their constructive comments and criticisms. References [1] Dechter R. Constraint Processing. Morgan Kaufmann, 2003. [2] Gottlob G., Leone N., Scarcello F. A comparison of structural CSP decompo- sition methods. Artificial Intelligence, 124, 243–282, 2000. [3] Kask K., Dechter R. Larrosa J., Dechter A. Unifying cluster-tree decomposi- tions for reasoning in graphical models. Artificial Intelligence, 166(1-2), 2005. [4] Larrosa J. Node and arc consistency in weighted CSP Proc. of AAAI-02, 2002. [5] Modi P. J., Shen W.M., Tambe M., Yokoo M. Adopt: asynchronous dis- tributed constraint optimization with quality guarantees. Artificial Intelli- gence, 161, 149–180, 2005. [6] Paskin M., Guestrin C. McFadden J. A robust architecture for distributed inference in sensor networks. Proc of IPSN, 55–62, 2005. [7] Perlman R. An algorithm for distributed computation of a spanning tree in an extended LAN. ACM SIGCOMM Computer Communication Review, 44–53, 1985. [8] Petcu A., Faltings B. A scalable method for multiagent constraint optimization Proc. of IJCAI-05, 266–271, 2005. [9] Sanchez M., Larrosa J., Meseguer P. Improving Tree Decomposition Methods with Function Filtering. Proc. of IJCAI-05, 2005. [10] Yokoo M., Durfee E., Ishida T., Kuwabara K. The Distributed Constraint Satisfaction Problem: Formalization and Algorithms. IEEE Trans. Know. and Data Engin., 10, 673–685, 1998. 15