Multiple Constraint Acquisition ∗ Robin Arcangioli, Nadjib Lazaar LIRMM, University of Montpellier, France {arcangioli,lazaar}@lirmm.fr Abstract says yes, Q UACQ reduces the search space by removing all constraints violated by the positive example. In the case of Q UACQ is a constraint acquisition system that as- a negative example, Q UACQ focuses onto one, and only one, sists a non-expert user to model her problem as a constraint of the target network in a number of queries log- constraint network by classifying (partial) exam- arithmic in the size of the example. This key component of ples as positive or negative. For each negative ex- Q UACQ allows it to always converge on the target set of con- ample, Q UACQ focuses onto a constraint of the tar- straints in a polynomial number of queries. However, even get network. The drawback is that the user may that good theoretical bound can be hard to put in practice. need to answer a great number of such examples to For instance, Q UACQ requires the user to classify more than learn all the constraints. In this paper, we provide 9000 examples to get the complete Sudoku model. a new approach that is able to learn a maximum An example can be classified as negative because of more number of constraints violated by a given negative than one violated target constraint. In this paper, we extend example. Finally we give an experimental evalua- the approach used in Q UACQ to make constraint acquisition tion that shows that our approach improves the state more efficient in practice by learning, not only one constraint of the art. on a negative example, but a maximum number of constraints violated by this negative example. Inspired by the work on 1 Introduction enumerating infeasibility called MUS (Minimal Unsatisfia- bility Subsets) in SAT [Liffiton and Sakallah, 2008], we pro- Constraint programming is a powerful paradigm for model- pose an algorithm that computes all minimal scopes of target ing and solving combinatorial problems. However, it has long constraints violated by a given negative example. been recognized that expressing a combinatorial problem as a constraint network requires significant expertise in constraint 2 Background programming [Freuder, 1999]. To alleviate this issue, several techniques have been proposed to acquire a constraint net- The constraint acquisition process can be seen as an interplay work. For example, the matchmaker agent [Freuder and Wal- between the user and the learner. For that, user and learner lace, 2002] interactively asks the user to provide one of the need to share a same vocabulary to communicate. We sup- constraints of the target problem each time the system pro- pose this vocabulary is a set of n variables X and a domain poses an incorrect (negative) solution. In [Beldiceanu and Si- D = {D(x1 ), . . . , D(xn )}, where D(xi ) ⊂ Z is the finite set monis, 2012], Beldiceanu and Simonis have proposed M OD - of values for xi . A constraint cY is defined as a pair where EL S EEKER , a system devoted to problems with regular struc- Y is a subset of variables of X, called the constraint scope, tures, like matrix models. Based on examples of solutions and c is a relation over D of arity |Y |. A constraint network and non-solutions provided by the user, C ONACQ.1 [Bessiere is a set C of constraints on the vocabulary (X, D). An as- et al., 2005] learns a set of constraints that correctly classi- signment eY on a set of variables Y ⊆ X is rejected by a fies all the examples given so far. As an active learning ver- constraint cS if S ⊆ Y and the projection eS of eY on the sion, C ONACQ.2 [Bessiere et al., 2007] proposes examples to variables in S is not in cS . An assignment on X is a solution the user to classify (i.e., membership queries). In [Shcheko- of C if and only if it is not rejected by any constraint in C. tykhin and Friedrich, 2009], the approach used in C ONACQ.2 sol(C) represents the set of solutions of C. has been extended to allow the user to provide arguments as In addition to the vocabulary, the learner owns a language constraints to converge more rapidly. Γ of relations from which it can build constraints on specified Q UACQ is a recent active learner system that is able to sets of variables. Adapting terms from machine learning, the ask the user to classify partial queries [Bessiere et al., 2013]. constraint bias, denoted by B, is a set of constraints built Q UACQ iteratively computes membership queries. If the user from the constraint language Γ on the vocabulary (X, D) from which the learner builds the constraint network. For- ∗ This work has been funded by the EU project ICON (FP7- mally speaking, B = {cS | (S ⊆ X) ∧ (c ∈ Γ)}. A concept 284715) and the JCJC INS2I 2015 project APOSTROP. is a Boolean function over DX = Πxi ∈X D(xi ), that is, a map that assigns to each example e ∈ DX a value in {0, 1}. Algorithme 1 : M ULTI ACQ- Multiple acquisition via par- We call target concept the concept fT that returns 1 for e tial queries. if and only if e is a solution of the problem the user has in 1 CL ← ∅ mind. In a constraint programming context, the target con- 2 while true do cept is represented by a target network denoted by CT . A membership query ASK(e) is a classification question 3 if sol(CL ) = ∅ then return “collapse” asked to the user, where e is a complete assignment in DX . 4 choose e in DX accepted by CL and rejected by B The answer to ASK(e) is “yes” if and only if e ∈ sol(CT ). 5 if e = nil then return “convergence on CL ” A partial query ASK(eY ), with Y ⊆ X, is a classification 6 else question asked to the user, where eY is a partial assignment in 7 M N Ses ← ∅ DY = Πxi ∈Y D(xi ). A set of constraints C accepts a partial 8 findAllScopes (e, X) assignment eY if and only if there does not exist any con- 9 foreach Y ∈ M N Ses do straint c ∈ C rejecting eY . The answer to ASK(eY ) is “yes” 10 cY ← findC(e, Y ) if and only if CT accepts eY . A classified assignment eY is 11 if cY = nil then return “collapse” called a positive or negative example depending on whether 12 else CL ← CL ∪ {cY } ASK(eY ) is “yes” or “no”. For any assignment eY on Y , κB (eY ) denotes the set of all constraints in B rejecting eY . We now define convergence, which is the constraint acqui- Y ∈ M N Ses represents the scope of a violated constraint sition problem we are interested in. We are given a set E of that must be learned. The function findC is called for each (partial) examples labelled by the user as positive or negative. computed MNS (line 10). It takes as parameter the negative We say that a constraint network C agrees with E if C ac- example e and an MNS. For each MNS, it returns a constraint cepts all the examples labelled as positive in E and does not from CT with the given MNS that rejects e. We do not give accept those labelled as negative. The learning process has the code of findC function since it is implemented as it ap- converged on the learned network CL ⊆ B if CL agrees with pear in [Bessiere et al., 2013]. If no constraint is returned E and for every other network C 0 ⊆ B agreeing with E, we (line 11), this is a second condition for collapsing as we could have sol(C 0 ) = sol(CL ). If there does not exist any CL ⊆ B not find in the bias B a constraint rejecting the according neg- such that CL agrees with E, we say that we have collapsed. ative example. Otherwise, the constraint returned by findC This happens when CT 6⊆ B. is added to the learned network CL (line 12). Finally, we introduce the notion of Minimal No Scope, 3.2 Description of the function findAllScopes noted MNS, which is similar to the notion of MUS (Minimal Unsatisfiable Subset) in SAT [Liffiton and Sakallah, 2008]. Definition 1 (Minimal No Scope). Given a negative example Function 1 : findAllScopes (e, Y ) : boolean e, an MNS is a subset of variables U ⊆ X s.t. ASK(eU ) = Output : The set M N Ses contains all MNS of e no and ∀xi ∈ U : ASK(eU \xi ) = yes. 1 if Y ∈ M N Ses then return f alse 3 Multiple constraint acquisition approach 2 if κB (eY ) = ∅ then return true 3 if @M ∈ M N Ses : M ⊂ Y ∧ ASK(eY ) = yes then In this section, we propose M ULTI ACQ for Multiple Acqui- 4 B ← B \ κB (eY ) sition. M ULTI ACQ takes as input a bias B on a vocabulary 5 return true (X,D) and returns a constraint network CL equivalent to the 6 isM N S ← true target network CT by asking (partial) queries. The main dif- 7 P ← subSets(Y, |Y | − 1) ference between Q UACQ [Bessiere et al., 2013] and M ULTI - 8 foreach Pi ∈ P do ACQ is the fact that Q UACQ learns and focuses on one con- 9 isM N S ← findAllScopes(Pi , e) ∧ isM N S straint each time we have a negative example, where M ULTI - 10 if isM N S then M N Ses ← M N Ses ∪ {Y } ACQ tries to learn more than one explanation (constraint) of 11 return f alse why the user classifies a given example as negative. 3.1 Description of M ULTI ACQ The recursive function findAllScopes (see function 1) M ULTI ACQ (see algorithm 1) starts by initializing the CL takes as input a complete example e and a subset of variables network to the empty set (line 1). If CL is unsatisfiable (line Y (X for the first call). The function starts by checking if 3), the acquisition process will reach a collapse state. At line the subset Y is already reported as an MNS (line 1). If this 4, we compute a complete assignment e satisfying the cur- occurs, we return f alse. As we assume that the bias is ex- rent learned network CL and violating at least a constraint in pressive enough to learn CT , when κB (eY ) = ∅ (i.e. there B. If such an example does not exist (line 5), then all the is no violated constraint in B to learn on Y ), it implies the constraints in B are implied by CL , and we have converged. fact that ASK(eY ) = yes and we return true (line 2). As Otherwise, we call the function findAllScopes on the ex- a third check (line 3), we verify if we have not reported a ample e (line 8). If e is a positive example, the bias B will subset of Y as an MNS. If that is the case, we are sure that be reduced. However, when e is negative, findAllScopes ASK(eY ) = no and we get into the main part of the al- will compute a set of MNS (i.e., the set M N Ses). Each gorithm. If not, at line 3, we ask the user to classify the (partial) example eY . If it is positive, we remove from B findAllScopes is called on all the subsets of X of size all the constraints rejecting eY and we return true (lines 4- |X| − 1. As M ⊆ X is an MNS of e, ASK(eM ) = no 5). If ASK(eY ) = no, this means that there is an MNS and so ∀Y ⊆ X\M : ASK(eM ∪Y ) = no (Lemma in Y and in what follows, we try to report it. For that, we 2). Hence, any call of findAllScopes on a superset compute all subsets of Y of size |Y | − 1 (line 7) and we of M will behave exactly as on X. That is, by recur- recall findAllScopes on each subset. If any sub-call to rence, there will be a call of findAllScopes on M . By findAllScopes returns false, the boolean isM N S will be assumption, M is an MNS and for any subset M 0 we set to f alse, which means that Y itself is not an MNS (line have ASK(eM 0 ) = yes (def.1). Thus, M is added to 11). Otherwise, when all the sub-calls of findAllScopes on M N Ses by findAllScopes as an MNS at line 10. Y subsets return true, this means that Y is an MNS (line 10). ii) We have a complete positive assignment eX Notice that the returned boolean corresponds to the classifi- (ASK(eX ) = yes). Here, findAllScopes is called cation of the partial example eY (positive or negative). only once and only on X. That is, the third condition at line 3 is satisfied and findAllScopes returns true with 3.3 Analysis M N Ses = ∅. We analyze the soundness and completeness of findAllScopes. We also give a complexity study of M ULTI ACQ in terms of queries it can ask of the user. Theorem 2. Given a bias B built from a language Γ of bounded arity constraints, and a target network CT , M UL - Lemma 1. If ASK(eY ) = yes then for any Y 0 ⊆ Y we have TI ACQ uses O(|CT | · (|X| + |Γ|) + |B|) queries to prove ASK(eY 0 ) = yes. convergence on the target network or to collapse. Proof. Assume that ASK(eY ) = yes. Hence, there exists Proof. i) We will show first that the maximal number of no constraint from CT violated by eY . For any Y 0 subset of queries asked by findAllScopes is bounded above by Y , the projection eY 0 also does not violate any CT constraint |CT | · (|X| − 1) + |B|. (i.e., ASK(eY 0 ) = yes). – Let us start with tablethe number of queries asked Lemma 2. If ASK(eY ) = no then for any Y 0 ⊇ Y we have by findAllScopes and classified as negative by ASK(eY 0 ) = no. the user. Given an MNS M ⊂ X, we need to ask at most |X| − 1 queries that are classified as negative Proof. Assume that ASK(eY ) = no. Hence, there exists by the user. That is, knowing that findAllScopes at least one constraint from CT violated by eY . For any Y 0 uses a depth first search, we need to ask queries superset of Y , eY 0 also violates at least the constraint violated at each level from the root (i.e., |X|) to the node by eY (i.e., ASK(eY 0 ) = no). M . Since all visited sets contain M , the queries are classified as negative. Here, the worst case Given a bias B and a target network CT , we say that CT is when |M | = 1 where we will have |X| − 1 is learnable if it is representable by B. Thus, ASK(eY ) = negative queries. Then, the use of Lemma 2 at no ⇒ κB (eY ) 6= ∅ , which is equivalent to κB (eY ) = ∅ ⇒ line 1 and 3 ensures the fact that we will never ASK(eY ) = yes. ask a query on a superset of M . With the fact that Theorem 1. Given any bias B and any target network CT the total number of MNS is bounded by |CT |, we representable by B, the findAllScopes function is sound can say that the total number of negative queries and complete. asked by findAllScopes is bounded above by |CT | · (|X| − 1). Proof. (Sketch.) • Soundness : Assume that we have a set of variables – Let us now see the number of queries asked X = {x1 , . . . , xn } and a complete assignment eX . by findAllScopes and classified as positive We show that any subset Y = {xj , . . . , xk } added to by the user. This number is bounded above by M N Ses by findAllScopes is an MNS. If Y is added |B|. Let us take two subsets Y and Y 0 where to M N Ses (line 10), this means that ASK(eY ) = no κB (eY 0 ) ⊆ κB (eY ). If an ask on Y is classified and the isM N S = true. As isM N S = true, it means as positive, we remove κB (eY ) from B (line 4) that all sub-calls of findAllScopes (line 9) on Y sub- and therefore eY 0 will be classified as positive sets of size |Y | − 1 return true. Hence, as we sup- without a query where κB (eY 0 ) becomes empty posed the bias B to be expressive enough, ∀Y 0 ⊂ Y (line 2). The worst case is when we remove, at of |Y | − 1 size, we have ASK(eY 0 ) = yes and knowing each time, one constraint from B (|κB (eY )| = 1) that ASK(eY ) = no imply that Y is an MNS (def. 1). and therefore findAllScopes asks |B| queries • Completeness : Assume that we have a set of variables classified as positive. X = {x1 , . . . , xn } : i) We have a complete negative assignment eX As has been shown, the maximal number of queries (ASK(eX ) = no) with an MNS M = {xj , . . . , xk }. asked by the findAllScopes function is bounded The three conditions at lines 1, 2 and 3 are not sat- above by |CT | · (|X| − 1) + |B|. isfied (ASK(eX ) = no and κB (eX ) 6= ∅). Then, ii) findC function uses O(|Γ|) queries to return a con- Table 1 displays a comparative performance of M ULTI - straint from CT [Bessiere et al., 2013] and therefore ACQ and Q UACQ. We report the size |CL | of the learned O(|CT | · |Γ|) queries to return all the constraints. network, the total number of queries #q, the average size q̄ of all queries, the number of complete (resp. partial) negative From i) and ii), the total number of queries asked by M UL - queries #qc− (resp. #qp− ), and the average time Tmns (resp. TI ACQ to converge to the target network is bounded above by Tq ) needed to compute an MNS (resp. a query) in seconds. |CT | · (|X| − 1) + |B| + |CT | · |Γ|. If we compare our M ULTI ACQ to Q UACQ, the main ob- servation is that the use of findAllScopes to find all MNS of a negative example reduces significantly the number of Before addressing our experimental evaluation, it is im- queries required for convergence. For 4-queens instance and portant to stress here that Q UACQ converges on a constraint using M ULTI ACQ we reduce the number of queries by 44%. of the target network in a logarithmic number of queries Here, M ULTI ACQ needs only 7 complete negative examples [Bessiere et al., 2013]. Whereas M ULTI ACQ converges on to learn 12 constraints that are equivalent to the target net- a set of constraints of the target network but on a linear num- work. Whereas Q UACQ needs twice as many negatives (14 ber of queries in the size of the example. However, we believe examples) to converge to the target network. that M ULTI ACQ can be more efficient than Q UACQ in prac- Let us take Golomb rulers instance. Here the gain in terms tice. of queries is of 31%. The same observation can be made on the number of complete (partial) negative examples. How- ever, the time needed to compute a query is twice the time 4 Experimental results needed by Q UACQ (0.54s instead of 0.20s). We made some experiments to evaluate the performance of The Sudoku instance is quite interesting where the problem our M ULTI ACQ algorithm and its findAllScopes function. is expressed on 81 variables and we have to learn a network Our tests were conducted on an Intel Core 2 Duo E4400 equivalent to 810 6= constraints. Despite of the fact that our 2x2.0GHz with 2.0GB of RAM. We first present the bench- algorithm learns more constraints to converge than the basic mark problems we used for our experiments. Q UACQ (810 instead of 622), M ULTI ACQ needs 60% less N-queens. (prob054 in [Gent and Walsh, 1999]) The problem queries than Q UACQ to converge. Here, M ULTI ACQ needs is to place n queens on a n × n chessboard so that none can only one complete negative example to learn the whole net- attack each other. The target network is encoded with n vari- work, where Q UACQ needs 622 complete negative examples. ables corresponding to the n queens, and binary constraints. Another result relates to the average size of queries. Here, For our experiments, we selected the 4-queens instance with q̄ = 3.5 is significantly smaller than the number of vari- a CT of 18 constraints (6 constraints 6= on rows and 12 con- ables (i.e., 81 variables) and the average size for Q UACQ (i.e., straints N otInDiagonal, 6 for each direction) and a bias of 37.61). This is an interesting result where short queries are 108 constraints. probably easier to answer by the user. These good results in Golomb Rulers. (prob006 in [Gent and Walsh, 1999]) The terms of queries are somewhat to the detriment of time where problem is to find a ruler where the distance between any two using M ULTI ACQ, we need more than 0.38s to compute a marks is different from that between any other two marks. query (instead of 0.20s using Q UACQ). The target network is encoded with n variables correspond- ing to the n marks, and constraints of varying arity. For our 5 Conclusion experiments, we selected the 8-marks ruler instance with a We have proposed a new approach to make constraint acqui- CT of 385 constraints and a bias of 798 constraints. sition more efficient in practice by learning a maximum num- Sudoku. We used the Sudoku logic puzzle with 9 × 9 grid. ber of constraints from a negative example. As a constraint The grid must be filled with numbers from 1 to 9 in such a acquisition system, Q UACQ focuses on the scope of one tar- way that all the rows, all the columns and the 9 non overlap- get constraint each time we give it a negative example. We ping 3 × 3 squares contain the numbers 1 to 9. The target have proposed M ULTI ACQ with a findAllScopes function network of the Sudoku has 81 variables with domains of size that aims to report all minimal scopes (MNS) of violated con- 9 and 810 binary 6= constraints on rows, columns and squares. straints. We have experimentally evaluated the benefit of our We use a bias of 6480 binary constraints. approach on some benchmark problems. The results show that i) M ULTI ACQ dramatically improves the basic Q UACQ Algorithm |CL | #q q̄ #qc− #qp− Tmns Tq in terms of number of queries ii) the queries are often much shorter than membership queries, so are easier to handle for Sudoku Golomb Queens Q UACQ 14 104 2.64 14 77 0.06 0.01 M ULTI ACQ 12 58 2.41 7 37 0.04 0.01 the user iii) and M ULTI ACQ could be time-consuming. An interesting direction would be to use a dichotomic search of Q UACQ 95 976 4.78 95 847 2.20 0.21 MNS and the use of cutoffs during search in order to obtain a M ULTI ACQ 146 666 4.69 53 498 2.18 0.54 good tradeoff (#queries/time). Q UACQ 622 9732 37.61 622 8003 3.15 0.20 M ULTI ACQ 810 3821 3.50 1 1380 1.77 0.38 Table 1: M ULTI ACQ vs Q UACQ References [Beldiceanu and Simonis, 2012] Nicolas Beldiceanu and Helmut Simonis. A model seeker: Extracting global constraint models from positive examples. In CP, pages 141–157, 2012. [Bessiere et al., 2005] Christian Bessiere, Remi Coletta, Frédéric Koriche, and Barry O’Sullivan. A sat-based ver- sion space algorithm for acquiring constraint satisfaction problems. In ECML, pages 23–34, 2005. [Bessiere et al., 2007] Christian Bessiere, Remi Coletta, Barry O’Sullivan, and Mathias Paulin. Query-driven con- straint acquisition. In IJCAI, pages 50–55, 2007. [Bessiere et al., 2013] Christian Bessiere, Remi Coletta, Emmanuel Hebrard, George Katsirelos, Nadjib Lazaar, Nina Narodytska, Claude-Guy Quimper, and Toby Walsh. Constraint acquisition via partial queries. In IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence, Beijing, China, August 3-9, 2013, 2013. [Freuder and Wallace, 2002] Eugene C. Freuder and Richard J. Wallace. Suggestion strategies for constraint- based matchmaker agents. International Journal on Artificial Intelligence Tools, 11(1):3–18, 2002. [Freuder, 1999] E. Freuder. Modeling: The final frontier. In 1st International Conference on the Practical Applica- tions of Constraint Technologies and Logic Programming, pages 15–21, London, UK, 1999. Invited Talk. [Gent and Walsh, 1999] I.P. Gent and T. Walsh. Csplib: a benchmark library for constraints. http://www.csplib.org/, 1999. [Liffiton and Sakallah, 2008] Mark H. Liffiton and Karem A. Sakallah. Algorithms for computing minimal unsatisfiable subsets of constraints. J. Autom. Reasoning, 40(1):1–33, 2008. [Shchekotykhin and Friedrich, 2009] Kostyantyn M. Shchekotykhin and Gerhard Friedrich. Argumenta- tion based constraint acquisition. In ICDM 2009, The Ninth IEEE International Conference on Data Min- ing, Miami, Florida, USA, 6-9 December 2009, pages 476–482, 2009.