An Improved Constraint Ordering Heuristics for Compiling Configuration Problems Benjamin Matthes and Christoph Zengler and Wolfgang Küchlin1 Abstract. This paper is a case study on generating BDDs (binary this paper we present an ordering of the constraints based on some decision diagrams) for propositional encodings of industrial configu- structure knowledge of the problem which is not always deducible ration problems. As a testbed we use product configuration formulas from the constraint graph. In most cases our test instances could only arising in the automotive industry. Our main contribution is the in- be compiled into BDDs with this new constraint ordering. With an troduction of a new improved constraint ordering heuristics incorpo- arbitrary ordering we exceeded space or time limits. rating structure-specific knowledge of the problem at hand. With the In section 2 we will introduce propositional configuration prob- help of this constraint ordering, we were able to compile all formu- lems and present the reader an overview of binary decision diagrams las of our testbed to BDDs which was not possible with an arbitrary and some important properties. Section 3 shortly describes our test constraint order. instances from the automotive industry. Our main contribution lies in section 4. We present an ordering of the constraints of the configura- tion problem with the help of which we could compile all formulas 1 INTRODUCTION to BDDs. Since the early 80s, product configuration systems have been among the most prominent and successful applications of AI methods in 2 PRELIMINARIES practice [15]. As a result computer aided configuration systems have been used in managing complex software, hardware or network set- 2.1 Configuration problems tings. Another application area of these configuration systems is the 2.1.1 Propositional configuration problems automotive industry. Here they helped to realize the transition from the mass production paradigm to present-day mass customization. We use the definition of a configuration problem as given in [7, Def- Besides CSP encodings [1] also propositional encodings [10] of inition 1]: a configuration problem is a triple (V, D, Ψ) where V is configuration problems proved to be a viable alternative in the auto- a set of variables x1 , x2 , . . . , xn , D is a set of their finite domains motive industry. Specific queries to the configuration base can then D1 , D2 , . . . , Dn and Ψ = {ψ1 , ψ2 , . . . , ψm } is a set of proposi- be answered by a decision procedure for propositional logic, e.g. tional formulas (constraints) over atomic propositions xi = v where in many cases SAT solvers. Although modern SAT solvers prove v ∈ Di , specifying conditions that the variable assignments have to to be very efficient in answering such queries, there are two major satisfy. A validV configuration is an assignment α with dom(α) = V drawbacks: (1) Since decidability of a propositional formula is NP- such that α |= ψ∈Ψ ψ, i.e. all constraints hold. hard, SAT solvers cannot guarantee certain runtime requirements re- In this paper we consider the special case where we have only quired in online configuration applications; (2) there are some types propositional variables in V and hence Di = {1, 0} for all 1 ≤ i ≤ of queries that cannot be handled by a SAT solver efficiently, e.g. n. The set O is the finite set of all configuration options for a product. restriction, model enumeration, or model counting. One approach to Each variable xo ∈ V represents a configuration option o ∈ O. The circumvent these limitations is the use of knowledge compilation. variable xo is assigned to 1 if the option o is chosen, otherwise it is The basic idea of knowledge compilation is to distinguish two assigned to 0. Following this course, the resulting V formulas ψ ∈ Ψ phases: (1) an offline phase in which a given formula is compiled into are propositional formulas and hence ϕ = ψ∈Ψ ψ is a propositional the respective compilation format and (2) an online phase in which formula describing all valid configurations. We will also refer to ϕ we query the compilation. Usually the offline phase is still NP-hard, as product overview formula (POF) [10]. but once compiled, there are a number of interesting polynomial time Remark. The restriction of the variables x ∈ V to propositional operations on the compilation. Well-known knowledge compilation variables does not limit the expressiveness of our problem descrip- formats for propositional logic are e.g. BDDs [3] or DNNFs [5]. tion. Since the domains Di are finite and we only allow atomic For this paper we chose BDD as compilation format. Its use in propositions of the form x = v, we can use a reduction [4] from configuration problems is well-studied [7, 11]. Hadzic et al. [7] fo- equality logic to propositional logic. cus on minimizing the final BDD for shorter response times; Nar- odytska et al. [11] try to establish a good static variable ordering 2.1.2 Structure of configuration problems for BDD compilation of configuration problems. They also present a constraint ordering based on the constraint graph. In contrast, in In many application domains (including the automative product con- figuration), we can divide the set of constraints Ψ in three parts: 1 Symbolic Computation Group, WSI Informatics, Universität Tübingen, Germany, email: [matthesb, zengler, kuechlin]@informatik.uni- Unit Constraints ΨU constraints concerning only a single variable. tuebingen.de These constraints enforce or forbid the selection of a single option. Cardinality Constraints ΨCC Constraints enforcing the selection compilation phase. Finding an optimal variable ordering is an NP- of a certain number of options. In most cases the selection of ex- complete problem [2]. Different reordering heuristics for BDDs will actly one option or at most one option is enforced. be reviewed in section 2.2.1. Dependencies ΨD Constraints describing the dependencies be- Since our input formulas are in CNF, the usual procedure of com- tween two or more options. These constraints are used to describe piling the BDD is to generate BDDs for each clause and conjoin complex domain specific configuration knowledge. them. Here, the order in which the clauses are conjoined plays an important role. We will discuss the impact of this clause/constraint Example. In the automotive industry we have the following exam- ordering in section 2.2.2. ples for the aforementioned constraint sets: • ΨU : Necessary or forbidden options in a production series of cars. 2.2.1 Reordering heuristics E.g. ’EPS must be chosen in this series’ or ’automatic transmis- As already mentioned, finding the optimal variable order for a BDD sion is not available for this series’. is NP-complete. Modern BDD compilers use different heuristics to • ΨCC : Enforcement that only one option from a certain option- find a good variable ordering while compiling. We will present some family can be chosen at the same time. E.g. ’only one steering of these heuristics which proved to be of interest for our real world wheel in a car’, or ’at most one navigation system’. applications. • ΨD : Description of complex dependencies in a car. E.g. ’naviga- The sifting algorithm by Rudell [14] is the foundation of various tion system enforces also board computer and forbids radio’. reordering heuristics. It is based on finding an optimum for each vari- able assuming all other variables remain fixed. Each variable is con- 2.2 Ordered binary decision diagrams sidered in sequence, beginning with the variable with most occur- rences. The currently selected variable is sifted (moved) sequentially A binary decision diagram [3] is a directed acyclic graph which rep- to both ends of the variable ordering and is finally fixed to the opti- resents a propositional formula. Each inner node is labeled with a mum position wrt. the size of the BDD. All variable movement can propositional variable and has two outgoing edges for negative and be done by a series of adjacent variable swaps. Swapping a variable positive assignment of the respective variable. The leaves are labeled with its direct predecessor or successor does not affect levels other with 1 and 0 representing true and false. An assignment is repre- than those of these two variables and therefore depends only propor- sented by a path from the root node to a leaf and its evaluation is tionally on the size of the respective levels. This sifting process is the respective value of the leaf. Therefore all paths to a 1-leaf are repeated for each variable, in order of their occurrences. It is notable valid models for the formula. Ordered reduced BDDs (ROBDDs) are that the BDD size can increase heavily during sifting. a subset with additional restrictions for the BDDs. Ordering guaran- The sifting algorithm can be extended to a symmetric sifting [13], tees the same variable ordering on all paths through the BDD; Reduc- where symmetric variables (variables, that can be interchanged with- tion guarantees that equivalent subtrees of the BDD are compactified out changing the Boolean function) are kept close together. Sym- and redundant nodes are deleted. A ROBDD is a canonical represen- metric sifting again can be generalized to group sifting [12]. Here, tation of a propositional formula wrt. to a variable ordering, meaning symmetry situations that go beyond the symmetry of two variables the ROBDD of a formula is unique. From now on we will refer to can be treated specially. ROBDDs simply as BDDs. Figure 1 presents the BDD for the for- A different approach was suggested by Fujita et al. [6] and Ishiura mula (x1 ↔ x2 ) ∨ x3 with the variable ordering x1 < x2 < x3 . et al. [8]. Instead of searching the optimal position of a variable in the Solid edges represent the positive assignment, dashed edges the neg- whole variable ordering, the search space is restricted to a small win- ative assignment. dow. Each variable is considered in sequence and permuted inside a window of size k. If xi is considered and window size is 3, xi , xi+1 x1 and xi+2 have to be permuted. All k! possibilities of arranging vari- ables are exhaustively searched. After testing all permutations, the x2 x2 best one wrt. BDD size is used. The process is repeated for each vari- able. Due to the rapid growth of the faculty function, this approach is x3 only practical for window sizes up to 5. Generally it performs better than sifting, but may not be able to overcome local minima. 1 0 For further comparison two random based algorithms have been used. The random variant randomly selects pairs of variables and Figure 1. BDD for (x1 ↔ x2 ) ∨ x3 with ordering x1 < x2 < x3 transposes them with adjacent swaps. The best position wrt. BDD size is used. This step is repeated n times for n variables. The ran- dom pivot takes the same approach but requires that the first variable selected has a smaller index than a pivot element. This pivot element Once compiled, BDDs allow a large number of polynomial time is the variable with most nodes in the BDD. Accordingly the second operations on the represented formula. Among them are: satisfiabil- selected variable has to have a larger index than the pivot element. ity, general entailment, restriction or equivalence. Since satisfiability is a polynomial time operation on BDDs, it is obvious, that it is NP- 2.2.2 Clause orderings hard to transform a given Boolean formula into a BDD. The size (number of nodes) of a BDD is strongly dependent on the variable Given a CNF as input formula for the BDD compilation, the order in ordering. There are many examples where bad orderings produce ex- which clauses are added to the BDD is crucial. Consider a formula ponential size BDDs, whereas a good ordering produces a linear size ψ ∧ x ∧ ¬x where ψ is an arbitrary satisfiable CNF. If first all clauses BDD. So finding a good variable ordering is a crucial task in the of ψ are added to the BDD, the resulting BDD can be of large size 2 (depending on ψ). If then at the end x and ¬x are added, the formula 4 RESULTS turns unsatisfiable and the BDD degenerates to the 0-leaf. If on the other hand x and ¬x are added to the BDD as first two clauses, all A framework for automated testing and evaluating of both static and other clauses will have no more impact on the BDD. Obviously the dynamic variable orderings has been implemented. We used CUDD 2 second approach would perform much better in this case for a large as a foundation of our implementation. The framework can han- ψ. In general we can not determine such a ’good’ clause ordering dle Dimacs CNF files and produces BDDs with different reordering but in our application we have specific structure knowledge which heuristics and also a static variable ordering. It then evaluates the we can utilize. Since clauses in our application stem from various resulting BDDs wrt. compilation time and BDD size and generates constraints, we will refer to this also as constraint ordering. comparison graphs for the different heuristics. For the static variable ordering we solved the formula with a state- of-the-art SAT solver and used its assignment stack as ordering. 3 TEST CASES 4.1 The impact of the constraint ordering As a testbed we used product configuration formulas for a series of As mentioned in section 2.2.2 the clause/constraint ordering can play cars of a major German car manufacturer. The series consists of 25 an important role in the compilation phase. In the first tests most of different car models, each with about 300 customer-selectable op- our instances could not be compiled into BDDs with an arbitrary tions in O and between 300 and 400 constraints in Ψ. Looking at the constraint ordering. We can observe this effect in the first diagram of distinction in section 2.1.2 we have the following numbers: figure 2 for a test instance (IB2). Without structuring the constraints, we reached > 50 million internal nodes after adding 100 clauses. Due to memory restrictions (4 GB) we could not add more than 200 • between 20 and 30 unit constraints in ΨU clauses. • between 40 and 60 cardinality constrains in ΨCC • about 300 dependencies in ΨD a) arbitrary clause ordering 5e+07 The corresponding CNF translations of these formulas range be- 4e+07 tween 200 and 350 variables and 500 and 3000 clauses. We distinguish two different flavors of formulas: the first set rep- nodes 3e+07 resents a restriction of the formula to technical aspects, meaning only 2e+07 options are considered, which are really choosable by the customer; 1e+07 the second set models the full configuration space including some 0 steering codes in the set of options which are used to guide certain 0 20 40 60 80 10 12 processes during the manufacturing of the car. Table 1 presents an 0 0 clauses overview over all instances. b) structured clause ordering Table 1. Automotive product configuration instances 600000 nodes 400000 technical full configuration space # variables # clauses # variables # clauses 200000 IA1 270 979 352 2796 IA2 262 895 344 2712 0 IA3 268 942 350 2759 0 10 20 30 40 50 60 70 IA4 262 898 344 2715 0 0 0 0 0 0 0 IB1 242 704 322 2519 clauses IB2 236 667 316 2482 IC1 251 768 331 2583 IC2 242 704 322 2519 Figure 2. Impact of clause orderings (IB2) IC3 220 594 240 638 IC4 267 952 349 2769 IC5 257 853 339 2670 ID1 246 760 325 2575 ID2 237 696 317 2511 To bypass this problem, we grouped our set of constraints accord- ID3 216 597 236 641 ing to section 2.1.2 in ΨU , ΨCC and ΨD and used the constraint ID4 246 765 326 2580 ordering ΨU < ΨCC < ΨD , meaning first we add all the unit con- ID5 238 669 318 2514 straints, then we add all the cardinality constraints, and at last we ID6 216 597 236 641 IE1 240 700 319 2514 add the dependencies. We will now take a closer look at the resulting IE2 236 662 315 2476 BDD. IE3 247 745 327 2560 The conjoin of all the constraints in ΨU represents exactly one IE4 241 695 321 2510 satisfying assignment. Therefore the resulting BDD is—independent IE5 246 736 326 2551 of the variable ordering—a chain of n nodes for n constraints in ΨU . IE6 241 697 321 2512 IE7 267 946 349 2763 For each variable representing a necessary option, the negative edge IE8 257 859 339 2676 goes to the 0-leaf and the positive edge goes to the next variable, and 2 ftp://vlsi.colorado.edu/pub/cudd-2.5.0.tar.gz 3 vice versa for each variable representing a forbidden option. Figure 3 knowledge already present due to the translation of the unit con- illustrates the BDD after adding all unit constraints for necessary op- straints and the cardinality constraints helps to a great extent to sim- tions n1 , . . . , nk and forbidden options f1 , . . . , fl . plify the remaining constraints. In the second diagram of figure 2 we can observe this effect: n1 Adding the clauses representing unit and cardinality constraints (the first 500 clauses) goes smoothly and the resulting BDD is very small. ... First on adding the dependencies, the BDD size grows faster. But taking into account that we could not compile over 200 clauses with nk an arbitrary constraint ordering, this is a large improvement. With the help of this new constraint ordering, we were able for the first 0 f1 time, to compile all our industrial instances into BDDs with under two minutes per instance (most of them taking only a few seconds to ... compile). 1 fl 4.2 Comparison of the reordering heuristics Figure 3. BDD after adding all unit constraints We compared all reordering heuristics wrt. compilation time (exe- cution time in user mode) and BDD size (total number of nodes). Our test system was a 64-Bit Linux running on an AMD Athlon 64 X2 Dual Core 4600+ with 4 GB of RAM. For each instance all 16 Next we add all the cardinality constraints. In our context we only heuristics were tested. The results are denoted as follows: have to deal with ’exactly one’ and ’at most one’ constraints. The propositional translation of ’at most one option of x1 , . . . , xn is cho- • var: static variable ordering (assignment stack of SAT solver) sen’ is • none: ascending variable order x1 . . . xn • sifting: basic sifting algorithm ^ ^ (¬xi ∨ ¬xj ). (1) i∈{1,...,n} j∈{i+1,...,n} • symsift: symmetric sifting • gsift: group sifting For the encoding of ’exactly one variable of x1 , . . . , xn is chosen’ W  • windowX: window permutation with window size X we simply conjoin i∈{1,...,n} x i to (1). The resulting BDD has • random: random selection algorithm (independent of the variable ordering) 2n − 1 nodes for an ’exactly • rpivot: random selection with pivot element one of n’ constraint and 2n − 2 nodes for an ’at most one of n’ constraint. Figure 4 illustrates such a BDD for an ’exactly one of A ‘-c’ identifies the convergent variant of a heuristics, which means x1 , . . . , xn ’ constraint. In our application domain all constraints in it is applied until no further improvement can be observed. ΨCC have disjoint variable sets (an option can only belong to one Figure 5 presents an evaluation for one test instance as automati- option-family). Therefore compiling all cardinality constraints to a cally produced by our tool. Here you can observe a typical pattern we BDD results in a chain of sub-BDDs as represented in figure 4. If identified: the static variable ordering often has a short compilation one of the options in a cardinality constraint was also present in the time, but produces large BDDs. The windowing algorithms perform unit constraints, the reduction property of the BDD guarantees im- better than the sifting-based algorithms in most cases. The sifting mediate simplification. After adding the cardinality constraints and algorithms yield by far the smallest BDDs. the unit constraints, the BDD size is still linear in the number of unit constraints and cardinality constraints and their respective variables. 1.6e+06 var random rpivot 1.4e+06 sifting x1 sifting-c symsift 1.2e+06 symsift-c gsift x2 x2 gsift-c window2 1e+06 window3 ... ... window4 window2-c nodes 800000 window3-c window4-c xn xn none 600000 1 400000 0 200000 0 Figure 4. BDD for an ’exactly one’ constraint 0 20 40 60 80 100 120 140 160 180 200 time [sec] As a last step, the dependencies between the options ΨD are con- Figure 5. Heuristics comparison for ID5 (full configuration space) joined to the BDD. This step can enlarge the BDD significantly (ex- ponential size in the worst-case). But our experiments show, that the 4 Table 2 presents an overview what algorithm yielded the smallest Table 3. Summary of the winning heuristics BDD size for each test instance and what algorithm had the best com- pilation time. Times are noted in seconds. The table presents only instances of the full configuration space. The instances reduced to # smallest size # best time technical aspects showed similar results and are omitted here. var 0 26 random 0 0 Table 2. Comparison of reordering algorithms (full configuration space) randompivot 0 0 sifting 4 1 sifting-c 2 0 symsift 11 3 nodes winner time (in s) winner symsift-c 21 0 IA1 6133 gsift-c 36.41 none gsift 2 2 IA2 3837 symsift-c 62.2 symsift gsift-c 10 0 IA3 1974 symsift-c 28.3 window3-c window2 0 0 IA4 12016 gsift-c 48.61 window3-c window3 0 0 IB1 1820 gsift-c 3.37 var window4 0 0 IB2 2878 gsift-c 1.03 var window2-c 0 0 IC1 2539 symsift 7.6 window3-c window3-c 0 15 IC2 1411 gsift-c 8.17 var window4-c 0 1 IC3 844 symsift 0.99 var none 0 2 IC4 4229 symsift-c 45.46 window3-c IC5 2883 symsift-c 33.35 gsift ID1 1781 symsift-c 18.18 window3-c REFERENCES ID2 2702 gsift-c 10.87 var [1] Jean Marc Astesana, Yves Bossu, Laurent Cosserat, and Helene ID3 1345 symsift-c 0.58 var Fargier, ‘Constraint-based modeling and exploitation of a vehicle range ID4 1343 symsift-c 3.54 var at renault’s: Requirement analysis and complexity study’, in Proceed- ID5 2407 gsift 9.01 var ings of the 13th Workshop on Configuration, pp. 33–39, (2010). ID6 1345 symsift-c 0.6 var [2] Beate Bollig and Ingo Wegener, ‘Improving the variable ordering of IE1 1165 gsift-c 2.59 var OBDDs is NP-complete’, IEEE Transactions on Computers, 45(9), IE2 1313 symsift 0.93 var 993–1002, (1996). IE3 3587 sifting-c 13.15 window3-c [3] Randal E. Bryant, ‘Graph-based algorithms for boolean function ma- IE4 2029 gsift 3.08 var nipulation’, IEEE Transactions on Computers, 35(8), 677–691, (1986). IE5 1853 symsift-c 10.89 var [4] Randal E. Bryant and Miroslav N. Velev, ‘Boolean satisfiability with IE6 1898 symsift-c 6.15 var transitivity constraints’, in Proceedings of the CAD 2000, volume 1855 IE7 2308 symsift-c 37.91 window3-c of Lecture Notes in Computer Science, 85–98, Springer, Berlin, Heidel- IE8 2233 gsift-c 34.66 window3-c berg, Germany, (2000). [5] Adnan Darwiche, ‘Decomposable negation normal form’, Journal of These results are summarized in table 33 . Here we show how many the ACM, 48(4), 608–647, (2001). [6] Masahiro Fujita, Yusuke Matsunaga, and Taeko Kakuda, ‘On variable times (out of 50 instances—full and technical configuration space) ordering of binary decision diagrams for the application of multi-level each algorithm yielded the best result wrt. to size and time respec- logic synthesis’, in Proceedings of the European Conference on Design tively. The aforementioned observations are justified: a static vari- Automation, 50–54, IEEE Computer Society, (1991). able ordering or reordering algorithms based on windowing (espe- [7] Tarik Hadzic, Subbarayan Sathiamoorthy, Rune M. Jensen, Henrik R. Andersen, Jesper Møller, and Henrik Hulgaard, ‘Fast backtrack free cially with window size 3) have often the best compilation times at product configuration using precompiled solution space representa- the expense of large BDD sizes. The various reorderings based on tions’, in Proceedings of the PETO 2004, (2004). sifting, especially the convergent symmetric sifting variant, produce [8] Nagisa Ishiura, Hiroshi Sawada, and Shuzo Yajima, ‘Minimization of the smallest BDDs. Since performance in the offline phase is not too binary decision diagrams based on exchanges of variables’, in Proceed- critical in knowledge compilation, sifting seems to be a viable choice ings of the ICCAD 1991, 472–475, IEEE Computer Society, (1991). [9] Andreas Kübler, Christoph Zengler, and Wolfgang Küchlin, ‘Model for a reordering heuristics for our test instances in order to compile counting in product configuration’, in Proceedings of LoCoCo 2010, small BDDs with good query times. volume 29, 44–53, EPTCS, (2010). [10] Wolfgang Küchlin and Carsten Sinz, ‘Proving consistency assertions for automotive product data management’, Journal of Automated Rea- 5 CONCLUSION soning, 24(1-2), 145–163, (2000). In this paper we introduced a new constraint ordering for BDD com- [11] Nina Narodytska and Toby Walsh, ‘Constraint and variable ordering heuristics for compiling configuration problems’, in Proceedings of the pilation of industrial configuration instances. This constraint order- 20th International Joint Conference on Artifical Intelligence, IJCAI’07, ing uses structure-specific knowledge of the constraints at hand in or- 149–154, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, der to optimize compilation time. With this ordering we were able for (2007). the first time to compile all configuration instances of our testbed— [12] Shipra Panda and Fabio Somenzi, ‘Who are the variables in your neigh- bourhood’, in Proceedings of the ICCAD 1995, 74–77, IEEE Com- product configuration data of a major German car manufacturer— pututer Society, (1995). into BDDs. Most of these BDDs could be compiled in a few seconds [13] Shipra Panda, Fabio Somenzi, and Barbard F Plessier, ‘Symmetry and have surprisingly small representations (850 - 12.000 nodes). detection and dynamic variable ordering of decision diagrams’, in These results look very promising. There are some interesting ques- Proceedings of the ICCAD 1994, 628–631, IEEE Computer Society, tions like counting all constructible variants of a single car [9] or enu- (1994). [14] Richard Rudell, ‘Dynamic variable ordering for ordered binary decision merating a certain number of cars with special features, that could be diagrams’, in Proceedings of the ICCAD 1993, 42–47, IEEE Computer solved efficiently once we have a BDD representation. Society, (1993). 3 [15] Daniel Sabin and Rainer Weigel, ‘Product configuration frameworks-a Complete benchmark results can be found at http:// survey’, IEEE Intelligent Systems, 13(4), 42–49, (1998). www-sr.informatik.uni-tuebingen.de/research/ confws2012-results.pdf 5