A structural approach to reasoning with quantified Boolean formulas ? Luca Pulina and Armando Tacchella DIST, Università di Genova, Viale Causa, 13 – 16145 Genova, Italy Luca.Pulina@unige.it Armando.Tacchella@unige.it Abstract In this paper we approach the problem of reasoning with quantified Boolean formulas (QBFs) by combining search and resolution, and by switching be- tween them according to structural properties of QBFs. We provide empiri- cal evidence that QBFs which cannot be solved by search or resolution alone, can be solved by combining them, and that our approach makes a proof-of- concept implementation competitive with current QBF solvers. 1 Introduction The development of the first practically efficient QBF solver can be traced back to [4], wherein a search-based solver is proposed. Since then, researchers have proposed improvements to the basic search algorithm including, e.g., backjumping and learning [7], as well as different approaches to the problem including, e.g., skolemization [1], and resolution [2, 12]. However, in spite of such efforts, current QBF reasoners still cannot handle many instances of practical interest. This is witnessed, e.g., by the results of the 2008 QBF solver competition (QBFEVAL’08) where even the most sophisticated solvers failed to decide encodings obtained from real-world applications [14]. In this work, we consider search and resolution for QBFs, and we propose a new approach that combines them dynamically. The key point of our approach is to implicitly leverage graph abstractions of QBFs to yield structural features which support the decision between search and resolution. As for the choice of a partic- ular feature, it is known, see, e.g., [16], that resolution on plain Boolean formulas may require computational resources that are exponential in the treewidth.1 This connection has been studied also for quantified formulas in [5, 8, 13]. In particular, in [5], an extension of treewidth is shown to be related to the efficiency of rea- soning about quantified Boolean constraints, of which QBFs are a subclass. This result tells us that treewidth is a promising structural parameter to gauge resolution and search: Resolution is the choice when treewidth is relatively small, and search is the alternative when treewidth is relatively large. Switching between the two ? The results herein presented are also detailed in a paper in proceedings of the Twenty-First International Joint Conference on Artificial Intelligence (IJCAI’09). 1 Induced width, as defined in [16], is shown to be equivalent to treewidth in [9]. Proceedings of the 16th International RCRA workshop (RCRA 2009): Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion Reggio Emilia, Italy, 11–12 December 2009 options may occur as long as search is able to obtain subproblems whose treewidth is smaller than the one of the original problem. In practice, QBFs of some interest for applications may have thousands of variables, and thus even approximations of treewidth – a well-known NP-complete problem – are too expensive to compute on-the-fly [15]. Also, while deciding if treewidth is bounded by a constant can be done in asymptotic linear time [3], an actual implementation of such algorithm would be too slow to be of practical use. In this paper we show how these issues can be overcome by considering alternative parameters indicating whether resolu- tion is bound to increase the size of the initial set of clauses beyond “reasonable” limits. To this purpose, we compute estimates of the number of new clauses gener- ated during resolution, and we require that the number of such clauses is less than the number of old clauses used to generate them. Also, we require that the expected number of clauses is less than a predefined threshold. Both these conditions can be checked without a prohibitive overhead, and respecting them can be as informative as tracking treewidth in many situations. To test our approach, we implemented it in S TRU QS (for “Stru”ctural “Q”BF “S”olver), a proof-of-concept tool consisting of less than 2K lines of C++ code. We have experimented with S TRU QS on the QBFEVAL’08 dataset [14], which is comprised of more than three thousands QBF encodings in various application domains – the largest publicly available collection of QBFs to date. Here, we show that exploiting dynamic combination of search and resolution enables S TRU QS to solve QBFs which, all other things being equal, cannot be solved by the search and resolution components of S TRU QS alone. Moreover, S TRU QS is competitive with respect to current QBF solvers, as it would have ranked third best among QBFEVAL’08 competitors if considering the number of problems solved within a given amount of resources. While approaches similar to ours have been proposed for plain Boolean satisfi- ability in [16], and for non-quantified constraint satisfaction in [11], we notice that this is the first time that a structural approach combining search and resolution is designed, implemented and empirically tested for QBFs. Learning in search-based QBF solvers such as, e.g., Q U BE [7] and Y Q UAFFLE [18], can also be seen as a way to integrate resolution and search. However, as shown in [7], this is true of a particular form of control strategy for general resolution known as tree resolution, while here we focus on variable elimination instead. Moreover, since S TRU QS features backjumping, and backjumping is essentially space-bounded tree resolu- tion, we could say that our work extends the above contributions in this sense. The solver 2 CLS Q [17] is another example of integration between search and res- olution. Also in this case, the contribution is limited to a specific form of res- olution, namely hyper-binary resolution, which is seen as a useful complement to the standard unit clause resolution step performed by most search-based QBF solvers, and not as an alternative resolution-based decision procedure as we do in S TRU QS. Also QUANTOR [2] provides some kind of integration between resolu- tion and search, but with two fundamental differences with respect to S TRU QS. The first one is that QUANTOR intertwines resolution with the expansion of univer- 2 sally quantified variables, whereas in S TRU QS no such expansion is performed. The second one is that QUANTOR uses search only at the very end of the decision process, when the input QBF has been reduced to an equi-satisfiable propositional formula. The paper is structured as follows. In Section 2 we lay down the foundations of our work considering relevant definitions and known results. In Section 3 we define the basic S TRU QS algorithms, including search, resolution and their com- bination. In Section 4 we describe our experiments and in Section 5 we summarize our current results and future research directions. 2 Groundwork A variable is an element of a set P of propositional letters and a literal is a variable or the negation thereof. We denote with |l| the variable occurring in the literal l, and with l the complement of l, i.e., ¬l if l is a variable and |l| otherwise. A clause C is an n-ary (n ≥ 0) disjunction of literals such that, for any two distinct disjuncts l, l0 of C, it is not the case that |l| = |l0 |. A propositional formula is a k-ary (k ≥ 0) conjunction of clauses. A quantified Boolean formula is an expression of the form Q1 z1 . . . Qn zn Φ where, for each 1 ≤ i ≤ n, zi is a variable, Qi is either an existential quantifier Qi = ∃ or a universal one Qi = ∀, and Φ is a propositional formula in the variables Z = {z1 , . . . , zn }; the expression Q1 z1 . . . Qn zn is the prefix and Φ is the matrix; a literal l is existential if |l| = zi for some 1 ≤ i ≤ n and ∃zi belongs to the prefix, and it is universal otherwise. A prefix p = Q1 z1 . . . Qn zn can be viewed as the concatenation of h quantifier blocks, i.e., p = Q1 Z1 . . . Qh Zh , where the sets Zi with 1 ≤ i ≤ h are a partition of Z, and consecutive blocks have different quantifiers. To each variable z we can associate a level lvl(z) which is the index of the corresponding block, i.e., lvl(z) = i for all the variables z ∈ Zi . We also say that variable z comes after a variable z 0 in p if lvl(z) ≥ lvl(z 0 ). The semantics of a QBF ϕ can be defined recursively as follows. A QBF clause is contradictory exactly when it does not contain existential literals. If ϕ contains a contradictory clause then ϕ is false. If ϕ has no conjuncts then ϕ is true. If ϕ = Qzψ is a QBF and l is a literal, we define ϕl as the QBF obtained from ψ by removing all the conjuncts in which l occurs and removing l from the others. Then we have two cases: • If ϕ is ∃zψ, then ϕ is true exactly when ϕz or ϕ¬z are true. • If ϕ is ∀zψ, then ϕ is true exactly when ϕz and ϕ¬z are true. The QBF decision problem (QSAT) can be stated as the problem of deciding whether a given QBF is true or false. 3 Given a QBF ϕ on the set of variables Z = {z1 , . . . , zn }, its Gaifman graph has a vertex set equal to Z with an edge (z, z 0 ) for every pair of different elements z, z 0 ∈ Z that occur together in some clause of ϕ. A scheme for a QBF ϕ having prefix p is a supergraph (Z, E) of the Gaifman graph of ϕ along with and ordering z10 , . . . , zn0 of the elements of Z such that 1. the ordering z10 , . . . , zn0 preserves the order of p, i.e., if i < j then zj0 comes after zi0 in p, and 2. for any zk0 , its lower numbered neighbors form a clique, that is, for all k, if i < k, j < k, (zi0 , zk0 ) ∈ E and (zj0 , zk0 ) ∈ E, then (zi0 , zj0 ) ∈ E. The width wp (ϕ) of a scheme is the maximum, over all vertices zk , of the size of the set {i : i < k, (zi , zk ) ∈ E}, i.e., the set containing all lower numbered neighbors of zk . The treewidth twp (ϕ) of a QBF ϕ is the minimum width over all schemes for ϕ. Clause resolution [10] for a QBF ϕ is an operation whereby given two clauses Q ∨ x and R ∨ ¬x of ϕ, the clause min(Q ∨ R) can be derived, where 1. Q and R are disjunctions of literals, 2. x is an existential variable, 3. Q and R do not share any literal l such that l occurs in Q and l occurs in R, and 4. min(C) is obtained from C by removing the universal literals coming after all the existential literals in C. Variable elimination is a control strategy for resolution defined as follows. Given a QBF ϕ on the set of variables Z = {z1 , . . . , zn }, we consider a scheme for ϕ and the associated elimination ordering Z 0 = {z10 , . . . , zn0 }. Starting from z = zn0 and scanning the elimination ordering in reverse, we eliminate the vari- ables as follows. If z is existential, and assuming that ϕ contains k > 0 clauses in the form Q ∨ z and h > 0 clauses in the form R ∨ ¬z, then we add O(kh) clauses to ϕ obtained by performing all the resolutions on z. In the case where either k = 0 or h = 0, no operation is performed. If z is universal, we simply skip to the next variable. If all the variables in ϕ can be eliminated without generating a contradictory clause, then ϕ is true, else it is false. Backtracking search [4] for a QBF ϕ is a depth-first exploration of an AND-OR tree defined as follows. Initially, the current QBF is ϕ. If the current QBF is of the form ∃xψ then we create an OR-node, whose children are obtained by checking recursively whether ϕx is true or ϕ¬x is true. If the current QBF is of the form ∀yψ then we create an AND-node, whose children are obtained by checking recursively whether both ϕy and ϕ¬y are true. We call ϕl a branch (also, an assignment), and since we explore the tree in a depth-first fashion, a node wherein only the first branch was taken is said to be open, and closed otherwise. The leaves are of two 4 kinds: If the current QBF contains a contradictory clause we have a conflict, while if the current QBF contains no conjuncts we have a solution. Backtracking from a conflict amounts to reaching back to the deepest open OR-node: if there is no such node, then ϕ is false; backtracking from a solution amounts to reaching back to the deepest open AND-node: if there is no such node, then ϕ is true. Unit propagation [4] is an optimization that can be added on top of basic vari- able elimination and search. A clause C is unit in a QBF ϕ exactly when (i) C contains only one existential literal l (unit literal) and, (ii) all the universal literals in C come after l in the prefix of ϕ. If ϕ contains a unit literal l, then ϕ is true if and only if ϕl is true. Unit propagation is the process whereby we keep assigning unit literals until no more such literals can be found. Backjumping is an optimization that can be added on top of search. According to [7] the computation performed by a QBF search algorithm corresponds to a par- ticular kind of deductive proof, i.e., a tree wherein clause resolution and term reso- lution alternate, where a term is a conjunction of literals, and term resolution is the “symmetric” operation of clause resolution – see [7] for details. The clause/term tree resolution proof corresponding to search for a QBF ϕ can be reconstructed as follows. In case of a conflict, there must be two clauses, say Q ∨ x and R ∨ ¬x such that all the literals in Q and R have been deleted by previous assignments. We can always resolve such clauses to obtain the working reason min(Q ∨ R) – where Q ∨ x and R ∨ ¬x are the initial working reasons. The other clauses can be derived from the working reason when backtracking over any existential literal l such that |l| occurs in the working reason. There are three cases: 1. If l is a unit literal, then there is a clause C where l occurs – the reason of assigning l – and we obtain a new working reason by resolving the current one with C. 2. If l is an open branch, then proof reconstruction stops, because we must branch on l, and the reason of this assignment is the current working reason. 3. If l is a closed branch, then its reason was computed before, and it can be treated as in (1). In case of a solution, the initial working reason is a term which is the conjunction of (a subset of) all the literals assigned from the root of the search tree, down to the current leaf, i.e., a Boolean implicant of the matrix of ϕ. The other terms can be derived when backtracking over any universal literal l such that l is in the working reason, considering cases (ii) and (iii) above, and using term instead of clause resolution.2 Given the above, it is easy to see that closing branches over existential (resp. universal) literals that do not appear in the current working reason is useless, i.e., they are not responsible for the current conflict (resp. solution). Backjumping can thus be described as the process whereby useless branches are skipped while backtracking. 2 Case (1) will not apply unless terms are learned, and thus unit propagation may involve universal variables as well – see [7] for the technical details. 5 PROPAGATE(ϕ, Γ) SEARCH (ϕ, Γ) hl, ri ← FIND U NIT(ϕ) l ← CHOOSE L IT(ϕ, Γ) while l 6= NIL do push(Γ, hl, LS, NIL i) push(Γ, hl, FL, ri) DOA SSIGN (ϕ, l) DOA SSIGN (ϕ, l) hl, ri ← FIND U NIT(ϕ) ELIMINATE (ϕ, Γ) v ← CHOOSE VAR(ϕ, Γ) BACKJUMP(ϕ, Γ, q) push(Γ, hv, VE, NIL i) wr ← INIT R EASON(ϕ, Γ, q) DO E LIMINATE(ϕ, v) while Γ 6= EMPTY do hl, m, ri ← top(Γ) BACKTRACK (ϕ, Γ, q) if m = VE then while Γ 6= EMPTY do UNDO E LIMINATE(ϕ, l) hl, m, i ← top(Γ) else if m = VE then UNDOA SSIGN (ϕ, l) UNDO E LIMINATE (ϕ, l) pop(Γ) else if IS B OUND(q, l, ϕ) then UNDOA SSIGN (ϕ, l) if OCCURS(|l|, wr) then pop(Γ) if m = LS then if IS B OUND(q, l, ϕ) then push(Γ, hl, FL, wri) if m = LS then DOA SSIGN (ϕ, l) push(Γ, hl, FL, NIL i) return FALSE DOA SSIGN (ϕ, l) else if m = FL then return FALSE UPDATE R EASON (wr, r) return TRUE return TRUE Figure 1: Pseudo-code of S TRU QS basic routines. 3 Implementing the structural approach In Figure 1 we present S TRU QS basic routines in pseudo-code format. All the routines take as by-reference parameters the data structure ϕ which encodes the input QBF, and a stack Γ which keeps track of the steps performed. We consider the following primitives for ϕ: • DOA SSIGN(ϕ, l) implements ϕl as per Section 2, where clauses and literals are disabled rather than deleted. • DO E LIMINATE(ϕ, v) implements the elimination of a single variable as de- scribed in Section 2, with the addition that clauses subsumed by some clause already in ϕ are not added – a process named forward subsumption in [2]; as in the case of DOA SSIGN, the eliminated variables and the clauses where they occur are disabled. • FIND U NIT(ϕ), returns a pair hl, ri if l is a unit literal in ϕ and r is a unit clause in which l occurs, or a pair hNIL , NIL i if no such literal exists in ϕ. • CHOOSE L IT(ϕ, Γ) returns a literal l subject to the constraint that all the remaining variables of ϕ which do not appear in Γ come after |l|. • CHOOSE VAR(ϕ, Γ) returns a variable v subject to the constraint that v comes after all the remaining variables of ϕ which do not appear in Γ. 6 Because updates in DOA SSIGN and DO E LIMINATE are not destructive, we can further assume that their effects can by “undo” functions, i.e., UNDOA SSIGN and UNDO E LIMINATE. The routines SEARCH and ELIMINATE (Figure 1, top-left) perform one step of search and variable elimination, respectively. SEARCH (resp. ELIMINATE), asks CHOOSE L IT (resp. CHOOSE VAR ) for a literal l (resp. a variable v) and then it (i) pushes l (resp. v) in the stack and (ii) it updates ϕ. In both cases step (i) amounts to push a triple hu, m, ri in Γ where r is always NIL , u is the literal being assigned or the variable being eliminated, and m is the mode of the assignment, where LS (“L”eft “S”plit) indicates an open branch in the search tree, and VE (“V”ariable “E”limination) indicates the corresponding resolution step. The task of PROPAGATE (Figure 1, top-right) is to perform unit propagation in ϕ. For each unit literal l, PROPAGATE pushes a record hl, FL, ri in Γ, where l and r are the results of FIND U NIT, and the mode is always FL (“F”orced “L”iteral). BACKTRACK (Figure 1 bottom-left) and BACKJUMP (Figure 1, bottom-right) are alternative backtracking procedures to be invoked at the leaves of the search tree with an additional parameter q, where q = ∃ if the current leaf is a conflict, and q = ∀ otherwise. BACKTRACK goes back in the stack Γ, popping records for search (LS) or deduction steps (VE, FL), and undoing the corresponding effects on ϕ. Notice that, if BACKTRACK is a component of a plain search solver, then VE-type records will not show up in Γ. BACKTRACK stops when an open branch (LS) corresponding to a literal l bound by the quantifier q is found (IS B OUND(q, l, ϕ)=TRUE). In this case, BACKTRACK closes the branch by assigning l with mode FL and returning FALSE to indicate that search should not stop. On the other hand, if all the records are popped from Γ without finding an open branch then search is over and BACKTRACK returns TRUE. BACKJUMP implements backjumping as per Section 2 on top of the following primitives: • INIT R EASON(ϕ, Γ, q) initializes the working reason wr. If q = ∃, then we have two cases: (i) If top(Γ) is a FL-type record, then wr is any clause in ϕ which is contradictory because all of its existential literals have been disabled. (ii) If top(Γ) is a VE-type record, this means that a contradictory clause C has been derived when eliminating a variable, and thus wr is C. If q = ∀, then wr is a term obtained considering a subset of the literals in Γ such that the number of universal literals in the term is minimized – VE-type records are disregarded in this process. • UPDATE R EASON(wr, r) updates wr by resolving it with r, i.e., the reason of the assignment which is being undone. When BACKJUMP is invoked in lieu of BACKTRACK, it first initializes the working reason, and then it pops records from Γ until either the stack is empty, or some open branch is found – the return values in these cases are the same as BACKTRACK. BACKJUMP stops at open branches ( LS) only if they correspond to some literal l 7 S TRU QS[ BJ , VE ](ϕ, Γ) stop ← FALSE while ¬stop do PROPAGATE(ϕ, Γ) if IS T RUE(ϕ) then result ← TRUE stop ← BACKJUMP(ϕ, Γ, ∀) else if IS FALSE(ϕ) then result ← FALSE stop ← BACKJUMP(ϕ, Γ, ∃) else if PREFER S EARCH(ϕ) then SEARCH(ϕ, Γ) else ELIMINATE(ϕ, Γ) return result Figure 2: Pseudo-code of S TRU QS[ BJ , VE ]. which is bound by the quantifier q and such that |l| occurs in the current working reason wr (OCCURS(|l|, wr)=TRUE). Also, when popping records hl, FL, ri such that l occurs in the working reason, the procedure UPDATE R EASON is invoked to update the current working reason wr. As described in Section 2, all the literals bound by q that are not in the current working reason are irrelevant to the current conflict/solution and are simply popped from Γ without further ado. On top of the basic algorithms of Figure 1, we define five different versions of S TRU QS: Three “single” versions, namely S TRU QS[ BJ ], S TRU QS[ BT ], and S TRU QS[ VE ], and two “blended” ones, namely S TRU QS[ BT, VE ] and S TRU QS[ BJ , VE ]. The single solvers implement basic reasoning algorithms and are provided for ref- erence purposes, while the blended ones implement our structural approach.3 The pseudo-code of S TRU QS[ BJ , VE ] is detailed in Figure 2 where we consider some additional primitives on ϕ: IS T RUE(ϕ) returns TRUE exactly when all clauses in ϕ have been disabled; IS FALSE(ϕ) returns TRUE exactly when there is at least one contradictory clause in ϕ; finally, PREFER S EARCH(ϕ) is some heuristic to choose between search and variable elimination. The other versions of S TRU QS can be obtained from the code in Figure 2 as follows. If PREFER S EARCH always re- turns TRUE, then we obtain the search solver S TRU QS[ BJ ], and if it always returns FALSE , then we obtain the variable elimination solver S TRU QS[ VE ]. Replacing BACKJUMP with BACKTRACK , we obtain S TRU QS[ BT, VE ], and if we further as- sume that PREFER S EARCH always returns TRUE, then we obtain S TRU QS[ BT ]. We now consider the implementation of PREFER S EARCH for the blended ver- sions of S TRU QS. As discussed in Section 1, PREFER S EARCH needs to be based on parameters which are informative, yet computable without a prohibitive over- head. To this end, in PREFER S EARCH we resort to the following criteria: 1. Given the current ϕ and Γ, check if there is any existential variable x which respects CHOOSE VAR’s constraints, and such that the number of clauses po- tentially generated by DO E LIMINATE(ϕ, x) is less than the number of clauses 3 The source code of S TRU QS is available at http://www.mind-lab.it/projects. 8 containing x in ϕ. 2. If condition (1) is met for some x, then consider the number of clauses where x occurs, say nx , and the number of clauses where ¬x occurs, say n¬x , and compute the diversity of x as the product nx n¬x ; check if the diversity of x is less than a predefined threshold div. Notice that, when checking for conditions (1) we compute an upper bound on the number of clauses generated as nx n¬x , and we stop checking when the first vari- able respecting both conditions is found. If both conditions above are met for some x, then PREFER S EARCH returns FALSE, and CHOOSE VAR is instructed to return x. If either condition is not met, then PREFER S EARCH returns TRUE. In this case, CHOOSE L IT tries to pick a literal l which respects CHOOSE L IT ’s constraints, and such that |l| is a node in the neighborhood of some variable x tested during the last call to PREFER S EARCH. If no such literal can be found, then CHOOSE L IT returns a literal l such that |l| has the maximum degree among all the candidate literals of ϕ. We mention here that condition (1) is proposed in [6] to control variable elim- ination during preprocessing of plain Boolean encodings, and diversity is intro- duced in [16] to efficiently account for structure of Boolean formulas. Intuitively, in our implementation condition (1) provides the main feedback that implicitly considers the treewidth of the current formula: If variable elimination becomes less efficient, then this means that the structure of the current QBF is probably less amenable to this kind of approach. Condition (2) fine tunes (1): Whenever some variable x meets condition (1), but nx n¬x is too large, eliminating x could be overwhelmingly slow. Finally, the search heuristic in CHOOSE L IT is modified in order to use search for breaking large clusters of tightly connected variables. If this cannot be done, then CHOOSE L IT falls back to assigning the variable which is most constrained. 4 Putting the structural approach to the test All the experimental results that we present are obtained on a family of identical Linux workstations comprised of 10 Intel Core 2 Duo 2.13 GHz PCs with 4GB of RAM. The resources granted to the solvers are always 600s of CPU time and 3GB of memory. To compare the solvers we use the QBFEVAL’08 dataset, which is comprised of 3326 formulas divided into 17 suites and 71 families (suites may be comprised of more than one family). All the formulas considered are QBF encodings of some automated reasoning task [14]. In the following, we say that solver A dominates solver B whenever the set of problems solved by A is a superset of the problems solved by B; we say that two solvers are incomparable when neither one dominates the other. Our first experiment aims to provide an empirical explanation on why alternat- ing between search and resolution can be effective. In Figure 3 (a,b) we present two 9 (a) (b) (c) Figure 3: (a) Number of reasoning steps (x axis) vs. approximated treewidth value (y axis) in S TRU QS[ VE ] (triangles) and S TRU QS[ BJ ] (dots). (b) As (a), but describing the alternation of search (dots) and variable elimination (triangles) steps inside S TRU QS[ BJ , VE ]. The number of reasoning steps corresponds to the number of LS- and VE-type records pushed on Γ. In particular, we consider the sequence of QBFs ϕ1 , ϕ2 , . . . where ϕ1 is the input QBF, and ϕi+1 is obtained from ϕi as follows. If ϕi does not correspond to a conflict or a solution, then ϕi+1 is the result of a destructive update of ϕi corresponding to (i) unit propagation and (ii) variable elimination or search. If ϕi is a conflict or a solution, then ϕi+1 is the formula corresponding to the node where S TRU QS backtracks to. For all ϕ1 , ϕ2 , . . ., approximated treewidth twˆ p (ϕi ) is such that twp (ϕi ) ≤ tw ˆ p (ϕi ). (c) Number of formulas solved (y axis) vs. div parameter value (x axis) in S TRU QS[ BJ , VE ]. plots showing the connection between the behaviour of S TRU QS – as dictated by PREFER S EARCH – and the treewidth4 of the input formula as it is modified during the solution process. The plots have been obtained by running various versions of S TRU QS on the instance test5 quant5 – family jmc quant, suite Katz in the QBFEVAL’08 dataset. We consider test5 quant5 because it is the small- est instance (358 variables, 941 clauses, 3 quantifier blocks) which is solved by S TRU QS[ BJ , VE ], and it is not solved by S TRU QS[ BJ ] and S TRU QS[ VE ]. Look- ing at Figure 3 (a), we can see that S TRU QS[ VE ] (triangle dots) is able to decrease slightly treewidth (from an initial estimate of 96 to a final estimate of 86), but it can go as far as a few hundred steps before exhausting resources. On the other hand, S TRU QS[ BJ ] (circle dots) can decrease substantially treewidth, but it gets trapped into structurally simpler subproblems, without managing to solve the input QBF after about 9 × 106 steps, when it reaches the time limit.5 From Figure 3 (b) we can see that S TRU QS[ BJ , VE ] changes its behavior according to the cur- rent value of treewidth, i.e., PREFER S EARCH implicitly takes into account struc- ture. In the first 206 steps, only variable elimination is performed (triangle dots). Then, as this approach becomes untenable, search (circle dots) is exploited to ob- tain structurally simpler subproblems. Comparing the two plots, it is clear that PREFER S EARCH prevents search to go “too deep”, because as soon as tw ˆ p (ϕi ) becomes small, PREFER S EARCH reckons that variable elimination alone can deal with ϕi . Overall, S TRU QS[ BJ , VE ] solves test5 quant5 in less than 4 × 105 4 More precisely, we consider an approximation of treewidth, computed as described in [15]. 5 Only the first 103 steps are shown in plots of Figure 3 (a,b). The hidden portions of S TRU QS[ BJ ] and S TRU QS[ BJ , VE ] behave as they do in the last few hundred steps shown. 10 Solver Performance # Time #LS (×105 ) #VE (×105 ) S TRU QS[ BJ , VE ] 1304 63141 656 1407 S TRU QS[ BT, VE ] 1019 41900 1744 1731 S TRU QS[ BJ ] 614 31543 11478 – S TRU QS[ VE ] 528 12834 – 3 S TRU QS[ BT ] 380 10855 35594 – Table 1: Comparison of S TRU QS versions considering the number of problems solved (“#”), the cumulative CPU seconds spent (“Time”), the cumulative number of search (“#LS”) and variable elimination (“#VE”) steps performed on solved problems. steps, of which about 25% are search steps. Another interesting phenomenon is that even when search is prevailing – for twˆ p (ϕ) values roughly in between 75 and 25 – PREFER S EARCH still gives a chance to variable elimination. In these cases, variable elimination plays essentially the same role of unit propagation, i.e., it stops in advance the exploration of irrelevant portions of the search space. In the above experiment, PREFER S EARCH turns out to be effective in switching between variable elimination and search with a specific value of the threshold div (2000). Since different values of div influence the behaviour of PREFER S EARCH, it is reasonable to ask how much a particular setting of div will influence the perfor- mance of S TRU QS, if at all, and which is the best setting, at least in the dataset that we consider. In figure 3 (c) we present the result of running S TRU QS[ BJ , VE ] with various settings of the parameter div on the formulas in the QBFEVAL’08 dataset which cannot be solved by either S TRU QS[ VE ] or S TRU QS[ BJ ] alone (660 for- mulas). In particular, we consider an initial value div = 125 and we increase it by a ×2 factor until div = 16000, each time running S TRU QS[ BJ , VE ] on the above dataset. The plot in Figure 3 clearly shows that (i) div ≈ 2000 appears to be the best setting in the QBFEVAL’08 dataset, and that (ii) only 15 instances separate the performance of the best S TRU QS[ BJ , VE ] setting (div = 2000) with respect to the worst one (div = 125). In all the experiments that we show hereafter div is fixed to 2000, but we conjecture that while tuning div helps in making S TRU QS faster in some cases, relatively small changes in the number of problems solved are to be expected. As we anticipated, there is a number of instances that can be solved by S TRU QS[ BJ , VE ], but cannot be solved by the corresponding single solvers. Our next experiment is to compare all S TRU QS versions to see whether blended solvers have an edge over single ones. Indeed, a quick glance at the results shown in Table 1 reveals that blended solvers are always superior to single ones. In more detail, S TRU QS[ BT ] can solve 11% of the dataset, S TRU QS[ VE ] can deal with 16% of it, and S TRU QS[ BJ ] tops at 18%. Only 273 problems (8% of the dataset) are solved by both S TRU QS[ BT ] and S TRU QS[ VE ], which are incomparable. On the other hand, S TRU QS[ BJ ] dominates S TRU QS[ BT ] and it is incomparable with S TRU QS[ VE ]. As for the blended versions, we see that S TRU QS[ BT, VE ] is more efficient than all the single versions: It can solve 31% of the dataset, which is more than the set of problems solved by at least one of S TRU QS[ BT ] and S TRU QS[ VE ] (635). S TRU QS[ BJ , VE ] 11 turns out to be the most efficient among S TRU QS versions: It can solve 39% of the dataset, which is also more than the set of problems solved by at least one of S TRU QS[ BJ ] and S TRU QS[ VE ] (800). If we normalize the number of search steps performed by the solvers considering the problems that are solved, we see a 10× gap between S TRU QS[ BT, VE ] and S TRU QS[ BJ ], while the same gap be- tween S TRU QS[ BJ ] and S TRU QS[ BT ] is only 5×. This means that combining search and variable elimination turns to be efficient per se, and even more ef- ficient than boosting search with backjumping. If we look at variable elimina- tion steps, we can see a substantial increase in the normalized number of steps: nearly 300× (resp. 200×) gap between S TRU QS[ VE ] and S TRU QS[ BT, VE ] (resp. S TRU QS[ BJ , VE ]). Indeed, variable elimination is used more in S TRU QS[ BT, VE ] and S TRU QS[ BJ , VE ] than in S TRU QS[ VE ], but this happens on subproblems that are structurally simpler, as there is only a 1.7× (resp. 2×) gap in normalized CPU time between S TRU QS[ VE ] and S TRU QS[ BT, VE ] (resp. S TRU QS[ BJ , VE ]). No- tice also that S TRU QS[ BJ , VE ] performs less search and variable elimination steps than S TRU QS[ BT, VE ], and this appears to be the net effect of backjumping. One last set of considerations about Table 1 regards dominance relationships. In particular, S TRU QS[ BT, VE ] is incomparable with both S TRU QS[ BT ] and S TRU QS[ VE ]. In the case of S TRU QS[ BT ], there are 6 problems in the suite Ansotegui which can be solved by S TRU QS[ BT ], but not by S TRU QS[ BT, VE ]. Indeed, the size of these problems is such that variable elimination becomes prohibitive too quickly for PREFER S EARCH to notice. S TRU QS[ VE ] can solve 162 problems that S TRU QS[ BT, VE ] cannot solve, where the bulk of such problems (121) is from the suite Pan: in these formulas, condition (1) of PREFER S EARCH is almost always unmet, and yet they turn out to be very difficult for search. Relaxing condition (1) of PREFER S EARCH can have positive effects on these formulas, but it can also severely degrade the performance of S TRU QS[ BT, VE ] on other suites. S TRU QS[ BJ , VE ] dominates S TRU QS[ BT, VE ] and S TRU QS[ BT ], but is incomparable with S TRU QS[ BJ ] and S TRU QS[ VE ]. In particular, there are 52 problems solved by S TRU QS[ BJ ] that S TRU QS[ BJ , VE ] cannot solve, whereof 40 are from the family blackbox-01X-QBF. These instances are characterized by the largest number of quantifier blocks among formulas in the dataset, and search is already quite efficient on them. S TRU QS[ VE ] can solve 68 formulas on which S TRU QS[ BJ , VE ] fails, of which 67 are from the suite Pan and have the same peculiarities mentioned about S TRU QS[ BT, VE ]. In Table 2 we conclude this section by comparing S TRU QS[ BJ , VE ] with state- of-the-art QBF solvers. We have considered all the competitors of QBFEVAL’08, namely AQME, Q U BE6.1, NENOFEX, QUANTOR, and SSOLVE. To them, we have added the solvers S K IZZO [1] which is representative of skolemization-based ap- proaches, Q U BE3.0 [7] which is the same as Q U BE6.1 but without preprocessing, Y Q UAFFLE [18] and 2 CLS Q [17] which are also search-based solvers, and QM- R ES [12] which features symbolic variable elimination. Although S TRU QS[ BJ , VE ] is just a proof-of-concept solver, Table 2 reveals that it would have ranked third best considering only the participants to QBFEVAL’08, and it ranks as fourth best in Table 2. Moreover, there are 24 instances that can be solved only by 12 Solver Solved True False # Time # Time # Time AQME 2434 43987 977 19747 1457 24240 Q U BE6.1 2144 32414 828 18248 1316 14166 S K IZZO 1887 40864 631 17550 1256 23314 S TRU QS[ BJ , VE ] 1304 63141 576 32685 728 30456 Q U BE3.0 1077 16700 406 6536 671 10164 NENOFEX 985 22360 459 13853 526 8507 QUANTOR 972 15718 485 10418 487 5300 SSOLVE 965 23059 450 9866 515 13193 Y Q UAFFLE 948 16708 389 9058 559 7650 2 CLS Q 780 21287 391 13234 389 8053 QMR ES 704 13576 360 7722 344 5853 Table 2: S TRU QS[ BJ , VE ] vs. state-of-the-art QBF solvers considering the number of problems solved and the cumulative CPU seconds considering overall results (“Solved”), as well as “True” and “False” formulas separately. S TRU QS[ BJ , VE ]. If we consider that AQME is a multi-engine solver combining the ability of several engines (including Q U BE6.1 and S K IZZO), and that both Q U BE6.1 and S K IZZO are fairly sophisticated pieces of software, we can con- clude that S TRU QS[ BJ , VE ] performances are very satisfactory. Indeed, S TRU QS is somewhat limited by the straigthforward implementation of operations such as, e.g., .forward subsumption. On one hand, disabling forward subsumption brings to 1114 the number of problems solved by S TRU QS[ BJ , VE ]. On the other hand, forward subsumption accounts for 8% of the total time spent by S TRU QS[ BJ , VE ] on the formulas that it can solve, but for 20% of the total time on the formulas that it cannot solve. Increasing the time limit to 1200 seconds allows S TRU QS[ BJ , VE ] to solve about one hundred additional problems, indicating that the efficiency of basic operations is currently a limiting factor for S TRU QS. 5 Conclusions and future work Summing up, the combination of search and resolution featured by S TRU QS seems to offer a gateway to effective reasoning in QBFs. The two key results of our work are (i) showing that the blended solvers S TRU QS[ BT, VE ] and S TRU QS[ BJ , VE ] outperform the single ones, and (ii) showing that S TRU QS[ BJ , VE ] is competitive with other state-of-the-art solvers. We believe that these results can be improved by pursuing an efficient implementation of S TRU QS. The fact that S TRU QS, how- ever unsophisticated, is able to solve 24 problems that cannot be solved by other state-of-the-art QBF solvers is promising in this sense. Also, S TRU QS[ BJ , VE ] used as a SAT solver on a subset of 790 QBFEVAL’08 formulas can solve 668 problems within one hour of CPU time and it is competitive with, e.g., MINI SAT which can solve 787 of them, while S TRU QS[ BJ ] and S TRU QS[ VE ] top at 553 and 145 problems solved, respectively. Other improvements would be leveraging dif- ferent structural features and approximations thereof, and/or integrating inductive techniques to synthesize (sub)instance-based PREFER S EARCH policies. 13 References [1] M. Benedetti. sKizzo: a Suite to Evaluate and Certify QBFs. In 20th Int.l. Conf. on Automated Deduction, volume 3632 of LNCS, pages 369–376, 2005. [2] A. Biere. Resolve and Expand. In 7th Intl. Conf. on Theory and Applications of Satisfiability Testing, volume 3542 of LNCS, pages 59–70, 2004. [3] H. Bodlaender. A Linear-Time Algorithm for Finding Tree-Decompositions of Small Treewidth. SIAM J. of Computation, 25(6):1305–1317, 1996. [4] M. Cadoli, A. Giovanardi, and M. Schaerf. An algorithm to evaluate quan- tified boolean formulae. In 15th Nat.l Conf. on Artificial Intelligence, pages 262–267, 1998. [5] H. Chen and V. Dalmau. From Pebble Games to Tractability: An Ambidex- trous Consistency Algorithm for Quantified Constraint Satisfaction. In In proc. of 19th Int.l workshop on Computer Science Logic, volume 3634 of LNCS, pages 232–247, 2005. [6] N. Eén and A. Biere. Effective Preprocessing in SAT through Variable and Clause Elimination. In 8th Intl. Conf. on Theory and Applications of Satisfi- ability Testing, volume 3569 of LNCS, pages 61–75, 2005. [7] E. Giunchiglia, M. Narizzano, and A. Tacchella. Clause-Term Resolution and Learning in Quantified Boolean Logic Satisfiability. Artificial Intelligence Research, 26:371–416, 2006. [8] G. Gottlob, G. Greco, and F. Scarcello. The Complexity of Quantified Con- straint Satisfaction Problems under Structural Restrictions. In 19th Int.l Joint Conference on Artificial Intelligence, pages 150–155, 2005. [9] G. Gottlob, N. Leone, and F. Scarcello. A comparison of structural CSP decomposition methods. Artificial Intelligence, 124:243–282, 2000. [10] H. Kleine-Büning, M. Karpinski, and A. Flögel. Resolution for Quantified Boolean Formulas. Information and Computation, 117(1):12–18, 1995. [11] J. Larrosa and R. Dechter. Boosting Search with Variable Elimination in Constraint Optimization and Constraint Satisfaction Problems. Constraints, 8(3):303–326, 2003. [12] G. Pan and M. Vardi. Symbolic Decision Procedures for QBF. In 10th Int.l Conf. on Principles and Practice of Constraint Programming, pages 453– 467, 2004. [13] G. Pan and M. Vardi. Fixed-Parameter Hierarchies inside PSPACE. In 21st IEEE Symposium on Logic in Computer Science, pages 27–36, 2006. 14 [14] C. Peschiera, L. Pulina, and A. Taccchella. Sixth QBF solvers evaluation (QBFEVAL), 2008. http://www.qbfeval.org/2008. [15] L. Pulina and A. Tacchella. Treewidth: a useful marker of empirical hard- ness in quantified Boolean logic encodings. In 15th Int.l Conf. on Logic for Programming, Artificial Intelligence, and Reasoning, volume 5330 of LNCS, 2008. [16] I. Rish and R. Dechter. Resolution versus search: Two strategies for sat. Journal of Automated Reasoning, 24(1/2):225–275, 2000. [17] H. Samulowitz and F. Bacchus. Binary Clause Reasoning in QBF. In In proc. of Ninth Int.l Conference on Theory and Applications of Satisfiability Testing, volume 4121 of LNCS, pages 353–367, 2006. [18] Y. Yu and S. Malik. Verifying the Correctness of Quantified Boolean For- mula(QBF) Solvers: Theory and Practice. In Conference on Asia South- Pacific Design Automation, pages 1047–1051, 2005. 15