=Paper=
{{Paper
|id=None
|storemode=property
|title=A Note for Parametric Complexity of #2SAT
|pdfUrl=https://ceur-ws.org/Vol-804/09_LANMR11.pdf
|volume=Vol-804
|dblpUrl=https://dblp.org/rec/conf/lanmr/LunaFR11
}}
==A Note for Parametric Complexity of #2SAT==
A Note for Parametric Complexity of #2SAT Guillermo De Ita Luna, Fernando Zacarı́as Flóres, Alejandro Rangel Huerta Faculty of Computer Sciences, BUAP deita@cs.buap.mx, fzflores@yahoo.com.mx, arangel@cs.buap.mx Abstract. We present some results about the parametric complexity for counting the number of truth assignments for two Conjunctive Forms (2- CF’s), such problem is denoted as #2SAT. It is common to analyze the computational complexity for #2SAT regarding the number of variables or the number of clauses on the input formula F We consider here, the time complexity analysis for #2SAT based on a positive integer parameter k, where k represents the number of satisfy assignments of F . We establish that for some values of k, the question: Is #2SAT (F ) = k?, can be answered in polynomial time with respect to the size of the formula F . Keywords: #SAT, Efficient Algorithms, Enumerative Combinatorics, Parametric Complexity. 1 Introduction To count combinatorial objects on graphs is an interesting and important area of research in Mathematics, Physics, and Computer Sciences. Counting problems, being mathematically relevant by themselves, are closely related to practical problems. For instance, reliability network issues are often equivalent to counting problems. Computing the probability that a graph remains connected, given the probabilities of failure over each edge, is essentially equivalent to counting the number of ways that the edges could fail without losing connectivity [11, 12]. The problem of counting models for a Boolean formula (#SAT problem) can be reduced to several problems in approximate reasoning. For example, in the cases of: estimating the degree of belief in propositional theories, the genera- tion of explanations to propositional queries, repairing inconsistent databases, Bayesian inference and a truth maintenance systems [1, 2, 8, 10–12]. The previ- ous problems come from several AI applications such as planning, expert systems, approximate reasoning, etc. The combinatory problem that we are going to address is the computation of the number of satisfy assignments for Boolean formulas in two Conjunctive Forms, this problem is denoted as #2SAT problem. #2SAT is a classical #P - complete problem even for the restricted cases of monotone and Horn formulas. Among the class of #P-complete problems, #2SAT is considered a funda- mental instance because of its relevance application in deduction issues and also, because it would allow to establish a boundary between efficient and intractable counting problems. As a result, given a 2-CF F with n variables and m clauses, 95 several methods with lower time upper bound than the trivial O(2n ) or O(2m ) have been proposed to solve #2SAT. Alternatively, given a positive integer k and a 2-CF F , we want to know if the number of models in F is k. We show that for some specific values of the parameter k, the main question: is #2SAT (F ) = k?, can be solved in polynomial time with respect to the size of the formula F . 2 Preliminaries Let X = {x1 , . . . , xn } be a set of n boolean variables. A literal is either a variable xi or a negated variable xi . As usual, for each xi ∈ X, x0i = xi and x1i = xi . A clause is a disjunction of different literals (sometimes, we also consider a clause as a set of literals). For k ∈ IN , a k-clause is a clause consisting of exactly k literals and, a (≤ k)-clause is a clause with at most k literals. A variable x ∈ X appears in a clause c if either x or x is an element of c. A conjunctive form (CF) F is a conjunction of clauses (we also consider a CF as a set of clauses). We say that F is a monotone CF if all of its variables appear in unnegated form. A k-CF is a CF containing only k-clauses and, (≤ k)-CF denotes a CF containing clauses with at most k literals. A kµ-CF is a formula in which no variable occurs more than k times. A (k, jµ)-CF ((≤ k, jµ)-CF) is a k-CF ((≤ k)-CF) such that each variable appears no more than j times. We use υ(X) to express the variables involved in the object X, where X could be a literal, a clause or a CF. For instance, for the clause c = {x1 , x2 }, υ(c) = {x1 , x2 }. Lit(F ) is the set of literals appearing in F , i.e. if X = υ(F ), then Lit(F ) = X ∪ X = {x1 , x1 , ..., xn , xn }. We denote {1, 2, ..., n} by [[n]] and the cardinality of a set A by |A|. An assignment s for F is a boolean function s : υ(F ) → {0, 1}. An assignment can also be considered a set of non complementary pairs of literals. If l ∈ s, being s an assignment, then s turns l true and l false. Considering a clause c and assignment s as a set of literals, c is satisfied by s if and only if c ∩ s = ∅, and if for all l ∈ c, l ∈ s, then s falsifies c. Assuming n = |υ(F )|, then there are 2n possible assignments defined over υ(F ). We denote the set of all 2n assignments defined over υ(F ) by S(F ). If F1 ⊂ F is a formula consisting of some clauses from F , and υ(F1 ) ⊂ υ(F ), an assignment over υ(F1 ) is a partial assignment over υ(F ). Assuming n =| υ(F ) | and n1 =| υ(F1 ) |, any assignment over υ(F1 ) has 2n−n1 extensions as assignments over υ(F ). Let F be a CF, F is satisfied by an assignment s if each clause in F is satisfied by s. F is contradicted by s if any clause in F is contradicted by s. A model M of F is an assignment for υ(F ) that satisfies F . The SAT problem consists of determining if F has a model. SAT(F ) denotes the set of models of F , then SAT(F ) ⊆ S(F ). Let UNSAT(F ) = S(F ) − SAT (F ), i.e. UNSAT(F ) is the set of assignments from S(F ) that falsify F . 96 The #SAT problem (or #SAT(F )) consists of counting the number of models of F defined over υ(F ), #2SAT denotes #SAT for formulas in 2-CF. And let #UNSAT(F ) = 2n − #SAT (F ). A 2-CF formula F can be represented by an undirected graph. Let F be a 2-CF, the constrained graph of F is the undirected graph GF = (V (F ), E(F )), where V (F ) = υ(F ) and E(F ) = {{υ(x), υ(y)} : {x, y} ∈ F }. I.e. the vertices of GF are the variables of F , and for each clause {x, y} in F there is an edge {υ(x), υ(y)} ∈ E(F ). The neighborhood for x ∈ V (F ) is N (x) = {y ∈ V (F ) : {x, y} ∈ E(F )} and its closed neighborhood is N (x)∪{x} denoted as N [x]. The degree of a variable x, denoted by δ(x), is |N (x)|, and the degree of F is ∆(F ) = max{δ(x) : x ∈ V (F )}. 3 Parametric Time Complexity for #2SAT An alternative to compute #SAT (F ), for a given 2-CF F , is to determine the set of connected components of its constrained graph GF . In [10], it has been proved that the set of connected components of a constrained graph can be determined in linear time with respect to the number of clauses in the formula. Thus, k #2SAT (F ) = #2SAT (GF ) = #2SAT (Gi ) (1) i=1 where {G1 , . . . , Gk } is the set of connected components of GF [10]. The set of different connected components of GF conforms a partition of F . Of course, if the number of models in any connected component of F is zero, then #SAT (F ) = 0. So, from now on, we will design an algorithm which solves #2SAT just for one connected component. And its application to 2CF- formulas with several connected components is inferred from (1). We say that a 2-CF F is a path, a cycle, or a tree if its corresponding constrained graph GF is a path, a cycle, or a tree, respectively. Let F = {C1 , C2 , . . . , Cm } be a strict 2-CF (all clause has length 2) and let n = |υ(F )|. We consider here, the size of F as (n + m). If F is not a empty formula then any clause Ci ∈ F, i = 1, . . . , m is falsified by 2n−2 assignments from the 2n possible assignments of S(F ), in fact, by those assignments where both literals from Ci take the false value. Thus, #SAT(F ) ≤ 2n−2 . Let k be a positive integer parameter such that k ≤ 2n−2 . Our main concern is to determine if #SAT(F ) = k. For this question there is an immediate answer for some values of k. For example, if k = 0 or k = 1 the Transitive Closure procedure could be applied to answer such question, and its is known that such procedure has a linear time complexity over the size of the 2-CF F (see [7]). In fact, if the value k can be upper bounded by a polynomial function on n, e.g. k ≤ n3 , then in [3], an exact algorithm, called Count Models, is shown for determining if #SAT(F ) ≤ n3 . Such algorithm runs on polynomial time complexity on the size of F . Thus, Count Models can be applied for answering if #SAT(F ) = k when k ≤ n3 . 97 Then, the hard cases to answer if #SAT(F ) = k, appear when the parameter k is a higher value than n3 . However, #SAT(F ) could not be so close to 2n−2 while m > 1. A way that the value #SAT(F ) might be close to 2n−2 is when F has few clauses. But in this case, almost all exact procedures can compute #SAT(F ) in polynomial time. For example, if m ≤ n then GF is an acyclic graph, or a simple cycle. In the both previous cases, the procedures shown in [4] compute #SAT(F ) in linear time on the size of F . From now on, we assume m > n. Let nc = m − n + 1 be the number of fundamental cycles in the graph GF . Let vr be any node of GF chosen to start a depth-first search. We obtain a spanning tree TF with vr as the root node and a set of fundamental cycles C = {C1 , C2 , ..., Cnc }, where each back edge ci ∈ E marks the beginning and the end of a fundamental cycle. Given any pair of fundamental cycles Ci and Cj of C, i = j, If Ci and Cj share edges, we call them intersecting cycles; otherwise, they are called independent cycles. Let AF be the depth-first search graph formed by the spanning tree TF and the set of fundamental cycles C. If the fundamental cycles appearing in AF are independent one from the other. Then there is a procedure in [4], called Count Models for Disjoint cycles, which computes #SAT(F ) in polynomial time on the size of F , without regarding the value of nc. Finally, the hard cases for determining if #SAT(F ) = k are the cases when there is intersected cycles in GF . Let ic be the number of intersected cycles in GF . Notice that ic ≤ nc = m − n + 1 and that ic is therefore smaller than m. Definition 1. Let F be a 2-CF and x a literal of F . The reduction of F by x, also called f orcing x and denoted by F [x], is the formula generated from F by the following two rules a) removing from F the clauses containing x (subsumption rule), b) removing x from the remaining clauses (unit resolution rule). A reduction is also sometimes called a unit reduction. Now, the reduction by a set of literals can be inductively established as follows: let s = {l1 , l2 , . . . , lt } be a set of literals, the reduction of F by s is defined as successively applying definition ( 1) for li , i = 1, . . . , t. That is, the reduction of F by l1 gives the formula F [l1 ], following a reduction of F [l1 ] by l2 , giving as a result the formula F [l1 , l2 ] and so on. The process continues until F [s] = F [l1 , ..., lk ] is reached. If s = ∅ then F [s] = F . Example 1. Let F = {{x1 , x2 }, {x1 , x2 }, {x1 , x3 }, {x1 , x3 }, {x2 , x4 }, {x2 , x4 }, {x2 , x5 }, {x3 , x5 }}. Then, F [x2 ] = {{x1 }, {x1 , x3 }, {x1 , x3 }, {x5 }, {x3 , x5 }}. And if s = {x2 , x3 } we have that F [s] = {{x1 }, {x1 }, {x1 }, {x4 }, {x4 }, {x5 }}. Furthermore, during the computation of F [s] new unitary clauses can be generated. The partial assignment s could be extended by adding the literals coming from the new unitary clauses found, that is, s = s ∪ {u} where {u} is 98 Algorithm 1 Unit Propagation(F, s) Input: F a 2-CF formula and s - a partial assignment of F . Let s be a global variable initiated with the initial value s. F = F [s] /* forcing s */ for each new unitary clause {u} in F do {s = s ∪ {u} F = U nit P ropagation(F , {u})} end for return F a unitary clause. So, F [s] can be reduced again using the new unitary clauses. The above mentioned iterative process is generalized in the following procedure. For simplicity, we will abbreviate U nit P ropagation(F, s) as U P (F, s). Given a 2-CF F and an initial assignment s, we obtain after the application of U P (F, s) a subformula F and an extended assignment s from s. s is s extended by the literals appearing in the unitary clauses found during the iterative process U P (F, s). Let T (s) = s be such extended assignment built by the procedure U P (F, s). Let F be a 2-CF and let s be a partial assignment of F . If a pair of contradic- tory unitary clauses is obtained while F [s] is being computed then #SAT (F [s]) = 0, because under no circumstances a pair of complementary unit clauses can be set to true at the same time, and then, F [s] does not have models. The application of definition ( 1 a) in a 2-CF formula F , could remove varia- bles which have to be considered in the models of F . Let us present an example. Example 2. Let F = {{x}, {x, y}}, in other words F = x ∧ (x ∨ y), it follows that F [x] = ∅. It can be noticed that {y} ∈ F [x], however y can take any logical value in the models of F . The variables which are removed from F during the application of U P (F, s) form a set, which will be denoted by Elim V ars(s). This set is constructed du- ring the computation of U P (F, s). In fact, it can be reviewed that |Elim V ars(s)| = |υ(F )| − |υ(U P (F, s)| − |T (s)|. It is time to show that counting the number of models of a 2-CF formula F can be simplified by counting the number of models of sub-formulas of F . Theorem 1. Let F be a 2-CF and x a variable in υ(F ). Let F1 = U P (F, x) and F2 = U P (F, x) then #SAT (F ) = #SAT (F1 ) ∗ 2|Elim V ars(x)| + #SAT (F2 ) ∗ 2|Elim V ars(x)| Proof. We can split the number #SAT(F ) between the models where x ∈ υ(F ) is true and the models where x is false. The number of models where x is true are counted by #SAT(F1 ) ∗ 2|Elim V ars(x)| where F1 = U P (F, x). While the number of models where x is false are counted by #SAT(F2 ) ∗ 2|Elim V ars(x)| , F2 being the subformula F2 = U P (F, x). As T (x) and T (x) haven’t common elements since x ∈ T (x) and x ∈ T (x) then in the above sum we don’t consider common models between SAT (F1 ) and SAT (F2 ). 99 Let us assume GF has ic intersected cycles. Let x ∈ υ(F ) be the variable appearing in a maximum number of intersected cycles of GF . Let F1 = U P (F, x) and F2 = U P (F, x), then the application of the theorem ( 1) defines a splitting rule for computing #SAT(F ). And the iterative application of that splitting rule over the resulting subformulas determines a branch and bound algorithm whose base case is when the subformula does not have intersected cycles. Let us show how this algorithm works with an example. Example 3. Let F be a monotone 2-CF, F = {{x1 , x2 }, {x1 , x9 }, {x2 , x3 }, {x2 , x8 }, {x3 , x4 }, {x3 , x10 }, {x4 , x5 }, {x4 , x7 }, {x5 , x6 }, {x6 , x7 }, {x6 , x9 }, {x8 , x7 }, {x8 , x9 }}. F contains 4 intersected fundamental cycles. Then, F1 = U P (F, x7 ) = {{x1 , x2 }, {x1 , x9 }, {x2 , x3 }, {x2 , x8 }, {x3 , x4 }, {x3 , x10 }, {x4 , x5 }, {x5 , x6 }, {x6 , x9 }, {x8 , x9 }} which includes two intersected cycles and F2 = U P (F, x7 ) = {{x1 , x2 }, {x1 , x9 }, {x2 , x3 }, {x3 , x10 } which is an acyclic graph. We can see the processing of F in figure 1. Notice that the splitting rule has to be performed again to process F1 , since F1 has a pair of intersected cycles. While F2 represents a basic instance because it has not intersected cycles and to count its models is done by the Count M odels f or Disjoint cycles procedure in polynomial time on the size of F2 . Notice that all clauses of F where the variable x appears, are not more clauses neither F1 nor F2 . And as x appears in at least one pair of intersected cycles from GF then GF1 and GF2 have at least two intersected cycles less than GF . Consequently, the following recurrence equation holds: r(ic) = 2 ∗ r(ic − 2) = 2t ∗ r(ic − 2 ∗ t) . Such recurrence ends when ic − 2 ∗ t = 1, that is, when t = (ic − 1)/2. Therefore, the time complexity will be O(2(ic−1)/2 ). In the case that GF contains a small number of intersected cycles, e.g. ic ≤ n/2, then the above branch and bound procedure would have a time complexity upper bounded by O(2n/4 ) = O(1.1892n ). To check for the conditions to apply the previous algorithm is easy to do, since we apply a depth first search over GF and we count the number of intersected cycles ic in GF in order to determine if ic ≤ n/2. All those previous tasks can be performed in linear time on the size of F . While the search for the variable x appearing in a maximum number of intersected cycles in GF that require a time of order ic2 , that is polynomial on the size of F . In fact, we can accelerate the computation of #SAT(F ), if for example, all variable with one occurrence in the formula is processed before applying the splitting rule. Another way to accelerate the computation of #SAT(F ) is when we consider the application of two sequential splitting rules at the same time. For example to select a clause c = {l1 , l2 } ∈ F where l1 is the literal which appears in a maximum number of intersected cycles and l2 has a maximum degree into the set N (υ(l1 )). 100 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 F2=UP(F, Not(X7)) F1= UP(F, X7) X1 X9 X2 X1 X2 X3 X4 X5 X6 X9 X8 X3 X10 X10 Fig.1 Applying the splitting rule on a monotone 2-CF. The application of two consecutive splitting rules, first with l1 and next with l2 , generate three new branches in the enumerative tree instead of four. As the branch for U P (F, [l1 , l2 ]) generates an unsatisfiable subformula (since it falsify to c) then such branch is discarded from the enumerative tree. 3.1 The General case If there are a great number of intersected cycles, that means that m >> n, then there is a bigger number of clauses than the number of variables. Let F = {C1 , C2 , . . . , Cm } be a stricter 2-CF involving n variables. For any clause Ci ∈ F , there are exactly 2n−|Ci | assignments from S(F ) falsifying Ci . Let Ci , Cj ∈ F, i = j, i, j = 1, . . . , m. Each clause Ci , Cj ∈ F is falsified by 2n−2 assignments from the 2n possible assignments from S(F ). Both clauses suppress generally fewer than 2n−2 + 2n−2 of assignments from S(F ) because they have common assignments falsifying both clauses. I.e. #U N SAT (Ci ∨Cj ) = #U N SAT (Ci ) + #U N SAT (Cj ) − #U N SAT (Ci ∧ Cj ). For example, if Ci ∩Cj = {l} and there are not complementary pair of literals in Ci ∪ Cj then there are a common partial assignment formed by Ci ∪ Cj which falsifies both clauses. As, |Ci ∪ Cj | = 3, the clauses Ci and Cj are falsified by a total number of 2n−2 + 2n−2 − 2n−3 assignments from S(F ). 101 If Ci and Cj contain a common variable in opposite forms, that is direct in Ci and complemented in Cj , the unsatisfied assignments of each clause form disjoint sets of assignments. Consequently, both clauses suppress exactly 2n−2 + 2n−2 assignments from S(F ). The property between two clauses having at least one common variable in opposite form is called independence. Two clauses are independent if they have at least one common variable in opposite form [5]. Otherwise, we say that both clauses are dependent. In general, the inclusion-exclusion formula can be applied for computing #UNSAT(F ). Several algorithms have been designed as finer or shorter ver- sions of the application of the inclusion-exclusion formula in order to compute #UNSAT(F ) [5, 9]. Each clause is falsified by a determined number of assignments from S(F ), but while more and more clauses are considered, then more and more assign- ments are discarded from S(F ). Therefore, it is expected that #SAT(F ) = 2n − #U N SAT (F ) would be a smaller value than 2n−2 , in which case, it might be that #SAT(F ) could be upper bounded by a polynomial value on n. And then, as we have seen, #SAT(F ) can be computed in polynomial time on the size of F . Thus, the real hard cases for computing #SAT(F ) are when there are a considerable number of intersected cycles in GF , e.g. ic ≥ n/2, and when it is expected that #SAT(F ) would be greater than, for example, n3 . In that latter case, we can apply the algorithm which provides the leader upper bound on the time complexity for computing #2SAT. For instance, in ( [13]) a branch and bound procedure for computing #2SAT with an upper bound of O(1.2377n) was shown. In fact, one of the important tasks when branch and bound procedures are performed for computing #2SAT, is how to select a pivotal variable (or pivotal clause) for applying the branching rule. We have shown, in the previous subsection, an adequate way to select the pivotal variable x for applying the splitting rule. With the intention of reduc- ing the number of intersected cycles in each application of the splitting rule, it is adequate to select the variable x that appears in the maximum number of intersected cycles. This guarantees that all cycles where x is part of them, will be reduced in the subformulas associated to the new branches. And then, this will accelerate the processing of intersected cycles appearing in the current subformulas. The results presented here, allow us to establish in a more precise way the boundary between efficient and intractable instances for the #2SAT Problem. 4 Conclusions Given a 2-CF F with n variables, m clauses, and a positive integer parameter k, the question: Is #SAT(F ) = k?, can be answered in an efficient way for values of k; k ≤ n3 or k ≥ 2n−2 . When k is outside of those interval values, is convenient 102 to apply a depth first search on GF (the constrained graph of F ), and to count the number of intersected fundamental cycles (ic) appearing on GF . If ic ≤ n/2 then the branch and bound algorithm presented here, computes #SAT (F ) with a time complexity of O(2ic/2 ), which in a worst case has an upper bound of O(1.1892n ). Otherwise, we can apply the leader upper bound time complexity algorithm which computes #SAT(F ) with a running time of O(1.2377n ). Therefore, in any case, to answer the question: Is #SAT(F ) = k?, requires until now, an exponential time complexity on the number of intersected cycles in GF or an exponential time complexity on the number of variables in F. References 1. Darwiche A., On the Tractability of Counting Theory Models and its Application to Belief Revision and Truth Maintenance, Jour. of Applied Non-classical Logics, (2001), pp. 11-34. 2. Dahllöf V., Jonsson P., Wahlström M., Counting models for 2SAT and 3SAT for- mulae., Theoretical Computer Sciences, 332(1-3), (2005), pp. 265-291. 3. De Ita G., Marcial-Romero R., Hernández J.A., A threshold for a Polynomial Solution of #2SAT, Fundamenta Informaticae, to appear. 4. De Ita G., Bello P., Contreras M., New Polynomial Classes for #2SAT Established Via Graph-Topological Structure, Engineering Letters, Vol. 15;2, Int. Assoc. of Engineers, www.engineeringletters.com, (2007), pp. 250-258. 5. Dubois Olivier, Counting the number of solutions for instances of satisfiability, Theor. Comp. Sc. 81, (1991), 49-64. 6. Fernau H., On Parameterized Enumeration, LNCS 2387, (2002), pp. 564-573. 7. Gusfield D., Pitt L., A bounded approximation for the minimum cost 2-Sat pro- blem, Algorithmica 8, (1992), 103-117. 8. Khardon R., Roth D., Reasoning with Models, Artificial Intelligence, Vol. 87, No. 1, (1996), pp. 187-213. 9. Lozinskii E., Counting propositional models, Inf. Proc. Letters 41, (1992), pp. 327- 332. 10. Roth D., On the hardness of approximate reasoning, Artificial Intelligence 82, (1996), pp. 273-302. 11. Russ B., Randomized Algorithms: Approximation, Generation, and Counting, Dis- tinguished dissertations Springer, 2001. 12. Vadhan Salil P., The Complexity of Counting in Sparse, Regular, and Planar Graphs, SIAM Journal on Computing, Vol. 31, No.2, (2001), pp. 398-427. 13. Wahlström M., A Tighter Bound for Counting Max-Weight Solutions to 2SAT Instances, LNCS 5018, (2008), pp. 202-213. 103 104