<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Generic CDCL - A Formalization of Modern Propositional Satisfiability Solvers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Steffen Ho¨lldobler</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Norbert Manthey</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tobias Philipp</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Peter Steinke</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>International Center for Computational Logic Technische Universita ̈t Dresden</institution>
        </aff>
      </contrib-group>
      <fpage>25</fpage>
      <lpage>34</lpage>
      <abstract>
        <p>Modern propositional satisfiability (or SAT) solvers are very powerful due to recent developments on the underlying data structures, the used heuristics to guide the search, the deduction techniques to infer knowledge, and the formula simplification techniques that are used during pre- and inprocessing. However, when all these techniques are put together, the soundness of the combined algorithm is not guaranteed any more. In this paper we present a small set of rules that allows to model modern SAT solvers in terms of a state transition system. With these rules all techniques which are applied in modern SAT solvers can be adequately modeled. Finally, we compare Generic CDCL with related systems.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Many practical problems of computer science are in the complexity class NP.
There are many well studied formalisms that can handle problems of this class,
among them are constraint satisfaction [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], answer set programming [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], or
satisfiability checking [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Although the two former formalisms admit a richer
language, the latter approach is still very competitive even if the expressivity of
its language is comparatively low.
      </p>
      <p>
        The propositional satisfiability problem (SAT) consists of a propositional
formula and asks whether there is a satisfying assignment for the Boolean variables
occurring in the formula. From a complexity theory point of view SAT is
NPcomplete [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and, thus, intractable. Still, there are many industrial and academic
applications that can be solved nicely with modern SAT solvers. For instance a
SAT-based railway scheduling software outperformed the native version [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
Likewise, haplotype matching [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] can be solved nicely with modern SAT solvers.
      </p>
      <p>
        The success of the SAT approach lies in the strength of today’s SAT solvers.
SAT solvers do not operate on testing all possible variable assignments, but
on constructing an assignment by successively interleaving two processes, viz.,
guessing and propagating the assignment of literals. The main inference rule is
unit propagation, an efficient form of resolution. Combined with a decision rule
it is the core of the basic algorithm known as the DPLL algorithm [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In the
case that a contradiction is found in the formula with respect to the current
variable assignment, advanced SAT solvers backtrack and learn a conflict clause
which prevents the current and similar conflicts. With the addition of so-called
learned clauses the basic algorithm is known as CDCL algorithm [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>
        Modern systematic SAT solvers are highly tuned and complex proof
procedures employing many advanced techniques like clause learning [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ],
nonchronological backtracking, restarts [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], clause removal [
        <xref ref-type="bibr" rid="ref2 ref6">2, 6</xref>
        ], decision
heuristics [
        <xref ref-type="bibr" rid="ref17 ref2">2,17</xref>
        ], and formula simplification techniques [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Specialized, cache-conscious
data structures [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] further improve the performance. This way, today’s solvers
like Riss,1 MiniSAT or Lingeling can handle formulas with millions of variables
and millions of clauses.
      </p>
      <p>However, the success of modern solvers carries a price tag: increased code
complexity. Successful SAT solvers like the above mentioned ones consist of
multiple thousand lines of code and are written in programming languages with
side effects like C or C++. Due to the code complexity, the behavior of SAT
solvers is hard to understand and state-of-the-art SAT solver internals are hard
to teach. Moreover, finding additional techniques and integrating them into a
SAT solver is getting more complex, as we have to consider the interplay with
all the remaining techniques. Consequently, abstracting from specific algorithms,
data structures, and heuristics is extremely important in order to discover and
prove properties of a modern SAT solver as well as to understand the principles
of SAT solving.</p>
      <p>
        This problem was tackled by different formalizations, notably Linearized
DPLL [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], Rule-based SAT Solver Descriptions [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], and Abstract
DPLL [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. However, these systems do not appropriately model modern SAT
solvers anymore. In particular, preprocessing and applying preprocessing
techniques interleaved with search, known as inprocessing, became a crucial part in
SAT solving. Applying formula simplification techniques also during search is an
attractive idea since it allows to use valuable formula simplifications while taking
learned clauses into account. For example, the SAT solver Lingeling benefits
considerably from this approach.
      </p>
      <p>The contribution of this paper is the formalism Generic CDCL that models
the computation of modern SAT solvers. Equipped with a small set of simple
state transition rules, we can model all well-established techniques like
preprocessing, inprocessing, restarts, clause sharing, as well as clause learning and
forgetting. This formalism allows us to reason about the behavior of SAT solvers
independently of the specific implementation. Additionally, the framework is a
first step to explain how modern SAT solvers are working in a compact and easy
way. Besides the presentation of Generic CDCL, the main result of this paper is
the proof that Generic CDCL and, consequently, all its instances are sound.</p>
      <p>The paper is structured as follows: In Section 2 we describe basic concepts of
satisfiability testing. We present Generic CDCL in Section 3, where we also prove
that Generic CDCL correctly solves the satisfiability problems. Afterwards, we
compare Generic CDCL with related formalism in Section 4 and we conclude
the paper in Section 5.
1 The SAT solver Riss is freely available at tools.computational-logic.org.</p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <sec id="sec-2-1">
        <title>The Satisfiability Problem</title>
        <p>We assume a fixed infinite set V of Boolean variables. A literal is a variable v
(positive literal ) or a negated variable v (negative literal ). The complement x of
a positive (negative, resp.) literal x is the negative (positive, resp.) literal with
the same variable as x. The complement of a set S of literals, denoted with S,
is defined as S = {x | x ∈ S}. Finite sets of clauses are called formulas, where
a clause is a finite set of literals. Sometimes, we write a clause {x1, . . . , xn} also
as the disjunction (x1 ∨ . . . ∨ xn) and a formula {C1, . . . , Cn} as the conjunction
(C1 ∧ . . . ∧ Cn). The empty clause is denoted by ⊥, the empty formula by &gt;. The
formula obtained from F by replacing all occurrences of the variable v by the
variable w is denoted by F [v 7→ w]. The set of all variables occurring in a formula
F (in positive or negative literals) is denoted by vars(F ); the set of all literals
occurring in F by lits(F ). For instance, if x, y ∈ V, then F = {{x, y}, {y}} is
a formula, its alternative representation using logical connectives is (x ∨ y) ∧ y,
vars(F ) = {x, y}, and lits(F ) = {x, y}.</p>
        <p>The semantics of formulas is based on the notion of an interpretation. An
interpretation I is a set of literals which does not contain a complementary pair
x, x of literals. An interpretation I is total iff for each v ∈ V either v ∈ I or v ∈ I.
The satisfaction relation |= is defined as follows: Let I be an interpretation, then
I |= &gt;, I 6|= ⊥, I |= (x1 ∨ . . . ∨ xn) iff I |= xi for some i ∈ {1, . . . , n}, and
I |= (C1 ∧ . . . ∧ Cn) iff I |= Ci for all i ∈ {1, . . . , n}. Interpretation I is a model
for the formula F iff I |= F . In the case that a formula F has a model, then F
is satisfiable, otherwise it is unsatisfiable.</p>
        <p>We relate formulas by three relations: the entailment, the equivalence and
the equisatisfiability relation: Formula F entails formula F 0 iff every total model
of F is a model of F 0. Two formulas F and F 0 are equivalent, in symbols F ≡ F 0,
iff F entails F 0 and F 0 entails F . Two formulas F and F 0 are equisatisfiable, in
symbols F ≡sat F 0, iff either both are satisfiable or both are unsatisfiable.</p>
        <p>For instance, the interpretation I = {x, ¬z} is a model of the formula F1 =
(x ∨ y) ∧ (x ∨ z) and, therefore, F is satisfiable. The formula F2 = x ∧ x has
no model and, therefore, is unsatisfiable. The formula F3 = x ∧ z is satisfiable
and, therefore, the formulas F1 and F3 are equisatisfiable, but the formulas F1
and F2 are not equisatisfiable. In fact, F3 |= F1 since every total model I of the
formula F3 must contain x and z and, hence, the two clauses of F1 are satisfied
by I. Finally, we find for all clauses C and formulas F that C ∨ &gt; ≡ &gt; ∨ C ≡ &gt;,
C ∨ ⊥ ≡ ⊥ ∨ C ≡ C, F ∧ &gt; ≡ &gt; ∧ F ≡ F , and F ∧ ⊥ ≡ ⊥ ∧ F ≡ ⊥.</p>
        <p>Let x be a literal, C = (x ∨ C0) and D = (x ∨ D0) be two clauses. Then the
clause (C0 ∨ D0) is the resolvent of the clauses C and D upon the literal x. A
linear resolution derivation from the clause C to the clause D in the formula F
is a finite sequence of clauses (Ci | 1 ≤ i ≤ n) such that C1 = C, Cn = D and
Ci is a resolvent of the clause Ci−1 and some clause in the formula F for all
i ∈ {2, . . . , n − 1}; one should observe that F entails D and that the addition of
entailed clauses to a formula preserves equivalence.
2.2</p>
        <p>Variable Assignments and the Reduct Operator
Let J be a finite sequence of literals. In J each literal may be marked as a
decision literal by placing a dot on top like in x˙ ; if a literal x is not marked, then
it is a propagation literal. Let J be a sequence of literals of length n. We say that
literal x ∈ J iff there is a k ∈ {1, . . . , n} such that x = xk. Let J1 = (x1, . . . , xn)
and J2 = (y1, . . . , ym) be two sequences of literals; their concatenation J1J2 is
the sequence (x1, . . . , xn, y1, . . . , ym). If a finite sequence J of literals does not
contain a complementary pair of literals, then J represents an interpretation. As
this condition is always met in this paper, we identify sequences of literals with
interpretations whenever appropriate.</p>
        <p>The reduct of a formula F w.r.t. an interpretation J , in symbols F |J , is
defined as F |J := {C|J | C ∈ F and for every literal x ∈ C we find that x 6∈ J },
where C|J = C \ {x | x ∈ J }. Intuitively, the reduct operator expresses the state
of a SAT solver, where the formula F is the working formula and J is the working
assignment. For instance, let F = {{x, y}, {z}}, then F |x = {{y}, {z}}, F |z =
{{x, y}, ⊥} and F |y z = &gt;, where the interpretations are written as sequences
of literals. One should observe that the reduct operator does not distinguish
between propagation and decision literals.</p>
        <p>Lemma 1 below summarizes the properties of the reduct operator: (i) The
reduct is monotone. (ii) A formula F is satisfiable iff there exists an
interpretation J such that the reduct of a formula w.r.t. J is the empty formula. (iii)
If the reduct of a formula F w.r.t. the interpretation {x} is unsatisfiable, and
the formula F entails the literal x, then the formula F is unsatisfiable. (iv) The
reduct operator is a semantic operator in the sense that it cannot distinguish
equivalent formulas.</p>
        <p>Lemma 1 (Reduct Operator). Let F, F 0 be formulas and x a literal.
(i) F |J ⊆ (F ∪ F 0)|J for every interpretation J .
(ii) F is satisfiable iff there exists a J such that F |J = &gt;.
(iii) If F |x is unsatisfiable and F |= x, then F is unsatisfiable.
(iv) If F ≡ F 0, then F |J ≡ F 0|J for every interpretation J .</p>
        <p>Proof. See [19, pp.10–12].
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Generic CDCL</title>
      <p>
        Modern SAT solvers are based on the linearized DPLL [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] algorithm and consists
of the following components: termination criteria, a decision component that
picks the branching literals, an inference component that adds propagation
literals to the working sequence of literals, a backtracking component that rolls
back wrong decisions and a formula management component that simplifies the
working formula. We maintain two data structures during the execution of
modern SAT solvers: the working formula, and the working set of literals. Together
they define the state. The components are modelled as a transition relation over
the set of states; the union of the rules in Fig. 1 is then the transition relation
tu
SAT-rule:
      </p>
      <p>F</p>
      <p>J ;SAT SAT
iff</p>
      <p>F |J = &gt;.</p>
      <p>UNSAT-rule: F J ;UNSAT UNSAT iff</p>
      <p>⊥ ∈ F |J and J contains only propagation literals.</p>
      <p>DEC-rule:
INF-rule:
LEARN-rule: F
FORGET-rule: F
BACK-rule:
INP-rule:</p>
      <p>F J ;DEC F J x˙ iff
x ∈ vars(F ) ∪ vars(F ) and {x, x} ∩ J = ∅.</p>
      <p>F J ;INF F J x iff
F |J ≡sat F |J x and {x, x} ∩ J = ∅.</p>
      <p>J ;LEARN F ∪ {C}</p>
      <sec id="sec-3-1">
        <title>J ;FORGET F \ {C}</title>
        <p>J
J
iff
iff</p>
        <p>F |= C.</p>
        <p>F \ {C} |= C.
F
F</p>
      </sec>
      <sec id="sec-3-2">
        <title>J J 0 ;BACK F</title>
        <p>ε ;INP F 0 ε</p>
        <p>J .
iff</p>
        <p>F ≡sat F 0.
of Generic CDCL: Formally, we model the computation of modern SAT solvers
by means of state transition systems as follows:
Definition 2 (Generic CDCL). Generic CDCL is a state transition system
whose sets of states is
{F</p>
        <p>J | F is a formula and J is a sequence of literals} ∪ {SAT, UNSAT},
whose initial state for the input formula F is init(F ) = F ε, whose set of
terminal states is {SAT, UNSAT}, and whose transition relation ; is defined as:
; := {;SAT, ;UNSAT, ;DEC, ;INF, ;LEARN, ;FORGET, ;BACK, ;INP}.
The SAT-rule terminates the computation with the output SAT, if the reduct of
the working formula w.r.t. the working set of literals is the empty formula. This
condition can be decided in linear time w.r.t. the size of the working formula F .
By Lemma 1 (ii) the working formula is then satisfiable.</p>
        <p>The UNSAT-rule terminates the computation with the output UNSAT, if no
model of the working formula exists. This is the case when a conflict occurs in
the top level, i.e. ⊥ ∈ F |J and the sequence J of literals contains only propagation
literals. These conditions can be decided in polynomial time.</p>
        <p>The DEC-rule extends the working sequence of literals by an unassigned literal
x˙ as a decision literal.
The INF-rule extends the working list of literals by a propagation literal x, if
the reducts of the working formula w.r.t. the working sequence of literals and its
extension are equisatisfiable.</p>
        <p>
          The BACK-rule models backtracking, as well as backjumping and restarts, by
deleting outermost right literals in the working sequence of literals.
The LEARN-rule adds a clause C to the working formula, if it is entailed by
the working formula F . Deciding whether F |= C holds, is coNP-complete.
Similarly to the INF-rule, SAT solvers avoid this check by using techniques for
creating the clause C that ensure this property, as for example resolution.
The FORGET-rule deletes a clause C of the working formula F , if F \ {C} |= C.
The question whether F \ {C} |= C holds is coNP-complete. Typically, we use
tractable algorithms to identify redundant clauses. For instance, clauses that
were introduced by the LEARN-rule but have turned out to be useless and did
not participate in the elimination of other clauses in the formula can be removed.
For more details on the deletion of clauses see [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
        </p>
        <p>The INP-rule models formula simplifications that are used in pre- and
inprocessing. It replaces the working formula with an equisatisfiable formula when
the working sequence of literals is empty.</p>
        <p>Let ;∗ be the reflexive and transitive closure of ;. Let x ;0 x for all states x,
and x ;n z for all natural numbers n ∈ N if and only if there exists a state y such
that x n−1 y ; z. In the next subsection we investigate the question whether
;
Generic CDCL correctly solves the SAT problem. Formally, we define Generic
CDCL to be sound iff for all formulas F0 we have that init(F0) ;∗ SAT implies
that F0 is satisfiable and init(F0) ;∗ UNSAT implies that F0 is unsatisfiable.
3.1</p>
        <p>Generic CDCL is Sound
Before proceeding to the soundness proof of Generic CDCL, it will be necessary
to study two invariants of Generic CDCL that are presented in the proposition
below: (i) states that the rules of Generic CDCL do not change the satisfiability
of the working formula, and (ii) states that whenever the working sequence of
literals is of the form J1 x J2, where x is a propagation literal, the reducts of the
working formula w.r.t. J1 and J1 x are equisatisfiable.</p>
        <p>Proposition 3 (Invariants). Let F0, F be formulas, J be a sequence of literals,
and n ∈ N. If init(F0) ;n F J , then
1. F0 ≡sat F , and
2. F |J1 ≡sat F |J1 x, for all sequences of literals J1, J2 and propagation literals x
with J = J1 x J2.</p>
        <p>Proof. The claims are proven by induction on n. For the base case n = 0, 1.
follows from F0 = F and 2. holds since the J is empty. For the induction step,
assume that the claim holds for the state F J and suppose that F J ;R F 0 J 0
where R ∈ {DEC, INF, LEARN, FORGET, BACK, INP}:
– DEC-rule: In this case, F 0 = F and J 0 = J x˙ for some decision literal x˙
with {x, x} ∩ J = ∅. 1. follows since F0 ≡sat F holds by induction. 2. holds
because the appended literal is a decision literal. Formally, let J10 , J20 be literal
sequences, y be a propagation literal such that J 0 = J10 y J20 x˙ . By induction,
we conclude that F |J10 ≡sat F |J10 y. Hence, F 0|J10 ≡sat F 0|J1 y.
– BACK-rule: In this case, F 0 = F and J = J 0 J 00. 1. follows since F0 ≡sat F by
induction. 2. holds because the literal sequence J 0 is a prefix of J . Formally,
let J10 , J20 be literal sequences and y be a propagation literal such that J 0 =
J10 y J 0 . By induction, we conclude that F |J10 ≡sat F |J10 y, and consequently
2
we know that F 0|J10 ≡sat F 0|J10 y.
– LEARN-rule: In this case, F 0 = F ∪ {C} where F |= C and J 0 = J . 1. follows
since F ≡ F 0 and the addition of the entailed clause C preserves equivalence
of the working formula. 2. follows from the reduct operator being a semantic
operator by Lemma 1.iv and therefore F 0|J10 ≡sat F 0|J10 y holds by induction
for every literal sequences J10 , J20 and propagation literals y with J 0 = J10 y J20 .
– FORGET-rule: This case can be treated as in the LEARN-rule.
– INF-rule: In this case, F 0 = F and J 0 = J x for some propagation literal x
with {x, x} ∩ J = ∅. 1. follows since F0 ≡sat F holds by induction. 2. follows
from the definition of the INF-rule: Consider the literal sequences J10 , J20 and a
propagation literal y such that J 0 = J10 y J20 . In the case that y = x, we know
that J20 is the empty sequence of literals and consequently F |J10 ≡sat F |J10 y
holds by the definition of the INF-rule. In the case of y 6= x, we can conclude
the claim by induction.
– INP-rule: In this case, F 0 ≡sat F and J 0 is the empty sequence. Consequently,
1. holds by the definition of the INP-rule, and 2. is satisfied as J 00 = ε.
tu
We can now show the main theorem in this paper.</p>
        <sec id="sec-3-2-1">
          <title>Theorem 4 (Soundness). Generic CDCL is sound.</title>
          <p>Proof. We divide the proof in two parts, first proving that the output SAT is
correct, and then proving that the output UNSAT is correct. Let F0, F be formulas,
J be a sequence of literals and suppose that
init(F0) ;∗ F</p>
          <p>J ; SAT(UNSAT, resp.).</p>
          <p>SAT By the definition of the SAT-rule, we know that F |J = &gt;. By Lemma 1.ii,
we know that the formula F is satisfiable. From the result that the formula F is
satisfiable together with the property that the formulas F0 and F are
equisatisfiable given in Prop. 3(1.), we conclude that the input formula F0 is satisfiable.
UNSAT By the definition of the UNSAT-rule, we know that ⊥ ∈ F |J and the
working sequence of literals J = (x1 . . . xn) contains only propagation literals.
Since a conflict occurs, F |J is unsatisfiable. From the result that the formula F |J
is unsatisfiable and the literal sequence J contains only propagation literals we
can repeatably apply Prop. 3(2.) and we obtain that the formula F is
unsatisfiable. Since the formula F is unsatisfiable and the formulas F and F0 are
equisatisfiable by Prop. 3(1.), we conclude that F0 is unsatisfiable. tu
3.2</p>
          <p>Generic CDCL Subsumes Important SAT Solving Techniques
We now describe some important SAT solving techniques, and demonstrate that
Generic CDCL can adequately model these techniques.</p>
          <p>
            Subsumption For a formula F , the clause C ∈ F subsumes the clause D ∈ F
iff C ⊂ D. In this case, D can be deleted because F \ {D} |= D. Consequently,
F J ;FORGET F \{D} J holds for every literal sequence J . Removing subsumed
clauses is done as a preprocessing step in SAT solvers and during clause learning.
Tautologies A clause C is a tautology if it contains a complementary pair of
literals. Every formula F entails a tautology and the LEARN-rule in Generic CDCL
subsumes this techniques. Tautologies are eliminated during preprocessing.
Conflict-Directed Backtracking and Learning [
            <xref ref-type="bibr" rid="ref21">21</xref>
            ] is an improvement of naive
backtracking that takes the reason of the conflict into account. Consider the
state F J x˙ J 0 and a clause C ∈ F where C|J x˙ J0 = ⊥. The clause C is the
conflict clause. If there is a linear resolution derivation from the conflict clause C
to a clause D in the formula F such that D|J is the unit clause y, the technique
rewrites the state F J x˙ J 0 into the state F ∪ {C} J y. Conflict-directed
backtracking and learning can be simulated by the following transition steps:
F J x˙ J 0 ;BACK F J ;LEARN F ∪ {D} J ;INF F ∪ {D} J y.
Blocked Clause Elimination [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ] A clause C is blocked in the formula F if it
contains a literal x such that all resolvents of the clause C and clauses D ∈ F
upon x are tautological. Blocked clauses are removed from a formula during
preand inprocessing. If C is blocked in F , then F ≡sat F \ {C} and, therefore, the
INP-rule subsumes the blocked clause elimination technique.
          </p>
          <p>Unit Propagation A clause that contains a single literal is a unit clause. Unit
propagation adds the propagation literal x to the literal sequence J , whenever
the reduct of the working formula w.r.t. J contains the unit clause (x). Since
F |J |= x, we know that F |J ≡sat F |J x and consequently the INF-rule subsumes
unit resolution.</p>
          <p>Pure Literal A literal x is pure in the formula F , if x 6∈ lits(F ). For pure
literals, it holds that F ≡sat F |x and, therefore, whenever a literal x is pure in
the formula F |J for some literal sequence J , Generic CDCL can add the pure
literal to the working literal sequence: F J x.</p>
          <p>J ;INF F
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Related Work</title>
      <p>
        Several attempts have been made to formalize sequential SAT solvers in terms of
transition systems or proofs calculi: Abstract DPLL [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], Linearized DPLL [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
and Rule-based SAT solver description [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. In general, these formalization are
based on a notion of state like Generic CDCL.
      </p>
      <p>
        However, these formalizations cannot adequately model recent SAT solvers:
For instance, Linearized DPLL does not model the SAT solver MiniSAT, because
Linearized DPLL restricts decision literals to occur in the working formula, but
the solver MiniSAT can also select the complements of such literals.
Additionally, Linearized DPLL does not model formula simplification techniques such
as blocked clause elimination, or probing-based inference techniques. Similarly,
Abstract DPLL and the Rule-based SAT solver description [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] do not model
formula simplifications that changes the semantics of formulas like blocked clause
elimination. Maric highlighted the implementation of clause learning techniques
in his Rule-based SAT solver description [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], but it does not include recent
developments such as clause strengthening. All these formalizations consider
DPLLbased SAT solvers, but the ancient pure literal rule is not subsumed in these
systems. In contrast, Generic CDCL subsumes all recent SAT techniques to the
best of our knowledge.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] Jarvisalo et. al. developed a formal system to model clause learning,
forgetting and formula simplification techniques to understand the side-effects
of the combination of different rules. They drew our attention to the interplay
of the learned clause database with inprocessing techniques. The interplay of
clause sharing and formula simplification techniques in parallel SAT solvers was
analyzed in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], where the state of a sequential SAT solver was modelled just as
the working formula. We believe that Generic CDCL is an important fragment
to understand sequential SAT solvers with inprocessing and their cooperation in
the parallel-portfolio setting with clause sharing.
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>The propositional satisfiability problem is of great practical interest and can be
efficiently answered by modern SAT solvers like Riss, Lingeling or MiniSAT.
Today, modern SAT solvers are highly tuned proof procedures with many advaned
techniques. Therefore, it is desireable to investigate SAT solving techniques in
combination with each other and to abstract from implementations.</p>
      <p>In this paper, we developed Generic CDCL, a formalism that models the
computation of modern SAT solvers in terms of a state transition system, where
each transition rule abstracts a component in a SAT solver. In particular, the
transition rules INF and INP model formula simplification techniques like blocked
clause elimination and inference techniques such as the pure literal rule. We
have examined invariants in Generic CDCL and have shown that Generic CDCL
is sound. In contrast to previous work on formalizations of SAT solvers, we
can model all recent techniques. The findings add to our understanding of the
interplay of inprocessing techniques with the other components of SAT solvers.</p>
      <p>A limitation of Generic CDCL is its lack of details in the learning and
inference component. As future work, we plan to investigate properties such as
completeness, confluence and termination.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Arnold</surname>
          </string-name>
          , H.:
          <article-title>A linearized DPLL calculus with clause learning</article-title>
          .
          <source>Tech. rep., Universita¨t Potsdam</source>
          .
          <source>Institut fu¨r Informatik</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Audemard</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simon</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Predicting learnt clauses quality in modern SAT solvers</article-title>
          . In: Boutilier,
          <string-name>
            <surname>C. (ed.) IJCAI</surname>
          </string-name>
          <year>2009</year>
          . pp.
          <fpage>399</fpage>
          -
          <lpage>404</lpage>
          . Morgan Kaufmann Publishers Inc.,
          <string-name>
            <surname>Pasadena</surname>
          </string-name>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Biere</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heule</surname>
            , M., van Maaren,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walsh</surname>
          </string-name>
          , T. (eds.):
          <article-title>Handbook of Satisfiability</article-title>
          . IOS Press, Amsterdam (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Cook</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          :
          <article-title>The complexity of theorem-proving procedures</article-title>
          . In: Harrison,
          <string-name>
            <given-names>M.A.</given-names>
            ,
            <surname>Banerji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.B.</given-names>
            ,
            <surname>Ullman</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.D. (eds.) STOC</surname>
          </string-name>
          <year>1991</year>
          . pp.
          <fpage>151</fpage>
          -
          <lpage>158</lpage>
          . ACM (
          <year>1971</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Logemann</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Loveland</surname>
            ,
            <given-names>D.:</given-names>
          </string-name>
          <article-title>A machine program for theorem-proving</article-title>
          .
          <source>Commun. ACM</source>
          <volume>5</volume>
          (
          <issue>7</issue>
          ),
          <fpage>394</fpage>
          -
          <lpage>397</lpage>
          (
          <year>1962</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. E´en, N., So¨rensson, N.:
          <article-title>An extensible SAT-solver</article-title>
          . In: Giunchiglia,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Tacchella</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . (eds.)
          <article-title>SAT 2003</article-title>
          .
          <article-title>LNCS</article-title>
          , vol.
          <volume>2919</volume>
          , pp.
          <fpage>502</fpage>
          -
          <lpage>518</lpage>
          . Springer, Heidelberg (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Gelfond</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lifschitz</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>The stable model semantics for logic programming</article-title>
          . In: Kowalski,
          <string-name>
            <given-names>R.A.</given-names>
            ,
            <surname>Bowen</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.A. (eds.) ICLP</surname>
          </string-name>
          <year>1988</year>
          . pp.
          <fpage>1070</fpage>
          -
          <lpage>1080</lpage>
          . MIT Press (
          <year>1988</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Gomes</surname>
            ,
            <given-names>C.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Selman</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Crato</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kautz</surname>
          </string-name>
          , H.:
          <article-title>Heavy-tailed phenomena in satisfiability and constraint satisfaction problems</article-title>
          .
          <source>J. Autom. Reason</source>
          .
          <volume>24</volume>
          (
          <issue>1-2</issue>
          ),
          <fpage>67</fpage>
          -
          <lpage>100</lpage>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Großmann</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Ho¨lldobler,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Manthey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Nachtigall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Opitz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Steinke</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          :
          <article-title>Solving periodic event scheduling problems with SAT</article-title>
          . In: Jiang,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            ,
            <surname>Ali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <surname>X</surname>
          </string-name>
          . (eds.) IEA /
          <article-title>AIE 2012</article-title>
          .
          <article-title>LNCS</article-title>
          , vol.
          <volume>7345</volume>
          , pp.
          <fpage>166</fpage>
          -
          <lpage>175</lpage>
          . Springer, Heidelberg (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. Ho¨lldobler,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Manthey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Saptawijaya</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>Improving resource-unaware SAT solvers</article-title>
          . In: Fermu¨ller, C.G.,
          <string-name>
            <surname>Voronkov</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . (eds.)
          <article-title>LPAR 2010</article-title>
          .
          <article-title>LNCS</article-title>
          , vol.
          <volume>6397</volume>
          , pp.
          <fpage>519</fpage>
          -
          <lpage>534</lpage>
          . Springer, Heidelberg (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. Ja¨rvisalo,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Biere</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Heule</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Blocked clause elimination</article-title>
          . In: Esparza,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Majumdar</surname>
          </string-name>
          ,
          <string-name>
            <surname>R</surname>
          </string-name>
          . (eds.)
          <article-title>TACAS 2010</article-title>
          .
          <article-title>LNCS</article-title>
          , vol.
          <volume>6015</volume>
          , pp.
          <fpage>129</fpage>
          -
          <lpage>144</lpage>
          . Springer, Heidelberg (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. Ja¨rvisalo,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Heule</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.J.H.</given-names>
            ,
            <surname>Biere</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>Inprocessing rules</article-title>
          . In: Gramlich,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Sattler</surname>
          </string-name>
          ,
          <string-name>
            <surname>U</surname>
          </string-name>
          . (eds.)
          <article-title>IJCAR 2012</article-title>
          .
          <article-title>LNCS</article-title>
          , vol.
          <volume>7364</volume>
          , pp.
          <fpage>355</fpage>
          -
          <lpage>370</lpage>
          . Springer, Heidelberg (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Lynce</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marques-Silva</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Efficient haplotype inference with Boolean satisfiability</article-title>
          .
          <source>In: AAAI 2006</source>
          . pp.
          <fpage>104</fpage>
          -
          <lpage>109</lpage>
          . AAAI Press (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Manthey</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Philipp</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wernhard</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Soundness of inprocessing in clause sharing SAT solvers</article-title>
          . In: Ja¨rvisalo, M.,
          <string-name>
            <surname>Van Gelder</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . (eds.)
          <article-title>SAT 2013</article-title>
          .
          <article-title>LNCS</article-title>
          , vol.
          <volume>7962</volume>
          , pp.
          <fpage>22</fpage>
          -
          <lpage>39</lpage>
          . Springer, Heidelberg (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. Mari´c, F.:
          <article-title>Formalization and implementation of modern SAT solvers</article-title>
          .
          <source>J. Autom. Reason</source>
          .
          <volume>43</volume>
          (
          <issue>1</issue>
          ),
          <fpage>81</fpage>
          -
          <lpage>119</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>Marques</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.P.</given-names>
            ,
            <surname>Sakallah</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.A.</surname>
          </string-name>
          :
          <article-title>GRASP: A search algorithm for propositional satisfiability</article-title>
          .
          <source>IEEE Transactions on Computers</source>
          <volume>48</volume>
          (
          <issue>5</issue>
          ),
          <fpage>506</fpage>
          -
          <lpage>521</lpage>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Moskewicz</surname>
            ,
            <given-names>M.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Madigan</surname>
            ,
            <given-names>C.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malik</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Chaff:
          <article-title>Engineering an efficient SAT solver</article-title>
          .
          <source>In: DAC 2001</source>
          . pp.
          <fpage>530</fpage>
          -
          <lpage>535</lpage>
          . ACM, New York (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Nieuwenhuis</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oliveras</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tinelli</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Abstract</surname>
            <given-names>DPLL</given-names>
          </string-name>
          <article-title>and abstract DPLL modulo theories</article-title>
          . In: Baader,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Voronkov</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . (eds.)
          <article-title>LPAR 2004</article-title>
          .
          <article-title>LNCS</article-title>
          , vol.
          <volume>3452</volume>
          , pp.
          <fpage>36</fpage>
          -
          <lpage>50</lpage>
          . Springer, Heidelberg (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Philipp</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Expressive Models for Parallel Satisfiability Solvers</article-title>
          .
          <source>Master thesis</source>
          , Technische Universita¨t
          <string-name>
            <surname>Dresden</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Rossi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Beek</surname>
            ,
            <given-names>P.v.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walsh</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Handbook of Constraint Programming. Elsevier Science Inc</article-title>
          ., New York (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>J.P.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sakallah</surname>
            ,
            <given-names>K.A.</given-names>
          </string-name>
          :
          <article-title>GRASP - a new search algorithm for satisfiability</article-title>
          .
          <source>In: ICCAD 1996</source>
          . pp.
          <fpage>220</fpage>
          -
          <lpage>227</lpage>
          . IEEE Computer Society, Washington (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>