<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>S. Schulz. E - A Brainiac Theorem Prover. Journal of AI Communications</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>6th International Workshop on the Implementation of Logics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Christoph Benzmu¨ller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bernd Fischer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Geoff Sutcliffe</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Universit¨at des Saarlandes</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Germany</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>chris@ags.uni-sb.de</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>University of Southampton</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>England</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>b.fischer@ecs.soton.ac.uk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>University of Miami</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>geoff@cs.miami.edu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Logic for Programming</institution>
          ,
          <addr-line>Artificial Intelligence and Reasoning</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2006</year>
      </pub-date>
      <fpage>111</fpage>
      <lpage>126</lpage>
      <kwd-group>
        <kwd>Held at</kwd>
        <kwd>The 13th International Conference on</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The 6th International Workshop
on the Implementation of Logics</p>
      <p>The IWIL workshop series brings together developers and users of systems that
implement reasoning in logic, to share information about successful implementation
techniques for automated reasoning systems and similar programs. Systems of all types
(automated, interactive, etc), and for all logics (classical, non-classical, all orders, etc)
are of interest to the workshop. Contributions that help the community to understand
how to build useful and powerful reasoning systems in practice are of particular interest.
Two invited papers, and six research papers selected from the submissions (by the
program committee listed below), will be presented. A panel discussion will enable
all attendees to participate in open discussion. Previous IWIL workshops include the
5th IWIL (Montevideo, Uruguay), 4th IWIL (Almati, Kazakhstan), 3rd IWIL (Tbilisi,</p>
    </sec>
    <sec id="sec-2">
      <title>Georgia), 2nd IWIL (Havana, Cuba), and 1st IWIL (Reunion Island).</title>
      <sec id="sec-2-1">
        <title>Journal Publication</title>
        <p>The Journal of Algorithms in Applied Logic, Artificial Intelligence and Computer Science
has agreed to a special issue based around around the topic of the IWIL workshop. The
special issue targets IWIL participants, but will also also accept submissions from the
broader community.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Program Committee</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Wolfgang Ahrendt (Chalmers University of Technology, Sweden)</title>
    </sec>
    <sec id="sec-4">
      <title>Anbulagan (National ICT Australia, Australia)</title>
    </sec>
    <sec id="sec-5">
      <title>Serge Autexier (Universit¨at des Saarlandes, Germany)</title>
    </sec>
    <sec id="sec-6">
      <title>Chad Brown (Universitt des Saarlandes, Germany)</title>
    </sec>
    <sec id="sec-7">
      <title>Hans de Nivelle (Max-Planck Institut fr Informatik, Germany)</title>
    </sec>
    <sec id="sec-8">
      <title>Alexander Fuchs (University of Iowa, USA)</title>
    </sec>
    <sec id="sec-9">
      <title>Thomas Hillenbrand (Max-Planck Institut fu¨r Informatik, Germany)</title>
    </sec>
    <sec id="sec-10">
      <title>Boris Konev (University of Liverpool, England)</title>
    </sec>
    <sec id="sec-11">
      <title>Konstantin Korovin (University of Manchester, England)</title>
    </sec>
    <sec id="sec-12">
      <title>Albert Oliveras (Technical University of Catalonia, Spain)</title>
    </sec>
    <sec id="sec-13">
      <title>Brigitte Pientka (McGill University, Canada)</title>
    </sec>
    <sec id="sec-14">
      <title>Stephan Schulz (Technische Universit¨at Mnchen, Germany)</title>
    </sec>
    <sec id="sec-15">
      <title>Volker Sorge (University of Birmingham, England)</title>
    </sec>
    <sec id="sec-16">
      <title>Alwen Tiu (Australian National University, Australia)</title>
    </sec>
    <sec id="sec-17">
      <title>Ullrich Hustadt (University of Liverpool, England)</title>
      <sec id="sec-17-1">
        <title>Presentation Schedule</title>
      </sec>
    </sec>
    <sec id="sec-18">
      <title>Session 1: 9:00-10:00</title>
    </sec>
    <sec id="sec-19">
      <title>Session 2: 10:30-11:00 11:00-11:30 11:30-12:00</title>
    </sec>
    <sec id="sec-20">
      <title>Session 3: 14:00-15:00</title>
    </sec>
    <sec id="sec-21">
      <title>Session 4:</title>
      <p>15:30-16:00
16:00-16:30
16:30-17:00
17:00-18:00</p>
    </sec>
    <sec id="sec-22">
      <title>Invited Talk: Algorithms and Data Structures for First-Order</title>
    </sec>
    <sec id="sec-23">
      <title>Equational Deduction</title>
      <p>Stephan Schulz</p>
    </sec>
    <sec id="sec-24">
      <title>Term Indexing for the LEO-II Prover</title>
      <p>Frank Theiß and Christoph Benzmu¨ller</p>
    </sec>
    <sec id="sec-25">
      <title>Integrating External Deduction Tools with ACL2</title>
      <p>Matt Kaufmann, J Strother Moore, Sandip Ray and Erik Reeber</p>
    </sec>
    <sec id="sec-26">
      <title>Efficiently Checking Propositional Resolution Proofs in Isabelle/HOL</title>
      <p>Tjark Weber</p>
    </sec>
    <sec id="sec-27">
      <title>Invited Talk: Implementing an Instantiation-based Theorem Prover for First-order Logic</title>
      <p>Konstantin Korovin</p>
    </sec>
    <sec id="sec-28">
      <title>Tableau Decision Procedure for Propositional Intuitionistic Logic</title>
      <p>Guido Fiorino, Alessandro Avellone and Ugo Moscato</p>
    </sec>
    <sec id="sec-29">
      <title>LIFT-UP: Lifted First-Order Planning Under Uncertainty</title>
      <p>Steffen Ho¨lldobler and Olga Skvortsova</p>
    </sec>
    <sec id="sec-30">
      <title>Multiple Preprocessing for Systematic SAT Solvers</title>
      <p>Anbulagan and John Slaney</p>
    </sec>
    <sec id="sec-31">
      <title>Panel Interrogation: Experts on the implementation of logics will answer questions from the workshop attendees</title>
      <p>Anbulagan, Martin Giese, Konstantin Korovin, Stephan Schulz</p>
      <sec id="sec-31-1">
        <title>Acknowledgements</title>
        <p>We wish to thank the following for their contribution to the success of this
conference: Air Force Office of Scientific Research, Asian Office of Aerospace Research and</p>
      </sec>
    </sec>
    <sec id="sec-32">
      <title>Development.</title>
      <p>Algorithms and Data Structures
for
First-Order Equational Deduction
(extended abstract)</p>
      <p>Stephan Schulz
Technische Universit¨at Mu¨nchen
schulz@eprover.org
1</p>
      <sec id="sec-32-1">
        <title>Introduction</title>
        <p>First-order logic with equality is one of the most widely used logics. While there is a
large number of different approaches to theorem proving in this logic, the field has been
dominated by saturation-based systems using some variant of the superposition
calculus [BG90, BG94, BG98, NR01], i.e. systems that employ paramodulation, restricted
by ordering constraints and possibly literal selection, as the main inference mechanism,
and rewriting and subsumption as the main redundancy elimination techniques. Many
systems complement equational reasoning with explicit resolution for non-equational
literals. Examples of provers based on the combination of paramodulation, rewriting
and (possibly) resolution include SPASS [WBH+02], Vampire [RV01], Otter [MW97],
its successor Prover9, and E [Sch02, Sch04b].</p>
        <p>The power of a saturating prover depends on four different, but interrelated aspects:
• The calculus (What inferences are necessary and possible?)
• The inference engine (How are they implemented?)
• The search organization (How is the the proof state organized and which invariants
are maintained?)
• The heuristic control of the search (Which subset of inferences is performed and
in what order?)</p>
        <p>In this talk I will discuss the basic concepts of the given clause saturation algorithm
and its implementation. In particular, I will describe the behaviour of the DISCOUNT
loop version of this algorithm on practical examples, and discuss how this affected the
choice of algorithms and data structures for E. I will try point out some low-hanging
fruit, where a lot of performance can be gained for relatively modest investment in code
complexity, as well as some more advanced techniques that have a significant pay-off.</p>
        <p>Rewrite-Based Theorem Proving
Superposition is a refutational calculus. It attempts to make the potential
unsatisfiability of a formula (consisting of axioms and negated conjecture) explicit using saturation.
The search state is represented by a set of first-order clauses (disjunctions of literals
over terms). The proof search employs two different mechanisms. First, new clauses are
added to the proof state by generating inference rules, using existing clauses as premises.
Secondly, simplifying or contracting inference rules are used to remove clauses or to
replace them by simpler ones. The proof is successful if this process eventually produces
the empty clause, i.e. an explicit contradiction.</p>
        <p>While the generating inferences are crucial to establish the theoretical completeness
of the calculus, extensive use of contracting inferences has turned out to be indispensable
for actually finding proofs.</p>
        <p>The practical difficulty of implementing a high-performance theorem prover results
mostly from the fact that the proof state grows extremely fast with the depth of the
proof search. A successful implementation has to be able to handle large data sets,
efficiently find potential inference partners, and in particular, be able to quickly identify
clauses that can be simplified or removed.</p>
        <p>While the number of inference and simplification rules can be much larger, in nearly
all cases only three rules are critical from a performance point of view:
• Superposition (including resolution as a special case) or a similar restricted form
of paramodulation is by far the most prolific generating inference rule. Typically,
between 95% and 99% of clauses in a non-trivial proof search are generated by
a paramodulation inference. Paramodulation can essentially be described as a
combination of instantiation (guided by unification) and lazy conditional rewriting.
For superposition, this inference is further restricted by ordering constraints (a
smaller term cannot be replaced by a larger term) and literal selection (only certain
literals need to be considered as applicable or as targets of the application).
• Unconditional rewriting allows the simplification of a clause by replacing a term
with an equivalent, but simpler term. In contrast to superposition, no instantiation
of the rewritten clause takes place, and a term is always replaced by a smaller
one. As a consequence, this is a simplifying inference, and the original of the
rewritten clause can be discarded. Practical experience has shown that in most
cases rewriting drastically improves the search behaviour of a prover.
• Subsumption allows the elimination of clauses if a more general clause already is
known. As such, it helps to reduce the search space explosion typical for saturating
provers. However, finding subsumption relations between clauses can be very
expensive.</p>
        <p>For completeness, it is necessary to eventually consider all combinations of
nonredundant clauses as premises for generating inferences. To avoid the overhead of
keeping track of each possible combination, the given-clause algorithms splits the proof
state into two distinct subsets, the set P of processed (or active) clauses, and the set
U of unprocessed (or passive) clauses, and maintains the invariant that all necessary
inferences between clauses in P have been performed. On the most abstract level, the
algorithm picks a clause from U , performs all inferences with this clause and clauses
from P (adding the resulting newly deduced clauses to U ), and puts it into P . This
process is repeated until either the empty clause is deduced, the set U runs empty (in
which case there is not proof), or some time or resource limit has been reached (in which
case the prover terminates without a useful result). The major heuristic decision in this
algorithm is in which order clauses from U are picked for processing.</p>
        <p>The two main variants of the given clause algorithm differ in how they integrate
simplification into this basic loop. The Otter loop maintains the whole proof state in
a fully simplified (or interreduced) state. It uses all unit equations for rewriting and
all clauses for subsumption attempts. The DISCOUNT loop, on which E is built, only
maintains this invariant for the set P of processed clauses. It simplifies newly generated
clauses once (to weed out obviously redundant clauses and to aid heuristic evaluation,
but it does not use them for rewriting or subsumption until they are actually selected
for processing. It thus trades reduced simplification for a higher rate of iterations of the
main loop. Opinions differ on which of the two designs is more efficient (see e.g. [RV03]).
3</p>
        <p>Representing the Proof State
Analysis of the behavior of provers over a large set of examples has shown that the size of
U typically grows about quadratically with the size of P . Therefore for non-trivial proof
problems, the vast majority of clauses is in U , and unprocessed clauses are responsible
for most of the resources used by the prover. The overall proof state can easily reach
millions clauses, with a corresponding number of literals and terms as their constituents.</p>
        <p>Surprisingly, with naive implementations much of the CPU time can be taken up
with seemingly trivial operations. In the first version of DISCOUNT [ADF95, DKS97]
we found to our surprise that assumedly complex operations like first-order unification
were negligible, while linear time operations like the naive insertion of clauses into U
(organized as a linear list sorted by heuristic evaluation) took up around half of the
total search time.</p>
        <p>The first and most basic data type for a first-order prover is the term. Simple,
direct implementations realize terms as ordered trees, with each node labeled by a
function or variable symbol, and with the subterms of a term as successor nodes in
the tree. Alternatives to this basic design are either increasingly optimized for size, as
flat-terms, string terms, and eventually implicit terms (reconstructed on demand) as in
the Waldmeister prover [GHLS03], or they try to store more and more precomputed
information at the term nodes to speed up repetitive operations. This is particularly
successful if sets of terms are not represented as sets of trees, but as a shared directed
acyclic graph, with repeated occurrences of any given term or subterm only represented
once. This approach has been followed in E. We found sharing factors varying from 5-15
in the unit equational case, up to several 1000 in the general non-Horn case.</p>
      </sec>
      <sec id="sec-32-2">
        <title>Rewriting</title>
        <p>Shared terms do not only allow a reduction in memory consumption, they also allow
the sharing of (some) term-related inferences. In particular, they allow the sharing of
rewrite operations and, even more importantly, normal form information among terms.
For any rewritten term, we can add a link pointing to the result of the rewrite operation.
If we encounter the same term again in the future, we can just follow this link instead
of performing a real search for matching and applicable rewrite rules. Less glorious, but
not less effective, is the sharing of information about non-rewritability. In either version
of the given clause algorithm, the set of potential rewrite rules changes over time, and
the strength of the rewrite relation grows monotonically. If a term is in normal form
with respect to all rules at a given time, no older rule has to be ever considered again for
rewriting this term. This criterion can be integrated into indexing techniques to further
speed up normal form computation.
5</p>
      </sec>
      <sec id="sec-32-3">
        <title>Subsumption</title>
        <p>A single rewriting step is usually cheap to perform. The high cost of rewriting comes
from the large number of term positions and rules to consider. For subsumption, on
the other hand, even a single clause-clause check can be very expensive, as the problem
is known to be NP-complete [KN86]. Unless reasonable care is taken, the exponential
worst case can indeed be encountered, and with clauses that are large enough that this
hurts performance significantly1.</p>
        <p>To overcome this problem, a number of strategies can be implemented. First,
presorting of literals with a suitable ordering stable under substitutions can greatly reduce
the number of permutations that need to be considered. Secondly, a number of required
conditions for subsumption can be tested rather cheaply. If any of these tests fail, the
full subsumption test becomes superfluous.</p>
        <p>Feature vector indexing [Sch04a] arranges several of these tests in a way that they
can be performed not only for single clauses, but for sets of clauses at a time.
6</p>
      </sec>
      <sec id="sec-32-4">
        <title>Conclusion References</title>
        <p>Engineering a modern first-order theorem prover is part science, part craft, and part
art. Unfortunately, there is no single exposition of the necessary knowledge available at
the moment - something that the community should aim to rectify.</p>
        <p>J. Avenhaus, J. Denzinger, and M. Fuchs. DISCOUNT: A System for
Distributed Equational Deduction. In J. Hsiang, editor, Proc. of the 6th RTA,
Kaiserslautern, volume 914 of LNCS, pages 397–402. Springer, 1995.
1This is an understated version of “The prover stops cold”.
[BG94]
[BG98]
[DKS97]</p>
        <p>L. Bachmair and H. Ganzinger. On Restrictions of Ordered Paramodulation
with Simplification. In M.E. Stickel, editor, Proc. of the 10th CADE,
Kaiserslautern, volume 449 of LNAI, pages 427—441. Springer, 1990.</p>
        <p>L. Bachmair and H. Ganzinger. Rewrite-Based Equational Theorem
Proving with Selection and Simplification. Journal of Logic and Computation,
3(4):217–247, 1994.</p>
        <p>L. Bachmair and H. Ganzinger. Equational Reasoning in Saturation-Based
Theorem Proving. In W. Bibel and P.H. Schmitt, editors, Automated
Deduction — A Basis for Applications, volume 9 (1) of Applied Logic Series,
chapter 11, pages 353–397. Kluwer Academic Publishers, 1998.</p>
        <p>J. Denzinger, M. Kronenburg, and S. Schulz. DISCOUNT: A
Distributed and Learning Equational Prover. Journal of Automated Reasoning,
2(18):189–198, 1997. Special Issue on the CADE 13 ATP System
Competition.
[KN86]
[MW97]
[NR01]
[RV01]
[RV03]
[Sch02]
[Sch04a]</p>
        <p>D. Kapur and P. Narendran. NP-Completeness of the Set Unification and
Matching Problems. In J.H. Siekmann, editor, Proc. of the 8th CADE,
Oxford, volume 230 of LNCS, pages 489–495. Springer, 1986.</p>
        <p>W.W. McCune and L. Wos. Otter: The CADE-13 Competition
Incarnations. Journal of Automated Reasoning, 18(2):211–220, 1997. Special Issue
on the CADE 13 ATP System Competition.</p>
        <p>R. Nieuwenhuis and A. Rubio. Paramodulation-Based Theorem Proving. In
A. Robinson and A. Voronkov, editors, Handbook of Automated Reasoning,
volume I, chapter 7, pages 371–443. Elsevier Science and MIT Press, 2001.
A. Riazanov and A. Voronkov. Vampire 1.1 (System Description). In
R. Gor´e, A. Leitsch, and T. Nipkow, editors, Proc. of the 1st IJCAR, Siena,
volume 2083 of LNAI, pages 376–380. Springer, 2001.</p>
        <p>A. Riazanov and A. Voronkov. Limited resource strategy in resolution
theorem proving. Journal of Symbolic Computation, 36(1–2):101–115, 2003.</p>
        <p>S. Schulz. Simple and Efficient Clause Subsumption with Feature Vector
Indexing. In G. Sutcliffe, S. Schulz, and T. Tammet, editors, Proc. of
the IJCAR-2004 Workshop on Empirically Successful First-Order Theorem
Proving, Cork, Ireland, ENTCS. Elsevier Science, 2004. (to appear).
Term Indexing for the LEO-II Prover</p>
        <p>Frank Theiß1</p>
        <p>Christoph Benzmu¨ller2
1FR Informatik, Universit¨at des Saarlandes, Saarbru¨cken, Germany</p>
        <p>lime@ags.uni-sb.de
2Computer Laboratory, The University of Cambridge, UK
chris@ags.uni-sb.de</p>
        <p>Abstract</p>
        <p>We present a new term indexing approach which shall support efficient
automated theorem proving in classical higher order logic. Key features of our indexing
method are a shared representation of terms, the use of partial syntax trees to
speedup logical computations and indexing of subterm occurrences. For the
implementation of explicit substitutions, additional support is offered by indexing of
bound variable occurrences. A preliminary evaluation of our approach shows some
encouraging first results.
1</p>
        <p>Introduction
Term indexing has become standard in first order theorem proving and is applied in
all major systems in this domain [RV02, Sch02, WBH+02]. An overview on first order
term indexing is given in [RSV01] and [NHRV01] presents an evaluation of different
techniques. Comparably few term indexing techniques have been developed and studied
for higher order logic. An example is Pientka’s work on higher order substitution tree
indexing [Pie03].</p>
        <p>In this paper we present a new approach to higher order term indexing developed for
the higher order resolution prover LEO-II1, the successor of LEO [BK98]. Our approach
is motivated by work presented in [TSP06], which studies the application of indexing
techniques for interfacing between theorem proving and computer algebra.</p>
        <p>Pientka’s approach is based on substitution tree indexing and relies on unification of
linear higher order patterns. While higher order pattern unification is a comparatively
high level operation, the approach we present here is based on coordinate and path
indexing [Sti89] and thus relies on lower level operations, for example, operations on
hashtables. Apart from indexing and retrieval of terms, we particularly want to speedup
basic operations such as replacement of (sub-)terms and occurs checks.</p>
        <p>1The LEO-II project at Cambridge University has just started (in October 2006). The project is
funded by EPSRC under grant EP/D070511/1.</p>
        <p>Terms in de Bruijn Notation
LEO (and its successor LEO-II under development) is based on Church’s simple type
theory, that is, a logic built on top of the simply typed λ-calculus [Chu40]. In contrast
to LEO, our new term data structure for LEO-II uses de Bruijn [dB72] indices for the
internal representation of bound variables. In this paper, de Bruijn indices have the form
xi, where x is a nameless dummy and i the actual index. Constants and free variables
in LEO-II, called symbols in the remainder of this paper, have named representations.
Due to Currying, applications have only one argument term in LEO-II.2</p>
        <p>Terms in LEO-II are thus defined as follows:
• Symbols are either constant symbols (taken from an alphabet Σ) or (free,
existential) variable symbols (taken from an alphabet V). Every symbol is a term.
• Bound variables, represented by de Bruijn indices xi for some index i ∈ {1, 2, . . .},
are terms.
• If s and t are terms, then the application s@t is a term.
• If t is a term, then the abstraction λ.t is a term.</p>
        <p>For bound variables xi, the de Bruijn index i denotes the distance between the
variable and its binder in terms of scopes. Scopes are limited by occurrences of λ-binders,
thus the index i is determined by the number of occurrences of λ-binders between the
variable and its binder.</p>
        <p>For instance, the term
translates to
λa.λb.(b = ((λc.(cb))a))
λλ.(x0 = ((λ.(x0x1))x1))
in de Bruijn notion. While de Bruijn indices ease the treatment of α-conversion in
the implementation, they are less intuitive. As it can be seen in the above example,
different occurrences of the same bound variable may have different de Bruijn indices.
This is the case here for b, which translates to both x0 and x1. Vice versa, different
occurrences of the same de Bruijn index may refer to different λ-binders. This is the case
for x0, which relates to both the bound variable b (first occurrence of x0) and the bound
variable c (second occurrence of x0). Similarly, x1 is related to the bound variables b
and the bound variable a.</p>
        <p>2Alternative representations, for example, spine notation [CP97], offer at first sight shorter paths to
term parts that are relevant for a number of operations. The difference is primarily the order in which
the parts of a term can be accessed. In the case of the spine notation, for example, the head symbol of
a term can be directly accessed. In our approach we try to offer these shortcuts by representing indexed
terms in a graph structure. This allows to adopt additional ways of accessing (sub-)terms by introducing
additional graph edges. For instance, the head symbol of each term and its position are indexed in our
data structure.</p>
        <p>The presentation of LEO-II’s type system (simple types) is omitted here. With
respect to LEO-II’s term indexing, typing only provides an additional criterion for the
distinction between terms, for example, different occurrences of the same de Bruijn
index may have different types. Apart from this, typing has no further impact on the
indexing mechanism. Overall correctness is ensured outside the index structure, that is,
by indexing only terms in βη normal form.
3</p>
        <p>The Index
The indexing mechanism we describe here supports fast access to indexed terms and its
subterms. The index will be used in LEO-II to store intermediate results (consisting
of clauses, literals, and terms in literals) of LEO-II’s resolution based proof procedure.
The idea is that these intermediate results are always kept in βη normal form (that is, η
short and β normal; for example, the term (λ.(λ.(x1x0))))((λ.gx0)c) has the βη normal
form (gc)). Hence, βη normalisation is an invariant of our approach and we assume that
we never insert non-normal terms to an index.</p>
        <p>Key features of our indexing mechanism are a shared representation of terms (see
Section 3.1), the use of partial syntax trees to speedup logical computations (see
Section 3.2) and the indexing of subterm occurrences (see Section 3.3). For the
implementation of explicit substitutions [FKP96, ACCL90], additional support is offered by
indexing of bound variable occurrences (see Section 3.4).</p>
        <p>Partial syntax trees are used to index occurrences of symbols and subterms within a
term. They help to avoid occurs checks and to early prune superfluous branches in the
implementation of operations like replacement, substitution or β-reduction. Checking
whether or not a symbol occurs within a term or in a given branch of its syntax tree
requires only constant time.</p>
        <p>The implementation of the index is furthermore based on the use of cascaded
hashtables, for example, to index application terms according to their function or argument
term. This allows requests for terms in a style similar to SQL [Ame92]. For example,
indexing of applications is realised similar to an SQL table Applications featuring the
columns appl for application terms, func for their respective function terms and arg for
their argument terms. Retrieval of terms is similar to SQL queries like “select appl
from Applications where func=t”, which returns terms whose function term is t.
3.1</p>
        <p>Shared Representation of Terms
Terms in LEO-II have a perfectly shared representation, that is, all occurrences of
syntactically equal terms (in de Bruijn notation) are represented by a single instance. An
exception are bound variables, where instances of the same variable may have different
de Bruijn indices. The treatment of bound variables is described further in Section 3.4.</p>
        <p>Terms are represented as term nodes. Term nodes are numbered by n ∈ {1, 2, . . .} in
the order they are created. In the following, term nodes are referred to either by their
number or by their graph representation, which is defined as follows:
• For each symbol s ∈ Σ occurring in some term, a term node symbol(s) is created.
• For each bound variable xi occurring in some term, a term node bound(i) is created.
• If an application s@t occurs in some term, where s is represented by term node i
and t by term node j, a term node application(i, j) is created.
• If an abstraction λt occurs in some term, where t is represented by term node i, a
term node abstraction(i) is created.</p>
        <p>This graph representation of terms is implemented using hashtables:
• Hashtable abstr with scope : IN → IN is used to lookup abstractions with a
given scope i.
• Hashtable appl with func : IN → IN → IN is used to lookup an application with
a given function i and argument j.
• Hashtable appl with arg : IN → IN → IN is used to lookup an application with
a given argument j and function j. This is similar to appl with func, but the
hashtable keys are used here in reversed order.</p>
        <p>This hashtable system can be employed to retrieve term nodes in a similar way as
in a relational database. It can be used to retrieve single terms as well as sets of terms,
for example, all application term nodes whose function term is represented by node i.
3.2</p>
        <p>Partial Syntax Trees
Term indexing in LEO-II is based on partial syntax trees (PST), a concept that is
newly introduced in this paper. Partial syntax trees are used to indicate positions of a
symbol or a particular subterm within a term. PSTs are called partial because they only
represent relevant parts of a term. Examples are PSTs recording symbol occurrences
in a term, where relevant part means, that the term part in question actually contains
an occurrence of that symbol. Such PSTs allow for early detection of branches in a
term’s syntax tree with no occurrences of a specific symbol, since these branches are not
represented in the PST for this symbol.</p>
        <p>In LEO-II’s term system (remember that this is based on simply typed λ-calculus
with Currying) a term position is defined as follows:
• While symbol nodes and bound variable nodes have no children in a term’s
syntax tree, abstraction nodes respectively application nodes have exactly one child
respectively exactly two children. The relative position of these children to their
parent node is described by either abstr (the relation between an abstraction
node and its scope), func or arg (the relation between an application node and
its function term respectively its argument term).
• A position is defined as a (possibly empty) sequence of relative positions. Starting
from the top position in a term, each entry of the sequence describes one traversal
step in the term’s syntax tree.
• An empty sequence of relative positions represents the root position or empty
position, which is the topmost position in a term.</p>
        <p>Consider, for example, the term (λ.x0)@(f @a). Its subterms occur at the following
positions:</p>
        <p>Based on this notion of positions, we introduce the notion of partial syntax trees.
As an example3, consider the term a = 0 · a, which translates in Curried form to
(= @a)@((·@0)@a). The example term’s syntax tree is given by:
=
a</p>
        <p>A partial syntax tree (PST) is a tree of nodes corresponding to positions in a term.
Each term position which occurs in a PST is represented as a node which
• has up to three child trees4 (these children are partial syntax trees which
correspond to the terms at one of the relative positions abstr, func or arg), and
• may be annotated by some data.</p>
        <p>A partial syntax tree t is denoted by pst (tabstr , tfunc , targ ), where tabstr is the PST of
the scope of t if t is an abstraction, and where tfunc and targ are the PSTs of the function
term and the argument term of t if t is an application. If no position in a branch of
the syntax tree is annotated by some data, this branch is empty and is denoted by an
underscore ( ).</p>
        <p>The PST corresponding to the whole term in the above example and its annotations
is thus given by:</p>
        <p>3We present a simple first order example here, since the the treatment of bound variables is special
and is described later in Section 3.4.</p>
        <p>4The data structure of PSTs can not only be partial, its structure can also exceed the syntax tree of
terms as defined in Section 2 if all three children of a node are nonempty. This may be the case when
using PSTs to represent coordinates, that is term positions which occur in some term. This is used when
building the index as described in Section 3.3, which is similar to Stickel’s path indexing and coordinate
indexing methods [Sti89, McC92].
with annotation =
with annotation a
with annotation ·
with annotation 0
with annotation a</p>
        <p>When the term is added to the index, however, not the PST of the entire term is
recorded, but the PSTs of each of the occurring symbols (and subterms). For example,
the PST of all occurrences of the symbol a in a = 0 · a is given by:
PST for a
·</p>
        <p>If a symbol occurs at a term position, the corresponding PST entry is annotated by
that symbol. If a branch of a term’s syntax tree has no occurrences of the symbol in
question, the PST contains no entry for this branch. The PST for occurrences of symbol
a in the above example is thus given by:
pa1 = pst ( , pa2, pa4)
pa2 = pst ( , , pa3)
pa3 = pst ( , , )
pa4 = pst ( , , pa5)
pa5 = pst ( , , )
with annotation a
with annotation a
Similarly, the PSTs for the remaining symbols are recorded:</p>
        <p>PST for =</p>
        <p>PST for ·</p>
        <p>PST for 0</p>
        <p>If the PST of all occurrences of a symbol (or subterm) t! in a given term t is available,
this provides a basis for speeding up replacements of t!. Also a costly occurs check is
avoided, since the existence or non-existence of a PST for a symbol can be used as
=
criterion. The nodes of the PST for t! determine the nodes in t that have to be modified
when performing the replacement operation, and all nodes in t that are not represented
in the PST for t! remain unchanged (i.e., the recursion over the term structure for
replacement operations is pruned early).</p>
        <p>When replacing a by (f @b) in the above example, the operation proceeds as follows:
• The operations starts at root position with term (= @a)@((·@0)@a) and with the
corresponding PST for a, pa1 = pst ( , pa2, pa4). As both the function child pa2
and the argument child pa3 of the PST are nonempty, the replacement operation
recurses over both the function term (= @a) and the argument term ((·@0)@a):
• To replace a in (= @a) with corresponding PST pa2 = pst ( , , pa3), only the
argument term has to be processed. The child PST corresponding to the function
term in pa2 is empty, indicating that there are no further occurrences of a in this
term. Thus we have:
• Finally a is replaced in the term a with corresponding PST pa3 respectively pa5.</p>
        <p>Both pa3 and pa5 have no child nodes and are annotated with a, so the result is
in both cases the replacement term (f @b):
The result of this operation is thus:</p>
        <p>During the operation, only those branches of the syntax tree with an actual
occurrence of a are processed and branches with no occurrences of a, here the terms =
and (·@0), are avoided. In this example, only five out of nine term nodes have to be
processed due to the guidance provided by the PST.</p>
        <p>As a probably useful indicator for the speedup for replacements obtainable this way
we therefore investigate the ratio of term size to PST size counted in nodes of the tree,
that is, the number of abstractions, applications, symbols and bound variables. As is
illustrated above, this ratio is a measure for the speedup which we expect for replacement
operations.</p>
        <p>In the above example, the term size is 9 (we have 9 nodes), which gives the following
rates for the occurring symbols:
Symbol
a
=
·
0</p>
        <p>PST size
5
3
4
4</p>
        <p>We examined an excerpt of Jutting’s Automath encoding of Landau’s book
Grundlagen der Analysis [vBJ77, Lan30] with over 900 definitions and theorems (see Section 4
for details) and found an average PST size/term size rate of 0.21 for symbol
occurrences. When indexing nonprimitive terms, too (that is, applications and abstractions),
this rate dropped to 0.12.
The index records whether and at which positions a subterm5 occurs in a term. Similar
to relational databases, both subterms occurring in a given term and terms in which
a given subterm occurs are indexed. Thus, the index can be used to find terms in the
database with occurrences of particular subterms and also to speed up logical operations
such as substitution by avoiding occurs checks.</p>
        <p>The index is built recursively, starting from symbols, which are the leaf nodes in
a term’s syntax tree. The only term which occurs in a symbol is the symbol itself at
root position. Nonprimitive terms, that is abstractions and applications are built up as
follows:
• The subterms occurring in an abstraction are all subterms which occur in the scope
of the abstraction. For a symbol whose occurrences in a term A are recorded in
the PST t!, its occurrences in the abstraction λ.A are given by t = pst (t!, , ).
• The subterms occurring in an application are all subterms which occur in its
function term or in its argument term. For a symbol whose occurrences in the
function and argument term are recorded in the PSTs t!func and t!arg , !the PST t
!
recording its occurrences in the application is given by t = pst ( , tfunc, targ ). If the
term occurs only in the argument respectively in the function of an application,
tfunc respectively targ is empty.</p>
        <p>Furthermore each primitive and nonprimitive term is recorded to occur as a subterm
of itself at root position.</p>
        <p>The result is a PST for each subterm of the term to be indexed, describing the
occurrences of this subterm. These PSTs are added to the hashtable occurrences.
Additionally, terms are indexed according to their subterms in a second hashtable occurs in.
A third hashtable is occurrs at, which is used to index terms according to subterms
at a given term position. Thus the core of the index consists of:</p>
        <p>5In the current implementation, both occurring symbols and nonprimitive subterms are indexed.
However, we plan to further evaluate the tradeoff between the speedup gained this way and the cost
for maintenance of the index. Depending on this evaulation, we may want to restrict the nonprimitive
subterms to be indexed, for example, by using their size as a criterion. Similar ideas have been examined
by McCune [McC92], for example, the effect of limitations on the length of paths used in path indexing.
• Hashtable occurrences : IN → IN → PST indexes occurrences of subterms (the
second key) in a given term (the first key). The indexed value is a PST of the
positions where the subterm occurs. If a subterm does not occur, then there is no
entry in the hashtable.
• Hashtable occurs in : IN → IN ∗ is used to index a list of all terms in which a
given subterm (the key) occurs.
• Hashtable occurrs at : pos → IN → IN ∗ is a hashtable to index all terms in which
a given subterm (the second key) occurs at a given position (the first key).</p>
        <p>For example, occurrences of symbol a in the example term (= @a)@((·@0)@a) are
indexed by the following hashtable updates (we assume that a is represented by term
node i and (= @a)@((·@0)@a) by term node j):
• add pst a with first key j and second key i in occurrences
• add j with key i in occurs in
• add j to the set hashed in occurs at with first key [func; arg] and second key i;
if no such set exists in the hashtable, add the singleton {j}
• add j to the set hashed in occurs at with first key [arg; arg] and second key i; if
no such set exists in the hashtable, add the singleton {j}</p>
        <p>The basic operations of adding a term to the index take constant time (except for
rehashing). The indexing of a term of length n takes time O(n).
3.4</p>
        <p>Bound Variables
Bound variables play a special role in the term system. To see this, remember our
example from the beginning, that is, the term λa.λb.((= b)((λc.(cb))a)) or, with de
Bruijn indices, λλ.((= x0)((λ.(x0x1))x1)). This example shows that two occurrences of
the same bound variable may have syntactically different de Bruijn indices and that the
de Bruijn indices of occurrences of different variables may be syntactically equal. It is
desirable to provide quick access to all variables bound by a given binder to speedup
β-reduction and related operations such as raising or lowering of bound variable indices.
We will now illustrate our solution to this issue. Remember that indexed terms are
always kept in βη normal form, hence, normalisation is mandatory after instantiation
of existential variables or expansion of defined constants (if the modified terms shall be
indexed again).</p>
        <p>The syntax tree of our example term in de Bruijn notation is</p>
        <p>In this case the bound variable b has two instances which are denoted by x0 and
x1, while x0 (resp. x1) can denote both c or b (resp. b or a). Since bound variables
are indexed as described in Section 3.3, this gives a somewhat scattered information
on where to find the variables that are bound by one particular λ-binder. This kind
of information, however, is important in practice, for example, to support efficient
βreduction. We therefore once more employ PSTs to describe the occurrences of variables
bound by one and the same λ-binder:
λ1
λ2
Variables bound by λ1</p>
        <p>Variables bound by λ2</p>
        <p>Variables bound by λ3</p>
        <p>For example, the PST indicating occurrences of variables bound by λ2 in the above
example is given by:
p1 = pst (p2, , )
p2 = pst (p3, , )
p3 = pst ( , p4, p6)
p4 = pst ( , , p5)
p5 = pst ( , , )
p6 = pst ( , p7, )
p7 = pst (p8, , )
p8 = pst ( , , p9)
p9 = pst ( , , )
with annotation 0
with annotation 1
λ1
λ2
Input: Two fluent terms Z1, Z2.</p>
        <p>Output: A complete set of substitutitons θ such that Z1 #θAC1 Z2.</p>
        <p>1. Deterministically match as many fluents of Z1 as possible to fluents of Z2. Substitute Z1 with
the substitution found. If some fluent of Z1 does not match any fluent of Z2, decide Z1 !θAC1 Z2.
2. ObjCon-based deterministically match as many fluents of Z1 as possible to fluents of Z2.</p>
        <p>Substitute Z1 with the substitution found. If some fluent of Z1 does not match any fluent of Z2,
decide Z1 !θAC1 Z2.
3. Build the substitution graph (V, E) for Z1 and Z2 with nodes v = (µ, i) ∈ V , where µ is a
matching candidate for Z1 and Z2, i.e., matches some fluent at position i in Z1 to some fluent in
Z2 and i ≥ 1 is referred to as a layer of v. Two nodes (µ1, i1) and (µ2, i2) are connected with an
edge iff µ1µ2 = µ2µ1 and i1 = i2. Delete all nodes (µ, i) with Xµ = o, for some X ∈ Vars (Z1)
and o ∈ Obj (Z2), and ObjC&amp;on(X, Z1, d) ! ObjCon(o, Z2, d) for some d. Find all cliques of size
|Z1| in (V, E). &amp;</p>
        <p>Algorithm 1: ObjCon-alltheta.
omit the algorithm for computing all cliques in a substitution graph. However, it can
be found in [KRS06].
6.2</p>
        <p>Experimental Evaluation
Figure 4 depicts the comparison timing results between the LitCon-based subsumption
reasoner, referred to as AllTheta, and its ObjCon-based opponent, referred to as
FluCaP. The results were obtained using RedHat Linux running on a 2.4GHz Pentium
IV machine with 2GB of RAM.</p>
        <p>We demonstrate the advantages of exploiting the object-based context information
on problems that stem from the colored Blocksworld and Pipesworld planning
scenarios. The Pipesworld domain models the flow of oil-derivative liquids through pipeline
segments connecting areas, and is inspired by applications in the oil industry. Liquids
are modeled as batches of a certain unit size. A segment must always contain a certain
number of batches (i.e., it must always be full). Batches can be pushed into pipelines
from either side, leading to the batch at the opposite end “falling” into the incident
area. Batches have associated product types, and batches of certain types may never
be adjacent to each other in a pipeline. Moreover, areas may never have constraints on
how many batches of a certain product type they can hold.</p>
        <p>For each problem, there have been done 1000 subsumption tests. The time limit
of 100 minutes has been allocated. The results show that FluCaP scales better than
AllTheta. It is best to observe on the problems of forteen-, twenty-, and
thirtyblocks. As empirical results demonstrate, the optimal value of the depth parameter for
Blocksworld and Pipesworld is four.</p>
        <p>The main reason for the computational gain of FluCaP is that it is less sensitive
to the growth of the depth parameter. Under the condition that the number of objects
in a state is strictly less than the number of fluents and other parameters are fixed,
the amount of object-based context information is strictly less than the amount of the
literal-based context information. Moreover, on the Pipesworld problems, FluCaP
requires two orders of magnitude less time than AllTheta.
10 blocks
3Context depth 4</p>
        <p>20 blocks
3Context depth 4
6 Areas</p>
        <p>FluCaP
AlTheta
FluCaP</p>
        <p>AlTheta
40
s 35
,m 30
e
itm 25
iton 20
pm 15
u
sb 10
uS 5
0 2
40
s 35
,m 30
e
itm 25
iton 20
pm 15
u
sb 10
uS 5</p>
        <p>0 2
s 10000
m
it,em 1000
n
top 100
i
m
u
sub 10
S
1 2 3 Co4ntext dep5th
5
5
40
s 35
,m 30
e
itm 25
iton 20
pm 15
u
sb 10
uS 5
0 2
40
s 35
,m 30
e
itm 25
iton 20
pm 15
u
sb 10
uS 5
0 2
100000
,sm 10000
e
tm 1000
i
n
itsuuobpSm 10100
1
40
s 35
,m 30
e
itm 25
iton 20
pm 15
u
sb 10
uS 5
0 2
1000
s
iemm 100
,
t
n
o
it
supm 10
b
u
S
FluCaP</p>
        <p>AlTheta
3Context depth 4</p>
        <p>30 blocks
3Context depth 4
8 Areas</p>
        <p>FluCaP
AlTheta
5
5
FluCaP</p>
        <p>AlTheta
3Context depth 4
4 Areas</p>
        <p>5
1 2 3 Co4ntext dep5th
10 Areas</p>
        <p>FluCaP</p>
        <p>AlTheta
2 3 Co4ntext depth
5
2 3 Co4ntext depth
5
We follow the symbolic dynamic programming (SDP) approach within Situation
Calculus (SC) of [BRP01] in using first-order state abstraction for FOMDPs. In the course
of first-order value iteration, a state space may contain redundant abstract states that
dramatically affect the algorithm’s efficiency. In order to achieve computational savings,
normalization must be performed to remove this redundancy. However, in the original
work by [BRP01] this was done by hand. To the best of our knowledge, the preliminary
implementation of the SDP approach within SC uses human-provided rewrite rules for
logical simplification. In contrast, [HS04] have developed an automated normalization
procedure for FOVI brings the computational gain of several orders of magnitude.
Another crucial difference is that our algorithm uses heuristic search to limit the number
of states for which a policy is computed.</p>
        <p>The ReBel algorithm by [KvOdR04] relates to LIFT-UP in that it also uses a
representation language that is simpler than Situation Calculus. This feature makes the
state space normalization computationally feasible.</p>
        <p>All the above algorithms can be classified as deductive approaches to solving FOMDPs.
They can be characterized by the following features: (1) they are model-based, (2) they
aim at exact solutions, and (3) logical reasoning methods are used to compute
abstractions. We should note that FOVI aims at exact solution for a FOMDP, whereas
LIFT-UP, due to the heuristic search that avoids evaluating all states, seeks for an
approximate solution. Therefore, it would be more appropriate to classify LIFT-UP as an
approximate deductive approach to FOMDPs.</p>
        <p>In another vein, there is some research on developing inductive approaches to solving
FOMDPs, e.g., by [FYG03]. The authors propose the approximate policy iteration
(API) algorithm, where they replace the use of cost-function approximations as policy
representations in API with direct, compact state-action mappings, and use a standard
relational learner to learn these mappings. A recent approach by [GT04] proposes an
inductive policy construction algorithm that strikes a middle-ground between deductive
and inductive techniques.
8</p>
        <p>Conclusions
We have proposed a new approach that combines heuristic search and first-order state
abstraction for solving first-order MDPs more efficiently. In contrast to existing systems,
which start with propositionalizing the decision problem at the outset of any solution
attempt, we perform lifted reasoning on the first-order structure of an MDP directly.
However, there is plenty remaining to be done. For example, we are interested in the
question of to what extent the optimization techniques applied in modern propositional
planners can be combined with first-order state abstraction.</p>
        <p>Acknowledgements
We thank anonymous reviewers for their comments on the previous versions of this
paper. We also thank Zhengzhu Feng for fruitful discussions and for providing us with
the executable of the symbolic LAO∗ planner. Olga Skvortsova was supported by a
grant within the Graduate Programme GRK 334 “Specification of discrete processes and
systems of processes by operational models and logics” under auspices of the Deutsche
Forschungsgemeinschaft (DFG).
[BBS95]
[BDH99]
[Bel57]
[BRP01]</p>
        <p>A. G. Barto, S. J. Bradtke, and S. P Singh. Learning to act using real-time
dynamic programming. Artificial Intelligence, 72(1-2):81–138, 1995.</p>
        <p>C. Boutilier, T. Dean, and S. Hanks. Decision-theoretic planning:
Structural assumptions and computational leverage. Journal of Artificial
Intelligence Research, 11:1–94, 1999.</p>
        <p>C. Boutilier, R. Reiter, and B. Price. Symbolic Dynamic Programming
for First-Order MDPs. In Bernhard Nebel, editor, Proceedings of the
Seventeenth International Conference on Artificial Intelligence (IJCAI’2001),
pages 690–700. Morgan Kaufmann, 2001.
[DKKN95] T. Dean, L. Kaelbling, J. Kirman, and A. Nicholson. Planning under time
constraints in stochastic domains. Artificial Intelligence, 76:35–74, 1995.
[GT04]
[HS90]
[HS04]
[HZ01]
[KL94]
[KN86]
[KRS06]</p>
        <p>Z. Feng and E. Hansen. Symbolic heuristic search for factored Markov
Decision Processes. In R. Dechter, M. Kearns, and R. Sutton, editors,
Proceedings of the Eighteenth National Conference on Artificial Intelligence
(AAAI’2002), pages 455–460, Edmonton, Canada, 2002. AAAI Press.</p>
        <p>A. Fern, S. Yoon, and R. Givan. Approximate policy iteration with a policy
language bias. In S. Thrun, L. Saul, and B. Sch¨olkopf, editors, Proceedings
of the Seventeenth Annual Conference on Neural Information Processing
Systems (NIPS’2003), Vancouver, Canada, 2003. MIT Press.</p>
        <p>C. Gretton and S. Thiebaux. Exploiting first-order regression in inductive
policy selection. In M. Chickering and J. Halpern, editors, Proceedings of the
Twentieth Conference on Uncertainty in Artificial Intelligence (UAI’2004),
Banff, Canada, July 2004. Morgan Kaufmann.</p>
        <p>S. H¨olldobler and J. Schneeberger. A new deductive approach to planning.
New Generation Computing, 8:225–244, 1990.</p>
        <p>S. H¨olldobler and O. Skvortsova. A Logic-Based Approach to Dynamic
Programming. In Proceedings of the Workshop on “Learning and Planning
in Markov Processes–Advances and Challenges” at the Nineteenth National
Conference on Artificial Intelligence (AAAI’04), pages 31–36, San Jose,
CA, July 2004. AAAI Press.</p>
        <p>E. Hansen and S. Zilberstein. LAO*: A heuristic search algorithm that
finds solutions with loops. Artificial Intelligence, 129:35–62, 2001.</p>
        <p>J.-U. Kietz and M. Lu¨bbe. An efficient subsumption algorithm for inductive
logic programming. In Proceedings of the Eleventh International Conference
on MAchine Learning, pages 130–138, 1994.</p>
        <p>D. Kapur and P. Narendran. NP-completeness of the set unification and
matching problems. In J.H.Siekman, editor, Proceedings of the Conference
on Automated Deduction, pages 489–495. Springer, Berlin, 1986. Lecture
Notes in Computer Science 230.</p>
        <p>E. Karabaev, G. Ramm´e, and O. Skvortsova. Efficient symbolic
reasoning for first-order MDPs. In Proceedings of the Workshop on ”Planning,
Learning and Monitoring with Uncertainty and Dynamic Worlds” at the
Seventeenth European Conference on Artificial Intelligence (ECAI’2006),
Riva del Garda, Italy, 2006. To appear.
[KvOdR04] K. Kersting, M. van Otterlo, and L. de Raedt. Bellman goes relational. In
C. E. Brodley, editor, Proceedings of the Twenty-First International
Conference in Machine Learning (ICML’2004), pages 465–472, Banff, Canada,
July 2004. ACM.
[Put94]</p>
        <p>M. L. Puterman. Markov Decision Processes - Discrete Stochastic Dynamic
Programming. John Wiley &amp; Sons, Inc., New York, NY, 1994.
[Rob65]</p>
        <p>R. Reiter. The Frame Problem in the Situation Calculus: A Simple Solution
(Sometimes) and a Completeness Result for Goal Regression. In Vladimir
Lifschitz, editor, Artificial Intelligence and Mathematical Theory of
Computation: Papers in Honor of John McCarthy, pages 359–380. Academic
Press, San Diego, CA, 1991.</p>
        <p>J.A. Robinson. A machine-learning logic based on the resolution principle.</p>
        <p>Journal of the Association for Computing Machinery, 12(1):23–41, 1965.
[SHW96]
[Thi98]</p>
        <p>T. Scheffer, R. Herbrich, and F. Wysotzki. Efficient θ-subsumption based
on graph algorithms. In Proceedings of the 6th International Workshop on
Inductive Logic Programming, pages 212–228, Berlin, August 1996. volume
1314 of LNAI.</p>
        <p>M. Thielscher. Introduction to the fluent calculus. Electronic Transactions
on Artificial Intelligence, 2(3-4):179–192, 1998.
[YLWA05] H.L.S. Younes, M.L. Littman, D. Weissman, and J. Asmuth. The first
probabilistic track of the International Planning Competition. Journal of
Artificial Intelligence Research, 24:851–887, 2005.
Multiple Preprocessing for Systematic SAT Solvers</p>
        <p>Anbulagan1, John Slaney1,2
1Logic and Computation Program, National ICT Australia Ltd.
2Computer Sciences Laboratory, Australian National University
{anbulagan, john.slaney}@nicta.com.au</p>
        <p>High-performance SAT solvers based on systematic search generally use either
conflict driven clause learning (CDCL) or lookahead techniques to gain efficiency.
Both styles of reasoning can gain from a preprocessing phase in which some form
of deduction is used to simplify the problem. In this paper we undertake an
empirical examination of the effects of several recently proposed preprocessors on both
CDCL and lookahead-based SAT solvers. One finding is that the use of multiple
preprocessors one after the other can be much more effective than using any one of
them alone, but that the order in which they are applied is significant. We intend
our results to be particularly useful to those implementing new preprocessors and
solvers.
1
In the last decade, the propositional satisfiability (SAT) has become one of the most
interesting research problems within artificial intelligence (AI). This tendency can be
seen through the development of a number of powerful SAT solvers, based on either
systematic search or stochastic local search (SLS), for solving various hard
combinatorial search problems such as automatic deduction, hardware and software verification,
planning, scheduling, and FPGA routing.</p>
        <p>The power of contemporary systematic SAT solvers derives not only from the
underlying Davis-Putnam-Logemann-Loveland (DPLL) algorithm but also from
enhancements aimed at increasing the amount of unit propagation, improving the choices of
variables for splitting or making backtracking more intelligent. Two of the most
important such enhancements are conflict driven clause learning (CDCL), made practicable on
a large scale by the watched literal technique, and one-step lookahead. These two tend
to exclude each other: the most successful solvers generally incorporate one or the other
but not both. The benefits they bring are rather different too, as is clear from the results
of recent SAT competitions. For problems in the “industrial” category, CDCL, as
implemented in MINISAT [ES03, SE05], siege [Rya04] and zChaff [MMZ+01, ZMMM01] is
currently the method of choice. On random problems, however, lookahead-based solvers
such as Dew Satz [AS05], Kcnfs [DD01] and March dl [HvM06] perform better.</p>
        <p>Lookahead, of course, is expensive at every choice node, while clause learning is
expensive only at backtrack points. Since half of the nodes (plus one) in any binary
tree are leaves, this difference is significant for lookahead-based solvers which process
nodes relatively slowly and gain, if at all, by reducing the search tree size. Looking
ahead is an investment in at-node processing which can pay off only if it results in more
informed choices with an impact on the total number of nodes visited. Where learnt
clauses prune the tree at least as effectively as complex choice heuristics, CDCL must
win. This seems to be the case in many classes of highly structured problems such as the
“industrial” ones in the SAT competitions. We have no clearer definition than anyone
else of “structure”, but are interested to find ways in which lookahead-based solvers
might detect and exploit it as well as the clause learners do.</p>
        <p>A noteworthy feature of many recent systems is a preprocessing phase, often using
inference by some variant of resolution, to transform problems prior to the search.
One suggestion we wish to make and explore is that such transformations may help
lookahead-based solvers to discover useful structure. That is, much of the reasoning done
by nogood inference might be done cheaply “up-front”, provided that the subsequent
variable choice heuristics are good enough to exploit it. In what follows, we are therefore
concerned mainly with the effects of preprocessing, including those of using multiple
preprocessors in series, on the performance of a lookahead-based solver Dew Satz on
problems where it does poorly in comparison with a clause learning solver (MINISAT). We
also include some results and remarks on benefits to be gained in the opposite direction,
where MINISAT is helped by preprocessing to attack problems to which Dew Satz is more
suited.</p>
        <p>In this paper we propose a multiple preprocessing technique to boost the performance
of systematic SAT solvers. The motivation for applying multiple preprocessors prior to
the systematic search process is clear: each preprocessor uses a different strategy in
objective to simplify clause sets derived from real-world problems that exhibit a great
deal structure such as symmetries, variable dependencies, clustering, and the like. Our
initial observation showed that each strategy works well for simplifying the structure
of some problems, at most of the time, from hard to easy. When a problem exhibits
different kinds of structure, then a single preprocessor has difficulty to simplify the
structures. In this case, we need to run multiple preprocessors one after the other.</p>
        <p>We report performance statistics for the two solvers, Dew Satz and MINISAT, with
and without combinations of five (and for one problem set, six) of the best contemporary
SAT preprocessors when solving parity, planning, bounded model checking and FPGA
routing benchmark problems from SATLIB and the recent SAT competitions. One
finding is that the use of multiple preprocessors one after the other can be much more
effective than using any one of them alone, but that the order in which they are applied
is significant. We intend our results to be particularly useful to those implementing new
preprocessors and solvers.</p>
        <p>The rest of the paper is organized as follows: section 2 addresses the related work.
In sections 3 and 4, we briefly describe the preprocessors and solvers examined in our
study. The main part of the paper consists of experimental results, and we conclude
with a few remarks and suggestions.
Resolution-based SAT preprocessors for CNF formula simplification have a dramatic
impact on the performance of even the most efficient SAT solvers on many benchmark
problems [LMS01]. The simplest preprocessor consists of just computing length-bounded
resolvents and deleting duplicate and subsumed clauses, as well as tautologies and any
duplicate literals in a clause.</p>
        <p>There are two most directly related works. The first one is that of Anbulagan
et al. [APSS06] which examined the integration of five resolution-based preprocessors
alone or the combination of them with stochastic local search (SLS) solvers. Their
experimental results show that SLS solvers benefit the present of resolution-based
preprocessing and multiple preprocessing techniques. And the second one is that of Lynce
and Marques-Silva [LMS01]. They only evaluated empirically the impact of some
preprocessors developed before 2001 including 3-Resolution, without considering multiple
preprocessing, on the performance of systematic SAT solvers. In recent years, many
other preprocessors, which are sophisticated, have been applied to modern propositional
reasoners. Among them are 2-SIMPLIFY [Bra01], the preprocessor in Lsat [OGMS02]
for recovering and exploiting Boolean gates, HyPre [Bac02, BW04], Shatter [ASM03]
for dealing with symmetry structure, NiVER [SP05] and SatELite [EB05]. We consider
some of these preprocessors plus 3-Resolution in our study.
3</p>
        <p>SAT Preprocessors
We describe briefly the six SAT preprocessors used in the experiments. The first
five are all based on resolution and its variants such as hyper-resolution.
Resolution [Qui55, DP60, Rob65] itself is widely used as a rule of inference in first order
automated deduction, where the clauses tend to be few in number and contain few
literals, and where the reasoning is primarily driven by unification. As a procedure for
propositional reasoning, however, resolution is rarely used on its own because in
practice it has not been found to lead to efficient algorithms. The sixth preprocessor is a
special-purpose tool for symmetry detection, which is important for one problem class
in the experiments.
3.1</p>
        <p>3-Resolution
k-Resolution is just saturation under resolution with the restriction that the parent
clauses are of length at most k. The special cases of 2-Resolution and 3-Resolution
are of most interest. 3-Resolution has been used in a number of SAT solvers, notably
Satz [LA97] and the SLS solver R+AdaptNovelty+ [APSS05] which won the satisfiable
random problem category in the SAT2005 competition. Since it is the preprocessor
already used by Satz, we expect it to work well with Dew Satz.
3.2</p>
        <p>2-SIMPLIFY
2-SIMPLIFY [Bra01] constructs an implication graph from all binary clauses in the
problem. Where there is an implication chain from a literal X to X, X can be deduced
as a unit which can be propagated. The method also collapses strongly connected
components, propagates shared implications, or literals implied in the graph by every
literal in a clause, and removes some redundant binary clauses. Experimental results
[Bra01, Bra04] show that systematic search benefits markedly from 2-SIMPLIFY on a
wide range of problems.
3.3</p>
        <p>HyPre
HyPre [BW04] also reasons with binary clauses, but incorporates full hyper-resolution,
making it more powerful than 2-SIMPLIFY. In addition, unit reduction and equality
reduction are incrementally applied to infer more binary clauses. It can be costly in
terms of time, but since it is based explicitly on hyper-resolution it avoids the space
explosion of computing a full transitive closure. HyPre has been used in the SAT solver,
2CLS+EQ [Bac02], and we consider it a very promising addition to many other solvers.
It is generally useful for exploiting implicational structure in large problems.
3.4</p>
        <p>NiVER
Variable Elimination Resolution (VER) is an ancient inference method consisting of
performing all resolutions on a chosen variable and then deleting all clauses in which
that variable occurs, leaving just the resolvents. It is easy to see that this is a complete
decision procedure for SAT problems, and almost as easy to see that it is not practicable
because of exponential space complexity. Recently, Subbarayan and Pradhan [SP05]
proposed NiVER (Non increasing VER) which restricts the variable elimination to the
case in which there is no increase in the number of literals after elimination. This shows
promise as a SAT preprocessor, improving the performance of a number of solvers [SP05].
3.5</p>
        <p>SatELite
E´en and Biere [EB05] proposed the SatELite preprocessor, which extends NiVER with
a rule of Variable Elimination by Substitution. Several additions including subsumption
detection and improved data structures further improved performance in both space and
time. SatELite was combined with MINISAT to form SatELiteGTI, the system which
dominated the SAT2005 competition on the crafted and industrial problem categories.
Since we use MINISAT for our experiments, it is obvious that SatELite should be one of
the preprocessors we consider.
3.6</p>
        <p>Shatter
It is clear that eliminating symmetries is essential to solving realistic instances of many
problems. None of the resolution-based preprocessors does this, so for problems that
involve a high degree of symmetry we added Shatter [AMS03] which detects symmetries
and adds symmetry-breaking clauses. These always increase the size of the clause set
and for satisfiable problems they remove some of the solutions, but they typically make
the problem easier by pruning away isomorphic copies of parts of the search space.
4.1</p>
        <p>MINISAT
As noted in Section 1, we concentrate on just two solvers: MINISAT, which relies on
clause learning, and Dew Satz, which uses lookahead.</p>
        <p>So¨rensson and E´en [ES03, SE05] released the MINISAT solver in 2005. Its design is based
on Chaff, particularly in that it learns nogoods or “conflict clauses” and accesses them
during the search by means of two watched literals in each clause. MINISAT is quite
small (a few hundred lines of code) and easy to use either alone or as a module of a
larger system. Its speed in comparison with similar solvers such as zChaff comes from a
series of innovations of which the most important are an activity-decay schedule which
proceeds by frequent small reductions rather than occasional large ones, and an inference
rule for reducing the size of conflict clauses by introducing a restricted subsumption test.
The cited paper contains a brief but informative description of these ideas.
The solver Dew Satz [AS05] is a recent version of the Satz solver [LA97]. Like its parent
Satz, it gains efficiency by a restricted one-step lookahead scheme which rates some of
the neighbouring variables every time a choice must be made for branching purposes. Its
lookahead is more sophisticated than the original one of Satz, adding a DEW (dynamic
equality weighting) heuristic to deal with equalities. This enables the variable selection
process to avoid duplicating the work of weighting variables detected to be equivalent
to those already examined. Thus, while the solver has no special inference mechanism
for propositional equalities, it does deal tolerably well with problems containing them.
5
We present results on four benchmark problem sets chosen to present challenges for one
or other or both of the SAT solvers. The experiments were conducted on a cluster of 16
AMD Athlon 64 processors running at 2 GHz with 2 GB of RAM. Ptime in the tables
represents preprocessing time, while Stime represents solvers runtime without including
Ptime. The timebound of Stime is 15,000 seconds per problem instance. It is worth
noting that in our study the results of SatELiteGTI, the solver which dominated the
SAT2005 competition on the crafted and industrial problem categories, are represented
by the results of SatELite+MINISAT.
5.1</p>
        <p>The 32-bit Parity Problem
The 32-bit parity problem was listed by Selman et al. [SKM97] as one of ten challenges
for research on satisfiability testing. The ten instances of the problem are satisfiable.
The first response to this challenge was by Warners and van Maaren [WvM98] who
solved the par32-*-c problem (5 instances) using a special-purpose preprocessor to
deal with equivalency conditions. Two years later, Li [Li00] solved the ten par32*
instances by enhancing Satz’s search process with equivalency reasoning. Ostrowski et</p>
        <p>Prep.
par32-5
par32-1-c
par32-2-c
par32-3-c
par32-4-c
par32-5-c</p>
        <p>Orig
3Res
Hyp+3Res
Niv+3Res
3Res+Hyp
Sat+3Res
Orig
3Res
Hyp+3Res
Niv+3Res
Orig
3Res
Hyp+3Res
Niv+3Res
Sat+3Res
Orig
3Res
Sat
Hyp+3Res
3Res+Hyp
Niv+3Res
3Res+Niv
3Res+Sat
Sat+3Res
Sat+2Sim
Orig
Niv
Orig
3Res
Hyp+3Res
Orig
3Res
Hyp+3Res
Orig
3Res
Hyp
Hyp+3Res
Niv+3Res
Sat+3Res
Orig
3Res
Hyp+3Res
Niv+3Res
Sat+3Res
Orig
Niv+3Res
3176/10227/27501
2418/7463/19750
1313/6193/17203
1315/5948/16707
1313/5495/15810
849/5245/18660
3176/10253/27405
2392/7387/19550
1301/5975/16719
1303/5730/16223
3176/10297/27581
2395/7437/19738
1323/5961/16779
1325/5716/16283
848/5284/18878
&gt;15,000</p>
        <p>10,651</p>
        <p>n/a
17,335,530
17,391,333
13,476,105
17,335,492
34,569,968</p>
        <p>n/a
9,341,185
8,186,883
3,889,345</p>
        <p>n/a
9,711,576
9,708,520
9,710,552
2,206,369</p>
        <p>n/a
10,036,154
18,230,746
17,712,997
10,036,146
10,036,154
25,092,756
7,744,986
7,744,986
18,230,746</p>
        <p>n/a
8,440,212
9,669,012
21,738,376
8,013,977
22,878,571
n/a
al. [OGMS02], solved the problems with Lsat, which performs a preprocessing step to
recover and exploit the logical gates of a given CNF formula and then applies DPLL
with a Jeroslow-Wang branching rule. The challenge has now been met convincingly by
Heule et al. [HvM04] with their March eq solver, which combines equivalency reasoning
in a preprocessor with a lookahead-based DPLL and which solves all of the par32*
instances in seconds. Dew Satz is one of the few solvers to have solved any instances of
the 32-bit parity problem without special-purpose equivalency reasoning [AS05].</p>
        <p>Table 1 shows the results of running the lookahead-based solver Dew Satz and the
CDCL-based solver MINISAT on the ten par32 instances, with and without
preprocessing. As preprocessors we used 3-Resolution, HyPre, NiVER and SatELite alone and
followed by 3-Resolution for the last three. We eliminated 2-SIMPLIFY from this test
as it aborted the resolution process of the first five par32* instances presented in the
Table 1. We experimented also with all combination of two preprocessors for the
problems par32-1 and par32-4. Where lines are omitted from the table (e.g. there is no
line for HyPre on par32-1 and for SatELite+3-Resolution on par32-2), this is because
no single solver produced a solution for those simplified instances.</p>
        <p>It is evident from the table that these problems are seriously hard for both solvers.
Even with preprocessing, MINISAT times out on all of them except for par32-2 and
par32-5-c. Curiously, on par32-2 instance, preprocessing with 3-Resolution makes its
performance degrade a little. This is not a uniform effect: Table 4 below shows
examples in which MINISAT benefits markedly from 3-Resolution. Without preprocessing,
Dew Satz times out on nine of ten par32 instances, but in every case except par32-5
and par32-5-c 3-Resolution suffices to help it find a solution, and running multiple
preprocessors improves its performance.</p>
        <p>In general, Table 1 shows that multiple preprocessing contributes significantly to
enhance the performance of Dew Satz and the preprocessor 3-Resolution dominates the
contribution through either single or multiple preprocessing.
5.2</p>
        <p>A Planning Benchmark Problem
The ferry planning benchmark problems, taken from SAT2005 competition, are all easy
for MINISAT, which solves all of them in about one second without needing preprocessors.
Dew Satz, however, is challenged by them. The problems are satisfiable. We show the
Dew Satz and MINISAT results on the problems in Table 2. Clearly the original problems
contain some structure that CDCL is able to exploit but which is uncovered by one-step
lookahead. It is therefore interesting to see which kinds of reasoning carried out in a
preprocessing phase are able to make that same structure available to Dew Satz. Most
strikingly, reasoning with binary clauses in the manner of the 2-SIMPLIFY preprocessor
reduces runtimes by upwards of four orders of magnitude in some cases. HyPre, NiVER
and SatELite, especially HyPre, are also effective on these planning problems. In most
cases the number of backtracks reduces from million to less than 100 or even zero for
ferry8 v01a, ferry9 v01a, and ferry10 ks99a instances which means that the input
formula is solved at the root node of the search tree.
ferry7 ks99i
ferry7 v01i
ferry8 ks99a
ferry8 ks99i
ferry8 v01a
ferry8 v01i
ferry9 ks99a
ferry9 v01a</p>
        <p>Orig
3Res
Hyp
Niv
Sat
Sat+2Sim
Orig
3Res
Sat
Sat+2Sim
Orig
3Res
Hyp
Sat
Sat+2Sim
Orig
3Res
Sat
Sat+2Sim+3Res
Sat+2Sim+Hyp+3Res
Orig
3Res
Hyp
Hyp+Sat
Hyp+Sat+3Res
Hyp+Sat+2Sim
Orig
3Res
Sat
Sat+2Sim+3Res
Orig
3Res
Hyp
Niv
Sat
2Sim
Hyp+Sat
Sat+2Sim
Orig
3Res
Hyp
Hyp+Sat
Hyp+Sat+2Sim
Orig
3Res
Hyp
Niv
Sat
2Sim
Sat+2Sim
3Res+2Sim+Niv
Niv+Hyp+2Sim+3Res
Orig
3Res
Hyp
Hyp+Sat+3Res
Hyp+Sat+2Sim
1946/22336/45706
1930/22289/45621
1881/32855/66732
1543/21904/45243
1286/21601/50644
1279/56597/120318
1329/21688/50617
1329/21681/50505
1286/21609/50803
1286/64472/136299
1272/62208/131357
1259/15259/31167
1241/15206/31071
813/14720/34687
813/35008/75263
2547/32525/66425
2529/32472/66329
2473/48120/97601
1696/31589/74007
1683/83930/178217
854/14819/34624
854/14811/34480
846/38141/81268
813/38044/81364
813/38028/81196
813/36583/78442
1745/31688/73934
1745/31680/73790
1696/31598/74202
1696/96092/202904
1598/21427/43693
1578/21368/43586
1542/29836/60522
1244/21046/43264
1042/20765/48878
1569/20563/41976
1056/26902/72553
1042/50487/108322
1088/20878/48771
1088/20869/48591
1079/55371/117757
1042/55256/117861
1042/53394/114135
1977/29041/59135
1955/28976/59017
1915/40743/82551
1544/28578/58619
1299/28246/66432
1945/27992/57049
1299/69894/149728
1793/21099/43369
1532/24524/50463
1350/28371/66258
1350/28361/66038
1340/77030/163576
1299/76874/163442
1299/74615/159134
n/a
0.09
0.21
0.01
0.33
0.49
n/a
0.14
0.17
0.95
1.26
n/a
n/a
n/a
n/a
n/a
n/a
1
0
3,949
bmc-ibm-3
bmc-galileo-8
bmc-galileo-9
bmc-ibm-10
bmc-ibm-11</p>
        <p>Orig
Hyp
Niv
3Res
Sat
Hyp+3Res
Orig
Hyp
Niv
3Res
Sat
Hyp+3Res
Sat+3Res
Orig
Hyp
Niv
3Res
Sat
Hyp+3Res
Orig
Hyp
Niv
3Res
Sat
Orig
Hyp
Niv
3Res
Sat
Orig
Hyp
Niv
3Res
Sat
Niv+Hyp+
3Res
Orig
Hyp
Niv
3Res
Sat
3Res+Niv+
Hyp+3Res
Orig
Sat
Sat+Hyp
Sat+Niv
Sat+3Res
Sat+2Sim
Orig
Sat
Sat+Hyp
Sat+Niv
Sat+3Res
Sat+2Sim
Sat+2Sim+
3Res
12001/100114/253071
13215/65728/174164
5010/27248/78059
9226/57332/161962
10426/49594/129998
4549/34273/110676
3529/22589/62633
663443/3065529/7845396
12408/76025/247622
9091/61789/203593
12356/75709/246367
12404/77805/249192
10457/71128/229499
1080015/3054591/7395935
23657/112343/364874
13235/88976/263053
22983/108603/351369
23657/117795/380389
17470/129245/375444
16837/98726/305057
n/a
3.72
0.47
0.41
1.32
3.98
n/a
416
1.06
5.91
6.46
417
10.12
n/a
129
566
130
130
131
n/a
4,662,067
1,738</p>
        <p>379
2,088
2,374
1,744
206
462</p>
        <p>2
1,148
1,182
762</p>
        <p>2
799
1,186</p>
        <p>2
1,148
1,060
1,009</p>
        <p>2
4,277</p>
        <p>0
3,532
3,276
2,502
6,422</p>
        <p>160
5,607
7,875
5,481
11,887
1,513
8,702
10,243
6,219
1,937
8,088
1,018
9,181
30,687
9,324</p>
        <p>01-k15
01-k20
26-k70
26-k75
26-k85
26-k90</p>
        <p>Orig
Hyp
Niv
3Res
Sat
2Sim
Orig
3Res+Sat+
Niv+Hyp
3Res+Hyp+
Niv+3Res
3Res+Hyp+
3Res
Hyp
Niv
3Res
Sat
Orig
3Res+Hyp+
Niv+3Res
Hyp
Niv
3Res
Sat
2Sim
Orig
3Res
Hyp
Niv
Sat
Orig
3Res
Hyp
Niv
Sat
Orig
3Res
Hyp
Niv
Sat
Orig
Niv+3Res
Hyp+3Res
3Res
Hyp
Niv
Sat
9275/38802/98468</p>
        <p>n/a
6662/33394/90715
6498/27318/70158
3418/19648/62925
4379/59765/133585
11524/48585/123966
3382/25936/79364
4203/23731/60639
4732/24972/63133
4889/27056/71819
8068/41368/113317
9403/40059/103137
5198/30697/97961
15069/63760/163081
6382/34846/89807
7323/39150/104635
10533/54293/149192
12948/55490/142966
7179/42837/136537
9370/93921/217635
346561/1752741/4579945
346561/1756001/4588705
243461/1569549/4182061
155221/1354556/4075072
132670/1300914/4980854
371091/1877066/4904440
371091/1880536/4913780
260621/1680704/4477966
166195/1450679/4364543
141870/1392526/5327557
420151/2125716/5553430
420151/2129606/5563930
294941/1903014/5069776
187631/1641901/4941437
160270/1576770/6039510
n/a
150
338
479
109
n/a
161
364
4.95
117
n/a
183
417
5.69
132
n/a
121
600
195
446
5.91
140
1,420</p>
        <p>190
1,472
n/a
366
2,860
140</p>
        <p>1
130,013
13,449
19,701
178,245
n/a
n/a
n/a
n/a
29,629
n/a
n/a
n/a
n/a
n/a
n/a</p>
        <p>1
n/a
n/a
n/a
n/a</p>
        <p>1
n/a
n/a
n/a
n/a</p>
        <p>1
n/a
n/a
n/a
3,743
243
603
3,548
3,783
3,655
16,658
1,261
5,182
2,069
9,341
8,705
1,977
2,654,614
10
642
492
1,503,271
2,880,376
11
654
474
2,067,948
n/a
5
5
10
583
429
3,311,629</p>
        <p>Bounded Model Checking Problems
Another domain providing benchmark problem sets which appear to be easy for MINISAT
but sometimes hard for Dew Satz is bounded model checking. In Table 3 we report
results on five of eleven BMC-IBM problems, two BMC-galileo problems and two of four
BMC-alpha problems. All other benchmark problems in the BMC-IBM class are easy for
both solvers and so are omitted from the table. The other two BMC-alpha instances
are harder than the two reported even for MINISAT before and after preprocessing. The
problems presented in Table 3 are satisfiable.</p>
        <p>Each of these bounded model checking problems is brought within the range of
Dew Satz by some form of preprocessing. In general, HyPre and 3-Resolution are the
best for this purpose, especially when used together, though on problem BMC-IBM-13
they are ineffective without the additional use of NiVER. The column showing the
number of times Dew Satz backtracks is worthy of note. In many cases, preprocessing
reduces the problem to one that can be solved without backtracking. Solving
“without backtracking” has to be interpreted with care here, of course, since a nontrivial
amount of lookahead may be required in a “backtrack-free” search. The results for
BMC-galileo-9 furnish a good example of this: HyPre takes 407 seconds to refine the
problem, following which Dew Satz spends 90 seconds on lookahead reasoning while
constructing the first (heuristic) branch of its search tree, but then that branch leads
directly to a solution. Adding 3-Resolution to the preprocessing step does not change
the number of variables, and only slightly reduces the number of clauses, but it roughly
halves the time subsequently spent on lookahead.</p>
        <p>The instance BMC-alpha-4408 is hard for Dew Satz even after preprocessing. While
MINISAT with multiple preprocessing solves the problem instance with an order of
magnitude faster. We can also observe that HyPre brings more benefit than SatELite,</p>
        <p>Table 4 shows results for both solvers on a related problem set consisting of formal
verification problems taken from the SAT2005 competition. The IBM-FV-01 problems
are satisfiable except for the problem IBM-FV-01-k10; the IBM-FV-26 problems are
unsatisfiable. Most of these satisfiable problems are easy for MINISAT, but the
unsatisfiable cases show that the SatELite preprocessor (with which MINISAT was paired in the
competition) is by far the least effective of the four we consider for MINISAT on these
problems. The preprocessor HyPre proved the unsatisfiability of IBM-FV-01-k10 in 1.27
seconds. 2-SIMPLIFY was not used to simplify the IBM-FV-26 problems, because it is
limited for input formula with maximum 100,000 variables. Again there are cases in
which Dew Satz is improved from a 15,000 second timeout to a one-branch proof of
unsatisfiability. Note that the numbers of clauses in these cases are actually increased by
the preprocessor 3-Resolution, confirming that the point of such reasoning is to expose
structure rather than to reduce problem size.
5.4</p>
        <p>A Highly Symmetrical Problem
FPGA routing problem is a higly symmetrical problem that model the routing of wires in
the channels of field-programmable integrated circuits [AMS03]. The problem instances
used in the experiment, which were artificially designed by Fadi Aloul, are taken from
SAT2002 competition.</p>
        <p>Without preprocessing to break symmetries, many of the FPGA routing problems
are hard—harder for CDCL solvers than for lookahead-based ones. Not only do they
have many symmetries, but the clause graphs are also disconnected. Lookahead
techniques with neighbourhood variables ordering heuristic seem able to choose inferences
within one graph component before moving to another, whereas MINISAT jumps
frequently between components. Table 5 shows performances of both solvers on FPGA
routing problem set. Of 21 selected satisfiable (bart) problems, MINISAT solves 8 in
some 2 hours. It manages better with the unsatisfiable (homer) instances, solving 14 of
15 in a total time of around 6 hours. Dew Satz solves all of the bart problems in 17.5
seconds and the homer ones in 45 minutes.</p>
        <p>The detailed results for two of the satisfiable problems and two unsatisfiable ones
(Table 6) are interesting. The resolution-based preprocessors do not give any
modification to the size of the input formula except when using SatELite. The Shatter
preprocessor, which removes certain symmetries, is tried on its own and in combination
with the five resolution-based preprocessors. It should be noted that the addition of
symmetry-breaking clauses increases the sizes of the problems, but of course it greatly
reduces the search spaces in most cases.</p>
        <p>The performance of Dew Satz after preprocessing is often worse in terms of time
than it was before, though there is always a decreases in the size of its search tree. This
is because of the increase in the problem size which increases the amount of lookahead
process. MINISAT, by contrast, sometimes speeds up by several orders of magnitude
after preprocessing.
bart (21 SAT)
homer (15 UNSAT)</p>
        <p>Dew Satz
Stime
17.52
2,662
#Solved</p>
        <p>8
14</p>
        <p>MINISAT
Stime
7,203
22,183</p>
        <p>#Conflict
119,782,466
143,719,166</p>
        <p>Order of Preprocessors</p>
        <p>Prep.
bart28
bart30
homer19
homer20</p>
        <p>Orig
Sat
Sha
Sha+3Res
Sha+Hyp
Sha+Niv
Sha+Sat
Sha+2Sim
Orig
Sat
Sha
Sha+3Res
Sha+Hyp
Sha+Niv
Sha+Sat
Sha+2Sim
Orig
Sat
Sha
Sha+3Res
Sha+Hyp
Sha+Niv
Sha+Sat
Sha+2Sim
Orig
Sat
Sha
Sha+3Res
Sha+Hyp
Sha+Niv
Sha+Sat
Sha+2Sim
428/2907/7929
413/2892/11469
1825/8407/27003
1764/7702/24400
1764/8349/26138
1781/8358/26759
1728/8254/30422
1750/7892/24682
485/3617/9954
468/3600/14544
2017/9649/30874
1945/8686/27492
1945/9348/29218
1969/9599/30625
1776/8830/33533
1919/8758/27287
330/2340/4950
300/2310/8400
1460/6764/20242
1388/5748/16914
1387/6547/18865
1412/6715/19993
1201/5846/19288
1348/5639/16110
440/4220/8800
400/4180/15200
1999/10340/29988
1907/8793/25027
1905/10527/29129
1941/10276/29671
1723/9420/30986
1879/9419/26188
0
0
9
1
20
6
0
17
20,160
0
96
224
28,270,212
1
1
9
19,958,400</p>
        <p>n/a
4,828,639
7,189,966
9,611,768
5,202,084
6,236,966</p>
        <p>678,425</p>
        <p>Hyp+3Res+Niv
Niv+Hyp+3Res
3Res+Hyp+Niv
3Res+Niv+Hyp
2Sim+Niv+Hyp+3Res
Niv+3Res+2Sim+Hyp
3Res+2Sim+Niv+Hyp
Niv+Hyp+2Sim+3Res
2Sim+Niv
2Sim+Niv+2Sim
2Sim+Niv+2Sim+Niv
2Sim+Niv+2Sim+Niv+2Sim
10805/83643/204679
12001/100114/253071
10038/82632/221890
11107/99673/269405
1518/32206/65806
1532/25229/51873
1793/20597/42365
1532/24524/50463
1518/27554/56565
1518/18988/39433
1486/18956/39429
1486/23258/48033</p>
        <p>Ptime
96.11
85.81
89.56
58.38
0.43
0.49
0.56
0.54
0.08
0.27
0.29
0.48
&gt;15,000
&gt;15,000</p>
        <p>198
2,458
&gt;15,000
5.46
115
19.12
&gt;15,000
&gt;15,000
&gt;15,000</p>
        <p>4,149
&gt;15,000
&gt;15,000
&gt;15,000
&gt;15,000
10,233
5,621
2.26
1.14
1.90
3.15
1.48
0.70
&gt;15,000
11,448
1.83
1.41
1.10
0.91
1.00
0.40</p>
        <p>Stime
&gt;15,000</p>
        <p>106
&gt;15,000
&gt;15,000
&gt;15,000
11,345
907
5.19
n/a
n/a
775,639
7,676,459</p>
        <p>n/a
53,683
684,272
150,838
n/a
n/a
n/a
7,594,231
n/a
n/a
n/a
n/a
51,960,410
54,469,568
33,492
21,669
30,418
45,484
26,517
14,682
n/a</p>
        <p>6
n/a
n/a</p>
        <p>n/a
6,066,241</p>
        <p>290,871
8,216,100
with 2-SIMPLIFY followed by NiVER is insufficient to allow solution before the timeout.
Simplifying again with 2-SIMPLIFY brings the runtime down to under an hour; adding
NiVER again brings it down again to a couple of minutes; repeating 2-SIMPLIFY, far
from improving matters, causes the time to blow out to two hours.</p>
        <p>Conclusions
We performed an empirical study of the effects of several recently proposed SAT
preprocessors on both CDCL and lookahead-based SAT solvers. We describe several outcomes
from this study as follow.</p>
        <p>1. High-performance SAT solvers, whether they depend on clause learning or on
lookahead, benefit greatly from preprocessing. Improvements of four orders of
magnitude in runtimes are not uncommon.
2. It is unlikely to equip a SAT solver with just one preprocessor of the kind
considered in this paper. Very different preprocessing techniques are appropriate to
different problem classes.
3. There are frequently benefits to be gained from running two or more preprocessors
in series on the same problem instance.
4. Both clause learning and lookahead need to be enhanced with techniques specific
to reasoning with binary clauses, in order to exploit dependency chains, and with
techniques for equality reasoning.
5. Lookahead-based solvers also benefit greatly from resolution between longer clauses,
as in the 3-Resolution preprocessor. This seems to capture ahead of the search
some of the inferences which would be achieved during it by learning clauses.
CDCL solvers can also benefit from 3-Resolution preprocessor—dramatically in
certain instances—but the effects are far from uniform.
6.1</p>
        <p>Future work
The following lines of research are open:
1. It would, of course, be easy if tedious to extend the experiments to more problem
sets, more preprocessors and especially to more solvers. We shall probably look
at some more DPLL solvers, but do not expect the results to add much more
than detail to what is reported in the present paper. One of the more important
additions to the class of solvers will be a non-clausal (Boolean circuit) reasoner.
We have not yet experimented with such a solver. We have already investigated
preprocessing for several state of the art SLS (stochastic local search) solvers, but
that is such a different game that we regard it as a different experiment and do
not report it here.
2. The more important line of research is to investigate methods for automatically
choosing among the available preprocessors for a given problem instance, and
for automaticallly choosing the order in which to apply successive preprocessors.
Machine learning may help here, though it would be better, or at least more
insightful, to be able to base decisions on a decent theory about the interaction of
reasoning methods.
3. Another interesting project is to combine preprocessors not as a series of separate
modules but as a single reasoner. For example, it would be possible to
saturate under 3-Resolution and hyper-resolution together, in the manner found in
resolution-based theorem provers. Whether this would be cost-effective in terms
of time, and whether the results would differ in any worthwhile way from those
obtained by ordering separate preprocessors, are unknown at this stage.</p>
        <p>As SAT solvers are increasingly applied to real-world problems, we expect deductive
reasoning by preprocessors to become increasingly important to them.
Acknowledgments
This work was funded by National ICT Australia (NICTA). National ICT Australia is
funded through the Australian Government’s Backing Australia’s Ability initiative, in
part through the Australian Research Council.</p>
        <p>References</p>
        <p>Fadi A. Aloul, Igor L. Markov, and Karem A. Sakallah. Shatter:
Efficient symmetry-breaking for boolean satisfiability. In Design Automation
Conference, pages 836–839. ACM/IEEE, 2003.</p>
        <p>Anbulagan, Duc Nghia Pham, John Slaney, and Abdul Sattar. Old
resolution meets modern SLS. In Proceedings of 20th AAAI, pages 354–359,
2005.</p>
        <p>Anbulagan, Duc Nghia Pham, John Slaney, and Abdul Sattar. Boosting
SLS performance by incorporating resolution-based preprocessor. In
Proceedings of Third International Workshop on Local Search Techniques in
Constraint Satisfaction (LSCS), in conjunction with CP-06, pages 43–57,
2006.</p>
        <p>Anbulagan and John Slaney. Lookahead saturation with restriction for SAT.
In Proceedings of 11th CP, pages 727–731, 2005.</p>
        <p>Fadi A. Aloul, Karem A. Sakallah, and Igor L. Markov. Efficient symmetry
breaking for boolean satisfiability. In Proceedings of 18th IJCAI, Mexico,
2003.</p>
        <p>Fahiem Bacchus. Enhancing Davis Putnam with extended binary clause
reasoning. In Proceedings of 18th AAAI, pages 613–619, Edmonton, Canada,
August 2002. AAAI Press.</p>
        <p>Ronen I. Brafman. A simplifier for propositional formulas with many binary
clauses. In Proceedings of 17th IJCAI, pages 515–522, 2001.</p>
        <p>Ronen I. Brafman. A simplifier for propositional formulas with many binary
clauses. IEEE Transactions on Systems, Man, and Cybernetics, Part B,
34(1):52–59, 2004.</p>
        <p>Fahiem Bacchus and Jonathan Winter. Effective preprocessing with
hyperresolution and equality reduction. In Revised Selected Papers of SAT 2003,
LNCS 2919 Springer, pages 341–355, 2004.</p>
        <p>Olivier Dubois and Gilles Dequen. A backbone-search heuristic for efficient
solving of hard 3-SAT formulae. In Proceedings of 17th IJCAI, pages 248–
253, Seattle, Washington, USA, 2001.</p>
        <p>M. Davis and H. Putnam. A computing procedure for quantification theory.
Journal of the ACM, 7:201–215, 1960.</p>
        <p>Niklas E´en and Armin Biere. Effective preprocessing in SAT through
variable and clause elimination. In Proceedings of 8th SAT, LNCS Springer,
2005.</p>
        <p>Niklas E´en and Niklas Sorensson. An extensible SAT-solver. In Proceedings
of 6th SAT, 2003.</p>
        <p>Marijn Heule and Hans van Maaren. Aligning CNF- and
equivalencereasoning. In Proceedings of 7th SAT, Vancouver, Canada, 2004.</p>
        <p>Marijn Heule and Hans van Maaren. March dl: Adding adaptive heuristics
and a new branching strategy. Journal on Satisfiability, Boolean Modeling
and Computation, (2):47–59, 2006.</p>
        <p>Chu Min Li and Anbulagan. Look-ahead versus look-back for satisfiability
problems. In Proceedings of 3rd CP, pages 341–355, Schloss Hagenberg,
Austria, 1997.</p>
        <p>Chu Min Li. Integrating equivalency reasoning into Davis-Putnam
procedure. In Proceedings of 17th AAAI, pages 291–296, USA, 2000. AAAI
Press.</p>
        <p>I. Lynce and J. Marques-Silva. The interaction between simplification and
search in propositional satisfiability. In Proceedings of CP’01 Workshop on
Modeling and Problem Formulation, 2001.
[MMZ+01] M. W. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang, and S. Malik. Chaff:
Engineering an efficient SAT solver. In Proceedings of DAC, pages 530–535,
2001.
[OGMS02] Richard Ostrowski, E´ric Gr´egoire, Bertrand Mazure, and Lakhdar Sais.</p>
        <p>Recovering and exploiting structural knowledge from CNF formulas. In
Proceedings of 8th CP, pages 185–199, 2002.</p>
        <p>Lawrence O. Ryan. Efficient Algorithms for Clause Learning SAT Solvers.
PhD thesis, Simon Fraser University, Burnaby, Canada, 2004.</p>
        <p>Niklas Sorensson and Niklas E´en. MINISAT v1.13 - A SAT solver with
conflict-clause minimization. In 2005 International SAT Competition
website: http://www.lri.fr/∼simon/contest05/results/descriptions/solvers/minisat static.pdf,
2005.</p>
        <p>Bart Selman, Henry Kautz, and David McAllester. Ten challenges in
propositional reasoning and search. In Proceedings of 15th IJCAI, pages 50–54,
Nagoya, Aichi, Japan, 1997.</p>
        <p>Sathiamoorthy Subbarayan and Dhiraj K. Pradhan. NiVER: Non increasing
variable elimination resolution for preprocessing SAT instances. In Revised
Selected Papers of SAT 2004, LNCS 3542 Springer, pages 276–291, 2005.
Joost P. Warners and Hans van Maaren. A two-phase algorithm for solving
a class of hard satisfiability problems. Operations Research Letters, 23:81–
88, 1998.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [SAHB00]
          <string-name>
            <given-names>R.</given-names>
            <surname>St-Aubin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Hoey</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Boutilier</surname>
          </string-name>
          . APRICODD:
          <article-title>Approximate policy construction using decision diagrams</article-title>
          . In T. K. Leen,
          <string-name>
            <given-names>T. G.</given-names>
            <surname>Dietterich</surname>
          </string-name>
          , and V. Tresp, editors,
          <source>Proceedings of the Fourteenth Annual Conference on Neural Information Processing Systems</source>
          (NIPS'
          <year>2000</year>
          ), pages
          <fpage>1089</fpage>
          -
          <lpage>1095</lpage>
          , Denver,
          <year>2000</year>
          . MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>W. V.</given-names>
            <surname>Quine</surname>
          </string-name>
          .
          <article-title>A way to simplify truth functions</article-title>
          .
          <source>American Mathematical Monthly</source>
          ,
          <volume>62</volume>
          :
          <fpage>627</fpage>
          -
          <lpage>631</lpage>
          ,
          <year>1955</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>Journal of the ACM</source>
          ,
          <volume>12</volume>
          :
          <fpage>23</fpage>
          -
          <lpage>41</lpage>
          ,
          <year>1965</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>