<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Workshop on Practical Aspects of Automated Reasoning, August</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>The Vampire Approach to Induction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Márton Hajdu</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laura Kovács</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Rawson</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrei Voronkov</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>TU Wien</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>U. Manchester</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>EasyChair</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>1</volume>
      <fpage>1</fpage>
      <lpage>12</lpage>
      <abstract>
        <p>We discuss practical aspects of automating inductive reasoning in the first-order superposition prover Vampire. We explore solutions for reasoning over inductively defined dataypes; generating, storing and applying induction schema instances; and directly integrating inductive reasoning into the saturation reasoning loop of Vampire. Our techniques significantly improve the performance of Vampire despite the inherent dificulty of automated induction. We expect our exposition to be useful when implementing induction in saturation-style provers, and to stimulate further discussion and advances in the area.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Induction</kwd>
        <kwd>Saturation</kwd>
        <kwd>Superposition</kwd>
        <kwd>First-order theorem proving</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Due to its emerging relevance in program analysis and synthesis, inductive reasoning has
seeped into major automated reasoning systems over the past few years [1, 2, 3, 4, 5, 6, 7],
complementing seminal eforts in the inductive theorem proving community [ 8, 9]. Still, a
consensus in the automated reasoning community is yet to be made on which parts of
induction to automate and what to leave to the user.</p>
      <p>In this paper we focus on fully automatic methods to eficiently support inductive
reasoning in saturation-based theorem proving, as saturation algorithms [10] provide
stateof-the-art solutions implemented by several modern first-order theorem provers [ 11, 12, 13].
Most provers limit their inductive capabilities to term algebras (also known as
inductivelydefined datatypes ) [14, 15], by relying heavily on case-analysis techniques, such as the
AVATAR framework [16] in [5] or duplicating the current search space in [3]. Others opt for
lightweight approaches that introduce arbitrary induction formulas into the search space
and let the saturation algorithm do the rest [4], thus allowing for induction on integers
[17] and more general induction schemas [18]. In this paper, we describe practical aspects
of integrating induction in saturation-based first-order theorem proving, in particular
within the first-order theorem prover Vampire [11].</p>
      <p>
        Example 1. Let us focus on the term algebras of natural numbers and lists. We consider
recursive function and predicate definitions over natural numbers and lists as given in
app(nil, z) ' z (a1) ¬mem(x, nil) (m1)
app(cons(x, y), z) ' cons(x, app(y, z)) (a2) ¬mem(x, cons(y, z)) ∨ x ' y ∨ mem(x, z) (m2)
rev(nil) ' nil (r1) mem(x, cons(y, z)) ∨ x 6' y (m3)
rev(cons(x, y)) ' app(rev(y), cons(x, nil)()r2) mem(x, cons(y, z)) ∨ ¬mem(x, z) (m4)
∀x, y.mem(x, y) → mem(x, rev(y)),
where x is a natural number and y is a list. Saturation-based first-order provers establish
validity of (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) by showing the unsatisfiability of its negation. Following this method, after
converting the negation of (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) into clause normal form, we are left with the following two
clauses:
      </p>
      <p>
        mem(σ0, σ1)
where σ0, σ1 are Skolem constants. Our approach for disproving (
        <xref ref-type="bibr" rid="ref2">2</xref>
        )–(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ), or more generally
proving formulas via inductive reasoning, is to perform induction directly in
saturationbased proof search, by using valid induction schemata (such as structural induction). To
this end, we generate valid induction formulas, that is, instances of induction schemata.
We further clausify these induction formulas and add them to the search space. We
instantiate the structural induction schema over lists
      </p>
      <p>
        F [nil] ∧ ∀y, z.(F [z] → F [cons(y, z)]) → ∀x.F [x]
with the negation of clauses (
        <xref ref-type="bibr" rid="ref2">2</xref>
        )–(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) over the term σ1 to obtain the induction formula

∀y, z.
      </p>
      <p>¬mem(σ0, nil) ∨ mem(σ0, rev(nil)) ∧</p>
      <p>¬mem(σ0, z) ∨ mem(σ0, rev(z)) →
¬mem(σ0, cons(y, z)) ∨ mem(σ9, rev(cons(y, z))))

 → ∀x. 
 ¬mem(σ0, x) </p>
      <p>
        ∨  (
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
mem(σ0, rev(x))
We further clausify the induction formula (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) to obtain the following six clauses:
mem(σ0, nil) ∨ mem(σ0, cons(σ3, σ2)) ∨ ¬mem(σ0, x) ∨ mem(σ0, rev(x))
mem(σ0, nil) ∨ ¬mem(σ0, σ2) ∨ mem(σ1, rev(σ2)) ∨ ¬mem(σ0, x) ∨ mem(σ0, rev(x))
mem(σ0, nil) ∨ ¬mem(σ0, rev(cons(σ3, σ2))) ∨ ¬mem(σ0, x) ∨ mem(σ0, rev(x))
¬mem(σ0, rev(nil)) ∨ mem(σ0, cons(σ3, σ2)) ∨ ¬mem(σ0, x) ∨ mem(σ0, rev(x))
¬mem(σ0, rev(nil)) ∨ ¬mem(σ0, σ2) ∨ mem(σ1, rev(σ2)) ∨ ¬mem(σ0, x) ∨ mem(σ0, rev(x))
¬mem(σ0, rev(nil)) ∨ ¬mem(σ0, rev(cons(σ3, σ2))) ∨ ¬mem(σ0, x) ∨ mem(σ0, rev(x))
We call the clauses (4.1)–(4.6), and in general clauses stemming from clausifying induction
formulas, the induction clauses of these formulas. Notice that the non-ground literals
of each clause stem from the conclusion of formula (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) — we refer to these non-ground
literals as the conclusion literals of these clauses or of the formula (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) — while the ground
literals come from the antecedent of (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ). An induction formula F is applied on a clause
set C, or alternatively speaking C is inducted upon, when a sequence of binary resolutions
is performed between the induction clauses of F and clauses of C such that the conclusion
literals are resolved. We say we induct on a term t in a clause set C with induction
formula F when, during the application of F on C, the variable in the conclusion literals
is bound to t. In Section 3, we explore solutions on how to generate and apply induction
formulas in saturation.
      </p>
      <p>
        The refutation of clauses (4.1)–(4.6) with clauses (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) and (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) goes as follows. We
eliminate the conclusion literals of (4.1)–(4.6) by applying formula (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) on them, i.e. by
binary resolving the six clauses first with (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ), then with (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ). Then, using the axioms
given in Figure 1 within standard superposition rules, we obtain the following clauses:
σ0 ' σ3 ∨ mem(σ0, rev(σ2))
¬mem(σ0, app(rev(σ2), cons(σ3, nil)))
(
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
(
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
First, a nested induction with the same structural induction schema of lists on the term
rev(σ2) in clauses (
        <xref ref-type="bibr" rid="ref5">5</xref>
        )–(
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) is performed, then using standard superposition rules we get
the clause:
      </p>
      <p>¬mem(σ0, app(rev(σ2), cons(σ0, nil)))
One more nested induction on the term rev(σ2) in this clause results in a refutation.</p>
      <p>Example 1 showcases two main ingredients towards automating induction in
saturationbased theorem proving: generating induction formulas/clauses and using them further
in saturation. In this paper, we address these ingredients of automating induction and
present practical aspects related to applying induction formulas/clauses in saturation. First,
should we add induction clauses to the search space, at the costs of rapidly increasing our
search space? If not, what inferences should we perform so that we do not lose “good"
induction clauses? We present our solutions answering these two questions in Section 3.
Another interesting line of work towards eficient induction in saturation comes with
reusing induction formulas/clauses. To this end, a natural question arises: should we
store induction formulas/clauses and query them when needed, or is it better to generate
induction formulas/clauses on-demand? In Section 4 we address these questions by storing
induction clauses eficiently during saturation. We also discuss more general techniques
that may be integrated easily into any saturation-based theorem prover, and then focus
on specialized techniques for the AVATAR architecture in Section 5. Our experiments in
Section 6 showcase the practical benefits of our work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Preliminaries</title>
      <p>We briefly introduce saturation-based proof search, and refer to [ 11] for more details.
First-order theorem provers typically work with clauses, rather than with arbitrary
firstAlgorithm 1 A Simplified Saturation Loop.
1 P := C, A := ∅
2 repeat
3 if P = ∅ then return C is SAT
4 C := select(P)
5 P := P \ C, A := A ∪ {C}
6 {C1, . . . , Cn} := derive(C, A, I)
7 P := P ∪ {C1, . . . , Cn}
8 if ∈ P then return C is UNSAT
/* select and activate</p>
      <p>clause C from P */
/* derive consequences of C
w.r.t. I and A, put them into P */
order formulas. Given a set of assumptions A1, ..., An and a conjecture G, to check validity
of A1, ..., A1 |= G, the set {A1, ..., An, ¬G} is transformed into an equisatisfiable set of
clauses C (i.e. computing clausal normal forms via clausification). A first-order prover then
saturates C by computing all logical consequences of C with respect to a sound inference
system I. The resulting set is called the closure of C and the process of computing the
closure of C is called saturation. If the closure of C contains the empty clause , the
original set C of clauses is unsatisfiable and thus G follows from A1, ..., An. A simplified
saturation algorithm for a sound inference system I is given in Algorithm 1, with a set of
input clauses C. We call the clause sets A the active set and P the passive set. Further,
we refer to a clause from A as an active clause; similarly, clauses from P are passive
clauses. One of the most common inference systems for first-order logic with equality
is the superposition calculus [10]; many saturation provers [11, 12, 13] implement this
calculus as I in Algorithm 1. Completeness and eficiency of saturation-based reasoning
rely heavily on properties of selecting clauses from P and adding them to A. A related
notion is redundancy of clauses, which allows simplifications and deletions to eliminate
redundant clauses without loss of completeness.</p>
      <p>The AVATAR framework for first-order reasoning. The AVATAR architecture [16]
is a splitting framework that combines first-order theorem proving with SAT solving.
In a nutshell, the first-order prover handles proof search in AVATAR, whereas the SAT
solver tracks splitting decisions and refutations at the boolean structure of clauses. An
A-clause C ← A consists of a first-order clause C and a conjunction of propositional
literals (assertions) A. When AVATAR encounters a first-order clause C, it splits C
into variable-disjoint components C = C1 ∨ ... ∨ Cn. Then, a propositional split clause
[[C1]] ∨ ... ∨ [[Cn]] is passed to the SAT solver, which in turn computes a model for the
propositional clauses it has seen so far. If the solver reports unsatisfiability, the original
problem is unsatisfiable. Otherwise, the clauses Ci ← [[Ci]] are added to the first-order
prover. A-clauses C ← A are allowed to participate in first-order proof search as long as
A is satisfied by the current model. Whenever the first-order prover produces ⊥ ← A,
the conflict clause ¬A is added to the SAT solver.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Applying Induction Formulas in Saturation</title>
      <p>
        A key ingredient in Example 1, and in general in proving inductive properties, is the
generation of inductive formulas, such as (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ), using an appropriate induction schema.
Yet, in a saturation loop such as Algorithm 1, induction is not given special attention.
All clauses are activated sequentially, and inferences are applied only over active clauses.
In our approach, instances of induction schemata are first generated when a clause set
matching the negation of their conclusion gets activated. This makes our induction goal
oriented, controlling the search space explosion by only adding already applicable instances
of schemata.
      </p>
      <p>
        Example 2. We now show one possible sequence of derivation steps of Example 1 w.r.t.
saturation in detail. Suppose that clause (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) is activated first. Then, we may only induct
on (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ), using induction formulas with a conclusion matching the negation of (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ), since
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) is not yet in A. Therefore, (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) is not generated until (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) is also activated. On the
other hand, even if (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) is eventually activated and (4.1) is derived, note that (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) is not an
ancestor of (4.1). However, we can apply (4.1) on (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) by performing the following binary
resolution step:
mem(σ0, nil) ∨ mem(σ0, σ2) ∨ ¬mem(σ0, x) ∨ mem(σ0, rev(x))
mem(σ0, nil) ∨ mem(σ0, σ2) ∨ ¬mem(σ0, σ1)
(7)
That is, if we do not treat active clauses (4.1) and (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) as ordinary clauses but keep track
of their (inductive) derivation, we could prioritize the above binary resolution step in
saturation.
      </p>
      <p>Example 2 shows that the activation and use of induction formulas/clauses in saturation
need a more tailored treatment. In particular, an induction inference may serve as a way
of either adding instances of induction schemata to P or adding the application of these
instances on the premises to P. To this end, we propose the following two ways/options
for applying induction formulas in saturation.</p>
      <p>
        Addition – We allow induction formulas to participate in any inference with any active
clause, by using induction formulas as new lemmas in proof search. In the case of
Example 2, we add (4.1) to P without any further inference. Later, when (4.1) is
activated, inference (7) with (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) may occur. After refuting the antecedent of (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ), the
derived clauses can be used as lemmas. Unfortunately, since these lemmas are not part of
the initial search space, their use on (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) and (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) after their addition to P can be practically
indefinitely delayed.
      </p>
      <p>
        Resolution – We restrict the use of induction clauses as active clauses, and add only
conclusions of binary resolutions with induction clauses to P. In the case of Example 2, we
resolve clauses (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) and (4.1), adding only the conclusion of (7) to P. At this point, it also
makes sense to perform an additional binary resolution between (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) and the conclusion
of (7), and hence activate the conclusion of (7), as (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) is already in A.
      </p>
      <p>It turns out that while allowing induction formulas to interact freely as lemmas
(Addition) blows up the search space in practice, storing and applying them in only
restricted ways (Resolution) requires more book-keeping (Section 4). In fact, we can
combine Addition and Resolution.</p>
      <p>
        Example 3. Using Addition and Resolution in combination for Example 2, we add to
P the clause (4.1) together with the result of a binary resolution between the conclusion
of (7) and clause (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ). Now suppose the following two clauses get activated (in any order):
mem(σ0, app(σ0, σ1)),
¬mem(σ0, rev(app(σ0, σ1)))
(8)
(9)
      </p>
      <p>
        Note that these two clauses are similar to clauses (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) and (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ), except they contain
app(σ0, σ1) instead of σ1. Hence, their negation matches the conclusion of induction
formula (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) and we can reuse the induction formula (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) to refute also these two clauses.
As (4.1) was added to saturation through P by Addition, the saturation algorithm
can perform the application. However, if we want to avoid the search space explosion
emitted by adding free lemmas to the search space, and only use Resolution, clause
(4.1) does not appear in the search space and we must generate a new induction formula.
To overcome the burden of generating new induction formulas that are identical to (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ),
we can instead store clauses (4.1)–(4.6) in an index and query these clauses when the
same induction formula (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) is needed (see Section 4).
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Reusing Induction Formulas in Saturation</title>
      <p>As discussed in Example 3, there may be cases where induction formulas can/must be used
multiple times, but generating the same induction formulas repeatedly is ineficient. In
our work, we detect whether an induction formula has already been generated. Additionally,
for Resolution, we store induction formulas to later query them for reuse.</p>
      <p>The obvious way to recognize, store and query clauses from induction formulas is to
use term indexing [19]. The idea is to index the conclusion literals, which will be resolved
with the literals inducted upon. Standard term indexing approaches are however not very
eficient for induction in saturation. Many clauses of an induction formula may share
conclusion literals, which — upon indexing — result in an insertion overhead, since these
literals could be stored under the same entry instead of adding them separately. Moreover,
we may want to avoid performing some induction inferences, for example inducting on
complex terms like app(σ0, σ1) in clauses (8) and (9); yet, filtering out clauses after they
have been queried from induction formulas is inecfiient. When inducting on non-unit
clauses, it becomes dificult to identify which induction formulas have been generated by
inspecting only conclusion literals. In order to simplify indexing of induction formulas
even for clause sets, we make the following assumptions:
Assumption 1 – Clauses used and queried for induction are ground, reducing the
indexing problem from unification to matching.</p>
      <p>Assumption 2 – Induction formulas contain one induction term, thus normalizing
conclusion literals becomes easier.</p>
      <p>
        With the above two assumptions, we can index a set of clauses {C1, ..., Cn} inducted upon
with induction term t of sort τ with the following key function:
key({C1, ..., Cn}, t) := {C1[t/cτ ], ..., Cn[t/cτ ]}
(10)
where c is a sort-indexed family of constants. A simple term index then uses the values
returned by this key function in an associative array, in which each entry contains the
induction formulas with the same conclusions, up to variable renaming.
Example 4. Using the constant clist for lists, evaluation of the key function with either
the clause set {(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ), (
        <xref ref-type="bibr" rid="ref3">3</xref>
        )} and induction term σ1 or the clause set {(8), (9)} and induction
term app(σ0, σ1) results in the same value:
key({(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ), (
        <xref ref-type="bibr" rid="ref3">3</xref>
        )}, σ1) = key({(8), (9)}, app(σ0, σ1)) = {mem(σ0, clist), ¬mem(σ0, rev(clist))}
Upon querying the index with this value gives the entry with formula (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ).
      </p>
      <p>Note that multiple induction formulas with the same conclusions up to variable renaming
may be produced, corresponding, for example, to diferent well-founded relations. Hence,
there may be multiple induction formulas that are mapped to the same key in the
index. An interesting further direction towards weakening assumptions could come with
performing unification instead of matching ( Assumption 1) and implementing variant
checking (Assumption 2).</p>
      <p>We conclude by noting that the saturation loop of Algorithm 1 eagerly simplifies every
generated clause Ci with respect to existing clauses before adding Ci to the passive set
P. When using induction clauses, these simplifications are performed after resolving the
conclusion literals and they may be performed as many times as the induction formula is
queried. In our work, we propose to store the induction formulas already simplified and
use these simplified formulas in inferences of the saturation loop. Our experiments in
Section 6 show the benefit of storing simplified induction clauses towards making proof
search more eficient.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Improved Induction Using AVATAR</title>
      <p>An induction clause L1 ∨ ... ∨ Ln ∨ C, with ground Li literals and a non-ground C
containing the conclusion literals, can be split into n + 1 components; these components
are variable-disjoint as Li are ground. Splitting the search space along L1 ∨ ... ∨ Ln ∨ C can
be done by many diferent splitting techniques [ 20], but this is done especially eficiently
in Vampire using AVATAR [16].</p>
      <p>
        Example 5. Continuing from the clausification of formula (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) in Example 1, let us assume
that we add clauses (4.1)–(4.6) to P and AVATAR splits them. We name components by
propositional variables and use integers to denote propositional variables. For example,
we write 2 for a positive atom and 2 for its negation. We show below each component
with its propositional variable (left), and also list the corresponding propositional split
clauses (right):
The propositional split clauses are passed to the SAT solver, which reports satisfiability
with a model, for example, 1 ∧ 7¯, resulting in the following A-clauses added to P:
mem(σ0, nil) ← 1
¬mem(σ0, rev(nil)) ← 7
(11)
(12)
Upon activation, the A-clause (11) is resolved with the axiom (m1) from Figure 1, and
we derive the A-clause ⊥ ← 1, which adds the conflict clause 1 to the SAT solver. A new
model for the updated propositional split clauses might be 1¯ ∧ 2 ∧ 5 ∧ 6¯. As shown here,
¯
the base case literal corresponding to 1 is refuted only once instead of three times in three
diferent clauses.
      </p>
      <p>Some considerations from Sections 3 and 4 are orthogonal to splitting. When using
Addition, saturation handles the splitting of clauses automatically. However, an
interesting question is whether to store induction clauses (that is, using Resolution) after
splitting. One advantage of storing induction clauses after they are split — similarly to
storing them simplified — is that splitting is performed only once, not multiple times.
Since only the conclusion components are used when applying an induction formula to
the current premises, only these splitting components need to be stored in an index and
all other components can be pushed into P by AVATAR when needed. In contrast, when
conclusion components are queried from the index, inferences with these components
should only be performed if these components are satisfied in the current SAT model, as
displayed by the following example.</p>
      <p>
        Example 6. Using Resolution after splitting clauses (4.1)–(4.6), the current model
needs to be recomputed with the propositional assumption 4 (as in Example 5) to force
the conclusion component. This either results in no model, in which case inferences with
induction formula (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) cannot be performed and the assumption is retracted. Otherwise, a
model satisfying 4 yields the following A-clause:
      </p>
      <p>
        mem(σ0, x) ∨ mem(σ0, rev(x)) ← 4
After two binary resolutions with e.g. clauses (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) and (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ), we obtain the empty A-clause
⊥ ← 4, yielding the conflict clause 4, and a SAT model for (4.1)–(4.6) is recomputed.
      </p>
      <p>This overhead of recomputing SAT models in AVATAR suggests a hybrid approach:
considering the size of induction clauses with and without splitting, and occasionally not
splitting relatively-small clauses. An additional disadvantage of using AVATAR could be
that split components are introduced even when an induction application results in only
unit clauses, possibly due to simplifications. We leave further investigation of these issues
as future work.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Implementation and Experiments</title>
      <p>Implementation. We implemented our solutions from Sections 3–5 in Vampire1.
Handling AVATAR split clauses turned out to be more challenging than we expected, so
due to the lack of fair comparability, we do now show any results related to Section 5.
We extended Vampire with the new option induction_formula_generation (indfg) to
implement the techniques discussed in Section 3. The indfg option in Vampire has the
following values: add for Addition; resolve for Resolution with the index described in
Section 4; add_resolve for combining Addition and Resolution without an index; and
regenerate to always generate a new induction formula. We also extended Vampire
with the new option simplify_induction_clauses (sic), as in Section 4, to be used
only with resolve.</p>
      <p>Benchmarks. We evaluated our work using the benchmarks suite UFDTLIA from
SMT-LIB [21] and our DTY inductive benchmark set [22]; both set of examples employ,
among others, structural induction over term algebras.
1https://github.com/vprover/vampire/tree/induction-paar22
Experimental setup and results. Our experiments were performed using BenchExec [23],
with a 300s time limit and 16GB memory limit, using four portfolio strategies of Vampire
for structural induction. The strategies are orthogonal to the newly introduced options
and are denoted with numbers 1-4. We measured how each of the indfg option values
performed, with our results being summarized in Table 1. We also measured if enabling
sic on top of resolve impacts performance. Note that with resolve and over 1308 proof
attempts on the set UFDTLIA, the 1.29 × 108 unique induction formulas generated were
applied almost 2.18 × 108 times, averaging 1.69 applications per induction formula over
UFDTLIA examples which can be the reason that resolve has only a marginal advantage
over the naive regenerate in these benchmarks. This number is only 1.19 for DTY,
with over 1.5 × 1013 unique induction formulas, and around 1.78 × 1013 applications over
13584 proof attempts. Table 1 shows that restricted induction in saturation (Resolution)
performs in general better (cf. legend ?:), while reusing and simplifying inductive formulas
plays an important role (cf. sic option).</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>We implement and experimentally evaluate a number of approaches for inductive reasoning
in Vampire-style saturation. We present possible ways to add consequences of induction
schemata into a saturation loop: experimentally-speaking, it turns out that restricted
induction Resolution is the best we have so far developed. Further, we show that, by
careful application of indexing techniques, it is possible to reuse induction formulas to
achieve higher performance, including by storing them already-simplified. Finally, we
discuss the interaction of AVATAR, and clause splitting more generally, with this line of
inductive reasoning.</p>
      <p>Acknowledgements. This work was partially funded by the ERC CoG ARTIST
101002685, the EPSRC grant EP/P03408X/1, the FWF grant LogiCS W1255-N23, and
the TU Wien SecInt DK.
[7] H. Barbosa, C. W. Barrett, M. Brain, G. Kremer, H. Lachnitt, M. Mann, A. Mohamed,
M. Mohamed, A. Niemetz, A. Nötzli, A. Ozdemir, M. Preiner, A. Reynolds, Y. Sheng,
C. Tinelli, Y. Zohar, CVC5: A Versatile and Industrial-Strength SMT Solver, in:
TACAS, 2022, pp. 415–442.
[8] M. Kaufmann, J. S. Moore, ACL2: An Industrial Strength Version of Nqthm, in:</p>
      <p>COMPASS, 1996, pp. 23–34.
[9] A. Bundy, A. Stevens, F. van Harmelen, A. Ireland, A. Smaill, Rippling: A Heuristic
for Guiding Inductive Proofs, Artif. Intell. 62 (1993) 185–253.
[10] R. Nieuwenhuis, A. Rubio, Paramodulation-Based Theorem Proving, in: Handbook
of Automated Reasoning, Handbook of Automated Reasoning, 2001, pp. 371–443.
[11] L. Kovács, A. Voronkov, First-Order Theorem Proving and Vampire, in: CAV, 2013,
pp. 1–35.
[12] S. Schulz, S. Cruanes, P. Vukmirovic, Faster, Higher, Stronger: E 2.3, in: CADE,
2019, pp. 495–507.
[13] C. Weidenbach, D. Dimova, A. Fietzke, R. Kumar, M. Suda, P. Wischnewski, SPASS
version 3.5, in: CADE, 2009, pp. 140–145.
[14] L. Kovács, S. Robillard, A. Voronkov, Coming to Terms with Quantified Reasoning,
in: POPL, 2017, pp. 260–270.
[15] J. C. Blanchette, N. Peltier, S. Robillard, Superposition with Datatypes and
Codatatypes, in: IJCAR, 2018, pp. 370–387.
[16] A. Voronkov, AVATAR: The Architecture for First-Order Theorem Provers, in:</p>
      <p>CAV, 2014, pp. 696–710.
[17] P. Hozzová, L. Kovács, A. Voronkov, Integer Induction in Saturation, in: CADE,
2021, pp. 361–377.
[18] M. Hajdú, P. Hozzová, L. Kovács, A. Voronkov, Induction with Recursive Definitions
in Superposition, in: FMCAD, 2021, pp. 1–10.
[19] R. Sekar, I. Ramakrishnan, A. Voronkov, Term Indexing, in: Handbook of Automated</p>
      <p>Reasoning, Handbook of Automated Reasoning, 2001, pp. 1853–1964.
[20] K. Hoder, A. Voronkov, The 481 Ways to Split a Clause and Deal with Propositional</p>
      <p>Variables, in: CADE, volume 7898, 2013, pp. 450–464.
[21] C. Barrett, P. Fontaine, C. Tinelli, The Satisfiability Modulo Theories Library
(SMT-LIB), www.SMT-LIB.org, 2016.
[22] M. Hajdú, P. Hozzová, L. Kovács, J. Schoisswohl, A. Voronkov, Inductive Benchmarks
for Automated Reasoning, in: F. Kamareddine, C. S. Coen (Eds.), CICM, 2021, pp.
124–129.
[23] D. Beyer, Reliable and Reproducible Competition Results with BenchExec and
Witnesses (Report on SV-COMP 2016), in: TACAS, 2016, pp. 887–904.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Reynolds</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kunčak</surname>
          </string-name>
          ,
          <article-title>Induction for SMT Solvers</article-title>
          , in: VMCAI,
          <year>2015</year>
          , pp.
          <fpage>80</fpage>
          -
          <lpage>98</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>Claessen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Johansson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rosén</surname>
          </string-name>
          , N. Smallbone,
          <source>HipSpec: Automating Inductive Proofs of Program Properties</source>
          , in: ATx/WInG,
          <year>2012</year>
          , pp.
          <fpage>16</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Wand</surname>
          </string-name>
          , Superposition: Types and Induction,
          <source>Ph.D. thesis</source>
          , Saarland University,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Reger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Voronkov</surname>
          </string-name>
          ,
          <article-title>Induction in Saturation-Based Proof Search</article-title>
          , in: CADE,
          <year>2019</year>
          , pp.
          <fpage>477</fpage>
          -
          <lpage>494</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Cruanes</surname>
          </string-name>
          ,
          <article-title>Superposition with Structural Induction</article-title>
          , in: FroCoS,
          <year>2017</year>
          , pp.
          <fpage>172</fpage>
          -
          <lpage>188</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Passmore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Cruanes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ignatovich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Aitken</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Kagan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kanishev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Maclean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Mometto</surname>
          </string-name>
          ,
          <source>The Imandra Automated Reasoning System, in: IJCAR</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>464</fpage>
          -
          <lpage>471</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>