<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Some Thoughts About FOL-Translations in Vampire</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Manchester</institution>
          ,
          <addr-line>Manchester</addr-line>
          ,
          <country country="UK">U.K</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <volume>2095</volume>
      <fpage>11</fpage>
      <lpage>25</lpage>
      <abstract>
        <p>It is a common approach when faced with a reasoning problem to translate that problem into first-order logic and utilise a first-order automated theorem prover (ATP). One of the reasons for this is that first-order ATPs have reached a good level of maturity after decades of development. However, not all translations are equal and in many cases the same problem can be translated in ways that either help or hinder the ATP. This paper looks at this activity from the perspective of a first-order ATP (mostly Vampire).</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        This paper looks at the common activity of encoding problems in first-order logic and running
an automated theorem prover (ATP) on the resulting formulas from the perspective of the
ATP. This paper focusses on the Vampire [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] theorem prover (available at https://vprover.
github.io/) but much of the discussion applies to other similarly constructed theorem provers
(e.g. E [
        <xref ref-type="bibr" rid="ref46">46</xref>
        ] and SPASS [
        <xref ref-type="bibr" rid="ref51">51</xref>
        ]).
      </p>
      <p>Over the last few years we have been looking at the application of program
analysis/verification, the related encodings, and how Vampire can work with these encodings. In this paper I
also mention another setting where we have begun to do some work: working with logics more
expressive than first-order logic. This paper comes at the start of an activity to inspect the
first-order problems coming from translations involving these logics and considering how we can
make Vampire perform better on them.</p>
      <p>Throughout the paper I use the terms encoding and translation reasonably interchangeably
and lazily. Sometimes the term translation makes more sense (we are moving from one
formally defined language to another) and sometimes encoding makes more sense (we are taking
a description of a problem and representing it in first-order logic).</p>
      <p>This issue of different encodings have differing impacts on how easily a problem is solved
is well-known in the automated reasoning community. A standard example is the explosion in
conversion to CNF in SAT when not using the Tseitin encoding. The result is obviously bad
because it is much larger. In the setting of first-order theorem proving large inputs are also
bad but there are other (more subtle) ways in which an encoding can impact the effectiveness
of proof search.</p>
      <p>The main points of this paper are as follows:
1. The way in which a problem is expressed can have a significant impact on how easy or
hard it is for an ATP to solve it and the causes of this go beyond the size of the translation
2. It is often non-obvious whether a translation will be good; we need experiments
3. Sometimes there is no best solution i.e. different encodings may be helpful in different
scenarios. In such cases it can be advantageous to move this choice within the ATP
4. Sometimes the only solution is to extend the ATP with additional rules or heuristics to
make the ATP treat an encoding in the way we want it to be treated</p>
      <sec id="sec-1-1">
        <title>This points are expanded (with examples) in the rest of the paper.</title>
        <p>The rest of the paper is structured as follows. Section 2 describes the relevant inner workings
of Vampire that might have a significant impact on the way that it handles different encodings.
Section 3 reviews some encodings described in the literature. Section 4 attempts to make some
hints about things to consider when designing a representation of a problem in first-order in
logic. Section 5 gives some advice on how different encodings should be compared. Section 6
concludes.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>The Relevant Anatomy of a First-Order ATP</title>
      <p>
        In this section we review the main components of Vampire [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] that might affect how well it
handles different encodings. As mentioned above, Vampire shares some elements with other
well-known first-order theorem provers.
      </p>
      <p>We consider multi-sorted first-order logic with equality as input to the tool. It will become
clear later that Vampire accepts extensions of this in its input, however its underlying logic
remains multi-sorted first-order logic with equality and extensions are (mostly) handled as
translations into this.</p>
      <p>Usually, Vampire deals with the separate notions of axioms and conjecture (or goal ).
Intuitively, axioms formalise the domain of interest and the conjecture is the claim that should
logically follow from the axioms. The conjecture is, therefore, negated before we start the
search for a contradiction. Conjectures are captured by the TPTP input language but not by
the SMT-LIB input language where everything is an axiom.
2.1</p>
      <sec id="sec-2-1">
        <title>Preprocessing</title>
        <p>
          Vampire works with clauses. So before proof search it is necessary to transform input
formulas into this clausal form. In addition to this process there are a number of (satisfiability,
but not necessarily equivalence, preserving) optimisation steps. For more information about
preprocessing see [
          <xref ref-type="bibr" rid="ref17 ref43">17, 43</xref>
          ].
        </p>
        <p>Clausification. This involves the replacement of existentially quantified variables by Skolem
functions, the expansion of equivalences, rewriting to negation normal form and then application
of associativity rules to reach conjunctive normal form. It is well know that this process can
lead to an explosion in the number of clauses.</p>
        <p>Subformula naming. A standard approach to dealing with the above explosion is the naming
of subformulas (similar to the Tseitin encoding from SAT). This process is parametrised by a
threshold which controls at which point we choose to name a subformula. There is a trade-off
here between a (possible) reduction in the size and number of resulting clauses and restricting
the size of the signature. Later we will learn that large clauses and a large signature are both
detrimental for proof search.</p>
        <p>
          Definition inlining. Vampire detects definitions at the formula and term level. The
equivalence p(X) ↔ F [X] is a definition if p does not occur in formula F , similarly the unit equality
2
f (X) = t is a definition if f does not appear in term t. Definitions may be inlined i.e. all
occurrences of p or f are replaced by their definition. This reduces the size of the signature with
the cost of a potential increase in the size of clauses. Note that it can be quite easy to break
this notion of definition in encodings. For example, by introducing a guard to a definition.
Goal-based premise selection. Vampire can use a goal to make proof search more
goaldirected. The point here is that if an encoding does not explicitly highlight the goal we cannot
make use of these techniques. There are two techniques used in preprocessing. The first is SInE
selection [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] which heuristically selects a subset of axioms that are likely to be used in the
proof of the goal. Axioms are selected using a notion of closeness to the goal that is based on
whether they can be connected to the goal via their least common symbol. The second is the
set of support strategy [
          <xref ref-type="bibr" rid="ref52">52</xref>
          ] where clauses from the goal are put into a set of support and proof
search is restricted so that it only makes inferences with clauses in, or derived from, this set.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Saturation-Based Proof Search</title>
        <p>After a clause set has been produced, Vampire attempts to saturate this set with respect to
some inference system I. The clause set is saturated if for every inference from I with premises
in S the conclusion of the inference is also added to S. If the saturated set S contains a
contradiction then the initial formulas are unsatisfiable. Otherwise, if I is a complete inference
system and, importantly, the requirements for this completeness have been preserved, then the
initial formulas are satisfiable. Finite saturation may not be possible and many heuristics are
employed to make finding a contradiction more likely.</p>
        <p>To compute this saturation we use a set of active clauses, with the invariant that all
inferences between active clauses have been performed, and a set of passive clauses waiting to be
activated. The algorithm then iteratively selects a given clause from passive and performs all
necessary inferences to add it to active. The results of these inferences are added to passive
(after undergoing some processing). This is illustrated in Figure 1. An important aspect of this
process is clause selection. Clauses are selected either based on their age (youngest first) or
their weight (lightest first) with these two properties being alternated in some specified ratio.</p>
        <p>
          A recent addition to this story is AVATAR [
          <xref ref-type="bibr" rid="ref40 ref50">50, 40</xref>
          ], which (optionally) performs clause
splitting using a SAT solver. The main point here is that the success of AVATAR is driven by
the observation that saturation-based proof search does not perform well with long or heavy
clauses. Therefore, encodings should avoid the introduction of such clauses. As an additional
point, AVATAR can only be utilised if the boolean structure of a problem is exposed at the
literal-level. For example, including a predicate implies with associated axioms would not play
to AVATAR’s strengths.
2.3
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Inference Rules</title>
        <p>
          Vampire uses resolution and superposition as its inference system I [
          <xref ref-type="bibr" rid="ref1 ref34">1, 34</xref>
          ]. A key feature of this
calculus is the use of literal selection and orderings to restrict the application of inference rules,
thus restricting the growth of the clause sets. Vampire uses a Knuth-Bendix term ordering
(KBO) [
          <xref ref-type="bibr" rid="ref23 ref25 ref32">23, 25, 32</xref>
          ] which orders terms first by weight and then by symbol precedence whilst
agreeing with a multisubset ordering on free variables. The symbol ordering is taken as a
parameter but is relatively coarse in Vampire e.g. by order of occurrence in the input, arity,
frequency or the reverse of these. There has been some work beginning to explore more clever
things to do here [
          <xref ref-type="bibr" rid="ref22 ref38">22, 38</xref>
          ] but we have not considered treating symbols introduced by translations
differently (although they will appear last in occurrence).
        </p>
        <p>
          Understanding this is important as some translations change symbols used in terms, which
can change where the term comes in the term ordering. Vampire makes use of both complete
and incomplete literal selection functions [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] which combine the notion of maximality in the
term ordering with heuristics such as selecting the literal with the least top-level variables.
        </p>
        <p>
          Another very important concept related to saturation is the notion of redundancy. The idea
is that some clauses in S are redundant in the sense that they can be safely removed from S
without compromising completeness. The notion of saturation then becomes
saturation-up-toredundancy [
          <xref ref-type="bibr" rid="ref1 ref34">1, 34</xref>
          ]. An important redundancy check is subsumption. A clause A subsumes B if
some subclause of B is an instance of A, in which case B can be safely removed from the search
space as doing so does not change the possible models of the search space S. The fact that
Vampire removes redundant formulas is good but if there is a lot of redundancy in the encoding
we can still have issues as this removal can be lazy (e.g. when using the discount saturation
loop that does not remove redundancies from the passive set).
        </p>
        <p>Figure 2 gives a selection of inference rules used in Vampire. An interesting point can be
illustrated by examining the demodulation rule. This uses unit equalities to rewrite clauses
to replace larger terms by smaller terms. A similar, but more general, rewriting is performed
by superposition. Ordering restrictions in inference rules make proof search practical but, as I
comment later, can mean that the theorem prover does not treat encoded rules in the way that
we want.
2.4</p>
      </sec>
      <sec id="sec-2-4">
        <title>Strategies and Portfolio Mode</title>
        <p>
          Vampire is a portfolio solver [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ]. It implements many different techniques and when solving
a problem it may use tens to hundreds of different strategies in a time-sliced fashion. Along
with the above saturation-based proof search method, Vampire also implements the InstGen
calculus [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ] and a finite-model building through reduction to SAT [
          <xref ref-type="bibr" rid="ref42">42</xref>
          ]. In addition, Vampire
can be run in a parallel mode where strategies are distributed over a given number of cores.
        </p>
        <p>This is important as Vampire can try different preprocessing and proof search parameters
in different strategies, meaning that it does not matter if a particular encoding does not work
well with one particular strategy. Furthermore, if Vampire is allowed to do the translation itself
then it can try multiple different translations during proof search.
2.5</p>
      </sec>
      <sec id="sec-2-5">
        <title>Problem Characteristics that Matter</title>
        <p>As summary, we can discuss the problem characteristics that matter. The main ones we usually
talk about are:
4
Resolution
A ∨ C1 ¬A0 ∨ C2 ,</p>
        <p>(C1 ∨ C2)θ
Superposition
s 6' t ∨ C ,</p>
        <p>Cθ
where θ = mgu(s, t)
Demodulation
l ' r L[lθ] ∨ C ,</p>
        <p>(L[rθ] ∨ C)θ
where lθ</p>
        <p>rθ
Factoring
A ∨ A0 ∨ C ,
(A ∨ C)θ
where, for both inferences, θ = mgu(A, A0) and A is not an equality literal
l ' r ∨ C1
where θ = mgu(l, s) and rθ 6 lθ and, for the left rule L[s] is not an equality literal,
and for the right rule ⊗ stands either for ' or 6' and t0θ 6 t[s]θ
EqualityResolution</p>
        <p>EqualityFactoring</p>
        <p>s ' t ∨ s0 ' t0 ∨ C
(t 6' t0 ∨ s0 ' t0 ∨ C)θ</p>
        <p>,
where θ = mgu(s, s0), tθ 6 sθ, and t0θ 6 s0θ
UnitResultingResolution
C ∨ A1 ∨ . . . An
¬B1 . . .</p>
        <p>¬Bn ,</p>
        <p>Cθ
where |C| ≤ 1 and θ = F mgu(Ai, Bi)
• Number of resulting clauses. Clearly, if there are more clauses it will take longer to
process them initially. However, the number of clauses is a bad proxy for effort as a small
clause set can lead to many consequences, where a large clause set may have almost no
consequences at all.
• Size of resulting clauses. Long and heavy clauses lead to even longer and heavier clauses
when applying rules such as resolution. Some processes, such as subsumption checking,
are exponential in the length of a clause.
• Size of signature. In SAT-solving the relationship with the signature is clear as each
symbol represents a potential choice point. In saturation-based proof search we are effectively
trying to eliminate the literals in clauses until we get an empty clause. The combination
of clause selection and literal selection steers this process. The notion of maximality built
into literal selection means that ‘bigger’ symbols are preferred, driving proof search to
effectively eliminate symbols in this order. So, like in SAT solving, the larger the signature
the more work we need to do.</p>
        <p>But as we can see from the discussions of literal selection in this paper, issues with encodings
can be more subtle than this.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Examples of Translations/Encodings</title>
      <p>This section reviews some translations or encodings targeting first-order logic. I don’t attempt
to be exhaustive; in fact there are many well-established examples that I will have missed.
3.1</p>
      <sec id="sec-3-1">
        <title>Simplifying things further</title>
        <p>Not all ATPs can handle multi-sorted first-order logic with equality. There are methods that
can be used to remove things that are not supported or wanted.</p>
        <p>
          Removing sorts. It is not always possible to simply drop sort information from problems as
the ‘size’ of different sorts may have different constraints [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. Two proposed solutions are to
either guard the use of sorted variables by a sort predicate that indicates whether a variable is
of that sort (this predicate can be set to false in a model for all constants not of the appropriate
sort) or tag all values of a sort using a sort function for that sort (in a model the function
can map all constants to a constant of the given sort). Both solutions add a lot of clutter to
the signature although sort predicates add more as they also require the addition of axioms
for the sort predicates. Experimental evaluation [
          <xref ref-type="bibr" rid="ref42 ref8">8, 42</xref>
          ] concluded that the encodings can be
complementary. Similar techniques can be used for removing polymorphism [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. An important
optimisation here is that we can simply drop sorts if they are monotonic [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
Removing equality. Every atom t = s can be replaced by eq (t, s) with the addition of
axioms for reflexivity, symmetry, transitivity of eq and congruence of functions and predicates.
However, the result is not able to use the efficient superposition or demodulation rules. Vampire
optionally makes this transformation as in some cases it can aid proof search, in particular when
we do not require deep equality reasoning. It is necessary to remove equality when using InstGen
as this does not support equality.
        </p>
        <p>
          Removing functions. A well known decidable fragment of first-order logic is the
BernaysSch¨onfinkel fragment (also known as effectively propositional ) where formulas contain no
(nonzero arity) function symbols (even after Skolemisation). If a problem fits in this fragment then
we obtain this useful property. For example, the problem of finding a finite model of a first-order
formula can be reduced to a sequence of effectively propositional problems [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Syntactic Extensions for Program Verification</title>
        <p>
          A common setting where we see problems encoded in first-order logic is for program verification.
Tools such as Boogie [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] and Why3 [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] produce first-order proof obligations. There tend to
be common expressions in such obligations, such as if-then-else, which is why languages such
as TPTP [
          <xref ref-type="bibr" rid="ref48 ref49">48, 49</xref>
          ] and SMT-LIB [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] support these features. However, ATPs typically do not
support these features directly but via the following translations.
6
Boolean sort. In first-order logic predicates are of boolean sort and functions are not.
However, in programs we typically want to reason about boolean functions, which requires a
firstclass boolean sort in the problem. This can be encoded in first-order logic (as explained in
the FOOL work [
          <xref ref-type="bibr" rid="ref26 ref28">26, 28</xref>
          ]) by the introduction of two constants true and false and two axioms
true 6= false and ∀x ∈ bool : x = true ∨ x = false. This second axiom is problematic as it will
unify with every boolean sorted term and assert that it is either true or false. To overcome
this we introduce a specialised inference rule that captures the desired behaviour without the
explosive nature [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ].
        </p>
        <p>If-then-else. Conditionals are a common construct in programs and it is important to model
them. This is often done by extending the syntax by a term of the form if b then s else t for a
boolean sorted term b and two terms of the same sort t and s. The question is then how this
should be translated to first-order logic. I discuss three alternatives. In each case we assume
we are given a formula containing an if-then-else term e.g. F [if b then s else t]. In the first case
we translate this formula into two formulas
b → F [s] ,
¬b → F [t]
where b and its negation are used as guards. This has the disadvantage that it copies F . An
altnernative is to produce the three formulas</p>
        <p>F [g(X)] , b → g(X) = s ,
¬b → g(X) = t
where g is a fresh symbol and X are the free variables of the original if-then-else term. This
introduces a new symbol g. Finally, we could replace the formula by F [iteτ (b, s, t)], where ite
is a fixed function symbol of the sort bool × τ × τ → τ , and add the general axioms
(X = true) → iteτ (X, s, t) = s , (X = false) → iteτ (X, s, t) = t
capturing the intended behaviour of iteτ . This only introduce a pair of axioms per set of
if-then-else expressions with the same resultant sort.</p>
        <p>
          We now have three different translations and the question is: which should we use? There
are various observations we can make at this point. The first translation copies F , which may
lead to many more clauses being introduced if F is complex. However, it does not extend the
signature, whilst the second two translations do. The last translation introduces some axioms
that are likely to be quite productive during proof search. We should also think about what we
want to happen here; ideally the guards will be evaluated first and then the rest of the expression
either selected or thrown away. Recall that literal selection is the mechanism for choosing which
part of a clause is explored next. In general, literal selection prefers heavier literals that are
more ground and dislikes positive equalities. Therefore, in the second translation it is likely
that the guards will be evaluated before the equalities. In general, it is likely that the guard
will be simpler than the conditional body and therefore the first translation is likely to not do
what we want. However, it is difficult to draw conclusions from this; as discussed at the end
of this paper, we should really run some experiments to see. Note that currently Vampire will
pick between the first two depending on the subformula naming threshold [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ].
Next state relations. Recent work [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] has used a tuple-based let-in expression to encode
next-state relations of loop-free imperative programs. These tuple-based let-in expressions are
then translated into clausal form by introducing names for the bound functions [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ] and relevant
axioms for the tuples (if required). The point of this translation was to avoid translation to
single static assignment form as is often done in translations of imperative programs [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. The
reason to avoid such a form is that it drastically increases the size of the signature. Another
benefit of this work is that we can translate loop-free imperative programs directly into
firstorder logic, allowing Vampire to take such programs directly in its input.
3.3
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Proofs about Programming Languages</title>
        <p>
          Recent work [
          <xref ref-type="bibr" rid="ref14 ref15">15, 14</xref>
          ] has used ATPs to aid in establishing soundness properties of type systems.
The idea is to translate a language specification and an ‘exploration task’ into first-order logic.
They define and evaluate 36 different compilation strategies for producing first-order problems.
        </p>
        <p>The most interesting observation to make about this work is that they reinvent a number
of encodings and fail to take advantage of the encodings that already exist within ATPs. For
example, they consider if-then-else and let-in expressions but do not encode these in the TPTP
format but choose their own encodings. In this case they make the first choice for encoding
conditionals given above and they choose an alternative way to capture let-in expressions. They
also explore the inlining of definitions in their encoding, a step that Vampire already takes in
preprocessing. Even though they were able to use more contextual information to decide when
to inline, they found that their inlining had minimal impact on performance.</p>
        <p>A positive point about this work is that they made a thorough comparison of the different
encodings, although some of the suggested encodings were likely to hinder rather than help the
ATP. In particular, those that remove information from the problem.
3.4</p>
      </sec>
      <sec id="sec-3-4">
        <title>Higher-Order Logic</title>
        <p>
          Many of the automated methods for reasoning in higher-order logic work via (a series of)
encodings into first-order logic. Well-known examples are the Sledgehammer tool [
          <xref ref-type="bibr" rid="ref35 ref9">9, 35</xref>
          ] and
Leo III [
          <xref ref-type="bibr" rid="ref47">47</xref>
          ]. The translation of higher-order logic to first-order logic has been well studied
[
          <xref ref-type="bibr" rid="ref33">33</xref>
          ]. In the translation it is necessary to remove three kinds of higher-order constructs and we
discuss the standard translation for removing these here.
        </p>
        <p>Handling partial application and applied variables. In higher-order formulas it is
possible to partially apply function symbols, which means that we cannot use the standard first-order
term structure. The standard solution is to use the so-called applicative form. The general idea
is to introduce a special app symbol (per pair of sorts in the multi-sorted setting). An
application st is then translated as app(s, t). This also addresses the issue that higher-order formulas
can contain applied variables e.g. ∃f.∀x.f (x) = x now becomes ∃f.∀x.app(f, x) = x.</p>
        <p>Whilst this translation is correct it also adds a lot of clutter to clauses that causes problems
for proof search. One example of this is in literal selection as applicative form can change
the notion of maximal literal. For example, the literal g(a) = a will be maximal in clause
f (a) = a∨g(b) = b if g f in the symbol ordering. However, in appτ1 (f, a) = a∨appτ2 (g, b) = b
the maximal literal depends on the ordering of appτ1 and appτ2 , which cannot consistently agree
with the symbol ordering (consider h of sort τ1 such that h g). As a result, it seems likely
that the number of maximal literals in clauses will significantly increase. Additionally, literal
selection heuristics often consider the number of ‘top’ variables of a literal but the applicative
form changes the height of variables in terms. Finally, the applicative form can decrease the
efficiency of term indexing as (i) these often use function symbols to organise the tree but these
have been replaced, and (ii) terms are larger making the indices larger.
8</p>
        <p>
          However, the applicative form is only needed for supporting partial application and applied
variables. If neither are used in a problem then it is not needed (although note that the
combinator translation discussed below for λ-expressions introduces applied variables).
Removing λ-expressions. The standard approach to removing λ-expressions is rewriting
with Turner combinators [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] and the addition of axioms defining these combinators. The
translation itself can lead to very large terms, which can be mitigated by the introduction of
additional axioms at the cost of further axioms and extensions to the signature. An alternative
is λ-lifting which introduces new function symbols for every nested λ-expression, which again
can significantly increase the size of the signature. To avoid combinator axioms cluttering proof
search we have begun to explore replacing these with (incomplete) inference rules emulating
their behaviour [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
        </p>
        <p>Translating logical operators. It is sometimes necessary to translate logical operators
occurring inside λ-expressions (other embedded logical operators can be lifted via naming). These
must then be specified by axioms e.g. OR could be defined by
app(app(OR, x), y) = true ⇐⇒</p>
        <p>X = true ∨ Y = true
which produces the clauses
app(app(OR, X), Y ) = f alse ∨ X = true ∨ Y = true
app(app(OR, X), Y ) = true ∨ X = f alse
app(app(OR, X), Y ) = true ∨ Y = f alse</p>
        <sec id="sec-3-4-1">
          <title>Alternatively, the last clause could be dropped and replaced by</title>
          <p>app(app(OR, X), Y ) = app(app(OR, Y ), X)
The non-orientability of this last equality suggests that it would be too productive. Finally,
whilst it seems most natural to introduce an equivalence, we would not typically need to reason
in the other direction and it could be interesting to see how necessary the first clause is.
3.5</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>Other Logics</title>
        <p>As we have seen from the previous example, a common activity is to consider other logics as
(possibly incomplete) embeddings into first-order logic. In all cases, an interesting direction of
research will be to inspect the kinds of problems produced and consider whether the encoding
or ATP could be improved.</p>
        <p>
          Translations via higher-order logic. Many non-classical logics, included quantified
multimodal logics, intuitionistic logic, conditional logics, hybrid logic, and free logics can be encoded
in higher-order logic [
          <xref ref-type="bibr" rid="ref13 ref5">5, 13</xref>
          ].
        </p>
        <p>
          Ontology Languages. In this setting it is more typical to use optimised reasoners for
decidable logics. However, some work has looked at translating these logics into first-order logic and
employing an ATP. As two examples, Schneider and Sutcliffe [
          <xref ref-type="bibr" rid="ref45">45</xref>
          ] give a translation of OWL 2
Full into first-order logic and Horrocks and Voronkov [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] translate the KIF language. In both
cases the translations allowed ATPs to solve difficult problems but it was not clear whether
they were in any sense optimal.
Propositional Modal Logic. There exists a standard relational encoding of certain
propositional modal logics into first-order logic where the accessibility relation is given as a predicate R
and relational correspondence properties are captured as additional axioms. To combat certain
deficiencies in this translation, a functional translation method was introduced with different
variations [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. One observation here is that many of the relational correspondence properties
are not friendly for proof search and it may be interesting to explore specific ATP extensions
that could help with such translations.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Improving a Translation</title>
      <p>When thinking about translations of problems to first-order logic there are things that can be
done to improve ATP performance within the translation and there are things that can only be
done within the ATP. I discuss both sets of improvements here.
4.1</p>
      <sec id="sec-4-1">
        <title>What Can be Controlled in an Encoding</title>
        <p>The following describes some things to watch out for when designing a translation.
Throwing things away. A common mistake in encodings is to throw away information
that the theorem prover can use in proof search. An obvious example of this is performing
Skolemisation and dropping information about which function symbols are Skolem functions –
in some applications, e.g. inductive theorem proving, such Skolem constants play a special role
in proof search. Another example is dropping information about which formula is the goal. A
more subtle example of this is to perform subformula naming before passing the problem to
Vampire. In this case the original structure is lost and it is not possible for Vampire to choose
to name fewer subformulas. Recall that an advantage of the strategy scheduling structure of
Vampire is that it can try out various different preprocessing steps. In general, if an ATP can
perform a preprocessing step then it should be given the option to.</p>
        <p>
          Not adding what is helpful. Quite often there is some information about a problem that
may not affect the solvability of the problem but can be useful in improving the performance of
the solver. An obvious example is in the translation of the problem of finding first-order models
to EPR [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. In this translation there are a number of symmetries that are known during the
translation but that would be expensive to recover after the fact. Failing to add these gives the
solver unnecessary work to do. Another example would be failing to exclude (sets of) axioms
that are known to have no relation to the current goal – something that may be known when
generating the problem but can be difficult to determine by the ATP.
4.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>What Needs ATP Support</title>
        <p>Here I discuss some issues with translations that require ATP support to handle.
Exploding axioms. It is quite common to include an axiomitisation of some theory that is
included in the problem but only needed a little bit in proof search. It is also common for these
axioms to be explosive, in the sense that they generate a lot of unnecessary consequences in
the search space. A typical example would be axioms for arithmetic, or even for equality. We
have already seen an example with this when encoding the boolean sort.
10</p>
        <p>
          One ATP-based solution to this that we have explored [
          <xref ref-type="bibr" rid="ref39 ref7">39, 7</xref>
          ] is to identify such axioms and
limit the depth to which these axioms can interact. This is effectively the same as precomputing
the set of consequences of the axioms up to a certain size but happens dynamically at proof
search so may not require the full set. An alternative ATP-based solution is to capture the
rules represented by the axioms as additional inference rules.
        </p>
        <p>Rewriting the wrong way. Sometimes rules may not do what we want them to do. For
example, the following rule for set extensionality
(∀x)(∀y)(((∀e)(e ∈ y ↔ e ∈ y)) → x = y)</p>
        <p>f (x, y) 6∈ x ∨ f (x, y) 6∈ y ∨ x = y
will be clausified as
which is correct but will not be treated in the way we want by the ATP. Literal selection prefers
larger literals and dislikes (positive) equalities. Therefore, the literal x = y will not be selected.
However, given the goal a 6= b for sets a and b we want this to unify with x = y to generate
the necessary subgoals. The rewrite rule is oriented the wrong way. This can also happen with
implications or equalities where we want to rewrite something small into something big.</p>
        <p>
          One ATP-based solution is to introduce specific rules to handle special cases. For example,
in the case of extensionality Vampire includes the inference rule [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]
x ' y ∨ C
        </p>
        <p>s 6' t ∨ D ,</p>
        <p>C{x 7→ s, y 7→ t} ∨ D
where x ' y ∨ C is identified as an extensionality clause.</p>
        <p>
          Finding the goal. As mentioned previously, preprocessing and proof search can benefit from
knowing what the goal of a problem is. Ideally the translation preserves this but the ATP can
also attempt to guess the goal based on the location of axioms in the input problem and the
frequency of symbols occurring in different parts of the problem [
          <xref ref-type="bibr" rid="ref37">37</xref>
          ].
        </p>
        <p>
          Changing the translation. It can be possible to detect cases where an alternative encoding
may be preferable and switch to that encoding via a further translation step. For example, some
work [
          <xref ref-type="bibr" rid="ref44">44</xref>
          ] in the CVC4 SMT solver looked at identifying datatypes encoding natural numbers
and translating these to guarded usage of integers, which the solver has better support for.
Supporting multiple encodings. Ultimately we may want to try various encodings via the
strategy scheduling approach. However, if the ATP cannot understand the original problem then
it cannot natively support these encodings. The solution here is to extend the input language
of the ATP such that it supports additional features that allow the different encodings to take
place. This is what we have done with our work on programs [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] and datatypes [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ].
5
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Comparing Encodings</title>
      <p>
        It is generally hard to evaluate the addition of a new feature to a first-order ATP [
        <xref ref-type="bibr" rid="ref41">41</xref>
        ]. The
same can be said for encodings. Here I give a few thoughts about how to go about it:
1. Actually do it. Without experiments it is impossible to draw any real conclusions. A
corollary here is don’t rely on properties of the output of the encoding that you assume are
good proxies for performance e.g. the number of resulting clauses. Whilst bigger problems
can pose a problem for ATPs, there are many small (&lt; 50 clauses) problems that ATPs
find very challenging.
2. Use portfolio mode. The encoding may require certain preprocessing or proof search
parameters to be switched on (or off) to be effective. Running in a single mode may
miss this and the wrong conclusion may be drawn. Furthermore, the encoding may react
positively to multiple different strategies and similarly, without portfolio mode this will
be missed. A corollary here is to look at the strategies actually used to solve problems
and see if there are any common patterns.
3. Don’t use portfolio mode. The portfolio modes in a solver are tuned to the current options
and assumptions about input problems. They are usually slightly over-fitted. Perhaps
the parameter option needed for your encoding is not included. The obvious solution is
to search the parameter space for the optimal combination of parameters for a particular
encoding. This is what we often do in Vampire but it is very expensive.
4. Think about the resources. Do the input problems reflect what you care about; often it is
easy to come up with pathological bad cases but optimising for these often makes little
practical difference. Is the time limit sensible; for portfolio mode allow minutes (I use 5)
and for single strategies allow seconds (I use 10 to 30) but use cases vary. The point here
is whether the results reflect actual usage for the problem being targeted – one encoding
may work better than another for cases that you don’t care about.
6
      </p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>In this paper I aimed to give some thoughts about the activity of translating problems into
first-order logic. My main conclusion is that for some encodings we need ATP support, and
this is what we are trying to provide in Vampire. If you want help extending Vampire for a
particular encoding please contact me.
12</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Bachmair</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Ganzinger</surname>
          </string-name>
          .
          <article-title>Resolution theorem proving</article-title>
          .
          <source>In A. Robinson and A</source>
          . Voronkov, editors,
          <source>Handbook of Automated Reasoning, volume I, chapter 2</source>
          , pages
          <fpage>19</fpage>
          -
          <lpage>99</lpage>
          . Elsevier Science,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Barnett and K. Rustan M. Leino</surname>
          </string-name>
          .
          <article-title>To goto where no statement has gone before</article-title>
          .
          <source>In Verified Software: Theories</source>
          , Tools, Experiments, Third International Conference, VSTTE 2010,
          <article-title>Edinburgh</article-title>
          ,
          <string-name>
            <surname>UK</surname>
          </string-name>
          ,
          <year>August</year>
          16-
          <issue>19</issue>
          ,
          <year>2010</year>
          . Proceedings, pages
          <fpage>157</fpage>
          -
          <lpage>168</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Clark</given-names>
            <surname>Barrett</surname>
          </string-name>
          , Aaron Stump, and
          <string-name>
            <given-names>Cesare</given-names>
            <surname>Tinelli</surname>
          </string-name>
          .
          <article-title>The Satisfiability Modulo Theories Library (SMT-LIB)</article-title>
          .
          <source>www.SMT-LIB.org</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Peter</given-names>
            <surname>Baumgartner</surname>
          </string-name>
          , Alexander Fuchs, Hans de Nivelle, and
          <string-name>
            <given-names>Cesare</given-names>
            <surname>Tinelli</surname>
          </string-name>
          .
          <article-title>Computing finite models by reduction to function-free clause logic</article-title>
          .
          <source>Journal of Applied Logic</source>
          ,
          <volume>7</volume>
          (
          <issue>1</issue>
          ):
          <fpage>58</fpage>
          -
          <lpage>74</lpage>
          ,
          <year>2009</year>
          . Special Issue:
          <article-title>Empirically Successful Computerized Reasoning</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Christoph</given-names>
            <surname>Benzmu</surname>
          </string-name>
          <article-title>¨ller and Lawrence C Paulson. Multimodal and intuitionistic logics in simple type theory</article-title>
          .
          <source>Logic Journal of IGPL</source>
          ,
          <volume>18</volume>
          (
          <issue>6</issue>
          ):
          <fpage>881</fpage>
          -
          <lpage>892</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Ahmed</given-names>
            <surname>Bhayat</surname>
          </string-name>
          and
          <string-name>
            <given-names>Giles</given-names>
            <surname>Reger</surname>
          </string-name>
          .
          <article-title>Higher-order reasoning vampire style</article-title>
          .
          <source>In 25th Automated Reasoning Workshop, page 19</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Ahmed</given-names>
            <surname>Bhayat</surname>
          </string-name>
          and
          <string-name>
            <given-names>Giles</given-names>
            <surname>Reger</surname>
          </string-name>
          .
          <article-title>Set of support for higher-order reasoning</article-title>
          .
          <source>In PAAR</source>
          <year>2018</year>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Blanchette</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>Bo¨hme, A. Popescu, and</article-title>
          <string-name>
            <given-names>N.</given-names>
            <surname>Smallbone</surname>
          </string-name>
          .
          <article-title>Encoding monomorphic and polymorphic types</article-title>
          .
          <source>In Proceedings of the 19th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS</source>
          <year>2013</year>
          ),
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Jasmin</given-names>
            <surname>Christian</surname>
          </string-name>
          <string-name>
            <given-names>Blanchette</given-names>
            , Sascha B¨ohme, and
            <surname>Lawrence</surname>
          </string-name>
          <string-name>
            <given-names>C.</given-names>
            <surname>Paulson</surname>
          </string-name>
          .
          <article-title>Extending sledgehammer with SMT solvers</article-title>
          .
          <source>J. Autom. Reasoning</source>
          ,
          <volume>51</volume>
          (
          <issue>1</issue>
          ):
          <fpage>109</fpage>
          -
          <lpage>128</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Franois</surname>
            <given-names>Bobot</given-names>
          </string-name>
          , Jean christophe Fillitre, Claude March, and Andrei Paskevich. Why3:
          <article-title>Shepherd your herd of provers</article-title>
          .
          <source>In In Workshop on Intermediate Veri cation Languages</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Koen</surname>
            <given-names>Claessen</given-names>
          </string-name>
          , Ann Lilliestro¨m, and Nicholas Smallbone.
          <article-title>Sort it out with monotonicity - translating between many-sorted and unsorted first-order logic</article-title>
          .
          <source>In Automated Deduction - CADE-23 - 23rd International Conference on Automated Deduction, Wroclaw, Poland, July 31 - August 5</source>
          ,
          <year>2011</year>
          . Proceedings, pages
          <fpage>207</fpage>
          -
          <lpage>221</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Laura</given-names>
            <surname>Kovcs Evgenii Kotelnikov</surname>
          </string-name>
          and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>A FOOLish encoding of the next state relations of imperative programs</article-title>
          .
          <source>In IJCAR</source>
          <year>2018</year>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Tobias</surname>
            <given-names>Gleißner</given-names>
          </string-name>
          , Alexander Steen, and
          <article-title>Christoph Benzmu¨ller. Theorem provers for every normal modal logic</article-title>
          .
          <source>In LPAR-21, 21st International Conference on Logic for Programming</source>
          ,
          <source>Artificial Intelligence and Reasoning</source>
          , Maun, Botswana, May 7-
          <issue>12</issue>
          ,
          <year>2017</year>
          , pages
          <fpage>14</fpage>
          -
          <lpage>30</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Sylvia</surname>
            <given-names>Grewe</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Erdweg</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Mira</given-names>
            <surname>Mezini</surname>
          </string-name>
          .
          <article-title>Using vampire in soundness proofs of type systems</article-title>
          .
          <source>In Proceedings of the 1st and 2nd Vampire Workshops, Vampire@VSL</source>
          <year>2014</year>
          , Vienna, Austria, July
          <volume>23</volume>
          ,
          <year>2014</year>
          / Vampire@CADE 2015, Berlin, Germany,
          <source>August</source>
          <volume>2</volume>
          ,
          <year>2015</year>
          , pages
          <fpage>33</fpage>
          -
          <lpage>51</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Sylvia</surname>
            <given-names>Grewe</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Erdweg</surname>
          </string-name>
          , Andr´e Pacak, Michael Raulf, and
          <string-name>
            <given-names>Mira</given-names>
            <surname>Mezini</surname>
          </string-name>
          .
          <article-title>Exploration of language specifications by compilation to first-order logic</article-title>
          .
          <source>Sci. Comput</source>
          . Program.,
          <volume>155</volume>
          :
          <fpage>146</fpage>
          -
          <lpage>172</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Ashutosh</surname>
            <given-names>Gupta</given-names>
          </string-name>
          , Laura Kovcs, Bernhard Kragl, and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>Extensional crisis and proving identity</article-title>
          . In Franck Cassez and
          <string-name>
            <surname>Jean-Franois</surname>
            <given-names>Raskin</given-names>
          </string-name>
          , editors,
          <source>Automated Technology for Verification and Analysis</source>
          , volume
          <volume>8837</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>185</fpage>
          -
          <lpage>200</lpage>
          . Springer International Publishing,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Krystof</surname>
            <given-names>Hoder</given-names>
          </string-name>
          , Zurab Khasidashvili, Konstantin Korovin, and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>Preprocessing techniques for first-order clausification</article-title>
          . In Formal Methods in Computer-Aided Design,
          <string-name>
            <surname>FMCAD</surname>
          </string-name>
          <year>2012</year>
          , Cambridge, UK,
          <source>October 22-25</source>
          ,
          <year>2012</year>
          , pages
          <fpage>44</fpage>
          -
          <lpage>51</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Kryˇstof</given-names>
            <surname>Hoder</surname>
          </string-name>
          , Giles Reger,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Suda</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>Selecting the selection</article-title>
          .
          <source>In Nicola Olivetti and Ashish Tiwari</source>
          , editors,
          <source>Automated Reasoning: 8th International Joint Conference, IJCAR</source>
          <year>2016</year>
          , Coimbra, Portugal, June 27 - July 2,
          <year>2016</year>
          , Proceedings, pages
          <fpage>313</fpage>
          -
          <lpage>329</lpage>
          , Cham,
          <year>2016</year>
          . Springer International Publishing.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Krystof</given-names>
            <surname>Hoder</surname>
          </string-name>
          and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>Sine qua non for large theory reasoning</article-title>
          .
          <source>In Automated Deduction - CADE-23 - 23rd International Conference on Automated Deduction, Wroclaw, Poland, July 31 - August 5</source>
          ,
          <year>2011</year>
          . Proceedings, pages
          <fpage>299</fpage>
          -
          <lpage>314</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Ian</given-names>
            <surname>Horrocks</surname>
          </string-name>
          and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>Reasoning support for expressive ontology languages using a theorem prover</article-title>
          .
          <source>In Proceedings of the 4th International Conference on Foundations of Information and Knowledge Systems</source>
          ,
          <source>FoIKS'06</source>
          , pages
          <fpage>201</fpage>
          -
          <lpage>218</lpage>
          , Berlin, Heidelberg,
          <year>2006</year>
          . Springer-Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Ullrich</given-names>
            <surname>Hustadt</surname>
          </string-name>
          and
          <string-name>
            <given-names>Renate A.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          . MSPASS:
          <article-title>Modal reasoning by translation and first-order resolution</article-title>
          . In Roy Dyckhoff, editor,
          <source>Automated Reasoning with Analytic Tableaux and Related Methods</source>
          , pages
          <fpage>67</fpage>
          -
          <lpage>71</lpage>
          , Berlin, Heidelberg,
          <year>2000</year>
          . Springer Berlin Heidelberg.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Jan</surname>
            <given-names>Jakubuv</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Suda</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Josef</given-names>
            <surname>Urban</surname>
          </string-name>
          .
          <article-title>Automated invention of strategies and term orderings for vampire</article-title>
          .
          <source>In GCAI 2017, 3rd Global Conference on Artificial Intelligence</source>
          , Miami, FL, USA,
          <fpage>18</fpage>
          -
          <lpage>22</lpage>
          October
          <year>2017</year>
          ., pages
          <fpage>121</fpage>
          -
          <lpage>133</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>D.</given-names>
            <surname>Knuth</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Bendix</surname>
          </string-name>
          .
          <article-title>Simple word problems in universal algebras</article-title>
          . In J. Leech, editor,
          <source>Computational Problems in Abstract Algebra</source>
          , pages
          <fpage>263</fpage>
          -
          <lpage>297</lpage>
          . Pergamon Press, Oxford,
          <year>1970</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Konstantin</given-names>
            <surname>Korovin</surname>
          </string-name>
          .
          <article-title>Inst-Gen - A Modular Approach to Instantiation-Based Automated Reasoning</article-title>
          . In Andrei Voronkov and Christoph Weidenbach, editors,
          <source>Programming Logics: Essays in Memory of Harald Ganzinger</source>
          , pages
          <fpage>239</fpage>
          -
          <lpage>270</lpage>
          , Berlin, Heidelberg,
          <year>2013</year>
          . Springer Berlin Heidelberg.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Konstantin</given-names>
            <surname>Korovin</surname>
          </string-name>
          and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>Orienting rewrite rules with the Knuth-Bendix order</article-title>
          .
          <source>Inf. Comput.</source>
          ,
          <volume>183</volume>
          (
          <issue>2</issue>
          ):
          <fpage>165</fpage>
          -
          <lpage>186</lpage>
          ,
          <year>June 2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Evgenii</surname>
            <given-names>Kotelnikov</given-names>
          </string-name>
          , Laura Kova´cs, Giles Reger, and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>The vampire and the FOOL</article-title>
          .
          <source>In Proceedings of the 5th ACM SIGPLAN Conference on Certified Programs and Proofs</source>
          ,
          <string-name>
            <surname>CPP</surname>
          </string-name>
          <year>2016</year>
          , pages
          <fpage>37</fpage>
          -
          <lpage>48</lpage>
          . ACM,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Evgenii</surname>
            <given-names>Kotelnikov</given-names>
          </string-name>
          , Laura Kova´cs, Martin Suda, and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>A clausal normal form translation for FOOL</article-title>
          .
          <source>In GCAI 2016. 2nd Global Conference on Artificial Intelligence, September 19 - October 2</source>
          ,
          <year>2016</year>
          , Berlin, Germany, pages
          <fpage>53</fpage>
          -
          <lpage>71</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Evgenii</surname>
            <given-names>Kotelnikov</given-names>
          </string-name>
          ,
          <article-title>Laura Kova´cs, and Andrei Voronkov. A first class boolean sort in first-order theorem proving and TPTP</article-title>
          . In Intelligent Computer Mathematics - International Conference, CICM 2015, Washington, DC, USA, July
          <volume>13</volume>
          -
          <issue>17</issue>
          ,
          <year>2015</year>
          , Proceedings, pages
          <fpage>71</fpage>
          -
          <lpage>86</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>Laura</given-names>
            <surname>Kov</surname>
          </string-name>
          <article-title>´acs, Simon Robillard, and Andrei Voronkov</article-title>
          .
          <article-title>Coming to terms with quantified reasoning</article-title>
          . In Giuseppe Castagna and Andrew D. Gordon, editors,
          <source>Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL</source>
          <year>2017</year>
          , Paris, France,
          <source>January 18-20</source>
          ,
          <year>2017</year>
          , pages
          <fpage>260</fpage>
          -
          <lpage>270</lpage>
          . ACM,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>Laura</given-names>
            <surname>Kova</surname>
          </string-name>
          <article-title>´cs and Andrei Voronkov</article-title>
          .
          <article-title>First-order theorem proving and Vampire</article-title>
          .
          <source>In CAV</source>
          <year>2013</year>
          , volume
          <volume>8044</volume>
          <source>of LNCS</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>35</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>K. Rustan M. Leino</surname>
          </string-name>
          and
          <article-title>Philipp Ru¨mmer. A polymorphic intermediate verification language: Design and logical encoding</article-title>
          .
          <source>In Proceedings of the 16th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS'10</source>
          , pages
          <fpage>312</fpage>
          -
          <lpage>327</lpage>
          , Berlin, Heidelberg,
          <year>2010</year>
          . Springer-Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>Michel</given-names>
            <surname>Ludwig</surname>
          </string-name>
          and
          <string-name>
            <given-names>Uwe</given-names>
            <surname>Waldmann</surname>
          </string-name>
          .
          <article-title>An extension of the Knuth-Bendix ordering with LPO-like properties</article-title>
          .
          <source>In Logic for Programming</source>
          ,
          <source>Artificial Intelligence, and Reasoning</source>
          , 14th International Conference, LPAR 2007, Yerevan, Armenia,
          <source>October 15-19</source>
          ,
          <year>2007</year>
          , Proceedings, pages
          <fpage>348</fpage>
          -
          <lpage>362</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>Jia</given-names>
            <surname>Meng</surname>
          </string-name>
          and Lawrence C. Paulson.
          <article-title>Translating higher-order clauses to first-order clauses</article-title>
          .
          <source>Journal of Automated Reasoning</source>
          ,
          <volume>40</volume>
          (
          <issue>1</issue>
          ):
          <fpage>35</fpage>
          -
          <lpage>60</lpage>
          ,
          <year>Jan 2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>R.</given-names>
            <surname>Nieuwenhuis</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Rubio</surname>
          </string-name>
          .
          <article-title>Paramodulation-based theorem proving</article-title>
          .
          <source>In A. Robinson and A</source>
          . Voronkov, editors,
          <source>Handbook of Automated Reasoning, volume I, chapter 7</source>
          , pages
          <fpage>371</fpage>
          -
          <lpage>443</lpage>
          . Elsevier Science,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Lawrence</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Paulson</surname>
          </string-name>
          and Jasmin Christian Blanchette.
          <article-title>Three years of experience with sledgehammer, a practical link between automatic and interactive theorem provers</article-title>
          . In Geoff Sutcliffe, Stephan Schulz, and Eugenia Ternovska, editors,
          <source>IWIL 2010. The 8th International Workshop on the Implementation of Logics</source>
          , volume
          <volume>2</volume>
          of EPiC Series in Computing, pages
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          . EasyChair,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Rawson</surname>
          </string-name>
          and
          <string-name>
            <given-names>Giles</given-names>
            <surname>Reger</surname>
          </string-name>
          .
          <article-title>Dynamic strategy priority: Empower the strong and abandon the weak</article-title>
          .
          <source>In PAAR</source>
          <year>2018</year>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>Giles</given-names>
            <surname>Reger</surname>
          </string-name>
          and
          <string-name>
            <given-names>Martin</given-names>
            <surname>Riener</surname>
          </string-name>
          .
          <article-title>What is the point of an smt-lib problem?</article-title>
          <source>In 16th International Workshop on Satisfiability Modulo Theories</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>Giles</given-names>
            <surname>Reger</surname>
          </string-name>
          and
          <string-name>
            <given-names>Martin</given-names>
            <surname>Suda</surname>
          </string-name>
          .
          <article-title>Measuring progress to predict success: Can a good proof strategy be evolved</article-title>
          ?
          <source>AITP</source>
          <year>2017</year>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>Giles</given-names>
            <surname>Reger</surname>
          </string-name>
          and
          <string-name>
            <given-names>Martin</given-names>
            <surname>Suda</surname>
          </string-name>
          .
          <article-title>Set of support for theory reasoning</article-title>
          .
          <source>In IWIL Workshop and LPAR Short Presentations</source>
          , volume
          <volume>1</volume>
          of Kalpa Publications in Computing, pages
          <fpage>124</fpage>
          -
          <lpage>134</lpage>
          . EasyChair,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <surname>Giles</surname>
            <given-names>Reger</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Suda</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>Playing with AVATAR</article-title>
          . In P. Amy Felty and Aart Middeldorp, editors,
          <source>Automated Deduction - CADE-25: 25th International Conference on Automated Deduction</source>
          , Berlin, Germany,
          <source>August 1-7</source>
          ,
          <year>2015</year>
          , Proceedings, pages
          <fpage>399</fpage>
          -
          <lpage>415</lpage>
          , Cham,
          <year>2015</year>
          . Springer International Publishing.
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <surname>Giles</surname>
            <given-names>Reger</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Suda</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>The challenges of evaluating a new feature in Vampire</article-title>
          . In Laura Kova´cs and Andrei Voronkov, editors,
          <source>Proceedings of the 1st and 2nd Vampire Workshops</source>
          , volume
          <volume>38</volume>
          of EPiC Series in Computing, pages
          <fpage>70</fpage>
          -
          <lpage>74</lpage>
          . EasyChair,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <surname>Giles</surname>
            <given-names>Reger</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Suda</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>Finding finite models in multi-sorted first-order logic</article-title>
          . In Nadia Creignou and Daniel Le Berre, editors,
          <source>Theory and Applications of Satisfiability Testing - SAT</source>
          <year>2016</year>
          : 19th International Conference, Bordeaux, France,
          <source>July 5-8</source>
          ,
          <year>2016</year>
          , Proceedings, pages
          <fpage>323</fpage>
          -
          <lpage>341</lpage>
          . Springer International Publishing,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <surname>Giles</surname>
            <given-names>Reger</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Suda</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>New techniques in clausal form generation</article-title>
          . In Christoph Benzmu¨ller, Geoff Sutcliffe, and Raul Rojas, editors,
          <source>GCAI 2016. 2nd Global Conference on Artificial Intelligence</source>
          , volume
          <volume>41</volume>
          of EPiC Series in Computing, pages
          <fpage>11</fpage>
          -
          <lpage>23</lpage>
          . EasyChair,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Reynolds</surname>
          </string-name>
          and
          <string-name>
            <given-names>Viktor</given-names>
            <surname>Kuncak</surname>
          </string-name>
          .
          <article-title>Induction for SMT solvers</article-title>
          . In Verification, Model Checking, and
          <string-name>
            <surname>Abstract</surname>
          </string-name>
          Interpretation - 16th
          <source>International Conference, VMCAI 2015</source>
          , Mumbai, India, January
          <volume>12</volume>
          -
          <issue>14</issue>
          ,
          <year>2015</year>
          . Proceedings, pages
          <fpage>80</fpage>
          -
          <lpage>98</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Schneider</surname>
          </string-name>
          and
          <string-name>
            <given-names>Geoff</given-names>
            <surname>Sutcliffe</surname>
          </string-name>
          .
          <article-title>Reasoning in the OWL 2 full ontology language using first-order automated theorem proving</article-title>
          .
          <source>In Nikolaj Bjørner and Viorica</source>
          Sofronie-Stokkermans, editors,
          <source>Automated Deduction - CADE-23</source>
          , pages
          <fpage>461</fpage>
          -
          <lpage>475</lpage>
          , Berlin, Heidelberg,
          <year>2011</year>
          . Springer Berlin Heidelberg.
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <given-names>S.</given-names>
            <surname>Schulz. E -</surname>
          </string-name>
          <article-title>a brainiac theorem prover</article-title>
          .
          <volume>15</volume>
          (
          <issue>2-3</issue>
          ):
          <fpage>111</fpage>
          -
          <lpage>126</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [47]
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Steen</surname>
          </string-name>
          and
          <article-title>Christoph Benzmu¨ller. The higher-order prover Leo-III. CoRR</article-title>
          , abs/
          <year>1802</year>
          .02732,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          [48]
          <string-name>
            <given-names>Geoff</given-names>
            <surname>Sutcliffe</surname>
          </string-name>
          .
          <article-title>The TPTP problem library and associated infrastructure</article-title>
          .
          <source>J. Autom. Reasoning</source>
          ,
          <volume>43</volume>
          (
          <issue>4</issue>
          ):
          <fpage>337</fpage>
          -
          <lpage>362</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          [49]
          <string-name>
            <given-names>Geoff</given-names>
            <surname>Sutcliffe</surname>
          </string-name>
          and
          <string-name>
            <given-names>Evgenii</given-names>
            <surname>Kotelnikov</surname>
          </string-name>
          .
          <article-title>TFX: The TPTP extended typed first-order form</article-title>
          .
          <source>In PAAR</source>
          <year>2018</year>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          [50]
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov. AVATAR:</surname>
          </string-name>
          <article-title>The architecture for first-order theorem provers</article-title>
          .
          <source>In Armin Biere and Roderick Bloem</source>
          , editors,
          <source>Computer Aided Verification</source>
          , volume
          <volume>8559</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>696</fpage>
          -
          <lpage>710</lpage>
          . Springer International Publishing,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          [51]
          <string-name>
            <given-names>C.</given-names>
            <surname>Weidenbach</surname>
          </string-name>
          .
          <article-title>Combining superposition, sorts and splitting</article-title>
          . In A. Robinson and
          <string-name>
            <surname>A</surname>
          </string-name>
          . Voronkov, editors,
          <source>Handbook of Automated Reasoning</source>
          , volume II, chapter
          <volume>27</volume>
          , pages
          <fpage>1965</fpage>
          -
          <lpage>2013</lpage>
          . Elsevier Science,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          [52]
          <string-name>
            <surname>Lawrence</surname>
            <given-names>Wos</given-names>
          </string-name>
          , George A.
          <string-name>
            <surname>Robinson</surname>
          </string-name>
          , and Daniel F. Carson.
          <article-title>Efficiency and completeness of the set of support strategy in theorem proving</article-title>
          .
          <source>J. ACM</source>
          ,
          <volume>12</volume>
          (
          <issue>4</issue>
          ):
          <fpage>536</fpage>
          -
          <lpage>541</lpage>
          ,
          <year>October 1965</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>