<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Set of Support for Higher-Order Reasoning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ahmed Bhayat</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giles Reger</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Manchester</institution>
          ,
          <addr-line>Manchester</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <fpage>2</fpage>
      <lpage>16</lpage>
      <abstract>
        <p>Higher-order logic (HOL) is utilised in numerous domains from program veri cation to the formalisation of mathematics. However, automated reasoning in the higher-order domain lags behind rst-order automation. Many higher-order automated provers translate portions of HOL problems to rst-order logic (FOL) and pass them to FOL provers. However, FOL provers are not optimised for dealing with these translations. One of the reasons for this is that the axioms introduced during the translation (e.g. those de ning combinators) may combine with each other during proof search, deriving consequences of the axioms irrelevant to the goal. In this work we evaluate the extent to which this issue a ects proof search and introduce heuristics based on the set-of-support strategy for minimising the e ects. Our experiments are carried out within the Vampire theorem prover and show that limiting how axioms introduced during translation can improve proof search with higher-order problems.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Most automated theorem provers for higher-order logic (HOL) utilise a
rstorder (FO) theorem prover to discharge some of the proof burden [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. However,
this can lead to ine ciencies in practice due to FO provers not being optimised
for the often unwieldy translations. The main goal of this work is to extend the
Vampire theorem prover [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] to e ciently support higher-order reasoning. This
involves extending proof search to handle the axioms arising from translation
from HOL to FOL in a special way. Our main focus is a translation from a
fragment of TPTP's higher-order form to FOL implemented within Vampire
but we also look at a sample of output from the Sledgehammer tool [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        This work builds on previous work [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] for controlling the use of theory
axioms when using Vampire to reason about theories such as real and integer
arithmetic. In this previous work (and the work presented here) the idea is to
extend the set-of-support strategy to avoid inferences which cause the search
space to grow too quickly with irrelevant consequences of the theory. Indeed, we
can see the axioms introduced in the HOL translations as a theory of combinators
and logical connectives (see later for further details).
      </p>
      <p>Our preliminary results indicate that seeking to control the use of such axioms
in proof search is important and that the set-of-support strategy can be helpful
in doing this. The rest of the paper is organised as follows. We rst introduce
relevant background (Section 2), we then present the translation from
higherorder logic to rst-order logic (Section 3), next we report on some initial analysis
of the use of higher-order axioms in proof search within Vampire (Section 4),
this is followed by a description of the set-of-support strategy (Section 5) and
its evaluation (Section 6) before some nal concluding remarks (Section 7).
2</p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>We introduce the necessary background of rst-order and high-order logic,
saturationbased theorem proving in Vampire, set of support reasoning, and the tools for
automated higher-order reasoning.
2.1</p>
      <sec id="sec-2-1">
        <title>First-order and Higher-order Logic</title>
        <p>
          We consider a many-sorted rst-order logic with equality. A term is either a
variable, a constant, or a function symbol applied to terms. Function symbols are
sorted i.e. their arguments and the return value have a unique sort drawn from
a nite set of sorts S. We only consider well-sorted literals. There is an equality
symbol per sort and equalities can only be between terms of the same sort. Here
we consider a fragment of rst-order logic where all atoms are equalities and
literals are atoms or their negation. That is, we treat predicates as functions
of boolean sort and include a theory of booleans (see [
          <xref ref-type="bibr" rid="ref7 ref8">7,8</xref>
          ]). Vampire supports
the use of predicates natively but our translation does not make use of them.
At this point we note that the theory of booleans requires the addition of the
axioms: true 6= false and (8x : bool )(x = true _ x = false) as well as congruence
axioms for functions. Previous work [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] explores the addition of an additional
inference rule to remove the need for these axioms. We draw attention to this
as the translation from HOL to FOL introduces a lot of boolean structure and
dealing with booleans becomes important when reasoning with the output of
this translation.
        </p>
        <p>Formulas (in FOL and HOL) may use the standard notions of quanti cation
and logical connectives, but in this work we assume all formulas are clausi ed
using standard techniques. A clause is a disjunction of literals where all variables
are universally quanti ed (existentially quanti ed variables can be replaced by
Skolem functions during clausi cation). We assume the reader is familiar with
the standard semantics of FOL.</p>
        <p>The main di erence between FOL and HOL is that in the former variables
can only range over individuals and not over functions. Below a version of HOL
syntax based on the simply-typed lambda calculus is presented. The terms of
HOL are de ned over a signature = (F; S) where F is a set of function symbols
and S is a set of atomic sorts containing o the sort of truth values, and the sort
of individuals. For each type, it is assumed that there exists a countably in nite
set of variables with that type.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Types</title>
        <p>Terms
::=</p>
        <p>j 1 ! 2
t ::= x j c j t1 1! 2 t2 1 j x :t
where</p>
        <sec id="sec-2-2-1">
          <title>2 S and 1 and 2 are types</title>
          <p>where c is a constant of type</p>
          <p>It is assumed that the set of function symbols F contains at least the logical
constants :o!o, _o!o!o and, for each type , constants ( !o)!o and = ! !o
for universal quanti cation and primitive equality. The remaining logical
constants can be de ned in terms of these. HOL formulas are de ned as terms of
type o.</p>
          <p>In the semantics of higher-order logic, a model is a pair (fD j 2 T g, I),
where T is a set of sorts and fD g is a set of frames, one for each sort 2 T .
Each frame is itself a set with Do = f?; &gt;g and D ! being a set of functions
from D to D . The function I maps each f 2 F to an element of D and is
chosen such that the logical constants have their expected denotation.</p>
          <p>There are two main semantics for higher-order logic. In standard semantics,
the set D ! is taken to be the set of all possible functions with domain D and
range D . In contrast, in Henkin or general semantics the frame D ! is only
assumed to contain the minimal necessary number of members.</p>
          <p>
            A variable assignment is a function that maps each variable X to a member
of D . A valuation function V from terms of higher-order logic to their
denotations can be de ned in the obvious manner. A term to is valid in a model M if,
for all variable assignments , V t0 = &gt;. For further details on the semantics
of higher-order logic refer to [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ].
          </p>
          <p>In our work, we target a fragment of higher-order logic with Henkin
semantics. The fragment targeted does not contain choice or extensionality. Henkin
semantics is more or less standard in the automated higher-order logic
community. The higher-order section of the TPTP library stipulates Henkin semantics
as the intended semantics for all problems. Adding choice and extensionality
axioms to a search space is known to have an adverse e ect proof search. An
option for future experiments would be to add a strategy to Vampire which runs
with these axioms included.
2.2</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>Saturation-based Reasoning in the Vampire Theorem Prover</title>
        <p>Theorem provers such as Vampire are refutational and saturation-based. The
idea is that an input formula of the form Premises ! Conjecture is negated, to
give Premises ^ :Conjecture, then clausi ed to produce a set of clauses S. This
set is then saturated with respect to some inference system I meaning that for
every inference from I with premises in S the conclusion of the inference is also
in S. If the saturated set S contains a contradiction then the initial formula is
necessarily valid. Otherwise, if I is a complete inference system, and importantly
the requirements for this completeness have been preserved, then S is satis able
and the input formula is not valid. Note that in our context of higher-order
reasoning we are never complete but we mention this notion here as it is key to
the approach. Finite saturation may not be possible and a lot of research over the
past 50 years has focussed on how to control this saturation-based proof search
to make nding a contradiction more likely. This is the focus of this paper.</p>
        <p>The standard approach to saturation is the given-clause algorithm illustrated
in Figure 1. The idea is to have an active set of clauses with the invariant that
all inferences between active clauses have been performed and a passive set of
clauses waiting to be activated. The algorithm then iteratively selects a given
clause from passive and performs all necessary inferences to add it to active.
The process of clause selection is important but not discussed here.</p>
        <p>Vampire uses resolution and superposition as its inference system I. A key
feature of this calculus is the use of literal selection and orderings to restrict the
application of inference rules, thus restricting the growth of the clause sets.
Another very important concept related to saturation is the notion of redundancy.
It is not key to this paper so we do not discuss the details but the idea is that
some clauses in S are redundant in the sense that they can be safely removed
from S without changing what can be derived. The notion of saturation then
becomes saturation-up-to-redundancy.</p>
        <p>
          As a nal note, from this discussion it may appear that completeness of
I is important. But as we showed in our recent work on literal selection [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ],
breaking the conditions required for completeness can be helpful in proof search.
Additionally, the focus of this paper is in higher-order reasoning, where the
previous completeness arguments for FOL no longer hold.
2.3
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>Set of Support</title>
        <p>
          The set of support strategy was introduced in [
          <xref ref-type="bibr" rid="ref10 ref16">16,10</xref>
          ] as a method for restricting
the possible inferences. The idea is to split the clauses into a set of support and
the rest, and then restrict inferences so that they must include at least one clause
that is in the set of support, with new clauses being added to the set of support.
Another way of phrasing this is that all inferences must have a clause in the
initial set of support as an ancestor.
        </p>
        <p>If this description feels familiar then it should as the above given-clause
algorithm is based on the set of support strategy. Here the set of support is the
passive set and the rest of the clauses must be immediately added to the
active set, breaking the invariant that all inferences between clauses in this set
have been performed. This immediately breaks the condition required for
completeness. Note that our context of higher-order reasoning also introduces larger
barriers for completeness. We are interested in how this strategy can be used to
prevent the explosion of axioms coming from the HOL to FOL translation.</p>
        <p>
          In the general set of support strategy it is not prescribed which clauses should
be added to the set of support, but it is reasonable to assume that it should
at least include the goal, thus encouraging goal-directed reasoning. This is the
approach taken in Vampire where only the conjecture is added to the set of
support. We note here that this means that set of support can only be used for
problems containing a conjecture which excludes many problems in TPTP and
all problems in SMTLIB [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
        </p>
        <p>In Vampire there are three set of support related options:
{ off: do not perform set of support reasoning,
{ on: perform set of support reasoning performing standard literal selection,
{ all: perform set of support reasoning selecting all literals in those clauses
initially added to active. This helps mitigate some of the incompleteness
introduced by the set of support strategy.
{ theory: apply set of support reasoning for theory axioms only (see below
and Section 5).</p>
        <p>
          Our previous work [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] in extending set of support for theory reasoning
demonstrates that (i) set of support is a generally useful strategy, and (ii) it
can be adapted to give more ne-grained control over the impact of including
axioms from some theory in proof search.
2.4
        </p>
      </sec>
      <sec id="sec-2-5">
        <title>Automated Higher-order Reasoning</title>
        <p>There exists a rich literature in techniques for automating higher-order
reasoning. A portion of this centres on techniques for translating higher-order to
rst-order logic. It is this previous work that motivates us as Vampire has often
been used to discharge the rst-order problems produced by such translations.</p>
        <p>
          The higher-order theorem prover Isabelle is well-known as an interactive
prover, but due to the possibility of chaining powerful automatic proof tactics
together, it can be also run as an automatic prover. Notable amongst these tactics
is Sledgehammer [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] which translates the current goal to rst-order logic and
then calls a number of rst-order provers (including SMT solvers [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]) on this
translation in parallel [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>
          The fully automated higher-order prover LEO-III utilises a calculus based
on paramodulation, equality and primitive substitution. It implements special
inference rules to deal with choice, function synthesis and extensionality [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
LEO-III also cooperates with rst-order provers by regularly translating the
clause set to rst-order logic and passing it to a prover.
        </p>
        <p>The translation utilised by both Sledgehammer and LEO-III di ers
minimally from that implemented in Vampire, details of which can be found in Section
3. One di erence between the Vampire translation and the Isabelle/LEO
translation is related to higher-order functions. If a higher-order function f always
appears in the input problem with at least n arguments, then both
Sledgehammer and LEO-III will translate it to a rst order function of arity n. Vampire,
on the other hand, deals with all higher-order application via the binary app
function discussed below. Vampire's translation results in a more nested
structure (which can be problematic for proof search), but is more complete. For
example, if the function symbol f always appears with two arguments in the
non-Vampire translation this would appear as f (t1; t2) and rst-order uni
cation with :app(app(X; t1); t2) would fail.</p>
        <p>
          Another important automated high-order prover is Satallax [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Satallax
implements a tableaux calculus, attempting to ground the formulas on a branch.
The proof search maintains a set of propositional clauses linked to the set of
formulas. If this set of clauses ever becomes unsatis able, the original problem
is unsatis able and Satallax records the input problem as a theorem.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Translating HOL to FOL</title>
      <p>Higher-order logic supports a number of features not supported in rst-order
logic. To translate from HOL to FOL we need to deal with each of these features.
These features are as follows:</p>
      <sec id="sec-3-1">
        <title>1. partial application: f 1! 2! c 1</title>
        <p>2. applied variables: 8x ! :x a = x b
3. lambda binders: x:f x
4. boolean sub-terms: po!o(xo _ yo)
The following sections explain how Vampire deals with each feature in its
translation (which, as mentioned previously, is similar to that of Sledgehammer and
LeoIII). Lambda expressions are translated using combinators rather than
lambdalifting for a couple of reasons. Firstly, the implementation of the combinator
translation was relatively straightforward allowing the quick evaluation of
various heuristics prior to the exploration of other translations. Secondly,
lambdalifting is incomplete. The idea behind the Vampire translation, is to achieve
as complete a translation as possible, and then control the search space with
heuristics and special inference rules.</p>
        <p>Note that the translation results in a fully rst-order set of formulas and
therefore no modi cation is required to Vampire's data structures, proof
procedures or heuristics.
3.1</p>
        <sec id="sec-3-1-1">
          <title>Translating Partial Application and Applied Variables</title>
          <p>All higher-order application, whether of constants or variables, is dealt with via a
two place application function app. More precisely, let JK be our translation from
higher-order to rst-order logic. Then if st is any higher-order application term,
JstK = app(JsK; JtK). For example the higher-order term f (Xa) would be
translated to the rst order term app(f; app(X; a)). Note that higher-order function
symbols are translated to rst-order constants.</p>
          <p>All higher-order types are translated to rst-order sorts. If, in the example
given above, f has higher-order type o ! and X a has type o then the outer
app function would be of type (0o ! 0 o) ! where 0o ! 0 is a rst-order
sort.
I
K
C
B
S
x:x
x: y:x
x: y: z:(x z y)
x: y: z:(x (y z))
x: y: z:(x z (y z))
Lambda-expressions are translated to rst-order utilising the extended set of
Turner combinators given in Figure 2. The translation of a term t containing a
lambda pre x is de ned inductively:
x:x then JtK = iCOM
x:t1x and x is not free in t1 then JtK = Jt1K
x:t1 and x is not free in t1 then JtK = app(kCOM; Jt1K))
x:t1t2 and x is free in t2, not t1 then JtK = app(app(bCOM; Jt1K); J x:t2K)
x:t1t2 and x is free in t1, not t2 then JtK = app(app(cCOM; J x:t1K)Jt2K)
x:t1t2 and x is free in t1 and t2 then JtK = app(app(sCOM; J x:t1K)J x:t2K)
For each combinator introduced in the translation, its de ning axiom is added
to the clause set. In particular instances of the following 5 axiom schemas are
added as relevant.
1. 8X : 1; Y : 2; Z : 3:app(app(app(sCOM; X); Y ); Z) = app(app(X; Z); app(Y; Z))
2. 8X : 1; Y : 2; Z : 3:app(app(app(bCOM; X); Y ); Z) = app(X; app(Y; Z))
3. 8X : 1; Y : 2; Z : 3:app(app(app(cCOM; X); Y ); Z) = app(app(X; Z); Y )
4. 8X : 1; Y : 2:app(app(kCOM; X); Y ) = X
5. 8X : 1:app(iCOM; X) = X</p>
          <p>Note that the translation is to many-sorted rst-order logic and thus the
above axioms are in reality axiom schemas. In the actual translation, sorted
versions of these axioms would be used.
3.3</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>Translating Logical Operators</title>
          <p>
            In higher-order logic, it is possible for terms of boolean sort to be function
arguments. Further, logical operators can be partially applied. Both of these are
disallowed in rst-order logic. Where high-order logical operators do not occur
partially applied nor as function arguments, these are mapped to their rst-order
counterparts. Thus, Jf a _ f bK ! app(f; a) _ app(f; b). Where logical constants
appear as arguments, but are not within the scope of any lambdas, Vampire
deals with this via a process known as renaming [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ]. However, renaming within
the scope of a lambda could cause bound variable to become free and thus logical
constants appearing in this context are translated to uninterpreted rst-order
constants. For example: J X : o; Y : o : X ^ Y K = J X : o; Y : o : JX ^ Y KK =
J X : o; Y : o : app(app(vAN D; X); Y )K. Here JK again represents our translation
function and vAN D is a rst-order constant of sort o ! o ! o.
          </p>
          <p>Axioms are added to provide these constants with the desired semantics. The
axioms are:
1. 8X : o; Y : o: app(app(vOR; X); Y ) = true () X = true _ Y = true
2. 8X : o; Y : o: app(app(vAN D; X); Y ) = true () X = true ^ Y = true
3. 8X : o; Y : o: app(app(vIM P; X); Y ) = true () X = true =) Y = true
4. 8X : o; Y : o: app(app(vIF F; X); Y ) = true () (X = true () Y =
true)
5. 8X : o; Y : o: app(app(vXOR; X); Y ) = true () X = true Y = true
6. 8X : o; Y : o: app(app(vEQU ALS; X); Y ) = true () X = Y
7. 8X : o: app(vN OT; X) = true () X 6= true
8. 8X : ! o: (app(vP I; X) = true () 8Y : : app(X; Y ) = true)
9. 8X : ! o: (app(vSIGM A; X) = true () 9Y : : app(X; Y ) = true)
3.4</p>
        </sec>
        <sec id="sec-3-1-3">
          <title>Examples of Translation</title>
          <p>Consider the following higher-order formula:</p>
          <p>mf orall = P hi(i!o)!i!o; Wi:(8Pi!o : P hi P W )</p>
          <p>Let JK be the Vampire translation function. The rst few steps the function
would take to translate the above formula are given below.</p>
          <p>Jmf orall = P hi(i!o)!i!o; Wi:(8Pi!o : P hi P W )K
Jmf orallK= J P hi(i!o)!i!o; Wi:(8Pi!o : P hi P W )K
mf orall = J P hi(i!o)!i!o:J Wi:(8Pi!o : P hi P W )KK
mf orall = J P hi(i!o)!i!o:J Wi:J8Pi!o : P hi P W KKK
mf orall = J P hi(i!o)!i!o:J Wi:JvP I((i!o)!o)!o ( Pi!o : P hi P W )KKK
mf orall = J P hi(i!o)!i!o:J Wi:app(vP I((i!o)!o)!o; J Pi!o : P hi P W K)KKK
.
.</p>
          <p>.
mf orall = app(app(bCOM; app(bCOM; vP I)); cCOM )</p>
          <p>The nal formula above has not been annotated with sort and type
information in order to keep it reasonably compact. However, it should be noted that
the sorts of two bCOM constants and the types of the two app functions are
di erent.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Analysing the Impact of Axioms on Proof Search</title>
      <p>We analyse the impact of axioms introduced in the translation on proof search via
the following experiment. We take 2,369 higher-order TPTP problems and run
Vampire in default mode using the translation from HOL to FOL described in the
previous section. We record (i) the maximum depth of pure descendants of HOL
axioms in proofs, and (ii) the percentage of the clause search space consisting
of clauses derived purely from HOL axioms in both successful and unsuccessful
proof search attempts. By depth here we mean the maximum number of inference
steps between a clause and a member of the initial clause set.</p>
      <p>The results are given in Tables 1 and 2. Table 1 gives the number of problems
solved with a certain depth of pure HOL axiom descendants. This shows us that
the majority of cases (almost 70%) do not require us to combine any axioms
introduced during proof search and most (almost 90%) only require combinations
up to depth 2. This suggests that restricting the depth of such combinations
will remove unnecessary work from proof search. Table 2 gives the number of
problems where the percentage of pure HOL descendants in the search space is
greater than a given percent for solved and unsolved problems (where zero is
a special case). This shows us that in the majority of cases such combinations
do not dominate the search space. However, in some cases they do and this is
more prevalent in cases where Vampire does not solve the problem, perhaps
suggesting that this could be a contributing factor. The evidence for combinations
of HOL axioms signi cantly impacting proof search is not overwhelming but
strong enough to be worthy of investigation. Looking at the 24 problems where
the percentage is over 50% we have 16 problems with more than 90% of the
search space being used with a maximum of 96%.</p>
    </sec>
    <sec id="sec-5">
      <title>Set of Support for Reasoning with Higher-order</title>
    </sec>
    <sec id="sec-6">
      <title>Axioms</title>
      <p>
        In this section we brie y describe how the notion of set of support is extended
to control the explosive nature of axioms introduced during translation. This is
broadly similar to the approach taken for theory axioms [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>The general idea is to treat the whole input problem, with the exception of
such axioms, as the set of support. However, as we saw in the previous section
we sometimes need combinations of these axioms to nd a proof. To allows this
we implement an approach that attempts to saturate the axioms up to some
depth during proof search (not preprocessing). The idea is to block all inferences
that would produce a clause derived only from the HOL axioms at a certain
depth. To achieve this we attach a ag and a counter to each clause where the
ag indicates whether the clause is a pure consequence of HOL axioms and the
counter counts the maximum number of inference steps between the clause and
a HOL axiom, i.e. the clause's depth in the proof. Inferences are updated to
update these values as appropriate. Whenever we generate a pure consequence
of HOL axioms with depth larger than a given threshold we delete this clause.</p>
      <p>Whilst testing the approach implemented above we noticed a number of
problems where the proof contained very shallow reasoning with input clauses
and deep reasoning with combinator axioms e.g. to derive a function to satisfy
some existential claim. In such cases the inverse of the above approach could
be helpful. To explore whether this was a useful approach we implemented an
option which acted as above but with the roles of HOL axioms and other clauses
reversed.</p>
      <p>The techniques described above rely on the translation being carried out by
Vampire. However, there is nothing here that relies on control over the
translation process; we only need to be able to identify HOL axioms. It happens that
problems produced by the Slegehammer tool in TPTP format use the \help "
pre x for the name of any axiom introduced during translation1. We add an
option to detect axioms with such names and identify them as HOL axioms.</p>
      <p>
        Finally, during testing we also observed that the axioms introduced for the
theory of booleans also have a signi cant impact on proof search in some cases.
Whilst a direct inference rule (FOOL paramodulation [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]) has been introduced
to tackle this, we also noted that the set of support approach for theory reasoning
introduced in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] could be applied to this theory.
      </p>
      <p>
        These discussion lead to the following changes/additions to Vampire options:
{ The sos option is extended with three new values: hol, hol inverse, and
both where both turns both hol and theory on.
{ A new option sos hol limit (sshl) is added to limit the depth of inferences
with pure HOL axioms. This is in addition to the sos theory limit (sstl)
introduced in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
{ A new option detect sledgehammer axioms (dsa) will attempt to detect
      </p>
      <p>
        HOL axioms introduced by Sledgehammer.
1 Thanks to Jasmin Blanchette for pointing this out to us.
The experiments described in this section use problems from TPTP v7.0.0 and
were run on the StarExec cluster [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] where each node contains an Intel Xeon
2.4GHz processor. In experiments we run Vampire 4.2.1 extended with the
options previously described.
In this experiment we consider some combinations of the new options introduced
in the previous Section and attempt to draw some conclusions about their
benet. We have not evaluated the full space of options as some were introduced late
in the development and evaluation process. We select all relevant higher-order
problems from TPTP e.g. those in higher-order form that are Theorems and
exclude 93 problems that we cannot currently parse. This gives us a total of 2,369
problems. We run for 60 seconds with all options other than those speci ed set
to their default values.
      </p>
      <p>Table 3 presents the results of these experiments. Using both kinds of
setof-support where no descendants of HOL axioms are allowed improves on using
no set-of-support at all. It is interesting to note that the best performance was
achieved when boolean axioms were restricted along with HOL axioms. This is
also where all unique solutions were found (solutions found by a single set of
options).</p>
      <p>The most signi cant result, not present in the above table, is that overall
strategies employing set-of-support in some way solve 56 problems not solved
by that without set-of-support (off). This is a strong indicator that including
the set-of-support variations discussed here in a portfolio of techniques will be
of bene t.</p>
      <p>Furthermore, there are 64 cases where the di erence in solution times between
the best set-of-support strategy and off is less than 1 second. If we just consider
these 64 cases the average speedup using the set-of-support strategy is 179 times
- 31 cases have a speedup of at least 100x. E ectively this means that solutions
that were taking tens of seconds are now happening immediately.</p>
      <p>
        As a comparison, this approach is currently not competitive with the leading
automated higher-order provers. On the same set of benchmarks Satallax [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
solves 2,155 problems and Leo-III [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] solves 2,216 problems. However, there
were 16 problems solved by Vampire not solved by either of these solvers.
6.2
      </p>
      <sec id="sec-6-1">
        <title>Looking at Problems from Sledgehammer</title>
        <p>We identi ed 364 problems inside the TPTP library that (i) make use of the
Sledgehammer translation, (ii) use axioms that can be identi ed using the
approach introduced previously, and (iii) can be parsed by Vampire. This is the
point in our experiments where we introduced the hol inverse option.
Sledgehammer axioms include those describing booleans. Therefore, in these
experiments we can view hol as meaning both with the same depth for both (and for
combinations of the two, which did not apply previously).</p>
        <p>This time we run in the CASC portfolio mode which runs a number of
different strategies in sequence. The results are given in Table 4. In terms of
overall problems solved the set-of-support strategies did not make much impact.
However, strategies employing set-of-support in some way solve 10 problems
not solved by that without set-of-support (off), out of these 8 were solved by
strategies using the hol inverse option. This time the average speedup was
much more modest at around 2.1x with 29 problems being solved faster out
of 53 problems where the solution time varied by more than 1 second (so 24
solutions were slower with set-of-support strategies).</p>
        <p>Table 5 repeats the exercise of nding the depth of pure descendants of HOL
axioms in proofs. Here we see two cases with very deep combinations. These were
for SWW246+1.p and SWW255+1.p. This exercise reinforces our previous conclusion
that the depth of such reasoning is generally small.
This paper has introduced a preliminary study aiming to understand the impact
of axioms introduced during translation from HOL to FOL on proof search in
rst-order saturation-based theorem provers such as Vampire. We also presented
some preliminary work using an extension of the set-of-support strategy (which
was previously useful for theory reasoning) to improve proof search in such cases.
Our results show that such approaches could help but indicate that our current
implementation has room for improvement.</p>
        <p>This works leaves a number of open questions, which suggest possible future
directions:
Detecting HOL axioms. For our work on Sledgehammer problems we were able
to take advantage of the fact that the axioms we wanted to treat di erently were
annotated. What can we do if this is not the case? Many of the axioms have a
particular syntactic form and it seems plausible that this would be enough in
many cases.</p>
        <p>Are all HOL axioms equal. In this work we have treated all of the di erent
classes of axioms arising from the FOL to HOL translation in the same way. For
example, combinator axioms and axioms for logical symbols. Do they have the
same impact on proof search? Related experiments with theory axioms suggest
that separating into di erent clauses would have a positive impact but comes
with the added cost of a larger space of proof search parameters.
What about non-pure axioms. In this work we have ignored all axioms that are
not pure HOL axiom descendants but perhaps this is a spectrum { perhaps
axioms that are `close' to the pure set should be treated di erently from those
that are `far'. Does it help to quantify this notion and re ne our notion of depth
accordingly? The technical solution to quantifying closeness does not seem too
complex but it is unclear whether this will help.</p>
        <p>When to do this. In this work we have chosen to restrict the depth of the pure
set of descendants during proof search. The alternatives would be to perform a
pre-saturation of this set or to statically (o ine) produce such a set. The o ine
approach will not work in general as the set of axioms is not known statically
due to their parameterisation by sort. The pre-saturation approach may involve
unnecessary computation as we saw earlier that many problems required no
pure descendants. We have no data on how many `unnecessary' descendants are
produced but it is almost certain that we would do not normally saturate to the
given depth as the normal proof search heuristics will guide exploration of the
search space. Whilst we still believe doing this during proof search makes the
most sense, we should explore the alternatives.</p>
        <p>Common patterns. A nal point is that there may be some common consequences
of pure HOL axioms that are used frequently in proofs and should be added
automatically. When we looked at the same idea with theory axioms we found
that such consequences were very shallow and adding them had no measurable
impact but the case may be di erent in this setting and should be explored.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Clark</surname>
            <given-names>Barrett</given-names>
          </string-name>
          , Aaron Stump, and
          <string-name>
            <given-names>Cesare</given-names>
            <surname>Tinelli</surname>
          </string-name>
          .
          <article-title>The Satis ability Modulo Theories Library (SMT-LIB)</article-title>
          .
          <source>www.SMT-LIB.org</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. Christoph Benzmuller, Chad E. Brown, and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Kohlhase</surname>
          </string-name>
          .
          <article-title>Higher-order semantics and extensionality</article-title>
          .
          <source>Journal of Symbolic Logic</source>
          ,
          <volume>69</volume>
          , 12
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Christoph Benzmuller, Nik Sultana, Lawrence C. Paulson, and
          <string-name>
            <given-names>Frank</given-names>
            <surname>Theib</surname>
          </string-name>
          .
          <article-title>The higher-order prover Leo-II</article-title>
          .
          <source>Journal of Automated Reasoning</source>
          ,
          <volume>55</volume>
          (
          <issue>4</issue>
          ):
          <volume>389</volume>
          {
          <fpage>404</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Jasmin</given-names>
            <surname>Christian</surname>
          </string-name>
          <string-name>
            <given-names>Blanchette</given-names>
            , Sascha Bohme, and
            <surname>Lawrence</surname>
          </string-name>
          <string-name>
            <given-names>C.</given-names>
            <surname>Paulson</surname>
          </string-name>
          .
          <article-title>Extending sledgehammer with SMT solvers</article-title>
          .
          <source>J. Autom. Reasoning</source>
          ,
          <volume>51</volume>
          (
          <issue>1</issue>
          ):
          <volume>109</volume>
          {
          <fpage>128</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chad</surname>
            <given-names>E. Brown. Satallax:</given-names>
          </string-name>
          <article-title>An automatic higher-order prover</article-title>
          . In Bernhard Gramlich,
          <string-name>
            <given-names>Dale</given-names>
            <surname>Miller</surname>
          </string-name>
          , and Uli Sattler, editors,
          <source>Automated Reasoning</source>
          , pages
          <volume>111</volume>
          {
          <fpage>117</fpage>
          , Berlin, Heidelberg,
          <year>2012</year>
          . Springer Berlin Heidelberg.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Krystof</given-names>
            <surname>Hoder</surname>
          </string-name>
          , Giles Reger,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Suda</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>Selecting the selection</article-title>
          .
          <source>In Nicola Olivetti and Ashish Tiwari</source>
          , editors,
          <source>Automated Reasoning: 8th International Joint Conference, IJCAR</source>
          <year>2016</year>
          , Coimbra, Portugal, June 27 { July 2,
          <year>2016</year>
          , Proceedings, pages
          <volume>313</volume>
          {
          <fpage>329</fpage>
          ,
          <string-name>
            <surname>Cham</surname>
          </string-name>
          ,
          <year>2016</year>
          . Springer International Publishing.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Evgenii</given-names>
            <surname>Kotelnikov</surname>
          </string-name>
          , Laura Kovacs, Giles Reger, and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>The vampire and the FOOL</article-title>
          .
          <source>In Proceedings of the 5th ACM SIGPLAN Conference on Certi ed Programs and Proofs</source>
          , Saint Petersburg, FL, USA, January
          <volume>20</volume>
          -
          <issue>22</issue>
          ,
          <year>2016</year>
          , pages
          <fpage>37</fpage>
          {
          <fpage>48</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Evgenii</given-names>
            <surname>Kotelnikov</surname>
          </string-name>
          , Laura Kovacs, and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>A rst class boolean sort in rst-order theorem proving and TPTP</article-title>
          . In Intelligent Computer Mathematics - International Conference, CICM 2015, Washington, DC, USA, July
          <volume>13</volume>
          -
          <issue>17</issue>
          ,
          <year>2015</year>
          , Proceedings, pages
          <volume>71</volume>
          {
          <fpage>86</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Laura</given-names>
            <surname>Kovacs</surname>
          </string-name>
          and
          <string-name>
            <given-names>Andrei</given-names>
            <surname>Voronkov</surname>
          </string-name>
          .
          <article-title>First-order theorem proving and Vampire</article-title>
          . In Natasha Sharygina and Helmut Veith, editors,
          <source>CAV</source>
          <year>2013</year>
          , volume
          <volume>8044</volume>
          of Lecture Notes in Computer Science, pages
          <volume>1</volume>
          {
          <fpage>35</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>William</given-names>
            <surname>McCune</surname>
          </string-name>
          .
          <source>OTTER 2.0. In 10th International Conference on Automated Deduction, Kaiserslautern</source>
          , FRG,
          <source>July 24-27</source>
          ,
          <year>1990</year>
          , Proceedings, pages
          <volume>663</volume>
          {
          <fpage>664</fpage>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>Jia</given-names>
            <surname>Meng</surname>
          </string-name>
          and Lawrence C. Paulson.
          <article-title>Translating higher-order clauses to rst-order clauses</article-title>
          .
          <source>Journal of Automated Reasoning</source>
          ,
          <volume>40</volume>
          (
          <issue>1</issue>
          ):
          <volume>35</volume>
          {
          <fpage>60</fpage>
          ,
          <string-name>
            <surname>Jan</surname>
          </string-name>
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. Lawrence C Paulson and
          <article-title>Jasmin Christian Blanchette</article-title>
          .
          <article-title>Three years of experience with sledgehammer, a practical link between automatic and interactive theorem provers</article-title>
          .
          <source>IWIL-2010</source>
          , 1,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>Giles</given-names>
            <surname>Reger</surname>
          </string-name>
          and
          <string-name>
            <given-names>Martin</given-names>
            <surname>Suda</surname>
          </string-name>
          .
          <article-title>Set of support for theory reasoning</article-title>
          . In Thomas Eiter, David Sands, Geo Sutcli e, and Andrei Voronkov, editors,
          <source>IWIL Workshop and LPAR Short Presentations</source>
          , volume
          <volume>1</volume>
          of Kalpa Publications in Computing, pages
          <volume>124</volume>
          {
          <fpage>134</fpage>
          .
          <string-name>
            <surname>EasyChair</surname>
          </string-name>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <article-title>Alexander Steen and Christoph Benzmuller. The higher-order prover Leo-III. CoRR</article-title>
          , abs/
          <year>1802</year>
          .02732,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Aaron</surname>
            <given-names>Stump</given-names>
          </string-name>
          , Geo Sutcli e, and Cesare Tinelli.
          <article-title>StarExec, a cross community logic solving service</article-title>
          . https://www.starexec.org,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Lawrence</surname>
            <given-names>Wos</given-names>
          </string-name>
          , George A.
          <string-name>
            <surname>Robinson</surname>
            , and
            <given-names>Daniel F.</given-names>
          </string-name>
          <string-name>
            <surname>Carson</surname>
          </string-name>
          .
          <article-title>E ciency and completeness of the set of support strategy in theorem proving</article-title>
          .
          <source>J. ACM</source>
          ,
          <volume>12</volume>
          (
          <issue>4</issue>
          ):
          <volume>536</volume>
          {
          <fpage>541</fpage>
          ,
          <year>October 1965</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>