<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Natural-Language Proof Assistant for Higher-Order Logic (Work in Progress)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Adam Dingle</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>KSVI, Faculty of Mathematics and Physics, Charles University</institution>
          ,
          <addr-line>Prague</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Natty is a new proof assistant in an early stage of development. It can read input written in a controlled natural language that looks like mathematical English with a restricted grammar and vocabulary. It translates this input to a series of formulas of classical higher-order logic. It can export these formulas to files in the standard THF format, or can attempt to prove them itself using a built-in automatic prover based on the higher-order superposition calculus. The built-in prover clausifies formulas incrementally to preserve as much structure as possible as it performs inferences. Although Natty is small (less than 2500 lines of OCaml code), its performance seems to be competitive with established provers such as E and Vampire in proving some basic arithmetic identities involving induction over natural numbers. In addition, the THF files that it generates for arithmetic identities may serve as a useful test set for other higher-order provers.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;controlled natural language</kwd>
        <kwd>formalization</kwd>
        <kwd>theorem proving</kwd>
        <kwd>higher-order logic</kwd>
        <kwd>superposition</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Natty is a new natural-language proof assistant with an embedded automatic prover based on
higherorder superposition. An input file for Natty is written in a controlled natural language [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] that
looks like mathematical English with a restricted grammar and vocabulary. It may contain a series of
axioms, definitions, and theorems, which may optionally include proofs written in natural language.
Natty translates this input to a series of formulas of classical higher-order logic, each representing an
entire theorem or a proof step. Natty then attempts to prove these formulas using its embedded prover,
reporting success or failure to the user.
      </p>
      <p>Natty is an early work in progress, and its capabilities are limited at this time. It is at least able to
read a natural-language input file that defines the natural numbers using the Peano axioms and asserts
a series of basic theorems about the naturals, such as that addition and multiplication are associative
and commutative. Natty’s embedded prover can prove most of these theorems in less than 1 second per
theorem on a typical machine. Many of these proofs require a higher-order induction step over the
natural numbers, which Natty can infer automatically. Additionally, some theorems in this input file
contain hand-written proofs. Natty can translate each step in these proofs to a separate formula, and
can verify almost all of these steps automatically.</p>
      <p>
        Natty can also export theorems and proof steps to files in the standard THF (Typed Higher-order
Form) format [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This is useful for comparing performance between Natty and other provers, and in
fact THF files with textbook theorems about natural numbers may form an interesting test set in their
own right.
      </p>
      <p>Today, Natty cannot do much more than this. However the program is under active development,
and our goal is to evolve it into a system that can be used for verifying mathematics in arbitrary
domains. In the months to come, we plan to extend its parser, logic, and prover to be adequate for
defining the integers and rationals and for verifying results in elementary number theory such as the
Fundamental Theorem of Arithmetic, i.e. that each positive integer has a unique prime factorization.
More mathematics will follow after that.</p>
      <p>
        Natty is publicly available [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] under a permissive open source license. It is a relatively small program,
currently consisting of only about 2,500 lines of OCaml code.
      </p>
      <p>
        Any discussion of Natty will inevitably draw comparisons with Naproche [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], another
naturallanguage proof assistant that has been under development for a number of years. Broadly speaking
our goals are similar to those of Naproche: we want to build a proof assistant in which the user writes
theorems and proofs in controlled natural language rather than interactively applying tactics as in
assistants such as Isabelle [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and Lean [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. One diference between our systems is that Natty is based
on higher-order logic, while Naproche uses first-order Morse-Kelley set theory. Another is that Natty
contains an embedded prover that we hope will be able to handle all proof obligations, unlike Naproche
which uses E [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] as an external prover.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Natural-language input</title>
      <p>
        Natty reads source files written in the Natty language, a controlled natural language for writing
mathematics. A file in this language has the extension .n, and may contain type declarations, constant
declarations, axioms, definitions and theorems. To illustrate, here is the beginning of a file nat.n that
defines the natural numbers axiomatically using the Peano axioms and then asserts a series of theorems
about the naturals, loosely following the development in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>Axiom. There exists a type N with an element 0 : N and a function s : N→N such that
a. There is no n : N such that s(n) = 0.
b. For all n, m : N, if s(n) = s(m) then n = m.
c. Let P : N→ B. If P(0) is true, and P(k) implies P(s(k)) for all k : N, then</p>
      <p>P(n) is true for all n : N.</p>
      <p>Definition. Let 1 : N = s(0).</p>
      <p>Lemma. Let a : N. Suppose that a ̸= 0. Then there is some b : N such that a = s(b).</p>
      <sec id="sec-2-1">
        <title>Axiom. There is a binary operation + : N→N→N such that for all n, m : N,</title>
        <p>a. n + 0 = n.</p>
        <p>b. n + s(m) = s(n + m).</p>
        <p>Theorem. Let a, b, c : N.
1. If a + c = b + c, then a = b.
2. (a + b) + c = a + (b + c).
...</p>
        <sec id="sec-2-1-1">
          <title>Listing 1: Beginning of nat.n</title>
          <p>The Axiom keyword introduces a series of type declarations, constant declarations, and/or axioms.
We see that the first Axiom block above includes a type declaration for N, constant declarations for
0 and s, and three axioms. The Definition keyword introduces a new constant and declares that it
is equal to some other value. The Theorem and Lemma keywords are synonyms, and introduce one or
more theorems. Each theorem in a single block may be individually numbered.</p>
          <p>Notice that Unicode characters such as N, B and ̸= are allowed and in fact encouraged in Natty
source files. Our goal in designing the Natty language is for the user to be able to write in a way that
resembles textbook mathematics as closely as possible, within the limitations of plain text and the
Unicode character set.</p>
          <p>Also notice that each constant or variable in the source file above has a type such as N, N → N, or
N → N → N (a curried function type). These reflect the fact that Natty is based on strongly typed
higher-order logic. The type B, visible in axiom (c) above, is predefined in Natty and represents the
booleans.</p>
          <p>At the moment, the user must always specify a type when declaring a constant or variable in a Natty
source file. In the future we intend to implement at least some level of type inference so that types may
sometimes be omitted.</p>
          <p>
            We also see that in the file above addition is introduced axiomatically, rather than by definition in
terms of lower-level concepts. In the current system, axioms such as these are the only practical method
for introducing new functions. This is unfortunate, since an erroneous axiom may make the entire input
inconsistent. In the future, we intend to implement a mechanism that lets the user declare inductive
types and define functions recursively over those types. Then the user will not need to introduce new
axioms for each new type or function. Both Isabelle [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ] and HOL Light [
            <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
            ] include this sort of
inductive type mechanism.
          </p>
          <p>Further down in nat.n we introduce multiplication axiomatically, then assert several related theorems:</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>Axiom. There is a binary operation · : N→N→N such that for all n, m : N, a. n · 0 = 0. b. n · s(m) = (n · m) + n.</title>
        <p>Notice above that Natty will implicitly multiply (using the · operator) variables that appear in adjacent
succession, as in the expression (ab)c. It distinguishes this situation from function application, as in
the expression s(a), by using type information: a has type N but s has type N → N. This implicit
multiplication syntax is convenient, and we are not aware of any other proof assistant that ofers it.</p>
        <p>Further down in nat.n we see that several theorems are accompanied by hand-written proofs.
Listing 3 shows one such theorem, which asserts that multiplication of natural numbers can be cancelled
on the right. This theorem is too dificult for Natty (at this time) to prove automatically in its entirety,
so the hand-written proof allows Natty to verify the theorem by checking one proof step at a time.</p>
        <p>Theorem. Let a, b, c : N. If c ̸= 0 and ac = bc then a = b.</p>
        <p>Proof. Let</p>
        <p>G : N→ B = { x : N | for all y, z : N, if z ̸= 0 and xz = yz then x = y }.</p>
        <p>Let b, c : N with c ̸= 0 and 0 · c = bc. Then bc = 0. Since c ̸= 0, we must have b = 0.
So 0 = b, and hence 0 ∈ G.</p>
        <p>Now let a : N, and suppose that a ∈ G. Let b, c : N, and suppose that c ̸= 0 and
s(a) · c = bc. Then ca + c = bc. If b = 0, then either s(a) = 0 or c = 0, which is
a contradiction. Hence b ̸= 0. By Lemma 1 there is some p : N such that b = s(p).</p>
        <p>Therefore ca + c = s(p) · c, and we see that ca + c = cp + c. It follows that ca = cp,
so ac = pc. By hypothesis it follows that a = p. Therefore s(a) = s(p) = b.</p>
        <p>Hence s(a) ∈ G, and we deduce that x ∈ G for all x : N.</p>
        <sec id="sec-2-2-1">
          <title>Listing 3: Proof of right cancellation of multiplication</title>
          <p>At the top of this proof we see an example of Natty’s set comprehension syntax. In Natty a set is
identified with a function whose codomain is B, the booleans. This is a common set representation in
higher-order logic. So in fact the set comprehension { :  | } denotes exactly the higher-order term
 :  .  , where  has type B, though the casual user should not need to think about this much.</p>
          <p>Notice that in the proof some steps include a justification such as By Lemma 1 or By hypothesis.
The current implementation of Natty ignores such justifications. However, in the future we intend to
use them as strong hints to the automatic prover about which assumptions it should use in verifying a
proof step.</p>
          <p>
            The expected workflow of a user using Natty is as follows. The user can create a source file and
enter axioms and theorems, then run Natty, which will attempt to prove all theorems automatically. If
the proof of any theorem fails, the user must edit the source file to provide a proof, and then on the
next run Natty will attempt to prove each proof step in turn. If it cannot prove any step, the user must
edit the source file again and make the step smaller. In the near future we plan to build an interactive
environment for Natty (probably as an extension for Visual Studio Code) that will make this process
easier. The Naproche component in the Isabelle/jEdit IDE [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ] serves a similar purpose.
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Logic</title>
      <p>
        Natty is founded on classical higher-order logic with rank-1 polymorphism, boolean extensionality,
functional extensionality, and choice, with Henkin semantics. This is essentially the same logic used in
proof assistants such as Isabelle/HOL [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and HOL Light [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], by automatic higher-order provers such
as recent versions of E [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], Vampire [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], and Zipperposition [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], and by the THF specification [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
We will not describe this logic here. A standard presentation (though without rank-1 polymorphism) is
given by Andrews [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>In fact, the current implementation of Natty does not use the full power of this logic. In the current
system the equality operator ≈ and the quantifiers ∀ and ∃ have polymorphic types, but user-defined
functions may not be polymorphic. Natty’s development of the natural numbers in nat.n does not
require extensionality or choice. Finally, as we discuss in our section on Natty’s proof procedure below,
Natty can make few higher-order inferences at this time.</p>
      <p>Higher-order logic is nevertheless useful as a foundation even in this early stage of development. For
example, it lets us express the Peano induction axiom in a natural way, without using an axiom schema
as would be needed in first-order logic. As we have already seen, we can express sets as functions with
codomain B, which is an elegant formulation that does not require any set theory axioms. The strong
type system allows Natty to perform useful type checking on the user’s behalf, and may even make
inference more eficient: for example, Natty’s unifier will not even attempt to unify two terms with
distinct types.</p>
      <p>In Natty’s logic the usual constants ⊤, ⊥, ¬, ∧, ∨, →, ∀, ∃, and ≈ are predefined. Of these, only
≈ is accessible via a similar symbol (i.e. the ordinary equals sign =) in Natty source files. To express
formulas involving the other logical constants, the user must use natural-language equivalents, e.g.
a = 3 or b = 4 for the formula  ≈ 3 ∨  ≈ 4.</p>
      <p>In higher-order logic terms and formulas are identified, and we will generally use these words as
synonyms. We use the syntax [/ ] to denote the result of substituting  for the variable  in .</p>
    </sec>
    <sec id="sec-4">
      <title>4. Translation into logic</title>
      <p>
        In this section we will describe how Natty translates an input file to a series of formulas of higher-order
logic.
4.1. Parsing and type checking
Natty first parses the input file, following a mostly context-free grammar for its controlled natural
language. It implements the parser using the parser combinator library MParser [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], which is patterned
after Haskell’s Parsec library [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. We have found parser combinators to be a convenient and powerful
tool for implementing this sort of natural-language parser.
      </p>
      <p>
        We will not present Natty’s grammar here. A summary of it can be found in the file grammar.ebnf
in the Natty source distribution [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The grammar is likely to evolve quickly as we enhance Natty to
handle more kinds of mathematical statements.
      </p>
      <p>Natty’s parser outputs a list of statements, each of which is a type declaration, a constant declaration,
an axiom, a definition, or a theorem. A theorem that includes a natural-language proof will contain
parsed proof steps. Each proof step is one of the following.</p>
      <p>Assert(formula) : A step such as “And so x &gt; 4”, asserting that a formula is true.
Let(var list, type) : A step such as “Let x, y: N”, introducing new arbitrary values of a certain
type.</p>
      <p>LetVal(var, type, formula) : A step such as “Let x = 2 + 2”, defining a local constant with a certain
value.</p>
      <p>Assume(formula) : A step such as “Suppose that f(y) = 0”, introducing an assumption for some
part of the proof.</p>
      <p>IsSome(var, type, formula) : A step such as “there is some p : N such that b = s(p)”,
introducing a value that is asserted to exist.</p>
      <p>As a next step, Natty type checks all statements. It checks that every constant symbol appearing in
an axiom, definition, or theorem has been declared, and that all function applications apply a function
to a value of appropriate type. A theorem’s proof steps are not type checked in this phase; that will
happen later, after a proof structure is inferred. Type checking is fairly easy, since Natty does not yet
perform any type inference and does not yet allow any form of polymorphism for variables or constants
defined by the user.</p>
      <p>At this point Natty’s translation process is essentially complete for axioms, definitions, and theorems
without proofs. However Natty must now translate every natural-language proof into a series of
formulas to be proven, which is the subject of the following sections.
4.2. Inferring proof structure
After Natty has produced a series of proof steps, it will arrange them into a tree representing the block
structure of the proof. This tree will indicate the scope of each variable and assumption in the proof. In
the tree the top-level node represents the entire proof, and every other node is a proof step. Every node
has a sequence of zero or more children. Assert steps never have children; other types of steps usually
do. A preorder traversal of the tree will always produce all proof steps in their original order.</p>
      <p>Before we present the algorithm for inferring block structure, it will be helpful to see an example of
a tree that it may produce. Recall the proof of cancellation of multiplication that we saw in Listing 3
above. When run with the -d (debug) option, Natty will print out the structure that it infers for this
proof:</p>
      <p>This tree illustrates all of the proof step types that Natty uses. The root node represents the entire
proof and is not shown. It has two children: the let_val step at the top, and the final step which asserts
the theorem that is being proven.</p>
      <p>Also notice that in this tree instances of the set membership operator ∈ appear as function application,
which is logically equivalent since sets are identified with functions. For example, 0 ∈ G appears here
as G(0).</p>
      <p>The tree structure indicates where each introduced assumption will be discharged, and where each
introduced variable’s scope will end. In a natural-language proof, this information is usually implicit. For
example, the proof in Listing 3 introduces an assumption “suppose that c ̸= 0 and s(a) · c = bc.”
Later on, the proof asserts “Hence s(a) ∈ G”. This assertion must not fall within the scope of the
assumption, for then it would be conditional and would be too weak to participate in the final induction
step. In fact this assumption must be discharged immediately before this assertion is made. In Listing 4,
we see that the assertion (which appears as G(s(a))) is correctly outside of the scope of the assumption.</p>
      <p>To construct the structure for a proof, Natty takes the steps parsed from the user-provided proof and
appends a final step asserting the formula of the theorem that is being proven. It then infers the proof
structure using a recursive algorithm that applies the following heuristics:
• A LetVal, Let or IsSome proof step  encloses a scope that is as small as possible, subject to
these constraints:
– It must extend far enough to enclose all proof steps that refer to variables that  introduces.
– It must extend far enough to enclose the scopes of all its children.
• An Assume proof step’s scope extends to the end of the scope of its nearest enclosing LetVal,</p>
      <p>Let or IsSome step.</p>
      <p>• An Assert proof step never has children, so there is no need to infer a scope for it.</p>
      <p>The efects of these heuristics are visible in Listing 4. For example, the second occurrence of the
step let b, c : N encloses a scope that extends only to the assertion s(a) = s(p) ∧ s(p) =
b, because that is the last step that uses either of the variables b or c. That forces the scope of the
assumption c ̸= 0 ∧ s(a) · c = bc to end as well, so the assertion G(s(a)) correctly falls outside
that assumption.</p>
      <p>These heuristics seem promising, however we have only tested them on the three proofs in nat.n.
We hope they will work well for future proofs. In the worst case we may find that we need to introduce
some way for proof authors to specify block structure explicitly in some cases.
4.3. Generating formulas
After Natty has inferred a proof’s structure, it will generate a series of formulas representing the logical
deductions made in each step in the proof. Each of these formulas will be independent, in the sense
that it can be proven by Natty (or another prover) without the presence of the other proof steps as
given hypotheses. The conjunction of these formulas will imply the theorem statement, so that if Natty
proves them all then the theorem is proven.</p>
      <p>Each Assert step (and also each IsSome step, which is another kind of assertion) will generate a
formula, and other steps will transform these formulas in various ways. Each generated formula will
capture the proof state at the moment that an assertion is made. This proof state will reflect both the
steps that enclose the assertion in the proof structure, and also the steps that precede it in the proof.</p>
      <p>Let us first consider the efect of enclosing proof steps. Broadly speaking:</p>
      <p>because the proof is asserting that  is true for any .
• If a step Let([],  ) encloses an assertion , then the assertion will be transformed to ∀ :  .  ,
• If a step LetVal(,  ,  ) encloses an assertion , then the assertion will be transformed to
[/ ], substituting the constant value of  into the assertion.1
• If a step Assume( ) encloses an assertion , then the assertion will be transformed to 
→ .
• If a step IsSome(,  ,  ) encloses an assertion , then the assertion will be transformed to</p>
      <p>→ , because the proof is asserting that  is true in the presence of an  with the given
property. In addition, the IsSome step will generate its own formula asserting that the given 
exists.</p>
      <p>We will make these ideas more precise in the definition of generated formulas below.</p>
      <p>Now let us consider the efect of preceding proof steps. Any step in a natural-language proof may
conceivably use the results of any previous steps in the proof. In some cases a proof author may
explicitly indicate a previous step that is being used, but very often this will be implicit. And so when
we generate a formula representing an assertion , we could conceivably include all the results of all
steps 1 . . .  that occurred anywhere previously in the proof. This would yield a formula such as
1 → 2 → . . . →  → .</p>
      <p>A disadvantage of such an approach is that the generated formulas could become very large for steps
that occur late in a large proof. In fact, most proof steps depend only on the previous step and possibly
some other recent or important steps. So we might choose to include a select subset of previous steps
in each generated formula.
3, . . . , (1 → ... → ).</p>
      <p>In fact Natty takes this approach and only includes certain previous steps in each formula. If a
node’s children are a sequence of assertions 1 . . .  then Natty includes each of these assertions as an
assumption when proving the others, generating the sequence of formulas 1, 1 → 2, 1 → 2 →
helpers for proving the conclusion .</p>
      <p>However only the conclusion of this sequence, namely the formula , is used as a known fact in
other parts of the proof. In other words, the formulas 1 . . . − 1 are treated as local, used only as</p>
      <p>This approach dramatically cuts down on the size of generated formulas. It seems to capture the
relevant facts needed for proving each proof step in the proofs we have verified. However, it is unclear
whether it will be adequate for larger or more complex proofs. If it is not, we may need to adapt or
abandon it, or provide a mechanism whereby the user can explicitly identify previous proof steps to be
used in certain cases.</p>
      <p>With these ideas in mind, we can now formally define the formulas that are generated from a proof
structure tree. For each tree node, we will define the formulas generated by the node and also the node’s
conclusion, which is also a formula. Also, for any sequence of nodes (which will always be the children
of a parent node), we will define the formulas generated by the sequence. These notions are defined via
mutual recursion as follows:
• A node Assume() generates the formula . Its conclusion is .</p>
      <p>of its last child.
• A node Let([],  ) generates the formulas ∀ :  .  1, . . . , ∀ :  .  , where 1, . . . ,  are
generated by the sequence of its children. Its conclusion is ∀ :  .  , where  is the conclusion
experiment with this in our implementation.
1Alternatively we could transform the assertion to ∀ :  .  =  → . It is not clear which alternative is best; we intend to
• A node LetVal(,  ,  ) generates the formulas 1[/ ], . . . , [/ ], where 1, . . . , 
are generated by the sequence of its children. Its conclusion is [/ ], where  is the conclusion
of its last child.
• A node IsSome(,  ,  ) generates the formulas ∃ :  . ,  → 1, . . . ,  → , where
1, . . . ,  are generated by the sequence of its children. Its conclusion is the conclusion of its
last child.
• For a sequence of nodes 1, . . . , , let 1, . . . ,  be the conclusions of each node in the
sequence. For each formula   generated by a node , the sequence generates the formula
1 → ... → − 1 →   .
• The top-level node, representing the entire proof, generates the formulas generated by its sequence
of children. Its conclusion is the conclusion of its last child (which is always the theorem being
proven).</p>
      <p>Theorem. Given a tree of proof steps for a theorem, the conjunction of the formulas generated by the
tree’s root node implies that the theorem is true.</p>
      <p>Proof. A proof sketch is as follows. By induction on the depth of the proof step tree, we can show that
the conjunction of the formulas generated by any node implies that node’s conclusion, and that the
conjunction of the formulas generated by any sequence implies the conclusion of the last node in the
sequence. The conclusion of the root node is the theorem statement, so the result follows.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Proof calculus</title>
      <p>
        Natty’s proof calculus is a pragmatic, incomplete variant of the higher-order superposition calculus
o Sup developed recently by Bentkamp et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] As such, it operates on the same clauses used in that
calculus. We will review the relevant definitions of terms, literals, and clauses here.
      </p>
      <p>In o Sup, a term is defined as a  -equivalence class of  -terms, which are in turn  -equivalence
classes of raw  -terms. We adopt these notions. In most situations Natty stores formulas in  -reduced
form. o Sup also defines a   -normal form of  -terms by applying a rewrite rule  to ensure that
the quantifiers ∀ and ∃ are only applied to  -terms. We have not implemented this rule in Natty at this
time.</p>
      <p>A literal is an equation  ≈  or disequation  ̸≈  of terms. Equality is not ordered, so  ≈  is the
same as  ≈ . Literals are purely equational, so a non-equational formula  must be encoded as  ≈ ⊤⊤
and ¬ as  ≈ ⊥⊥.</p>
      <p>A clause is a finite multiset of literals 1 ∨ . . . ∨  representing a disjunction.</p>
      <p>
        Starting with this section of the paper, we print logical symbols in bold (as in [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]) to distinguish
them from the syntax of literals and clauses.
      </p>
      <p>
        o Sup defines the notion of a green subterm of a  -term, and many rules in o Sup operate only on
green subterms. Essentially a green subterm is at a position where a first-order inference could take
place. We adopt the notion of green subterms; see [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], section 3 for a definition. As in o  Sup, we write
⟨⟩ to denote a clause  with a green subterm .
      </p>
      <p>
        Bentkamp et al. also define blue subterms, which are like green subterms but may also occur at some
positions under quantifiers. We will sometimes use these too; see [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], section 3.7 for a definition. We
write ⎷ ⌄ to denote a clause  with a blue subterm .
      </p>
      <p>o Sup specifies that the input set of clauses must be preprocessed by two rewrite rules ∀≈ and ∃≈
to eliminate quantified variables in certain higher-order contexts. This is required for completeness,
however we have not implemented this preprocessing in Natty at this time.</p>
      <p>
        As in o Sup, Natty’s calculus is parameterized by a strict term order ≺ that must be well-founded,
total on ground terms, stable under grounding substitutions, and also satisfy certain other conditions
listed in Definition 15 in [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. We will not repeat these conditions here. ⪯ is the reflexive closure of ≺ .
We will describe the specific term order that Natty uses below. As is common practice, the term order
≺ is lifted to literals and clauses via the multiset encoding described in [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], section 2.4.
      </p>
      <p>
        As in o Sup, a fluid term is either a variable applied to one or more arguments, or a  -expression 
such that for some substitution  ,  is no longer a  -expression due to  -reduction. As suggested in
[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], section 5, Natty assumes that any non-ground  -expression is fluid.
      </p>
      <p>
        o Sup also defines which positions are eligible in a clause with respect to a substitution  . We will
adopt this notion as well. The definition of an eligible position in o  Sup is slightly complicated due to
the possibility of selected literals. Natty does not support literal selection, which simplifies this notion
a bit. Consider a literal  to be maximal in a clause  with respect to  if  is maximal in  with
respect to the term ordering ≺ . Then essentially a green position is eligible in  with respect to  if it
can be reached by descending the term structure of a maximal literal while never passing through a
term  in an equation  ≈  where  ⪯  . See [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], section 3.3 for a precise definition.
      </p>
      <p>In the rules below, csu(, ) refers to the complete set of unifiers between two terms  and .
5.1. Generating rules
Natty generates new clauses using the two following rules. A side condition marked with an asterisk (*)
will be relaxed by Natty in some circumstances, as discussed in a following section.</p>
      <p>Superposition:</p>
      <p>⏞′ ∨  ⏟≈ ′ ⟨⟩
(′ ∨ ⟨′⟩)</p>
      <p>Sup
 ∈ csu(, )
(i)  is not fluid
(ii)  is not a variable
(iii)  ̸⪯ ′
(iv) the position of  is eligible in  w.r.t.  (*)
(v)  ̸⪯ 
(vi)  ≈ ′ is maximal in  w.r.t. 
(vii)  is not a fully applied Boolean logical symbol
(viii) if ′ = ⊥,  is at the top level of a positive literal</p>
      <sec id="sec-5-1">
        <title>The clauses  and  must have no free variables in common.</title>
        <p>This is a slightly simplified form of the superposition rule from o  Sup. The most notable diference
is that o Sup may perform superposition into a variable  in some circumstances, but our rule never
does. (This presumably breaks completeness, but our pragmatic calculus is not complete anyway.) Also,
our condition (vii) only excludes fully applied Boolean logical symbols (i.e. ⊤, ⊥, ¬, ∧, ∨, →), whereas
the corresponding condition in o Sup excludes all logical symbols in this context. We relaxed this
condition because we found that a superposition step with a fully applied quantifier ( ∀ or ∃) can be an
important inference in some proofs.</p>
        <p>Equality resolution:
(i)  ̸≈ ′ is maximal in  w.r.t. 
′ ∨  ̸≈ ′
′</p>
      </sec>
      <sec id="sec-5-2">
        <title>ERes</title>
        <p>∈ csu(, ′)</p>
        <p>This is a slightly simplified form of the equality resolution rule from o  Sup. (Natty does not actually
implement the side condition (i) at this time, but we will implement that soon.)</p>
        <p>Natty should also include an equality factoring rule, which is necessary for first-order completeness.
However, we have not implemented one yet.
5.2. Clausifying rules
Natty uses these rules to clausify formulas as described in a following section. We have written them
with a double line to indicate that they could be applied destructively (i.e. discarding the parent formula)
while retaining completeness. However, we will see later that Natty does not always apply these rules
destructively.</p>
        <p>Outer clausification:
 ≈ ⊤⊤ ∨ 
oc() ∨</p>
        <p>OC
 ≈ ⊥⊥ ∨ 
oc(¬) ∨</p>
        <p>OC</p>
        <p>
          This rule is a variant of the OuterClaus rule in [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. However, unlike that rule it never splits a clause
into two. The function oc is defined on formulas as follows:
oc( ∨ ) = ( ≈ ⊤⊤ ∨  ≈ ⊤⊤)
oc( → ) = ( ≈ ⊥⊥ ∨  ≈ ⊤⊤)
        </p>
        <p>oc( ≈ ) = ( ≈ )
oc(∀.) = ([/] ≈ ⊤⊤)
oc(∃.) = ([k(¯)/] ≈ ⊤⊤)
oc(¬( ∧ )) = ( ≈ ⊥⊥ ∨  ≈ ⊥⊥)</p>
        <p>oc(¬( ≈ )) = ( ̸≈ )
oc(¬(∀.)) = ([k(¯)/] ≈ ⊥⊥)</p>
        <p>oc(¬(∃.)) = ([/] ≈ ⊥⊥)</p>
        <p>Splitting:</p>
        <p>In the equations above, k is a new constant,  is a variable not appearing in  or the adjacent clause
, and ¯ represents all free variables appearing in ∃. or ¬(∀.).</p>
        <p>≈ ⊤⊤ ∨ 
sp(, )</p>
      </sec>
      <sec id="sec-5-3">
        <title>Split</title>
        <p>≈ ⊥⊥ ∨ 
sp(¬, )</p>
      </sec>
      <sec id="sec-5-4">
        <title>Split</title>
        <p>
          This rule is derived from the same OuterClaus rule in [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ], and splits clauses into two. The function
sp is defined on formulas as follows:
        </p>
        <p>
          sp( ∧ , ) = { ≈ ⊤⊤ ∨ ,  ≈ ⊤⊤ ∨ }
sp(¬( ∨ ), ) = { ≈ ⊥⊥ ∨ ,  ≈ ⊥⊥ ∨ }
sp(¬( → ), ) = { ≈ ⊤⊤ ∨ ,  ≈ ⊥⊥ ∨ }
5.3. Contracting rules
Natty includes additional rules that can rewrite, subsume, and simplify clauses. It also includes rules
that delete propositional tautologies and AC tautologies [
          <xref ref-type="bibr" rid="ref23 ref8">8, 23</xref>
          ] that can be proven from the associativity
and commutativity axioms for operators that are known be associative and commutative. All of these
rules are similar to those found in other superposition-based provers such as E [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. For reasons of
limited space, we omit a detailed presentation of these rules here.
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Proof procedure</title>
      <p>
        In this section we describe Natty’s automatic prover, which finds proofs constructed from the inference
rules outlined in the previous sections. Natty is broadly similar to other superposition-based provers,
but its mechanisms for selecting the next given clause and for performing clausification are somewhat
unusual. We will first describe some fundamental building blocks including Natty’s term ordering and
unification mechanism, and then move on to discussing its full proof procedure.
6.1. Term ordering
As mentioned above, the calculus o Sup requires a term ordering that satisfies certain technical
conditions listed in Definition 15 in [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. Because Natty uses the superposition rule from this calculus,
it inherits these term ordering requirements. In section 3.9 of their paper, Bentkamp et al. suggest a
concrete term ordering that satisfies their requirements. To compare two higher-order terms, they map
each of them to a first-order term using a certain encoding, then use the transfinite Knuth-Bendix order
[
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] to compare the resulting first-order terms using certain weights. Natty uses this suggested term
ordering and implements this mapping scheme and the Knuth-Bendix order. Instead of using the ordinal
number  for the weights of the symbols ∀1 and ∃1 , Natty simply uses a large integer constant, which
is efectively equivalent. Natty gives all other constant symbols a weight of 1, except unary function
symbols which have a weight of 2. (We assigned this weight because it seems to facilitate certain helpful
reductions in proofs about the natural numbers. However, these weight values are certainly subject to
further experimentation.)
6.2. Unification
A complete higher-order prover must perform full higher-order unification, which is challenging and
expensive because it is only semi-decidable, and because two terms may have an infinite number of
unifiers that are not instances of one another. Natty is a pragmatic, incomplete prover and in fact
its unification procedure is almost entirely first-order. Natty performs first-order unification using
a simple recursive procedure including an occurs check. In addition, it can unify two lambda terms
if the variables bound by the lambdas occur in exactly the same positions in those terms, and the
remaining subterms of the lambda terms are unifiable. An example of such terms is the pair . (, )
and . (, 4). This is the entire extent of Natty’s unification capabilities today.
      </p>
      <p>Despite these limitations, Natty can actually perform useful unifications in which a variable will
become bound to a higher-order term. For example, consider the Peano axiom of induction over the
natural numbers:</p>
      <p>∀ : (N → B).( (0) → ∀ : N.( () →  (())) → ∀ : N. ())</p>
      <p>
        In higher-order logic, the final consequent ∀ : N .  () is represented as ∀( : N .  ()). But
this then  -reduces to simply ∀( ). Now suppose that we are trying to prove the simple identity
∀ : N . 0 +  ≈ . In higher-order logic, this is represented as ∀( : N . 0 +  ≈ ). This now
unifies trivially with the consequent ∀( ), yielding the substitution  = { ↦→  : N . 0 +  ≈ }. In
Natty, superposition between the goal and this induction axiom finds this substitution and substitutes it
into the axiom, which allows the proof by induction to proceed. This all works without any special
higher-order unification machinery. And so Natty does not need a special process for instantiating
induction axioms with abstractions from goal clauses, as is found in E [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and Zipperposition [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
6.2.1. Allowing inductive inferences
However, regarding the useful superposition inference we just described, a word of caution is in order.
The restrictions of the superposition rule would not normally allow this superposition step to proceed.
The goal is negated, so in this example we are performing superposition with (∀ : N.0+ ≈ ) ≈ ⊥⊥ on
the left, and the induction axiom on the right. If we have not clausified the induction axiom, restriction
(viii) will not allow the superposition since ∀( ) is not at the top level of a literal. We could clausify
the induction axiom until ∀( ) becomes a positive literal. The clausified axiom will have this form:
¬ (0) ∨ ¬∀ : N.( () →  (())) ∨ ∀( )
      </p>
      <p>But then restriction (iv) will not allow the superposition since the literal ∀( ) will not be maximal
after the substitution  is applied. The substitution will ground all literals, and then according to the
Knuth-Bendix ordering the literal with largest weight will be maximal, namely the second of the three
literals. One might hope that further clausification steps could cause ∀( ) to become maximal (e.g.
after the clause splits in two), but experiments with Natty show that this is not the case.</p>
      <p>Now, we very much want to allow this superposition step, so Natty makes an exception. It considers
a formula to be inductive if it has the form ∀ :  .  , where  is a function type with codomain B. If the
right formula in a superposition is inductive, then Natty does not enforce the eligibility restriction (iv).
6.3. Main loop
The input to Natty’s automatic prover is a formula to be proved, plus a set of formulas that are known.
The known formulas are either axioms or are theorems that appeared earlier in the input file and have
already been proved.</p>
      <p>Like most other superposition-based provers, Natty negates the goal, then searches for a contradiction
by saturating the set of input formulas. Because the proof calculus works on equational clauses, Natty
initially converts each input formula to a clause of the form  = ⊤, or to  = ⊥ for an input formula
of the form ¬. Unlike in some other provers, no clausification steps (i.e. applications of the rules OC
or Split) are performed at this time.</p>
      <p>
        Natty’s main loop is modeled after the main loop in E [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], which is itself a variation of the DISCOUNT
loop used in an earlier system of that name [25]. Like E, Natty keeps clauses in two sets containing
processed and unprocessed clauses, respectively. In its main loop, it repeatedly selects a given clause from
the unprocessed set and adds it the processed set. As it does so, it performs all possible simplifications
of the given clause using clauses from the processed set. Furthermore, it back-simplifies clauses in the
processed set using the given clause, and sends them back to the unprocessed set if they have changed.
In this way, Natty maintains the invariant that all processed clauses are always reduced with respect to
each other.
      </p>
      <p>We will not provide pseudocode for Natty’s main loop or discuss it more, because it is so similar to
that of E. Instead, we will discuss Natty’s mechanisms for given clause selection and for clausification,
which are less similar to other provers.
6.4. Given clause selection
Most other superposition-based provers select given clauses in a weighted round-robin fashion from
two or more priority queues. Each queue orders the available unprocessed clauses according to some
characteristic function. One basic scheme used by some provers has two priority queues. One queue is
a LIFO, which orders clauses by age, so that at any given moment the oldest available clause will be
picked next. A second queue orders clauses by weight, which may be a simple count of the symbols in
a clause or which may be a weighted sum of those symbols. In this queue the clause with the smallest
weight will be picked next. Many variations of this scheme are possible. In [26], Schulz and Möhrmann
present various selection heuristics and the results of experiments they performed to see which of them
might work best.</p>
      <p>Natty uses a diferent, experimental given clause selection mechanism. In Natty there is only a single
priority queue, ordering all unprocessed clauses using a single cost function. Unlike in most systems,
where clauses are selected based on their size (or weight), Natty’s cost function reflects the changes in
size that have occurred during the proof steps that have derived a clause. It is based on the intuition
that many proofs involve only a small number (perhaps just one or even zero) of non-obvious or “uphill”
steps that may increase the size of the formula to be proved, combined with a larger number of “easy” or
“downhill” steps that reduce the formula size and bring the proof toward a close. For example, applying
mathematical induction will usually increase the formula size and can be considered an uphill step,
whereas applying a theorem that simplifies a formula is a downhill step that decreases the formula size.
Natty’s cost function is designed to penalize the uphill steps, and find proofs that go downhill most of
the time.</p>
      <p>To be more specific, Natty assigns a cost to each clause as follows. First, recall that the Knuth-Bendix
ordering assigns a weight to each clause, defined as the sum of the weights of all of its symbols. As
we saw before, Natty gives all symbols a weight of 1, except unary function symbols which have a
weight of 2. Let () be the total weight of clause , computed as in the Knuth-Bendix ordering except
considering the quantifier symbols ∀ and ∃ to have weight 0. (This is because e.g. prefixing a universal
quantifier to a formula does not really change its essence, and we do not want to count that as an uphill
step.)</p>
      <p>Now we can describe how costs are calculated. Each input clause has cost 0. Suppose that a new
clause  is derived from clauses  and  by superposition, where  and  are on the left and right in
the superposition rule. We will compute two values  () and (), both of which will be stored in
memory along with .  () represents E’s cost relative to its right parent , and () represents its
absolute cost, which will be used for ranking it in the priority queue.</p>
      <p>() is computed as follows. If () &lt; (), then  () = 0.01. If () = (), then  () =
0.02. Otherwise,  () = 1.0.</p>
      <p>() is computed to be  () + , where  is the cost that is inherited from the parents  and , and
is computed as follows. Let  be a set of clauses containing , , and all ancestors of  or  in the
graph of inferences. Then  = ∑︀∈  (). To put it diferently,  is the total cost of the proof that
derived  and . Natty computes it by performing a depth-first search to find all ancestors of  and .</p>
      <p>One might ask why we compare the weight of the new clause  with the weight of , and not with
. One reason is that the superposition restriction  ̸⪯  generally ensures that  will be larger,
and in some cases  may be quite small, e.g. an identity such as ∀ : N . 0 +  ≈ .</p>
      <p>In searching for a proof, Natty only considers clauses whose costs fall under a fixed limit. By default,
this limit is 1.3, meaning that Natty will only find a proof that contains at most one uphill step (i.e.
a step costing 1.0). Any induction step will cost 1.0, so a proof by induction is not allowed to make
any more uphill steps at all. Nevertheless, Natty can prove many identities about the natural numbers
inductively within this cost limit.</p>
      <p>Before considering superposition inferences between two clauses, Natty precomputes the cost  that
will be inherited from those clauses, which is the minimum cost that any derived clause will have. If
this  exceeds the fixed cost limit, Natty will not even look for possible inferences between the clauses.
This optimization may make searches for proofs whose cost is near the limit significantly faster than in
a search with an unbounded cost.
6.5. Clausification
If we start with a clause containing a single formula and exhaustively apply the inference rules OC and
Split until no more applications are possible, we will eventually reach clause normal form. This process
is called clausification . If the starting formula is first order, clause normal form will be a conjunction of
disjunctions of atomic formulas. During the process of clausification, all quantifiers will be eliminated
and new Skolem constants will be introduced for quantifiers that are efectively existential.</p>
      <p>All provers must perform clausification in some way. Some provers such as E immediately clausify
every formula at the beginning of the proof process. Even if a prover does this, higher-order inferences
may create new formulas later on that are not in clause normal form. E will immediately clausify such
formulas as soon as they appear.</p>
      <p>Unfortunately, completely clausifying a formula will often create a large number of small clauses
whose relationship to the original formula is dificult to understand. As a result, even if a refutation is
found, the generated proof may be cryptic. A further disadvantage of immediate clausification is that it
destroys high-level formula structure that may be useful for inferring higher-level proof steps, such as
by unifying with the antecedent of a complex theorem.</p>
      <p>Natty attempts to preserve formula structure as much as possible when generating inferences.
However, the clausification rules OC and Split present a fundamental dilemma that all provers must
deal with somehow. If a prover applies these rules destructively, then formula structure is lost. On the
other hand, if it applies them non-destructively, it will introduce clauses that are redundant and in fact
equivalent. This may lead to many duplicate inferences if the redundant clauses remain present.</p>
      <p>Natty attempts to resolve this dilemma through a process of dynamic clausification , which works as
follows. Suppose that it is time to compute all possible superpositions between clauses  and  on
the left and right. Natty will apply the rule OC to  and  repeatedly as many times as is possible,
generating two sequences of clauses 1, . . . ,  and 1, . . . , . (Recall that OC never splits a clause.)
All of these clauses are kept in memory.</p>
      <p>Now Natty must look for superposition inferences. Every possible inference will involve some
equational literal from , and some green subterm from . Natty will consider all possible pairs
(,  ), where 1 ≤  ≤  and 1 ≤  ≤ . It will look for superposition inferences between  and
 , but only considering those literals that first appeared in , i.e. were not already present in − 1.
Similarly, it will only consider green subterms that first appeared in  . In this way, any superposition
inference that is generated between a literal from  and a subterm from  will use the earliest  and
 in which it is possible to make that inference, preserving as much formula structure as is possible.</p>
      <p>Finally, when all possible inferences between  and  have been found, Natty will discard the
sequences 1, . . . ,  and 1, . . . ,  from memory. They will be regenerated again when necessary.
We believe this dynamic clausification process is actually fairly cheap, since most subterms between
the dynamically generated formulas are shared in memory.</p>
      <p>In this way, each clause in the generated set actually serves as a representative of all the clauses that
can be generated from it using the rule OC.</p>
      <p>When Natty adds a clause  to the processed set, it looks for a way to split the clause by applying
OC zero or more times and then applying the rule Split. If this is possible, it will apply this rule
non-destructively, creating two new clauses and retaining the original clause . Any particular clause
will only be split at most once. The newly created clauses will have a  value of 0, so their costs will be
the same as that of the parent clause. For that reason, they will soon be pulled from the priority queue
and also enter the processed set, at which point they may split again. In this way, all clauses of the
normal form of the original clause will be generated in a short time. There may be a fair number of
such clauses, but none of them will be equivalent to any other.</p>
      <p>This scheme seems to work reasonably well, but the non-destructive splitting may still sometimes
cause an undesirable level of redundancy between active clauses. We intend to experiment more with
variations of this scheme that may allow splitting to be destructive in some cases.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Performance</title>
      <p>We evaluated Natty’s performance versus that of E 3.0.08, Vampire 4.8, and Zipperposition 2.1 in proving
theorems and proof steps exported to THF files from nat.n. As we saw above, this file defines the
natural numbers axiomatically and then asserts various elementary theorems about them. Many of
these theorems require proofs by induction. We ran E with the --auto option, and built Vampire from
the v4.8HO4Sledgahammer tag (dated October 19, 2023) containing higher-order improvements not yet
merged into its master branch. We ran all of these provers in a mode that used a single strategy.</p>
      <p>nat.n contains 18 theorems. Of these, 3 include hand-written natural-language proofs, because they
are too dificult for Natty to prove automatically. Natty generates a total of 43 proof steps from these
proofs.</p>
      <p>We ran the four provers on all the exported theorems, and then again separately on all the exported
proof steps. We used a 5-second time limit for all proof attempts. This limit is much shorter than is
typical in competitive evaluations of automatic theorem provers, but we choose it to reflect Natty’s
intended use as a prover that can be used interactively and quickly.</p>
      <p>The results are in Table 1 and Table 2. We show individual results for theorems, but only aggregate
results for proof steps. The average times in the tables only reflect theorems or proof steps that were
actually proved, and are unafected by failed proof attempts. The tables also show the PAR-2 score,
which is the average time over all attempts, in which a failed attempt is assigned twice the time limit.</p>
      <p>We think Natty’s performance at this early stage is promising. It was able to prove almost as many
theorems as any prover, and it proved the most proof steps of any of the provers. The comparison may
not be entirely fair, since the other provers may have been designed mostly to solve dificult problems
after running for longer periods of time using a portfolio of strategies, not easy problems in just a few
seconds. Still, Natty seems adequate for its intended use case. Furthermore, we could certainly optimize
its performance more. For example, it is not currently performing any term indexing at all.</p>
    </sec>
    <sec id="sec-8">
      <title>8. Future work</title>
      <p>We do not have space here to list all the enhancements we would like make to Natty, but here are
some of our plans. A major goal for the near future is to move beyond the natural numbers and prove
theorems about the integers. To do this, we will need some sort of overloading, polymorphism, or type
classes to handle e.g. the fact that addition on naturals and on the integers will have the same name
and syntax but will be two diferent functions. Before long we will also need some way to define types
inductively, or to define new types as isomorphic to equivalence classes of an existing type.</p>
      <p>An interactive environment where the user can edit a proof and get feedback on it in real time is also
a high priority.</p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgments</title>
      <p>Thanks to Martin Suda and the anonymous reviewers for their helpful comments.</p>
      <p>This research was supported by the Grant Agency of Charles University (grant 128524).
in: International Conference on Logic for Programming Artificial Intelligence and Reasoning,
Springer, 2007, pp. 348–362.
[25] J. Denzinger, M. Kronenburg, S. Schulz, DISCOUNT-a distributed and learning equational prover,</p>
      <p>Journal of Automated Reasoning 18 (1997) 189–198.
[26] S. Schulz, M. Möhrmann, Performance of clause selection heuristics for saturation-based theorem
proving, in: Automated Reasoning: 8th International Joint Conference, IJCAR 2016, Coimbra,
Portugal, June 27–July 2, 2016, Proceedings 8, Springer, 2016, pp. 330–345.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kuhn</surname>
          </string-name>
          ,
          <article-title>A survey and classification of controlled natural languages</article-title>
          ,
          <source>Computational Linguistics</source>
          <volume>40</volume>
          (
          <year>2014</year>
          )
          <fpage>121</fpage>
          -
          <lpage>170</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Hales</surname>
          </string-name>
          ,
          <article-title>An argument for controlled natural languages in mathematics, 2019</article-title>
          . URL: https://jiggerwit. wordpress.com/wp-content/uploads/2019/06/header.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>G. Sutclife,</surname>
          </string-name>
          <article-title>The logic languages of the TPTP world</article-title>
          ,
          <source>Logic Journal of the IGPL</source>
          <volume>31</volume>
          (
          <year>2023</year>
          )
          <fpage>1153</fpage>
          -
          <lpage>1169</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dingle</surname>
          </string-name>
          , Natty code repository,
          <year>2024</year>
          . URL: https://github.com/medovina/natty.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Cramer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Fisseni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Koepke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kühlwein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schröder</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Veldman,</surname>
          </string-name>
          <article-title>The Naproche project: controlled natural language proof checking of mathematical texts</article-title>
          , in: International Workshop on Controlled Natural Language, Springer,
          <year>2009</year>
          , pp.
          <fpage>170</fpage>
          -
          <lpage>186</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L. C.</given-names>
            <surname>Paulson</surname>
          </string-name>
          ,
          <article-title>Isabelle: A generic theorem prover</article-title>
          , Springer,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L. d.</given-names>
            <surname>Moura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ullrich</surname>
          </string-name>
          ,
          <article-title>The Lean 4 theorem prover and programming language</article-title>
          ,
          <source>in: Automated Deduction-CADE 28: 28th International Conference on Automated Deduction, Virtual Event, July 12-15</source>
          ,
          <year>2021</year>
          , Proceedings 28, Springer,
          <year>2021</year>
          , pp.
          <fpage>625</fpage>
          -
          <lpage>635</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Schulz</surname>
          </string-name>
          ,
          <string-name>
            <surname>E-</surname>
          </string-name>
          <article-title>a brainiac theorem prover</article-title>
          ,
          <source>AI</source>
          Communications
          <volume>15</volume>
          (
          <year>2002</year>
          )
          <fpage>111</fpage>
          -
          <lpage>126</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>E. D.</given-names>
            <surname>Bloch</surname>
          </string-name>
          ,
          <source>The real numbers and real analysis</source>
          ,
          <source>Springer Science &amp; Business Media</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Berghofer</surname>
          </string-name>
          , M. Wenzel,
          <article-title>Inductive datatypes in HOL-lessons learned in formal-logic engineering</article-title>
          , in: International Conference on Theorem Proving in Higher Order Logics, Springer,
          <year>1999</year>
          , pp.
          <fpage>19</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Harrison</surname>
          </string-name>
          ,
          <article-title>Inductive definitions: automation and application</article-title>
          , in: International Conference on Theorem Proving in Higher Order Logics, Springer,
          <year>1995</year>
          , pp.
          <fpage>200</fpage>
          -
          <lpage>213</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Harrison</surname>
          </string-name>
          ,
          <article-title>HOL light: A tutorial introduction</article-title>
          , in: International Conference on Formal Methods in Computer-Aided Design, Springer,
          <year>1996</year>
          , pp.
          <fpage>265</fpage>
          -
          <lpage>269</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>A. De Lon</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Koepke</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lorenzen</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Marti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Schütz</surname>
          </string-name>
          , M. Wenzel, The Isabelle/
          <article-title>Naproche natural language proof assistant</article-title>
          ,
          <source>in: Automated Deduction-CADE 28: 28th International Conference on Automated Deduction, Virtual Event, July 12-15</source>
          ,
          <year>2021</year>
          , Proceedings 28, Springer International Publishing,
          <year>2021</year>
          , pp.
          <fpage>614</fpage>
          -
          <lpage>624</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>Vukmirović</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Blanchette</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schulz</surname>
          </string-name>
          ,
          <article-title>Extending a high-performance prover to higher-order logic</article-title>
          ,
          <source>in: International Conference on Tools and Algorithms for the Construction and Analysis of Systems</source>
          , Springer,
          <year>2023</year>
          , pp.
          <fpage>111</fpage>
          -
          <lpage>129</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bhayat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Reger</surname>
          </string-name>
          ,
          <article-title>A combinator-based superposition calculus for higher-order logic</article-title>
          ,
          <source>in: International Joint Conference on Automated Reasoning</source>
          , Springer,
          <year>2020</year>
          , pp.
          <fpage>278</fpage>
          -
          <lpage>296</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Vukmirovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bentkamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Blanchette</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Cruanes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Nummelin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tourret</surname>
          </string-name>
          ,
          <article-title>Making higherorder superposition work</article-title>
          .,
          <source>in: CADE</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>415</fpage>
          -
          <lpage>432</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>P. B. Andrews</surname>
          </string-name>
          ,
          <article-title>An introduction to mathematical logic and type theory: to truth through proof</article-title>
          , volume
          <volume>27</volume>
          ,
          <string-name>
            <surname>Springer</surname>
            <given-names>Science</given-names>
          </string-name>
          &amp; Business
          <string-name>
            <surname>Media</surname>
          </string-name>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mouratov</surname>
          </string-name>
          , MParser,
          <year>2024</year>
          . URL: https://github.com/murmour/mparser.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>D.</given-names>
            <surname>Leijen</surname>
          </string-name>
          , E. Meijer, Parsec:
          <article-title>Direct style monadic parser combinators for the real world (</article-title>
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bentkamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Blanchette</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tourret</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Vukmirović</surname>
          </string-name>
          ,
          <article-title>Superposition for higher-order logic</article-title>
          ,
          <source>Journal of Automated Reasoning</source>
          <volume>67</volume>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>L.</given-names>
            <surname>Bachmair</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ganzinger</surname>
          </string-name>
          ,
          <article-title>Rewrite-based equational theorem proving with selection and simplification</article-title>
          ,
          <source>Journal of Logic and Computation</source>
          <volume>4</volume>
          (
          <year>1994</year>
          )
          <fpage>217</fpage>
          -
          <lpage>247</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>V.</given-names>
            <surname>Nummelin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bentkamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tourret</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Vukmirovic</surname>
          </string-name>
          ,
          <article-title>Superposition with first-class booleans and inprocessing clausification</article-title>
          ,
          <source>in: CADE</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>378</fpage>
          -
          <lpage>395</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>J.</given-names>
            <surname>Avenhaus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hillenbrand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Löchner</surname>
          </string-name>
          ,
          <article-title>On using ground joinable equations in equational theorem proving</article-title>
          ,
          <source>Journal of Symbolic Computation</source>
          <volume>36</volume>
          (
          <year>2003</year>
          )
          <fpage>217</fpage>
          -
          <lpage>233</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ludwig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Waldmann</surname>
          </string-name>
          ,
          <article-title>An extension of the Knuth-Bendix ordering with LPO-like properties,</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>