<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Structure-Generating First-Order Theorem Proving</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Christoph Wernhard</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Potsdam</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <fpage>64</fpage>
      <lpage>83</lpage>
      <abstract>
        <p>Provers based on the connection method can become much more powerful than currently believed. We substantiate this thesis with certain generalizations of known techniques. In particular, we generalize proof structure enumeration interwoven with unification - the proceeding of goal-driven connection and clausal tableaux provers - to an interplay of goal- and axiom-driven processing. It permits lemma re-use and heuristic restrictions known from saturating provers. Proof structure terms, proof objects that allow to specify and implement various ways of building proofs, are central for our approach. Meredith's condensed detachment represents the prototypical base case of such proof structure terms. Hence, we focus on condensed detachment problems, first-order Horn problems of a specific form. Experiments show that for this problem class the approach keeps up with state-of-the-art first-order provers, leads to remarkably short proofs, solves an ATP challenge problem, and is useful in machine learning for ATP. A general aim is to make ATP more accessible to systematic investigations in the space between calculi and implementation aspects.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Our thesis is that provers based on the connection method (CM) can become much more
powerful than currently believed. We substantiate this with certain generalizations of known
techniques.</p>
      <p>
        Current realizations of the CM operate by enumerating proof structures, interwoven with
unification of atomic formulas that are associated with nodes in the structure. The scope of
variables in these formulas includes the whole structure; in technical terms, they are rigid [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Failure of unification on a partially built-up structure effects backtracking. All those structures
that would extend this partial structure are abandoned at once.
      </p>
      <p>
        As the proof structures are often represented in a tree form as clausal tableaux [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], we
call provers that proceed in this way CM-CT provers, for Connection Method/Clausal Tableaux.
Examples are the Prolog Technology Theorem Prover [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], SETHEO [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], leanCoP [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ] and
CMProver [
        <xref ref-type="bibr" rid="ref7 ref8 ref9">7, 8, 9</xref>
        ]. Aside of the CM and clausal tableaux also model elimination [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is a
conceptual model that covers such provers. The underlying principle of enumerating proof
structures in combination with unification is already present in [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ], as an improvement
compared to enumerating formula instantiations. In [13, Sect. 3] resolution theorem proving is
contrasted as a formula enumeration process with tableau enumeration. With techniques based
on proof structure enumeration, each proof structure usually appears at most once, while, e.g.,
in proof search by resolution the same subproof may appear several times.
AReCCa 2023
      </p>
      <p>CEUR-WS.org</p>
      <p>
        Typically, CM-CT provers perform the structure enumeration wrapped in iterative deepening
upon the maximal value of some structure size measure. A basic motivation there is completeness
while avoiding the storage requirements of breadth-first search [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In addition, alternate depth
measures are explored as heuristics variations, e.g., [4, Sect. 6.2].
      </p>
      <p>CM-CT provers proceed goal-driven through an initially instantiated goal formula at the top
of the proof structure, e.g., with a ground clause obtained from Skolemizing universal variables
in the original goal formula.</p>
      <p>
        Structure-generating first-order theorem proving as discussed here generalizes these principles.
One major influence is the consideration of proof structure terms in Carew A. Meredith’s
condensed detachment (CD) [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. These are full binary trees,1 called D-terms [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], referring to
their convenient presentation as terms with the binary function symbol D. A D-term permits to
associate a most general formula that is proven by it if the constants or leaf nodes are associated
with given axioms. See, e.g., [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] for original examples. D-terms allow to specify many useful
operations on proof structures in immediate and straightforward ways. For example, notions
of proof size. Or proof composition: proofs of lemmas can simply be inserted into an overall
proof term. Such a composition of proofs is, for example, necessary to obtain an overall proof
in the case where precomputed and preselected lemmas are supplied as additional axioms to
the theorem prover [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Also decomposition of proofs is straightforward: Each subterm of a
D-term is itself a D-term that represents the proof of a lemma.
      </p>
      <p>
        CD plays a core role in witness theory [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] and is known [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] as the first technical approach
to the Curry-Howard correspondence or “formulas as types” [
        <xref ref-type="bibr" rid="ref20 ref21">20, 21</xref>
        ]. It provides the basis of
Metamath2 [
        <xref ref-type="bibr" rid="ref22 ref23">22, 23</xref>
        ], a rather successful computer-processable language for verifying, archiving,
and presenting mathematical proofs. The problems covered by CD form a subclass of first-order
Horn problems. Although many historic successes in ATP were achieved involving CD problems
[
        <xref ref-type="bibr" rid="ref24 ref25 ref26 ref27 ref28 ref29 ref30 ref31 ref32">24, 25, 26, 27, 28, 29, 30, 31, 32</xref>
        ], CD was in ATP mostly considered just as a restricted form of
hyperresolution. Emphasis on its proof structures became prevalent in ATP only later, based on
observing parallels to the CM [
        <xref ref-type="bibr" rid="ref15 ref33">33, 15</xref>
        ].
      </p>
      <p>
        Thus our investigations focuses on CD problems, for which these D-terms are readily available,
and we call our prototypical system to investigate the approach SGCD – Structure-Generating
theorem proving for Condensed Detachment. Our starting point is enumerating proof structures,
interwoven with unification, as performed by the CM-CT provers. SGCD computes proofs in
increasing levels, characterized for example by some size measure of the D-terms, e.g., number
of nodes or height, similar to the iterative deepening in CM-CT provers. The simple form of the
D-terms, just full binary trees, allows to find precise upper bounds of the search space growth
for increasing levels in the The On-Line Encyclopedia of Integer Sequences (OEIS) [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ]. Also
measures that were so far hardly considered for CM-CT provers can be taken into account. For
example compacted size, that is, the number of nodes of the minimal DAG representation of the
proof tree. Or we can characterize sets of structures level by level in some inductive way. With
one specific such characterization, the so-called PSP-level (PSP for Proof-SubProof ), SGCD finds
a strikingly short proof of a historic problem by Łukasiewicz [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] and a proof of a problem that
was solved by Meredith but was so far very hard for ATP [
        <xref ref-type="bibr" rid="ref17 ref31">17, 31</xref>
        ].
      </p>
      <sec id="sec-1-1">
        <title>1A binary tree is full if each node has 2 or 0 children. 2https://us.metamath.org/.</title>
        <p>Another issue to address is the purely goal-driven operation of the CM-CT provers. Their
depth-first search effects that subgoals are re-proven again and again. Moreover, information
about the goal in form of the initial goal instantiation is often propagated only to a few subgoals
that are very close to the overall goal.</p>
        <p>
          From our enumeration view, instantiating the top node with a given goal is just an optional
possibility. We can also leave it with variables and collect their bindings in each enumerated
structure. Under such a binding, the goal is a lemma formula that can be proven by the structure
in a given level from given axioms. Alternate proof structures yield alternate bindings, alternate
lemmas. We call this mode axiom-driven. SETHEO was used for bottom-up preprocessing in this
way [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ]. Hypertableaux [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ] may be viewed as proving by maintaining such derived lemmas
in a specific way. SGCD can be invoked in a loop where goal-driven phases alternate with
axiom-driven phases, for increasing levels. The parameters that control this loop are highly
configurable. Unit lemmas obtained by the axiom-driven phases are cached and available in
later phases, both axiom- and goal-driven. They can be deleted on the basis of heuristics that are
well-known from saturating theorem proving, for example deleting lemmas whose term height
exceeds a threshold. Such heuristics are usually not available in CM-CT provers, because the
atom instances maintained by them are deeper instantiated, impacted always from the overall
proof structure in contrast to just the substructure that proves the lemma.
        </p>
        <p>In a typical configuration SGCD starts with an empty cache and proceeds in a main loop
over a “current” level initialized with 0. As level we may, e.g., assume tree size, the number of
inner nodes of the proof tree. In each iteration of the loop SGCD first tries to prove the goal in
the goal-driven way with iterative deepening upon a level that starts with the current level, is
incremented in each deepening step by one, and ends with, say, the current level plus 2. Much
like a CM-CT prover with iterative deepening except that: (1) Iterative deepening starts at the
current level and is only tried for two more levels; (2) In addition to the original axioms also
cached lemmas are used to solve subgoals. If no proof is found in the goal-driven phase, SGCD
invokes the axiom-driven mode to generate lemmas in the current level, which are then put
into the cache. For generating these lemmas it solves subgoals by lemmas in smaller levels that
were cached in previous iterations of the main loop. With a generated lemma not just its proof
but also its level is recorded. The level is then available to control the search when the lemma is
used to solve a subgoal in either goal- or axiom-driven mode. If, for example, the search is for
an overall proof with tree size 10 where the fragment under construction already has tree size 8,
then only lemmas with a tree size of at most 2 may be attached to the fragment. After placing
the lemmas generated by the axiom-driven phase in the cache the current level is incremented
by one and SGCD moves to the next iteration of its main loop.</p>
        <p>
          In the recent terminology of [
          <xref ref-type="bibr" rid="ref37">37</xref>
          ], our approach combines the following features: Cs
(separation of proof structural information from the formula), Cm (mixed manner of operation, that
combines starting from the goal theorem (Cg) and starting from the axioms), and a form of Cc
(cut) through unit lemmas.
        </p>
        <p>
          In the following Sect. 2 we provide more background on condensed detachment. Then, in
Sect. 3 we explicate our approach with presenting the system SGCD (Structure-Generating
theorem proving for Condensed Detachment), covering the range from general concepts to
implementation aspects and experimental results. In Sect. 4 we discus some particular open
issues and speculative ideas concerning the approach. Section 5 concludes the paper.
Detailed descriptions of experiments are provided in [
          <xref ref-type="bibr" rid="ref38">38</xref>
          ] and [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
        </p>
        <p>The system is available as free software from http://cs.christophwernhard.com/cdtools/,
where also additional data and presentations for the experiments can be found, including
HTML-formatted comprehensive result tables and full logs.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Background: Condensed Detachment for ATP</title>
      <p>
        CD was developed in the mid-1950s by Carew A. Meredith as an evolution of the method of
substitution and detachment practiced by Łukasiewicz [
        <xref ref-type="bibr" rid="ref14 ref39 ref40 ref41">14, 39, 40, 41</xref>
        ]. Reasoning steps are by
detachment, or modus ponens, under implicit substitution by most general unifiers. Its basic
application field is the investigation of axiomatizations of propositional logics from a first-order
meta level. Some of the most advanced formal proofs ever developed by humans were such
applications of CD by Meredith and some other researchers. From a first-order ATP perspective,
a CD problem consists of proper axioms (briefly axioms), that is, positive unit clauses, a goal
theorem, that is, a single negative ground unit clause (representing a universally quantified
atomic goal theorem after Skolemization), and the following ternary Horn clause that models
detachment.
      </p>
      <p>Det =def P(i(x, y)) ∧ P(x) → P(y).</p>
      <p>The premises of Det are called major and minor premise, respectively. All atoms in the problem
have the same predicate P, which is unary and stands for something like provable. The formulas
of the investigated propositional logic are expressed as terms, where the binary function symbol i
stands for implies.</p>
      <p>
        CD may be considered as an inference rule. With this view, the most straightforward way to
describe a CD inference step from an ATP perspective is to consider it as a positive hyperresolution
step such that from Det and two positive unit clauses, used as major and minor premise, a
third positive unit clause is inferred. A CD proof is a proof of a CD problem constructed with
the CD inference rule, in contrast to, for example, a proof involving binary resolution steps
involving Det that yield non-unit resolvents. Prover9 [
        <xref ref-type="bibr" rid="ref42">42</xref>
        ] actually by default chooses positive
hyperresolution as inference rule for CD problems and thus produces CD proofs for these.
      </p>
      <p>
        The structure of CD proofs can be represented in a very simple and convenient way as full
binary trees or terms, which can be directly taken into account as objects in proving. In ATP
we find this aspect in the connection methods (CM) [
        <xref ref-type="bibr" rid="ref43 ref44 ref45">43, 44, 45</xref>
        ], where the proof structure as a
whole is in the focus, in contrast to extending a set of formulas by inferred formulas. This view
of CD has been made precise and elaborated in [
        <xref ref-type="bibr" rid="ref15 ref33">33, 15</xref>
        ].
      </p>
      <p>As already mentioned, we call the structure representations of CD proofs D-terms. A D-term is
a term built from D as binary function symbol and atom labels as constant symbols or primitive
D-Terms, labels of axioms. In other words, it is a full binary tree where the leaf nodes are labels
of axioms. The following line shows four example D-terms for axiom labels 1, 2.</p>
      <p>1, 2, D(1, 1), D(D(2, 1), D(1, D(2, 1))).</p>
      <p>The mapping between the representation of CD proofs in terms of CD inference steps and
D-terms is straightforward: The use of an axiom corresponds to a primitive D-term. A CD
inference step corresponds to a D-term D(d1, d2) where d1 is the D-term that proves the unit
clause for the major premise and d2 is the D-term that proves the unit clause for the minor
premise.</p>
      <p>
        A CD proof in full is represented by a D-term (which just describes structure) taken together
with an axiom assignment, that is, a mapping of primitive D-terms to axioms. It proves from the
axioms all instances of a most general formula that is associated through unification, constrained
by the axioms for the leaf nodes and the requirements imposed by Det for the inner nodes.3
Following [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], we call this formula the most general theorem (MGT) of the D-term with respect
to the axiom assignment. The MGT is unique, modulo renaming of variables. Of course, for a
given axiom assignment not all D-terms necessarily have an MGT. If the unification constraints
imposed by the axioms and Det cannot be satisfied, we say the D-term has no MGT. And, of
course, also depending on the axiom assignment, it is well possible that two different D-terms
have the same MGT or that the MGT of one is subsumed by the MGT of the other. A CD problem
also includes a goal theorem, a ground atom. A D-term is a proof of the problem if its MGT
with respect to the axioms subsumes the goal theorem.
      </p>
      <p>
        Example 1. Consider the axiom assignment α =def {1 7→ P(i(x, i(x, x)))} declaring the primitive
D-term 1 as label of the axiom known as Mingle [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ]. The MGT of the primitive D-term 1 is then
just this axiom. The MGT of the D-term D(1, 1) is P(y)σ where σ is the most general unifier of
the set of pairs
      </p>
      <p>{{P(i(x, y)), P(i(x′, i(x′, x′)))}, {P(x), P(i(x′′, i(x′′, x′′)))}}.</p>
      <p>The most general unifier of these pairs is σ = {x 7→ i(x′′, i(x′′, x′′)), x′ 7→ i(x′′, i(x′′, x′′)), y 7→
i(i(x′′, i(x′′, x′′)), i(x′′, i(x′′, x′′))).}. Hence, after renaming variables (recall that the MGT is just
characterized modulo renaming of variables) the MGT of D(1, 1) is P(i(x, i(x, x)), i(x, i(x, x))).
A D-term proves as goal theorems all ground instances of its MGT, which is defined with respect
to an axiom assignment. So here D(1, 1) proves P(i(a, i(a, a)), i(a, i(a, a))), and also, for example
P(i(i(a, b), i(i(a, b), i(a, b))), i(i(a, b), i(i(a, b), i(a, b)))).</p>
      <p>
        D-terms, full binary trees, facilitate characterizing and investigating structural properties of
proofs. While, for a variety of reasons, it is far from obvious how to measure the size of proofs
by ATP systems in general, for D-terms there are three immediate size measures:
• The tree size of a D-term is the number of its inner nodes.
• The height of a D-term is the number of edges of the longest downward path from the
root to a leaf.
• The compacted size of a D-term is the number of distinct compound subterms, or, in other
words, the number of inner nodes of its minimal DAG.4
In the literature compacted size is also called length, height level and tree size CDcount [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ].
3There is actually some fine-print here with respect to the usual notion of most general unifier [
        <xref ref-type="bibr" rid="ref15 ref20 ref33">33, 15, 20</xref>
        ].
4It is well known that a tree or set of trees is represented by a unique (modulo isomorphism) minimal DAG, which is
maximally factored (it has no multiple occurrences of the same subtree) or, equivalently, is minimal with respect to
the number of nodes.
      </p>
      <sec id="sec-2-1">
        <title>Example 2. The D-term</title>
        <p>D(D(1, D(1, 1)), D(D(1, D(1, 1)), D(D(1, 1), 1)))
has tree size 8, height 4 and compacted size 5. If we replace the second occurrence of 1 from left
with 2 we obtain</p>
        <p>D(D(1, D(2, 1)), D(D(1, D(1, 1)), D(D(1, 1), 1))).</p>
      </sec>
      <sec id="sec-2-2">
        <title>Tree size and height remain unaltered, but the compacted size is then 7.</title>
        <p>CD proofs suggest and allow a specific form of lemmas, which we call unit/subtree lemmas,
reflecting two views: As formulas, they are positive unit clauses, which can be re-used in
different CD inference steps. In the structural view, they are subterms (or subtrees5) of the
overall D-term. If they occur multiply there, they would be factored in the minimal DAG of
the overall D-term. Both views are linked in that the formula of a lemma is the MGT of its
D-term. The most adequate size measure for D-terms that takes the compression achieved by
unit/subtree lemmas into account is compacted size. From the perspective of proof structure
compression methods, unit/subtree lemmas have the property that the compression target is
unique, simply because each tree is represented by a unique minimal DAG.</p>
        <p>
          A textual representation of a D-term that respects its compacted size, the DAG compression
by unit/subtree lemmas, is provided by a list of factor equations. There, distinct subproofs
with multiple incoming edges in the DAG receive numeric labels by which they are referenced.
Meredith uses such a form with Polish notation for D-terms and their MGTs, e.g., in [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], which
is discussed in [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
        </p>
        <p>Example 3. Consider the D-term D(D(1, D(1, 1)), D(D(1, D(1, 1)), D(D(1, 1), 1))) from
Example 2. As a list of factor equations it can be represented as follows.</p>
        <p>2 =
3 =
4 =</p>
        <p>D(1, 1)
D(1, 2)</p>
        <p>D(3, D(3, D(2, 1)))
Here we assume that labels 2–4 are not used for axioms and thus are free for use as reference labels.
Label 4, which is not referenced, is associated with the root or the overall D-term.</p>
        <p>Compared to general first-order ATP problems, CD problems are restricted: viewed as a
CNF to be refuted, the problem has positive unit clauses, a single negative ground clause, and
a single ternary Horn clause. Also equality is not explicitly considered. Nevertheless, core
characteristics of general first-order ATP problems are present: first-order variables, a binary
function symbol, and cyclic predicate dependency.6
5We use subtree with the meaning common in computer science and matching the notion of subterm: A subtree of a
tree T is a tree consisting of a node in T and all of its descendants in T .
6In the sense of the dependency graph of a logic program. It has an edge from predicate p to predicate q if there is a
clause with a body atom whose predicate is p and with a head whose predicate is q.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. SGCD – Structure-Generating Theorem Proving for</title>
    </sec>
    <sec id="sec-4">
      <title>Condensed Detachment</title>
      <p>
        We explicate our approach by means of describing the SGCD system, which is implemented
in SWI-Prolog [
        <xref ref-type="bibr" rid="ref46">46</xref>
        ]. There we enter the space between calculi and “system descriptions”. In
a sense, the calculus (e.g., condensed detachment as some inference system) just tells us in
an inductive way how proof structures are built-up and relate to proven formulas or MGTs.
Although the inductive specification suggests a construction, for theorem proving in practice,
there are many ways in which the proof structures can actually be built. We think, this is not
merely the domain of ad-hoc low-level heuristics, but a potentially large field that should be
systematically approached as well.
      </p>
      <p>The programming language Prolog is inherently rather close to proof structure enumeration
with incorporated unification. In fact, it expresses arbitrary computations according to this
principle. A predicate (or nondeterministic procedure) enumerates alternate bindings of output
variables, each corresponding to a different (implicitly represented) proof of the “predicate”,
viewed as a goal unit clause with free output variables. Thus, Prolog is for SGCD not just
suitable as an implementation language, but also to abstractly describe the involved methods.
3.1. The Core Enumeration Predicate, its Modes and the Cache
We assume CD problems with D-terms as proof structures. Further we assume that the set of
D-terms is grouped into levels, possibly but not necessarily disjoint, where each strict subterm
of a given D-term is in some strictly lower level than the given D-term. The set of the sets of all
D-terms with a specific tree size is an example of such a level grouping. We then say that the
level is characterized by the tree size.</p>
      <p>At the core of SGCD is a ternary predicate that, for a given level, enumerates all pairs of a
D-term in the level and a formula such that the formula is the MGT of the D-term, with respect
to some assumed axioms that do not explicitly appear as parameter.</p>
      <p>enum_dterm_mgt_pairs(+Level, ?D-term, ?Formula)</p>
      <p>Depending on the mode in which enum_dterm_mgt_pairs is invoked, that is, which of the
D-term and Formula arguments are inputs or outputs, it realizes different functionalities. We
assume there that as an input argument Formula is ground, representing a universally quantified
theorem after Skolemization, whereas, as an output argument it is just the MGT of D-term,
which may involve variables.</p>
      <p>• If D-term and Formula are both input arguments, then the predicate verifies that D-term
is a proof of Formula. The predicate then either succeeds (enumerates the empty binding
as sole value) or fails.
• If only D-term is an input and Formula an output, it computes the MGT of D-term. The
predicate then either enumerates a single value for Formula or, in case D-term has no
MGT, fails.
• If only Formula is an input and D-term is an output, it acts as a goal-driven prover,
enumerating proofs of Formula.
• If both D-term and Formula are outputs, it enumerates pairs of a proof and its MGT, that
is, it generates lemmas with proofs in an axiom-driven way.</p>
      <p>Invoked goal-driven, i.e., with Formula as input and D-term as output, for increasing values
of Level it is in essence a CM-CT prover with iterative deepening, specialized to CD problems.</p>
      <p>To process the lemmas enumerated by axiom-driven invocations, i.e., with both D-term and
Formula as outputs, SGCD maintains a cache of hLevel, D-term, Formulai triples, solutions of
enum_dterm_mgt_pairs(Level, D-term, Formula). If lemmas are computed and cached for
increasing levels, this cache can be used within the implementation of enum_dterm_mgt_pairs
to solve subproblems from the cache instead of recomputing them, if they are for a lower level
than that of the top call. This is possible not just within axiom-driven top calls, but also within
goal-driven invocations. A part of the proof search is then replaced by retrieval from the cache.7</p>
      <p>
        Moreover, the cache can be subjected to heuristic restrictions based on formulas, the MGTs,
as known from resolution-based (or more generally, saturating) provers. There, for example,
lemmas with formulas that are subsumed by an already present lemma or whose term height
exceeds some threshold are discarded. Such heuristic restrictions are not possible with
conventional CM-CT provers, because these do not have MGTs at hand, but rather just formulas that
are due to rigid variables more deeper instantiated, in dependence of the context where they
occur in the proof (this is discussed in more depth as IPT vs. MGT in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]). With such heuristic
restrictions, the replacement of search by lemmas realized with SGCD’s cache then effects an
actual modification, a heuristically motivated restriction, of the explored search space.
3.2. Level Characterizations
SGCD can be used with alternate versions of enum_dterm_mgt_pairs for different level
characterizations, including by tree size, height, maximal tree size, and maximal height. The
number of distinct D-terms for increasing values of some size measure gives an upper bound of
the number of trees to consider in proof search by enumerating D-terms level-by-level. This
number is just an upper bound, because it does not take into account that D-term enumeration
is interwoven with unification. Through the unification, which is constrained by given axioms
and possibly a given goal, fragments of D-terms for which unifiability fails are immediately
discarded and D-terms extending them are skipped in the enumeration. Heuristic restrictions
may further limit the number of enumerated structures.
      </p>
      <p>The number of distinct D-terms for increasing values of some size measure also indicates a
measure-specific size value up to which it is easily possible to compute for given axioms all
proofs, together with the lemma formulas proven by them.</p>
      <p>If we assume a single axiom such that we can identify D-terms with full binary trees without
any additional labeling, the sequences of the number of distinct D-terms for increasing tree
size, height, or compacted size are well-known and can be found in the OEIS, with identifiers
A000108, A001699, and A254789, respectively, as shown in Table 1.</p>
      <p>Aside of these “conventional” criteria, our use of proof structure terms also permits inductive
structure specifications as level characterizations. Particularly prospective there is the PSP-level
(indicating Proof-SubProof ), specified as follows.</p>
      <sec id="sec-4-1">
        <title>7This realizes a replacing form of lemma application [47, 17]</title>
        <p>|PSPLevel (n)| oeis:A001147 1 1 3 15 105 945 10,395 135,135 2,027,025
Compacted size oeis:A254789 1 1 3 15 111 1,119 14,487 230,943 4,395,855
(i) Structures in PSP-level 0 are the axiom identifiers.
(ii) Structures in PSP-level n + 1 are all structures D(d1, d2) and D(d2, d1) where d1 is a
structure in PSP-level n and d2 is a (not necessarily strict) subterm of d1 or an axiom
identifier.</p>
        <p>
          The PSP-level was motivated by an analysis [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] of Meredith’s variant [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] of Łukasiewicz’s
completeness proof for his shortest single axiom for implicational logic [
          <xref ref-type="bibr" rid="ref48">48</xref>
          ], where it was
observed that most steps in Meredith’s proof can be ascribed to the relationship between levels
that underlies the PSP-level. As an inferencing technique it is “global” in the sense that it takes
whole proof structures and combines them to a new proof structure. For this reason it may so
far have escaped the traditional inference systems, which are more locally oriented, e.g., by
extending a branch based on properties of the leaf node or adding the resolvent of two clauses.
        </p>
        <p>PSP-levels are disjoint. All D-terms in PSP-level n have compacted size n. However, the
cardinality of D-terms in PSP-level n grows slower than that of D-terms of compacted size n,
according to OEIS sequence A001147 in contrast to A254789. Table 2 compares the initial values
of both sequences. It follows that the enumeration of D-terms according to the PSP-level is
“incomplete”, that is, there are D-terms that are not a member of any PSP-level.</p>
        <p>
          Experimental successes involving enumeration by PSP-level include relatively short proofs
for problems where exhaustive search for a shortest proof seems unfeasible [38, Sect. 4.5], a
short proof for Łukasiewicz’s shortest axiom, LCL038-1 [38, Sect. 4.6][
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], and an automatically
found proof of a challenge problem, the “Meredith single axiom” LCL073-1 [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
3.3. Combining Goal- and Axiom-Driven Reasoning
SGCD combines goal- and axiom-driven invocation of enum_dterm_mgt_pairs by
proceeding in two nested loops, as informally outlined in the introduction and shown in Fig. 1. The
goal formula, in Skolemized ground form, is given as input parameter g. The parameters
maxLevel and preAddMaxLevel are specified with the configuration. The core predicate
enum_dterm_mgt_pairs is for some particular assumed level characterization, for
example by tree size. It enumerates argument bindings nondeterministically. If it succeeds once in
the inner loop, the exception proof_found(d) returns the bound D-term d.
Cache := ∅;
for l := 0 to maxLevel do begin
for m := l to l + preAddMaxLevel do begin
enum_dterm_mgt_pairs(m, d, g);
throw proof_found(d)
end;
News := {hl, d, f i | enum_dterm_mgt_pairs(l, d, f )};
if News = ∅ then throw exhausted;
        </p>
        <p>Cache := merge_news_into_cache(News, Cache)
end</p>
        <p>The procedure merge_news_into_cache(News, Cache) merges the newly generated
hLevel, D-term, Formulai
triples News with the cache Cache, optionally taking into account configured heuristics. For
example, sorting the union of the triples according to increasing term height of the lemma
formulas and truncating the sorted list after, say, 1,000 entries.</p>
        <p>If maxLevel is configured as 0, the method proceeds purely in goal-driven mode with the
inner loop performing iterative deepening upon the level m from 0 up to preAddMaxLevel .
The similarity to CM-CT provers can even be shown empirically by comparing the sets of solved
TPTP problems (see Sect. 3.5.3). If combined with axiom-driven lemma generation, generally
successful configurations of preAddMaxLevel typically have rather low values, about 0–3.</p>
        <p>Of course, also variations of the basic schema of Fig. 1 can be useful. For example to enumerate
alternate proofs instead of exiting with the first found proof through the throw statement.</p>
        <p>
          For lemma generation [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], SGCD can be run with some resource bound, e.g. a timeout or
a Prolog inference limit, or until axiom-driven enumeration exhausts by heuristic limitations.
After SGCD terminates, the content of the cache represents a set of generated lemmas. Lemmas
deleted at merge_news_into_cache may be maintained in a separate “passive” store which
can be used for example in lemma generation (Sect. 3.5.2) and theorem finding (Sect. 3.5.4).
        </p>
        <p>
          “Hybrid” configurations are possible, where different level characterizations are used for the
goal- and axiom-driven phases. The cache has initially not to be empty: It is possible to initialize
it with externally generated lemmas (possibly by another invocation of SGCD) and modify the
initial settings of the l and m parameters. We call this lemma injection. Search at lower levels is
then completely replaced by retrieval from these external lemmas. SGCD with lemma injection
was used successfully for lemmas selected with the help of machine learning [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
3.4. Implementation Aspects and the CD Tools Environment
SGCD is integrated in CD Tools, an environment to experiment with CD, which is embedded
in SWI-Prolog8 and utilizes PIE [
          <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
          ] as basic support for first-order ATP. For experimenting
8No efforts were made to support other Prolog systems. Brief tries with free systems reckoned as particularly fast,
notably YAP and Eclipse, unfortunately gave the impression that these are broken in their recent versions. SWI-Prolog
provides some library predicates that proved very useful, for example variant_sha1/3 or term_factorized/3.
with machine learning [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] SGCD was invoked via Bash scripts, also in portfolio settings where
different configurations are run in parallel.
        </p>
        <p>
          CD Tools provides various interfaces to external worlds. It supports, for example, conversion
to and from Łukasiewicz’s Polish notation. There are means to enumerate and test TPTP
problems whether they represent a CD problem and to canonicalize the vocabulary of such CD
problems. A pre-defined association of about 170 common names for about 120 well-known
formulas and combinators, e.g., from [
          <xref ref-type="bibr" rid="ref32 ref40">40, 32</xref>
          ], helps to specify these as axioms and to detect
them among computed lemmas. Also some support for handling combinators and λ-terms is
implemented, which is used, e.g., for the second included prover CCS [
          <xref ref-type="bibr" rid="ref49">49</xref>
          ].
        </p>
        <p>
          CD Tools includes implementations of many concepts and operations introduced or discussed
in [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], including D-term comparison by ≥c, various notions of regularity, variations of the
organic property (based on calling a SAT solver9 or based on previously proven lemmas), the
prime property, and n-simplification, that is, replacing complex subproofs of minor premises
whose goal is irrelevant for the conclusion with a primitive subproof.
        </p>
        <p>Many of these operations are available to SGCD for heuristic restrictions that can be optionally
configured to be applied either immediately to the results of axiom-driven invocations of
enum_dterm_mgt_pairs or when merging new results into the cache.</p>
        <p>Attempting to introduce at least a little bit of goal-direction in the axiom-driven mode, an
experimental heuristic restriction takes the number of distinct subterms in a formula that do
not appear in the goal as a negative criterion.</p>
        <p>
          For handling large (millions of tree nodes) proof terms, CD Tools provides an implementation
of MGT computation and verification that transparently operates on a factored representation.
3.5. Experimental Results
SGCD was evaluated in various experiments, most of them on the 196 pure CD problems in
TPTP 8.0.0 (there are 10 further CD problems in the TPTP with status satisfiable or a more
complex problem form). We call this problem corpus TPTPCD. The report [
          <xref ref-type="bibr" rid="ref38">38</xref>
          ] describes several
experiments with SGCD on that corpus. Additional experiments with machine learning for ATP
and proving a challenge problem for ATP, the “Meredith single axiom” LCL073-1, are described
in [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. Experiments with CCS, the second prover included in CD Tools, are reported in [
          <xref ref-type="bibr" rid="ref49">49</xref>
          ].
Comprehensive result tables and log files can be found at http://cs.christophwernhard.com/
cdtools/. Here we give brief summaries of the experiments with SGCD.
3.5.1. SGCD in Generally Successful Configurations
In generic configurations SGCD solves 176 of the 196 TPTPCD problems. This may be compared
with Prover9, which also creates CD proofs, like SGCD but differently from most other general
first-order provers. Prover9 solves 168 of the problems. SGCD fails on five problems that
Prover9 can solve. In the generally successful configurations SGCD solves no “new” problems,
i.e., problems rated in the TPTP with 1.00, but 93% of the problems that are solvable at all by a
first-order prover and 95% of the problems that can be solved by E [
          <xref ref-type="bibr" rid="ref51">51</xref>
          ]. See [38, Sect. 4.1].
9Via PIE, which provides a call interface to SAT and QBF solvers by translating between PIE’s Prolog term
representation of formulas and DIMACS or QDIMACS files. In experiments we used MiniSat [
          <xref ref-type="bibr" rid="ref50">50</xref>
          ].
        </p>
        <p>
          These 176 solutions are obtained as the joint results of SGCD in four configurations whose key
settings are as follows. Configuration 1: level characterization by tree size; maximally 1000 cache
entries; goal-driven phases upon 2 levels (preAddMaxLevel = 1); 165 solutions. Configuration 2:
similar to configuration 1 but height as level characterization; 109 solutions. Configuration 3:
similar to configuration 1 but maximally 3000 cache entries; 161 solutions. Configuration 4:
similar to configuration 1 but goal-driven phases just upon 1 level (preAddMaxLevel = 0) and
strong limitations of the formula size of the cached lemmas just slightly above the maximal
sizes found in the input problems; 80 solutions.
3.5.2. SGCD in Specialized Settings with Lemma Injection
In specialized settings with lemma injection, SGCD solves more difficult problems, including the
1.00-rated LCL073-1 [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. With or without lemma injection it outperforms Vampire [
          <xref ref-type="bibr" rid="ref52">52</xref>
          ] (without
lemma injection 285 vs. 263 solutions for the 312 problems considered in [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]; with lemma
injection 289; Vampire, when given lemmas obtained from SGCD is boosted from 263 to 285
solutions). For an overall comparison of SGCD with and without lemma injection with various
provers, invoked with and without lemmas provided by SGCD, see http://cs.christophwernhard.
com/cdtools/exp-lemmas/lemmas.html. An excerpt of this table is reproduced in [53, App. E].
3.5.3. SGCD in Purely Goal-Driven Configurations
In purely goal-driven configurations SGCD is indeed very similar to the CM-CT provers and
can solve roughly the same TPTPCD problems. In two purely goal-driven configurations SGCD
solves 89 of the 196 TPTPCD problems, compared to 92 solvable by various CM-CT provers
according to previously reported experiments. With 86 problems, the overlap of problems
that are solved by both SGCD and CM-CT provers is rather large. See [38, Sect. 4.2] and
http://cs.christophwernhard.com/cdtools/exp-tptpcd-2022-07/table_3.html. The 92 solutions
by CM-CT provers could be stretched to a superset of 96 solutions, obtained by CMProver in
nine configurations in an experiment with a longer timeout of 1 hour per problem and prover
configuration. Also the well-known leanCoP 2.1 [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] was tried. With a single exception, the
196 TPTPCD problems are provided in the TPTP CNF format, which is not accepted by leanCoP,
such that we had to take conversions to TPTP FOF format (with axioms and goal conjecture
distinguished). Experiments with a 1 hour timeout per problem led to 72 or 51 solutions,
depending on whether the major or minor subgoal appears first in axiom Det in the FOF input.
3.5.4. SGCD for Theorem Finding
In purely axiom-driven configurations SGCD can perform theorem finding. It finds proofs
of Łukasiewicz’s 68 Theses in a single run in 2.5 minutes. These theses from a textbook by
Łukasiewicz were used extensively with OTTER [
          <xref ref-type="bibr" rid="ref54">54</xref>
          ] by Larry Wos [
          <xref ref-type="bibr" rid="ref25 ref26 ref27 ref28">25, 26, 27, 28</xref>
          ]. They could
be solved by OTTER in a single run after the introduction of weight templates [
          <xref ref-type="bibr" rid="ref25 ref27">25, 27</xref>
          ]. In [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]
the focus was finding proofs with small compacted size for sets of goals, subsets of the theses
that represent axiom systems. With respect to compacted size, the proofs obtained in our brief
experiments cannot compete with the carefully developed proofs from [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]. However, the tree
size of our proofs is for all considered axiom systems smaller. See [38, Sect. 4.3].
3.5.5. SGCD and Proof Size
Prover9 computes for CD problems CD proofs that can be represented by D-terms. Hence we can
directly compare Prover9’s proofs with the D-terms output by SGCD. In general, SGCD finds
smaller proofs than Prover9 in its default mode, moderately smaller with respect to compacted
size and height, and drastically smaller with respect to tree size. See [38, Sect. 4.4].
        </p>
        <p>
          CCS can be applied in a configuration that returns proofs with ascertained minimal compacted
size by exhaustive search [
          <xref ref-type="bibr" rid="ref49">49</xref>
          ]. This means that the method then guarantees that the problem
has no proof whose compacted size is strictly smaller than that of the returned proof. This
succeeds for about 44% problems of the TPTPCD corpus. SGCD with enumeration by PSP-level
is particularly useful to find proofs with small compacted size in cases where the exhaustive
search for ascertained minimal compacted size appears to be unfeasible. See [38, Sect. 4.5].
3.5.6. SGCD with Enumeration by PSP-Level for Specific Problems and for Lemma
        </p>
        <p>
          Generation
With enumeration by PSP-level, SGCD finds a short proof of LCL038-1, the most difficult
of the three subproblems of proving completeness of Łukasiewicz’s shortest single axiom for
implicational logic. Short extensions of this proof that solve the other two easier subproblems
were then performed with naive structure enumeration predicates provided by CD Tools.
SGCD’s proof of LCL038-1 is drastically shorter than proofs by other ATP systems and even
substantially shorter than the proofs by Łukasiewicz and Meredith. The combined proof for all
three subproblems is a few steps shorter than the proofs by these logicians. See [38, Sect. 4.6]
and also [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
        </p>
        <p>
          For proving LCL073-1, SGCD was first invoked purely axiom-driven by PSP-level to generate
about 100k lemmas. These were ranked by general heuristic features, mostly size properties
of the lemma formula, smaller preferred. Then SGCD was invoked a second time, now for
proving, with alternating goal- and axiom-driven phases, initialized by lemma injection with
the best-ranked 3k of the previously genrerated lemmas. The configuration of the phases was
there “hybrid”: while the axiom-driven phases were again by PSP-level, the goal-driven phases
were by height as level characterization. For more details, see [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
        </p>
        <p>
          Enumeration by PSP-level for lemma generation in general led to good results in the machine
learning scenario [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. It appears to supply to other provers useful lemmas that would not be
found by their own calculi.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Some Issues Requiring Further Research</title>
      <p>
        The approach of structure-generating theorem proving in its current state of development raises
questions that require further investigations. We outline some of these, organized into two
major topics.
4.1. Stronger Proof Compressions
SGCD so far focuses on unit/subtree lemmas, which correspond to the sharing of subtrees in
DAGs. There are stronger proof compressions that correspond to shared trees with “holes”.
Technically this can be expressed with binary resolution that results in Horn clauses (the
body atoms correspond to the “holes”), with tree grammar compressions where nonterminals
may contain variables (the “holes”), or with combinators, where the “holes” are mapped to
subtree sharing in DAGs (by getting rid of the variables through combinators) [
        <xref ref-type="bibr" rid="ref49">49</xref>
        ]. The
combinator representation is particularly suitable for proof structure enumeration, interwoven
with unification, initialized with a goal, as in SGCD when operated goal-driven. This is realized
with CCS [
        <xref ref-type="bibr" rid="ref49">49</xref>
        ], our second experimental prover in CD Tools. This prover may be viewed as
implementing the connection structure calculus [
        <xref ref-type="bibr" rid="ref55">55</xref>
        ], restricted to CD problems. Depending on
the permitted combinators or proof term patterns involving combinators, it can simulate other
calculi, in particular goal-driven versions of binary resolution.
      </p>
      <p>
        The most immediate level characterization for enumeration with the combinator
representation is compacted size. We then obtain in essence DAG enumeration, with iterative deepening
upon the DAG size. Allowing combinators to enter the D-terms is only justified if they permit
proofs with smaller compacted size. That is, if the proving D-term with combinators has smaller
compacted size than those without combinators. Systematic enumeration of proof DAGs is
not as simple as tree enumeration because it requires to maintain subtrees and the lemmas
proven by them which already appear in the proof under construction.10 Rigid variables are not
sufficient, some copying or forgetting [
        <xref ref-type="bibr" rid="ref55">55</xref>
        ] of variables is required.
      </p>
      <p>So far, CCS has been tried only in goal-driven mode. It is not clear, how enumeration by
compacted size transfers to SGCD with its axiom-driven mode. The level characterizations
for SGCD are determined for a given proof structure “globally”, independently from context:
tree size, height, PSP-level. During enumeration by compacted size, a given structure has to be
assessed differently, context depending: if it already occurred in the partially built-up structure,
its cost value is zero – it can be attached again without increasing the overall compacted size. Is
the PSP-level a version of compacted size that is limited in a certain sense such that it permits
global value assessment? If so, is it a distinguished best version within that limitation?</p>
      <p>CCS performs roughly comparable with CM-CT provers. If configured without combinators, it
yields as a bonus proofs with guaranteed minimal compacted size. While stronger compressions
can in principle achieve drastic savings, it is not clear what role this may play in practice.
4.2. Systematization of Level Characterizations
It seems desirable to formally specify and systematize criteria and possibilities of level
characterizations for SGCD. Levels may be disjoint (e.g., tree size, height, compacted size, PSP-level)
or cumulative (e.g., tree size less or equal than the level index). There is an interplay with the
subterm relationship among proof terms. There may be gaps with respect to given axioms: some
level may have no member with MGT, but the next level has members with MGT. A method
that incrementally generates levels then may exhaust too early. The level characterization may
be incomplete, as observed for the PSP-level in Sect. 3.2. As noted in Sect. 4.1, compacted size is
in a sense context dependent and seems to bear a certain relationship to PSP-level enumeration.</p>
      <p>
        How can different level characterizations be combined? Abstractly into a single new level
characterization? The experiment with LCL073-1 shows that “hybrid” configurations with
10As outlined in [
        <xref ref-type="bibr" rid="ref49">49</xref>
        ], the DAG enumeration by CCS is with a variation of the value-number method [
        <xref ref-type="bibr" rid="ref56 ref57">56, 57</xref>
        ].
different level characterizations for the goal- and axiom-driven phases are important. A trivial
pragmatic approach to combine different level characterizations is invoking prover instances for
each of them in portfolio mode. Also “heuristic” limitations like, e.g., maximal term height of
lemma formulas, which are so far handled in SGCD orthogonally to the level characterizations,
might be incorporated into level characterizations.
      </p>
      <p>
        In semi-naive evaluation (e.g., [
        <xref ref-type="bibr" rid="ref58">58</xref>
        ]) certain recursive invocations of enumeration predicates
are replaced with accesses to “delta” predicates that represent results newly obtained in the
immediately preceding round. This realizes a forward-reasoning trigger effect: a subgoal with a
delta predicate is evaluated first; only for its results, that is, the new results from the preceding
round, the remaining subgoals are considered. A related effect can be observed in some cases
for our proof/formula pair enumeration predicate, where recursive invocations are for the
given level minus one. But so far, such effects and their consequences for improving SGCD’s
enumeration algorithms have not been systematically studied.
      </p>
      <p>The grouping into levels suggests a “bulk view” on first-order ATP, where instead of proving a
single theorem from axioms the objective is to derive from the axioms all lemmas in some given
level. This perspective hides a key difficulty in goal-driven proceeding to prove a single theorem,
which precludes decomposing a problem – an open subgoal – into subproblems – further
subgoals – that are independent from each other: The subgoals obtained in the decomposition,
e.g., for the major and minor premise of a CD problem, may share variables. The bulk view
may explain the success of axiom-driven proceeding and its adequacy for theorem finding
(Sect. 3.5.4). It also may facilitate interfacing with techniques for machine learning that are
based on problem decomposition into independent subproblems.</p>
      <p>
        The view of enumeration by level suggests to try in ATP research also an “empirical” style of
pattern-observing, inspired by physics, as exemplified for combinatory logic in [
        <xref ref-type="bibr" rid="ref59">59</xref>
        ].
      </p>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusion</title>
      <p>We have generalized the proof structure enumeration interwoven with unification as performed
by goal-driven CM-CT provers to an interplay of goal- and axiom-driven processing. It makes
heuristic restrictions known from saturating provers applicable. The approach was evaluated
with SGCD, an experimental system.</p>
      <p>Proof structure terms, proof objects that allow to specify and implement various ways of
building proofs, were a central tool. Meredith’s CD represents the prototypical base case of
such proof structure terms. Hence, so far we focused on the CD problems, a subclass of the
first-order Horn problems with a binary function symbol, and cyclic predicate dependency.</p>
      <p>From the view of first-order ATP, CD provides a simplified setting that inspired and facilitated
our work. We expect that our main techniques and results can be transferred to first-order
ATP in general. Incorporation of dedicated equality handling should through the axiom-driven
operation be possible without resorting to lazy variants of paramodulation.</p>
      <p>
        Another view on generalization is to explore the potential scope of CD. Generalizing CD to
arbitrary first-order Horn problems is straightforward and has been realized [
        <xref ref-type="bibr" rid="ref49">49</xref>
        ] in a system
related to SGCD. Typical applications of CD are investigating axiomatizations of propositional
logics whose formulas are expressed as first-order terms. Proving problems are then first-order
problems with a single predicate. If the axiomatized logic is classical propositional logic, this
covers theorem proving in that logic, although with an incomparably weaker performance
than that of a SAT solver. Nevertheless, with a generalization of CD to take quantification into
account, this approach is used very successfully by Metamath to formalize mathematics [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
The mismatch between this CD approach and current ATP systems (a generalization of the
mentioned ways to perform propositional reasoning via CD vs. a SAT solver) has recently been
observed and addressed in different ways [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. Dedicated CD-based reasoning, as represented
by our SGCD system, may be a further way to combine Metamath – with its inherent generality
– with ATP. Theoretical results concerning the extension of CD to first-order logic in general
can be found in [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The CM and the connection structure calculus offer proof representations
for first-order logic in general, which possibly may be used like the D-terms of CD [
        <xref ref-type="bibr" rid="ref15 ref33 ref49">33, 15, 49</xref>
        ].
      </p>
      <p>
        Concerning related proving techniques, similarly to SGCD an implementation [
        <xref ref-type="bibr" rid="ref60">60</xref>
        ] of
hypertableaux [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] creates lemmas axiom-driven level-by-level. This approach is, however, much
less flexibly configurable and it incorporates goal-driven phases only in a rudimentary way, by
termination before a level is completed as soon as a clause subsuming the goal is generated. A
recent enhancement of E by machine learning supplements a classification based on derivation
history as clause selection guidance [
        <xref ref-type="bibr" rid="ref61">61</xref>
        ]. Since in structure-generating proving the D-term
itself, or, so-to-speak, the derivation history, provides the main guidance, this enhancement
might possibly be viewed as introducing a little bit of structure-generating proving into E.
      </p>
      <p>Experiments show that on CD problems the structure-generating approach keeps up with
state-of-the-art first-order provers and results in comparatively short proofs, which, moreover,
are gap-free and in a simple calculus. In powerful configurations, state-of-the-art provers such
as E and Vampire do not emit gap-free proofs. Further particular strengths of the approach show
up with two problems that were extensively investigated in ATP. For the first, never solved by a
CM-CT prover, SGCD quickly finds a proof that is shorter than all known ones. For the second,
which escapes all state-of-the-art provers, SGCD, in a specific configuration, is able to find a
proof. In both cases the key is a novel proof structure enumeration strategy.</p>
      <p>
        As demonstrated in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], the structure-generating approach is also useful in machine learning
for ATP. It supports a data flow centered around proof structure terms, D-terms. Features of
formulas can be complemented by features of proof structures. In purely axiom-driven mode
SGCD is well suited to generate a large number of proven lemmas from which a small subset is
then selected on the basis of learning and passed as auxiliary input to arbitrary provers.
      </p>
      <p>Open issues that require further research include the incorporation of stronger proof
compressions and a systematic investigation of the possibilities to group proof structures into ordered
“levels” governing the enumeration order of SGCD.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Acknowledgments</title>
      <p>The author thanks anonymous reviewers for helpful suggestions to improve the presentation.
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) –
ProjectID 457292495. The work was supported by the North-German Supercomputing Alliance (HLRN).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Hähnle</surname>
          </string-name>
          ,
          <article-title>Tableaux and related methods</article-title>
          , in: A.
          <string-name>
            <surname>Robinson</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Voronkov (Eds.),
          <source>Handb. of Autom. Reasoning</source>
          , volume
          <volume>1</volume>
          ,
          <string-name>
            <surname>Elsevier</surname>
          </string-name>
          ,
          <year>2001</year>
          , pp.
          <fpage>101</fpage>
          -
          <lpage>178</lpage>
          . doi:
          <volume>10</volume>
          .1016/ b978-044450813-3/
          <fpage>50005</fpage>
          -9.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Letz</surname>
          </string-name>
          , Tableau and
          <string-name>
            <given-names>Connection</given-names>
            <surname>Calculi</surname>
          </string-name>
          . Structure, Complexity, Implementation, Habilitationsschrift, TU München,
          <year>1999</year>
          . Available from http://www2.tcs.ifi.lmu.de/~letz/habil.ps, accessed Jul 19,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M. E.</given-names>
            <surname>Stickel</surname>
          </string-name>
          ,
          <article-title>A Prolog technology theorem prover: implementation by an extended Prolog compiler</article-title>
          ,
          <source>J. Autom. Reasoning</source>
          <volume>4</volume>
          (
          <year>1988</year>
          )
          <fpage>353</fpage>
          -
          <lpage>380</lpage>
          . doi:
          <volume>10</volume>
          .1007/BF00297245.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Letz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schumann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bayerl</surname>
          </string-name>
          , W. Bibel,
          <article-title>SETHEO: A high-performance theorem prover</article-title>
          ,
          <source>J. Autom. Reasoning</source>
          <volume>8</volume>
          (
          <year>1992</year>
          )
          <fpage>183</fpage>
          -
          <lpage>212</lpage>
          . doi:
          <volume>10</volume>
          .1007/BF00244282.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Otten</surname>
          </string-name>
          , W. Bibel,
          <article-title>leanCoP: lean connection-based theorem proving</article-title>
          ,
          <source>J. Symb. Comput</source>
          .
          <volume>36</volume>
          (
          <year>2003</year>
          )
          <fpage>139</fpage>
          -
          <lpage>161</lpage>
          . doi:
          <volume>10</volume>
          .1016/S0747-
          <volume>7171</volume>
          (
          <issue>03</issue>
          )
          <fpage>00037</fpage>
          -
          <lpage>3</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Otten</surname>
          </string-name>
          ,
          <article-title>Restricting backtracking in connection calculi</article-title>
          ,
          <source>AI</source>
          Communications
          <volume>23</volume>
          (
          <year>2010</year>
          )
          <fpage>159</fpage>
          -
          <lpage>182</lpage>
          . doi:
          <volume>10</volume>
          .3233/AIC-2010-0464.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>I.</given-names>
            <surname>Dahn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wernhard</surname>
          </string-name>
          ,
          <article-title>First order proof problems extracted from an article in the Mizar mathematical library</article-title>
          , in: M. P. Bonacina, U. Furbach (Eds.), FTP'97,
          <string-name>
            <surname>RISC-Linz Report</surname>
          </string-name>
          Series No.
          <volume>97</volume>
          -
          <fpage>50</fpage>
          , Joh. Kepler Univ.,
          <string-name>
            <surname>Linz</surname>
          </string-name>
          ,
          <year>1997</year>
          , pp.
          <fpage>58</fpage>
          -
          <lpage>62</lpage>
          . URL: https://www.logic.at/ftp97/ papers/dahn.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Wernhard</surname>
          </string-name>
          ,
          <article-title>The PIE system for proving, interpolating and eliminating</article-title>
          , in: P. Fontaine,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schulz</surname>
          </string-name>
          , J. Urban (Eds.),
          <source>PAAR</source>
          <year>2016</year>
          , volume
          <volume>1635</volume>
          <source>of CEUR Workshop Proc., CEUR-WS.org</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>125</fpage>
          -
          <lpage>138</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>1635</volume>
          /paper-11.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Wernhard</surname>
          </string-name>
          ,
          <article-title>Facets of the PIE environment for proving, interpolating and eliminating on the basis of first-order logic</article-title>
          , in: P.
          <string-name>
            <surname>Hofstedt</surname>
          </string-name>
          , et al. (Eds.),
          <source>DECLARE</source>
          <year>2019</year>
          , volume
          <volume>12057</volume>
          <source>of LNCS (LNAI)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>160</fpage>
          -
          <lpage>177</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -46714-2_
          <fpage>11</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D. W.</given-names>
            <surname>Loveland</surname>
          </string-name>
          ,
          <source>Automated Theorem Proving: A Logical Basis</source>
          , North-Holland,
          <year>1978</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D.</given-names>
            <surname>Prawitz</surname>
          </string-name>
          ,
          <article-title>An improved proof procedure</article-title>
          ,
          <source>Theoria</source>
          <volume>26</volume>
          (
          <year>1960</year>
          )
          <fpage>102</fpage>
          -
          <lpage>139</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>D.</given-names>
            <surname>Prawitz</surname>
          </string-name>
          ,
          <article-title>Advances and problems in mechanical proof procedures</article-title>
          ,
          <source>Machine Intelligence</source>
          <volume>4</volume>
          (
          <year>1969</year>
          )
          <fpage>59</fpage>
          -
          <lpage>71</lpage>
          .
          <article-title>Reprinted with author preface in J</article-title>
          . Siekmann, G. Wright (eds.):
          <source>Automation of Reasoning</source>
          , vol
          <volume>2</volume>
          :
          <source>Classical Papers on Computational Logic</source>
          <year>1967</year>
          -1970, Springer,
          <year>1983</year>
          , pp.
          <fpage>283</fpage>
          -
          <lpage>297</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>C.</given-names>
            <surname>Goller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Letz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Mayr</surname>
          </string-name>
          , J. Schumann,
          <string-name>
            <surname>SETHEO</surname>
          </string-name>
          <year>V3</year>
          .
          <article-title>2: recent developments - system abstract</article-title>
          , in: A.
          <string-name>
            <surname>Bundy</surname>
          </string-name>
          (Ed.),
          <source>CADE-12</source>
          , volume
          <volume>814</volume>
          <source>of LNCS (LNAI)</source>
          ,
          <year>1994</year>
          , pp.
          <fpage>778</fpage>
          -
          <lpage>782</lpage>
          . doi:
          <volume>10</volume>
          .1007/3-540-58156-1_
          <fpage>59</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Prior</surname>
          </string-name>
          , Logicians at play; or Syll, Simp and Hilbert,
          <source>Australasian Journal of Philosophy</source>
          <volume>34</volume>
          (
          <year>1956</year>
          )
          <fpage>182</fpage>
          -
          <lpage>192</lpage>
          . doi:
          <volume>10</volume>
          .1080/00048405685200181.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>C.</given-names>
            <surname>Wernhard</surname>
          </string-name>
          , W. Bibel,
          <article-title>Investigations into proof structures</article-title>
          ,
          <source>CoRR abs/2304</source>
          .12827 (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2304.12827, submitted.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Meredith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Prior</surname>
          </string-name>
          ,
          <article-title>Notes on the axiomatics of the propositional calculus</article-title>
          .,
          <source>Notre Dame J. of Formal Logic</source>
          <volume>4</volume>
          (
          <year>1963</year>
          )
          <fpage>171</fpage>
          -
          <lpage>187</lpage>
          . doi:
          <volume>10</volume>
          .1305/ndjfl/1093957574.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>M.</given-names>
            <surname>Rawson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wernhard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zombori</surname>
          </string-name>
          , W. Bibel, Lemmas: Generation, selection, application, in: R.
          <string-name>
            <surname>Ramanayake</surname>
          </string-name>
          , J. Urban (Eds.),
          <source>TABLEAUX</source>
          <year>2023</year>
          , volume
          <volume>14278</volume>
          <source>of LNCS (LNAI)</source>
          , Springer,
          <year>2023</year>
          , pp.
          <fpage>153</fpage>
          -
          <lpage>174</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -43513-
          <issue>3</issue>
          _
          <fpage>9</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rezuş</surname>
          </string-name>
          ,
          <source>Witness Theory - Notes on λ-calculus and Logic</source>
          , volume
          <volume>84</volume>
          of Studies in Logic, College Publications,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rezuş</surname>
          </string-name>
          ,
          <article-title>The Curry-Howard Correspondence revisited (2019b)</article-title>
          , in: [18],
          <year>2020</year>
          , pp.
          <fpage>209</fpage>
          -
          <lpage>214</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Hindley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Meredith</surname>
          </string-name>
          ,
          <article-title>Principal type-schemes and condensed detachment</article-title>
          ,
          <source>J. Symbolic Logic</source>
          <volume>55</volume>
          (
          <year>1990</year>
          )
          <fpage>90</fpage>
          -
          <lpage>105</lpage>
          . doi:
          <volume>10</volume>
          .2307/2274956.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Hindley</surname>
          </string-name>
          , Basic Simple Type Theory, Cambridge University Press,
          <year>1997</year>
          . doi:
          <volume>10</volume>
          .1017/ CBO9780511608865.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>N. D.</given-names>
            <surname>Megill</surname>
          </string-name>
          ,
          <article-title>A finitely axiomatized formalization of predicate calculus with equality</article-title>
          ,
          <source>Notre Dame J. of Formal Logic</source>
          <volume>36</volume>
          (
          <year>1995</year>
          )
          <fpage>435</fpage>
          -
          <lpage>453</lpage>
          . doi:
          <volume>10</volume>
          .1305/ndjfl/1040149359.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M.</given-names>
            <surname>Carneiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. E.</given-names>
            <surname>Brown</surname>
          </string-name>
          , J. Urban,
          <source>Automated theorem proving for Metamath</source>
          , in: A.
          <string-name>
            <surname>Naumowicz</surname>
          </string-name>
          , R. Thiemann (Eds.),
          <source>ITP</source>
          <year>2023</year>
          , volume
          <volume>268</volume>
          of LIPIcs,
          <source>Schloss Dagstuhl - Leibniz-Zentrum für Informatik</source>
          ,
          <year>2023</year>
          , pp.
          <volume>9</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          :
          <fpage>19</fpage>
          . doi:
          <volume>10</volume>
          .4230/LIPIcs.ITP.
          <year>2023</year>
          .
          <volume>9</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>L.</given-names>
            <surname>Wos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Winker</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. McCune</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Overbeek</surname>
          </string-name>
          , E. Lusk,
          <string-name>
            <given-names>R.</given-names>
            <surname>Stevens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Butler</surname>
          </string-name>
          ,
          <article-title>Automated reasoning contributes to mathematics and logic</article-title>
          , in: M. E. Stickel (Ed.),
          <source>CADE-10</source>
          , Springer,
          <year>1990</year>
          , pp.
          <fpage>485</fpage>
          -
          <lpage>499</lpage>
          . doi:
          <volume>10</volume>
          .1007/3-540-52885-7_
          <fpage>109</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>L.</given-names>
            <surname>Wos</surname>
          </string-name>
          ,
          <article-title>Automated reasoning and Bledsoe's dream for the field</article-title>
          , in: R. S. Boyer (Ed.),
          <source>Automated Reasoning: Essays in Honor of Woody Bledsoe, Automated Reasoning Series</source>
          , Kluwer Academic Publishers,
          <year>1991</year>
          , pp.
          <fpage>297</fpage>
          -
          <lpage>345</lpage>
          . doi:
          <volume>10</volume>
          .1007/
          <fpage>978</fpage>
          -94-011-3488-0_
          <fpage>15</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>W.</given-names>
            <surname>McCune</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wos</surname>
          </string-name>
          ,
          <article-title>Experiments in automated deduction with condensed detachment</article-title>
          , in: D.
          <string-name>
            <surname>Kapur</surname>
          </string-name>
          (Ed.),
          <source>CADE-11</source>
          , volume
          <volume>607</volume>
          <source>of LNCS (LNAI)</source>
          , Springer,
          <year>1992</year>
          , pp.
          <fpage>209</fpage>
          -
          <lpage>223</lpage>
          . doi:
          <volume>10</volume>
          .1007/3-540-55602-8_
          <fpage>167</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>L.</given-names>
            <surname>Wos</surname>
          </string-name>
          ,
          <article-title>The resonance strategy</article-title>
          ,
          <source>Computers Math. Applic</source>
          .
          <volume>29</volume>
          (
          <year>1995</year>
          )
          <fpage>133</fpage>
          -
          <lpage>178</lpage>
          . doi:
          <volume>10</volume>
          . 1016/
          <fpage>0898</fpage>
          -
          <lpage>1221</lpage>
          (
          <issue>94</issue>
          )
          <fpage>00220</fpage>
          -
          <lpage>F</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>L.</given-names>
            <surname>Wos</surname>
          </string-name>
          ,
          <article-title>The power of combining resonance with heat</article-title>
          ,
          <source>J. Autom. Reasoning</source>
          <volume>17</volume>
          (
          <year>1996</year>
          )
          <fpage>23</fpage>
          -
          <lpage>81</lpage>
          . doi:
          <volume>10</volume>
          .1007/BF00247668.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>R.</given-names>
            <surname>Veroff</surname>
          </string-name>
          ,
          <article-title>Finding shortest proofs: An application of linked inference rules</article-title>
          ,
          <source>J. Autom. Reasoning</source>
          <volume>27</volume>
          (
          <year>2001</year>
          )
          <fpage>123</fpage>
          -
          <lpage>139</lpage>
          . doi:
          <volume>10</volume>
          .1023/A:
          <fpage>1010635625063</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>B.</given-names>
            <surname>Fitelson</surname>
          </string-name>
          , L. Wos, Missing proofs found,
          <source>J. Autom. Reasoning</source>
          <volume>27</volume>
          (
          <year>2001</year>
          )
          <fpage>201</fpage>
          -
          <lpage>225</lpage>
          . doi:
          <volume>10</volume>
          .1023/A:
          <fpage>1010695827789</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>L.</given-names>
            <surname>Wos</surname>
          </string-name>
          ,
          <article-title>Conquering the Meredith single axiom</article-title>
          ,
          <source>J. Autom. Reasoning</source>
          <volume>27</volume>
          (
          <year>2001</year>
          )
          <fpage>175</fpage>
          -
          <lpage>199</lpage>
          . doi:
          <volume>10</volume>
          .1023/A:
          <fpage>1010691726881</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>D.</given-names>
            <surname>Ulrich</surname>
          </string-name>
          ,
          <article-title>A legacy recalled and a tradition continued</article-title>
          ,
          <source>J. Autom. Reasoning</source>
          <volume>27</volume>
          (
          <year>2001</year>
          )
          <fpage>97</fpage>
          -
          <lpage>122</lpage>
          . doi:
          <volume>10</volume>
          .1023/A:
          <fpage>1010683508225</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>C.</given-names>
            <surname>Wernhard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Bibel</surname>
          </string-name>
          ,
          <article-title>Learning from Łukasiewicz and Meredith: Investigations into proof structures</article-title>
          , in: A.
          <string-name>
            <surname>Platzer</surname>
          </string-name>
          , G. Sutcliffe (Eds.),
          <source>CADE 28</source>
          , volume
          <volume>12699</volume>
          <source>of LNCS (LNAI)</source>
          , Springer,
          <year>2021</year>
          , pp.
          <fpage>58</fpage>
          -
          <lpage>75</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -79876-
          <issue>5</issue>
          _
          <fpage>4</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>OEIS</given-names>
            <surname>Foundation</surname>
          </string-name>
          <article-title>Inc</article-title>
          .,
          <source>The On-Line Encyclopedia of Integer Sequences</source>
          ,
          <year>2023</year>
          . Published electronically at http://oeis.org.
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>J. M. P. Schumann</surname>
          </string-name>
          , DELTA
          <article-title>- A bottom-up preprocessor for top-down theorem provers</article-title>
          ,
          <source>in: CADE-12</source>
          , volume
          <volume>814</volume>
          <source>of LNCS (LNAI)</source>
          , Springer,
          <year>1994</year>
          , pp.
          <fpage>774</fpage>
          -
          <lpage>777</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 3-540-58156-1_
          <fpage>58</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>P.</given-names>
            <surname>Baumgartner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Furbach</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Niemelä</surname>
          </string-name>
          ,
          <article-title>Hyper tableaux</article-title>
          , in: J. J.
          <string-name>
            <surname>Alferes</surname>
            ,
            <given-names>L. M.</given-names>
          </string-name>
          <string-name>
            <surname>Pereira</surname>
          </string-name>
          , E. Orlowska (Eds.),
          <source>JELIA'96</source>
          , volume
          <volume>1126</volume>
          <source>of LNCS (LNAI)</source>
          , Springer,
          <year>1996</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          . doi:
          <volume>10</volume>
          .1007/3-540-61630-
          <issue>6</issue>
          _
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>W.</given-names>
            <surname>Bibel</surname>
          </string-name>
          ,
          <article-title>Comparison of proof methods</article-title>
          , in: J.
          <string-name>
            <surname>Otten</surname>
          </string-name>
          , W. Bibel (Eds.),
          <source>AReCCa</source>
          <year>2023</year>
          , CEUR Workshop Proc.,
          <source>CEUR-WS.org</source>
          ,
          <year>2023</year>
          . This volume.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>C.</given-names>
            <surname>Wernhard</surname>
          </string-name>
          , CD Tools -
          <article-title>Condensed detachment and structure generating theorem proving (system description)</article-title>
          ,
          <source>CoRR abs/2207</source>
          .08453 (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2207. 08453.
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Lemmon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Meredith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Meredith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Prior</surname>
          </string-name>
          , I. Thomas, Calculi of pure strict implication, in: J. W. Davis,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Hockney</surname>
          </string-name>
          , W. K. Wilson (Eds.),
          <source>Philosophical Logic</source>
          , Springer Netherlands,
          <year>1969</year>
          , pp.
          <fpage>215</fpage>
          -
          <lpage>250</lpage>
          . doi:
          <volume>10</volume>
          .1007/
          <fpage>978</fpage>
          -94-010-9614-0_
          <fpage>17</fpage>
          ,
          <source>reprint of a technical report</source>
          , Canterbury University College, Christchurch,
          <year>1957</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Prior</surname>
          </string-name>
          , Formal Logic, 2nd ed., Clarendon Press, Oxford,
          <year>1962</year>
          . doi:
          <volume>10</volume>
          .1093/acprof: oso/9780198241560.001.0001.
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>D.</given-names>
            <surname>Meredith</surname>
          </string-name>
          , In memoriam: Carew Arthur Meredith (
          <year>1904</year>
          -
          <fpage>1976</fpage>
          ).,
          <source>Notre Dame J. of Formal Logic</source>
          <volume>18</volume>
          (
          <year>1977</year>
          )
          <fpage>513</fpage>
          -
          <lpage>516</lpage>
          . doi:
          <volume>10</volume>
          .1305/ndjfl/1093888116.
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>W.</given-names>
            <surname>McCune</surname>
          </string-name>
          ,
          <source>Prover9 and Mace4</source>
          ,
          <fpage>2005</fpage>
          -
          <lpage>2010</lpage>
          . http://www.cs.unm.edu/~mccune/prover9.
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>W.</given-names>
            <surname>Bibel</surname>
          </string-name>
          ,
          <source>Automated Theorem Proving</source>
          , Vieweg, Braunschweig,
          <year>1982</year>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>322</fpage>
          -90102-6, second edition
          <year>1987</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>W.</given-names>
            <surname>Bibel</surname>
          </string-name>
          , Deduction: Automated Logic, Academic Press,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>W.</given-names>
            <surname>Bibel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Otten</surname>
          </string-name>
          ,
          <article-title>From Schütte's formal systems to modern automated deduction</article-title>
          , in: R. Kahle, M. Rathjen (Eds.),
          <source>The Legacy of Kurt Schütte</source>
          , Springer,
          <year>2020</year>
          , pp.
          <fpage>215</fpage>
          -
          <lpage>249</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -49424-7_
          <fpage>13</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wielemaker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Schrijvers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Triska</surname>
          </string-name>
          , T. Lager, SWI-Prolog,
          <source>Theory and Practice of Logic Programming</source>
          <volume>12</volume>
          (
          <year>2012</year>
          )
          <fpage>67</fpage>
          -
          <lpage>96</lpage>
          . doi:
          <volume>10</volume>
          .1017/S1471068411000494.
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [47]
          <string-name>
            <given-names>O. L.</given-names>
            <surname>Astrachan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. E.</given-names>
            <surname>Stickel</surname>
          </string-name>
          ,
          <article-title>Caching and lemmaizing in model elimination theorem provers</article-title>
          , in: D.
          <string-name>
            <surname>Kapur</surname>
          </string-name>
          (Ed.),
          <source>CADE-11</source>
          , Springer,
          <year>1992</year>
          , pp.
          <fpage>224</fpage>
          -
          <lpage>238</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 3-540-55602-8_
          <fpage>168</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          [48]
          <string-name>
            <given-names>J.</given-names>
            <surname>Łukasiewicz</surname>
          </string-name>
          ,
          <article-title>The shortest axiom of the implicational calculus of propositions</article-title>
          ,
          <source>in: Proc. of the Royal Irish Academy</source>
          , volume
          <volume>52</volume>
          ,
          <string-name>
            <surname>Sect</surname>
          </string-name>
          . A, No.
          <volume>3</volume>
          ,
          <issue>1948</issue>
          , pp.
          <fpage>25</fpage>
          -
          <lpage>33</lpage>
          . URL: http: //www.jstor.org/stable/20488489.
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          [49]
          <string-name>
            <given-names>C.</given-names>
            <surname>Wernhard</surname>
          </string-name>
          ,
          <article-title>Generating compressed combinatory proof structures - an approach to automated first-order theorem proving</article-title>
          , in: B.
          <string-name>
            <surname>Konev</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Schon</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Steen (Eds.),
          <source>PAAR</source>
          <year>2022</year>
          , volume
          <volume>3201</volume>
          <source>of CEUR Workshop Proc., CEUR-WS.org</source>
          ,
          <year>2022</year>
          . https://arxiv.org/abs/2209.12592.
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          [50]
          <string-name>
            <given-names>N.</given-names>
            <surname>Eén</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Sörensson</surname>
          </string-name>
          ,
          <article-title>An extensible SAT-solver</article-title>
          , in: E.
          <string-name>
            <surname>Giunchiglia</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Tacchella (Eds.),
          <source>SAT</source>
          <year>2003</year>
          , volume
          <volume>2919</volume>
          <source>of LNCS</source>
          , Springer,
          <year>2003</year>
          , pp.
          <fpage>502</fpage>
          -
          <lpage>518</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>540</fpage>
          -24605-3_
          <fpage>37</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          [51]
          <string-name>
            <given-names>S.</given-names>
            <surname>Schulz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Cruanes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Vukmirović</surname>
          </string-name>
          , Faster, higher,
          <source>stronger: E 2</source>
          .3, in: P. Fontaine (Ed.),
          <source>CADE 27, number 11716 in LNAI</source>
          , Springer,
          <year>2019</year>
          , pp.
          <fpage>495</fpage>
          -
          <lpage>507</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>030</fpage>
          -29436-6_
          <fpage>29</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          [52]
          <string-name>
            <given-names>L.</given-names>
            <surname>Kovács</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Voronkov</surname>
          </string-name>
          ,
          <article-title>First-order theorem proving and Vampire</article-title>
          , in: N.
          <string-name>
            <surname>Sharygina</surname>
          </string-name>
          , H. Veith (Eds.),
          <source>CAV</source>
          <year>2013</year>
          , volume
          <volume>8044</volume>
          <source>of LNCS</source>
          , Springer,
          <year>2013</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>35</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>642</fpage>
          -39799-
          <issue>8</issue>
          _
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref53">
        <mixed-citation>
          [53]
          <string-name>
            <given-names>M.</given-names>
            <surname>Rawson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wernhard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zombori</surname>
          </string-name>
          , W. Bibel, Lemmas: Generation, selection, application,
          <source>CoRR abs/2303</source>
          .05854 (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2303.05854, extended version of [17].
        </mixed-citation>
      </ref>
      <ref id="ref54">
        <mixed-citation>
          [54]
          <string-name>
            <given-names>W.</given-names>
            <surname>McCune</surname>
          </string-name>
          , OTTER
          <volume>3</volume>
          .3 Reference Manual,
          <source>Technical Report ANL/MCS-TM-263</source>
          , Argonne National Laboratory,
          <year>2003</year>
          . https://www.cs.unm.edu/~mccune/otter/Otter33.pdf,
          <source>accessed Jul 19</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref55">
        <mixed-citation>
          [55]
          <string-name>
            <given-names>E.</given-names>
            <surname>Eder</surname>
          </string-name>
          ,
          <article-title>A comparison of the resolution calculus and the connection method, and a new calculus generalizing both methods</article-title>
          , in: E. Börger,
          <string-name>
            <given-names>H. Kleine</given-names>
            <surname>Büning</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. M.</given-names>
            <surname>Richter</surname>
          </string-name>
          (Eds.),
          <source>CSL '88</source>
          , volume
          <volume>385</volume>
          <source>of LNCS</source>
          , Springer,
          <year>1989</year>
          , pp.
          <fpage>80</fpage>
          -
          <lpage>98</lpage>
          . doi:
          <volume>10</volume>
          .1007/BFb0026296.
        </mixed-citation>
      </ref>
      <ref id="ref56">
        <mixed-citation>
          [56]
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Aho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sethi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Ullman</surname>
          </string-name>
          , Compilers - Principles, Techniques, and Tools, AddisonWesley,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref57">
        <mixed-citation>
          [57]
          <string-name>
            <given-names>A.</given-names>
            <surname>Genitrini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Gittenberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kauers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wallner</surname>
          </string-name>
          ,
          <article-title>Asymptotic enumeration of compacted binary trees of bounded right height</article-title>
          ,
          <source>J. Comb. Theory, Ser. A</source>
          <volume>172</volume>
          (
          <year>2020</year>
          )
          <article-title>105177</article-title>
          . doi:
          <volume>10</volume>
          . 1016/j.jcta.
          <year>2019</year>
          .
          <volume>105177</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref58">
        <mixed-citation>
          [58]
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Ullman</surname>
          </string-name>
          ,
          <article-title>Principles of Database and Knowledge-Base Systems</article-title>
          , volume I:
          <article-title>Classical Database Systems</article-title>
          , Computer Science Press, Rockville, Maryland,
          <year>1988</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref59">
        <mixed-citation>
          [59]
          <string-name>
            <given-names>S.</given-names>
            <surname>Wolfram</surname>
          </string-name>
          ,
          <string-name>
            <surname>Combinators - A Centennial View</surname>
          </string-name>
          , Wolfram Media Inc,
          <year>2021</year>
          . Accompanying webpage: https://writings.stephenwolfram.com/
          <year>2020</year>
          /12/combinators
          <article-title>-a-centennial-view/</article-title>
          ,
          <source>accessed Jun 30</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref60">
        <mixed-citation>
          [60]
          <string-name>
            <given-names>B.</given-names>
            <surname>Pelzer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wernhard</surname>
          </string-name>
          , System description: E-KRHyper, in: F. Pfenning (Ed.),
          <source>CADE-21</source>
          , volume
          <volume>4603</volume>
          <source>of LNCS (LNAI)</source>
          , Springer,
          <year>2007</year>
          , pp.
          <fpage>503</fpage>
          -
          <lpage>513</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>540</fpage>
          -73595-3_
          <fpage>37</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref61">
        <mixed-citation>
          [61]
          <string-name>
            <given-names>M.</given-names>
            <surname>Suda</surname>
          </string-name>
          ,
          <article-title>Improving ENIGMA-style clause selection while learning from history</article-title>
          , in: A.
          <string-name>
            <surname>Platzer</surname>
          </string-name>
          , G. Sutcliffe (Eds.),
          <source>CADE 28</source>
          , volume
          <volume>12699</volume>
          <source>of LNCS (LNAI)</source>
          , Springer,
          <year>2021</year>
          , pp.
          <fpage>543</fpage>
          -
          <lpage>561</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -79876-5_
          <fpage>31</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>