=Paper=
{{Paper
|id=Vol-3613/AReCCa2023_paper9
|storemode=property
|title=Comparison of Proof Methods
|pdfUrl=https://ceur-ws.org/Vol-3613/AReCCa2023_paper9.pdf
|volume=Vol-3613
|authors=Wolfgang Bibel
|dblpUrl=https://dblp.org/rec/conf/arecca/Bibel23
}}
==Comparison of Proof Methods==
<pdf width="1500px">https://ceur-ws.org/Vol-3613/AReCCa2023_paper9.pdf</pdf>
<pre>
                                Comparison of Proof Methods
                                Wolfgang Bibel1
                                1
                                    Technical University of Darmstadt, Germany


                                                                         Abstract
                                                                         The paper analyses and compares on a high level of abstraction signiﬁcant features of the leading proof
                                                                         methods in Automated Theorem Proving. It aims at the identiﬁcation of promising areas of research
                                                                         in order to further advance the ﬁeld. The evaluation yields relevant evidence for promising such a
                                                                         substantial advance by solving open research questions in the Connection Method. The most striking
                                                                         feature thereby is the method’s unique formula-orientedness which recently has led to a notable success
                                                                         by the system SGCD based on it.

                                                                         Keywords
                                                                         Automated Reasoning, Automated Theorem Proving, Connection Method, Resolution, Tableaux Method


                                1. Introduction
                                The discipline of Automated Reasoning (AR), or Automated Theorem Proving (ATP), derives its
                                central importance for the science comprising Artiﬁcial Intelligence (AI) from the following three
                                fundamental assumptions. A1: Knowledge of any kind can be represented in a language with a
                                precisely deﬁnable semantics. A2: Logical languages provide a close-enough approximation for
                                such a knowledge representation language. A3: Logical reasoning on the basis of such logical
                                languages provides the key mechanism for crucial tasks involved in the various applications of
                                knowledge and its handling including consistency-checking of knowledge bases as well as their
                                extensions by reasoning.
                                   Mathematics due to its wide-ranging success in the sciences and their applications provides
                                convincing evidence for the validity of these fundamental assumptions. For this reason, ATP
                                systems have primarily been tested paradigmatically on mathematical knowledge although its
                                mechanisms apply to any kind of knowledge and are not at all restricted to the mathematical
                                one.
                                   There is a variety of logical languages known for representing knowledge. First-order logic
                                (fol) may be seen as the language exhibiting core features shared by most others. Hence, fol
                                along with its inferential structure has been studied so extensively. In this paper we will follow
                                this attitude and restrict the discussion to fol. This allows us to assume that the reader is familiar
                                with this widely known formal language.
                                   Along with the underlying language any logic features some mechanism which to some
                                generality allows to identify true resp. valid statements among all possible ones. In Logic these

                                AReCCa 2023: Automated Reasoning with Connection Calculi, 18 September 2023, Prague, Czech Republic
                                $ bibel@gmx.net (W. Bibel)
                                 0000-0003-3892-0171 (W. Bibel)
                                                                       © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                    CEUR
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073
                                                                       CEUR Workshop Proceedings (CEUR-WS.org)


                                AReCCa 2023                                                                                              119                                                           CEUR-WS.org


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
are called formal systems while in ATP we refer to them as proof calculi. Numerous such calculi
have been developed in ATP. Many of them are based on the resolution rule, others are variants
of tableaux, further ones derive from the Connection Method (CM),1 and so forth.
   The main topic of this paper is a high-level analysis of their diﬀerences, virtues, advantages,
disadvantages, and the resulting perspectives for our ﬁeld. Thereby we will focus on the three
clusters of calculi just mentioned. Let us refer to these clusters of calculi as RM, TM, and CM.
   The subsequent section will present the intended analysis of these three methods. The section
thereafter will point out several aspects of the CM which still are waiting for being explored and
realized in competitive proof systems. The most important aspect thereby is the separation of the
structural work in proof search from the formula to be proved.2 This separation allows a global
guidance of the search from all proof-speciﬁc syntactic features of the formula as demonstrated
by the system SGCD in a ﬁrst step into such a direction. We outline the resulting research
perspective in a ﬁrst subsection. Apart from pointing to further hints in the literature towards
advancing the CM, we then outline the special handling of equality within the CM in a second
subsection, an important gap still waiting to be closed presumably in a rather straightforward
way simulating the analogous techniques successfully applied in the RM. Conclusions complete
the paper.


2. A comparative analysis of proof methods
The concept of proof has been developed, used, and reﬁned in Mathematics for more than two
thousand years. Establishing the validity of a statement, here assumed to be a formula F in fol,
is the goal of a proof. Generally, F is of the form A1 ∧ . . . ∧ An → T, n ≥ 0. The Ai ’s are the
axioms and T is the theorem to be established by the proof. “Axioms” here and in the following
is meant as an abbreviation for “axioms, theorems, lemmata, or assumptions”.
   Typically, mathematical proofs are presented in a semi-formal natural language text, ie. in a
linear format, and involves some sort of logical/mathematical proof steps which have evolved
by tradition over the centuries. Gerhard Gentzen has precisely formalized this semi-formal
style in his seminal paper [3]. Among others it features his well-known Kalkül des natürlichen
Schließens, NK, or calculus of natural deduction, which is modelling the traditional proof steps
in a formal way. Gentzen’s formal system LK [3, p.192f], a technical modiﬁcation of NK, features
19 rules (“Schlußﬁguren”) including the so-called cut rule. Gentzen proved in his Hauptsatz
that this rule is admissible, ie. that it can be dispensed with without aﬀecting the amount of
provable formulas.
1
  The cluster denoted by CM of course also includes the extensive work done in fol as well as in hol under the name
  of matings; see eg. [1]. Connection and mate are two diﬀerent notions for the very same concept. To the best of the
  author’s knowledge, the notion connection was coined ﬁrst by Robert Kowalsky and mate by Martin Davis.
2
  Proof search in the CM operates on connections along with their associated substitutions determined by the
  structure of the formula to be proved. Thereby the formula itself is not subjected to any operation changing it.
  In this sense the structural work is separated from the formula. This is in contrast to, say, resolution where the
  manipulations during proof search are performed on the formula itself by generating resolvents from parts of
  it, thus operating on the formula and its parts. Note that the resolvent carries a meaning as does the original
  formula, ie. both have semantics while connections are purely syntactic concepts; they are pairs of occurrences
  in the formula. This fundamental diﬀerence sets the CM apart from most other proof methods. (The reader not
  familiar with details of the CM is referred to the literature such as eg. [2].)


                                                        120
   This cut elimination leads to longer proofs, in fact exponentially longer proofs in the worst
case [4, Sect. 5.2]. This increase of the proof lengths is therefore an important, even crucial issue
for proof search. Imagine a competition of two systems one searching for the short proof of a
theorem achievable with cut, the other for the proof of the same theorem, but without cut and
for the exponentially longer proof. It is obvious that the latter system starts at an unassailable
disadvantage.
   This observation leads to the ﬁrst fundamental distinction of proof calculi: calculi with cut
or some equivalent feature vs. calculi without, shortly Cc vs. C−c . Thereby, the calculi with
property Cc have an exponential advantage over those in C−c for some set of theorems. LK
belongs to Cc unless the cut-rule is deleted in which case it belongs to C−c .
   Proofs in Gentzen’s systems are trees with axioms at the leaves and the theorem at the root.
In a more compressed form they might also be represented as dags, ie. directed acyclic graphs.
Several proof calculi developed thereafter took Gentzen as sort of a reference point. In principle,
all complete and consistent proof systems for fol related with LK are equally suitable for realizing
the automatic search for proofs. There is, however, the crucial criterion just mentioned putting
a subset among these proof systems at a serious disadvantage; the subset consists of those
inheriting the just mentioned exponential expansion of proof lengths.
   Since TM calculi are derived from LK without cut by way of inverting its rules, C−c holds for
TM and thus suﬀers from that serious disadvantage. It also holds for most CM calculi realized
in implemented systems like leanCoP. In contrast, RM calculi involve the cut since resolution
amounts to sort of a cut: resolution on ground clauses is a version of the cut rule restricted to
atomic formulas. Each resolution step involves the generation of a lemma which can be used
multiple times in the search for a proof, a feature closely related with cut. At our high level
of discussion we may thus say that RM calculi belong to Cc , without going into the nontrivial
details of this statement (see eg. [5, 6]). Cc holding for RM has been considered as a signiﬁcant
advantage in comparison with TM.3
   In [7, Sect. 2.10] a CM calculus has been presented which involves connection structures in
addition to simple connections. For this calculus it has been proved that it can linearly simulate
the resolution calculus, and vice versa. In other words, both calculi are computationally equally
powerful. For the CM as a family of calculi this means that its assumed major weakness in
comparison with RM of lacking cuts does in fact not hold. In short, for the CM we rather have
Cc . The ﬁrst ATP system with such connection structure features is CCS [8]. First promising
results achieved with it in its interplay with the system SGCD [9, 10] have been reported in [11].
   In human mathematical proofs the proof steps typically consist in relatively small changes of
the representational state, expressed in semi-formal statements. This stepwise manipulation of
parts of these statements takes into account the limited computational capabilities of humans
in carrying out such a step in a precise way and supports the minimization of errors. The
consequence is an interweaving of problem statements with proof statements or of problem
structure with proof structure. To a large extent this characteristical feature of not separating
these two diﬀerent structural aspects, noted as C−s , is modelled in some way or another by
3
    For a better overview of the various notions used here and in the following to denote the diﬀerent characteristics
    like Cc of the considered proof methods, the reader is recommended to keep an eye on the summarizing table
    towards the end of the present section which would be helpful for keeping track of these diﬀerent notions in
    relation to the proof methods during the subsequent discussion.


                                                          121
most logical calculi. They modify in each step the actual state, represented by some derived
formula, in some relatively small syntactic way.4 This characteristical feature of interweaving
the deductive structure with some formula structure is thus expressed by the “−s” in C−s in
order to express that these two diﬀerent structural aspects are not separated.
   The unique exception is CM whose calculi during the course of a proof search leave the
formula itself completely untouched. This is because they strictly separate the proof structural
information from the formula, ie. for CM we have Cs , a signiﬁcant distinction. While all other
calculi involve the ballast of formulas associated with each state in proof search, the CM has
been freed completely from this burden which thus turns out to be superﬂuous and is totally
unnecessary from a computational point of view; it is meaningful only with regard to our
human understanding. This separation of the proof structure from the formula at the same
time allows to focus the proof search on all structural features inherent in the formula. Thereby
proof structure includes all structural information possibly leading to proofs, not only the ﬁnally
found proof, of course. This includes, eg., alternative spanning sets of connections, variable
multiplicities, term transformations involved with connections, the discrepancy between goal
and axiom terms to be bridged, and so forth. A global view may be taken on all this relevant
information, as we termed it in [12]. In other words, the search can be done in a formula-oriented
manner as we denoted it in [12]. We consider these advantages of equal relevance as those
going along with the cut. They have been demonstrated in a ﬁrst step by the already mentioned
system CCS/SGCD.
   LK is of a generative type, as it is called in the formal languages literature. Any proof in it
starts with axioms and ends with the given theorem. For the search for a proof of a theorem
the opposite direction, namely starting with the theorem, is much more promising for rather
obvious reasons which are not repeated here. For enabling this switch of direction an analytic
type of calculus is asked for which starts from the theorem in a goal-oriented manner, shortly Cg ,
and ends up with axioms. Tableaux calculi belong to this family. They have been derived from
LK roughly by inverting its deductive rules along with certain simpliﬁcations and modiﬁcations
in detail. Since inverting the cut rule would involve a guess for the cut formula without any
guidance whatsoever, TM calculi have dispensed with that rule altogether, ie. TM calculi as
such are members of C−c as already noted above.
   RM calculi start out with the proof problem as a whole, ie. including axioms and theorem;
more precisely, they do so in the simpliﬁed format as a set of clauses. RM therefore features the
ﬂexibility to operate in both directions in a mixed manner, shortly Cm , which may include Cg .5
   As presented in [13, Sect. 13.2] in formal details, the CM is a method abstracted from TM
(and LK) so that all these are closely related. Hence, for the CM calculi published until recently
in the literature we have Cg , ie. it shares this property with TM. As noted above CM and TM
diﬀer in that we have Cs for CM, but C−s for TM.6 For a computational comparison all this

4
  As already mentioned in Footnote 2, in the example of resolution the manipulations during proof search are
  performed on the formula itself by generating resolvents, thus manipulating on the formula structure itself and this
  way interweaving manipulation on the formula structure with that on the deductive structure.
5
  With resolution a goal-oriented search can be achieved by way of an appropriate set-of-support and an axiom-
  oriented search for instance by way of the hyperresolution rule. Since RM calculi typically operate in an exhaustive
  manner, the distinction is of minor relevance in this context.
6
  Disjunction (in the sense of the originally given formula) is handled diﬀerently in CM and TM; whether or not this


                                                        122
amounts to a clear advantage of the CM over TM. Since otherwise both methods are so closely
related, in the important computational context we may thus set aside TM in our comparative
analysis as being represented by the computationally superior CM. In what follows we will
therefore reduce our discussion to CM vs. RM.
   For CCS/SGCD mentioned above, and thus also for CM, we have Cm as well which gives the
ﬂexibility of combining goal-oriented search with lemmata derived from the axioms. In fact,
Cm has turned out to become the crucial improvement for CM calculi in order to achieve the
decisive property Cc for them. For that reason it has been included into the list of the three
properties discussed here. Altogether, the comparison of CM with RM so far has thus resulted
in an advantage for CM because of the favorable Cs while otherwise both methods share the
same relevant computational features Cc and Cm .
   Unless P=co-NP we have to assume that the complexity of any proof calculus is (at least)
exponential. So, there will always be theorems for which any calculus will fail to ﬁnd a proof
for them within reasonable time limits. But it could well be the case that one calculus always
fares better than another one. That this is deﬁnitely not the case for resolution vs. the CM has
been shown in [14]. Namely, Haken has shown in [15] that resolution exhibits an exponential
behavior for the so-called pigeonhole formulas while the present author has presented in [14]
short CM proofs for exactly these formulas; recently a similar result was achieved for RM [16].7
In other words, CM and RM so far are at least on an equal footing, except for the mentioned
advantage of Cs for the CM.
   The ATP community has established a standard for comparing proof systems by way of the
CASC competition (along with TPTP).8 The best system in any problem category is the one
which, roughly, within a given time-limit proves most theorems. Sean Holden puts this on his
homepage such: “The automated theorem-proving (ATP) world likes to see things going faster and/or
more things being proved. If your prover is implemented in Prolog it starts at a disadvantage.”9
This manner of comparison suits the inherent human attitude towards spectacular races and

  bears an eﬀect on the complexity of the two methods is unknown to the present author; given the way of diﬀerent
  handling in both, it could only be an eﬀect favorable for CM.
7
  [16] fails to refer to [14].
8
  CASC was initiated in the 1990s by members of the present author’s former Munich research group along with
  Geoﬀ Sutcliﬀe during his collaborative visit of it. This footnote remark in the preceding sentence, contained already
  in the original version of the paper, was meant to indicate the present author’s favorable attitude towards Geoﬀ’s
  enormous and extremely valuable contributions done within this enterprise for many decades. In response to one
  of the referees’s “objection/criticism” concerning the subsequent critical assessment of CASC in the paper’s main
  text I extend this remark with the following additional explanations.
  First of all, the present author has supported TPTP/CASC in various ways, not least in my role as initiator,
  coordinator, and speaker of the DFG Focus Programme on Deduction in Germany (1992–1998) which included
  several substantial contributions to TPTP/CASC. But I stay with my assessment expressed in the main text that the
  present format of CASC implicitly favors the leading and popular proof methods in a too indiscriminate manner.
  I fully agree with the referee in that “TPTP/CASC is agnostic wrt the proof methods used in the provers.” But a
  comprehensive assessment in addition has to take into consideration the well-known human attitudes in reaction
  to the results of any such competition. Under such a more comprehensive view TPTP/CASC in its present format is
  focussing merely on quantitative measures without any consideration of further relevant information whatsoever
  and, of course, it is supporting the winners in an indirect way, even though the winning methods might not be
  the most promising ones in view of their future performance. Admittedly, a ﬁner tuned way of competition is not
  easily achievable.
9
  https://www.cl.cam.ac.uk/∼sbh11/, access 8.7.2023.


                                                         123
cheering winners. But obviously it involves too crude a measure of long-term comparison for
revealing the true virtues or weaknesses of the underlying proof methods, ie. their potential
performance in the future.
   The winning method under this crude measure for many years has been RM which this way
has achieved a monopoly in our ﬁeld, in accordance with the well-known social principle: “The
winner takes all.” Since computers have dramatically increased in speed and power along these
years, so has the number of proved theorems. Yet, there are still many theorems proved by
humans which have withstood any attempts to prove them by leading systems. Why that? It
is not unlikely that it is this particular manner of comparison within CASC which is favoring
in some way a possibly suboptimal research direction and through that might be suppressing
support for promising alternatives. The analysis in this paper is meant to alert the community
to this possibility – once again – in order to open its interest in competing directions (as I have
vainly done at numerous occasions within the last decades). Some ATP researchers do share this
encouragement to widen ATP’s horizons; among these is Sean Holden who has stated [17]: “It
seems therefore important that researchers devote time to developing methods that, while perhaps
not winning competitions in the short-term – and under the current metrics – have the potential to
allow us to address harder problems in the future.”
   The contradiction between our analysis favoring CM and the CASC favoring RM (in some
way) has simple explanations. Building CM systems in earnest has started in the 1980s engaging
since then altogether a mere few dozens of researchers. This is in sharp contrast with RM
systems whose serious research and development started about two decades earlier engaging
over the decades altogether thousands of researchers. For these sheer investmental reasons,
the RM technology this way has advanced far ahead of the CM one, of course. Under these
circumstances it is even remarkable how relatively successful the CM has become none the less.
Let us mention some of these achievements.
   CM allows the proof search for theorems in their original form. This makes the transforma-
tions to normal form including Skolemization superﬂuous which oﬀers a lot of advantages. Also,
the CM generalizes to logics other than fol in a rather straightforward way. The separate proof
structures are quite similar in all these variants. In consequence, a system like leanCoP [18] has
required only marginal changes to mutate to a non-clausal form prover [19], to an intuitionistic
prover [20], to a modal logic prover [21], and so forth. In [12] we introduced the term uniformity
for denoting this advantageous property of CM. Last not least, due to its close relationship with
LK – shared with TM – a discovered CM proof can be transformed in a straightforward way
into a human-readable text, a further important advantage. The discussion of this section is
summarized in Table 1.10
   In a personal correspondence with the present author (and with further correspondents)11 Iris

10
   The inclusion of such a table was proposed by one of the referees. While it certainly supports the view over the
   discussion of the various properties for the reader, it has to be interpreted with the following proviso. Each of the
   properties carries with it a longer story. The simpliﬁcation to a binary interpretation suggested by table is not the
   whole story. Consider property orf , for instance. There is some work on generalizing resolution to work also for
   non-clausal form input; it resulted in concluding that resolution under several aspects suits much better to clausal
                      √
   input. The lack of in the table has to be interpreted in this more elaborate manner and similarly for the other
   entries.
11
   Email of 22 Sept. 2023.


                                                         124
                                C±c    C±s    Cm/g     cmp    orf   unf   thr
                         RM     Cc     C−s    Cm
                                                              √     √     √
                         TM     C−c    C−s    Cg
                                                       √      √     √     √
                         CM     Cc     Cs     Cm

Legend: RM resolution; TM tableaux; CM connection method; C±c with/without cut; C±s formula and
proof separated/not separated; Cm mixed manner operation; Cg goal-oriented operation; cmp compact
    proof structure; orf formula in original form; unf uniformity over diﬀerent logics; thr ease of
                              transformation to a human-readable text.
Table 1
Summary of Comparison


van der Giessen pointed to the close relationship between the CM and combinatorial proofs [22].
Indeed the two proof representations share the decisive feature Cs . Details of this relationship
have to be worked out in future research. From a quick look at the work on combinatorial
proofs (CP) it appears to me that both approaches are derivatives from Gentzen’s LK, whereby
CM features a higher degree of abstraction from the details in LK in comparison with CP. This
results in the fact that a CM proof represents a set of diﬀerent LK derivations not just a single
one. This fact bears major advantages for proof search.


3. Research tasks in the pipeline
Depending on the taken perspective the state of the art in ATP may be regarded as extremely
impressive or relatively disappointing. The fact that its systems are able to prove tens of
thousands of theorems, many of them within a few seconds is impressive indeed. The fact,
however, that there are still theorems with proofs found by humans within a reasonable amount
of time, but still untractable for our systems is certainly disappointing. It demonstrates that the
underlying methodology is still lacking of possibly fundamental features. In the present section
I will try to point to features that I feel would advance the ﬁeld if explored and exploited. After
the result of the comparison in the previous section it is needless to mention that I am expecting
progress especially by advancing the CM.

3.1. Exploiting structural features
The characteristic of CM to separate the deductive structure from the formula, ie. Cs , opens
the door for studying and exploiting this structure in a global way wrt. the given problem.
This has been demonstrated in a series of recent papers [23, 11, 24]. In order to simplify the
underlying research task to a reasonable state of tractability for the beginning, these papers
have restricted the study to the class of fol formulas known as condensed detachment problems
(CD). A well-known theorem of this class is the following one due to Łukasiewicz. It is shown
in Figure 1 along with its ﬁve uniﬁable connections c1 , c2 , c3 , c4 , c5 . In fact, there is one more
connection, c0 , connecting the axiom with the goal which, however, is not uniﬁable for which
reason it is not shown explicitly. All CD formulas have the same structure A ∧ DET → G
illustrated by the Figure whereby A is the axiom, DET the detachment rule (or modus ponens),


                                                 125
                                           4
                                                                      1
                                       5
               P i(i(ixy, z), i(izx, iux)) ∧ (P v ∧ P ivw → P w) → P i(iab, i(ibc, iac))
                                                                2
                                                            3
Figure 1: Łukasiewicz with its five unifiable connections.


and G the goal. While CD is a rather restricted class, it already features most important aspects
of general fol.
   Any proof of such a CD formula consists of a number of instances of each of these ﬁve
connections along with a global substitution of the instances of the variables x, y, . . . , w. This
global view on the structure of proofs is unique to the CM and expressed as Cs in the previous
section. In consequence, this opens the door for novel and promising ways of proof search.
   One such approach consists in enumerating these proof structures by the following list S:

                              {c0 }, {c11   11 12       11 12 11 22 23
                                       1 , c4 , c5 }, {c1 , c2 , c5 , c4 , c5 }, . . .

Recall that a connection is a pair of occurrences of literals in instances of formula parts or clauses
whereby these instances are represented just by multiplicities attached to the occurrences within
the given formula.12 The upper indices attached to the connections in S denote the pair of
instances of the connected positive and negative literals. For instance, c12     2 is the connection
from the (positive) major premise of the ﬁrst instance of DET to its (negative) conclusion in
the second instance, and similarly for all others.
   This enumeration is restricted to contain only spanning sets of connections. This guarantees
that the identiﬁcation of a uniﬁable element in the list proves the theorem.13 In the case of CD
formulas this restriction to spanning sets is easily guaranteed by associating with each incoming
connection containing the conclusion of an instance of DET two outgoing connections from
the two premises of this instance. In [23, 11, 24] the sets of S have been represented as so-called
D-terms built from terms of the form D(d1 , d2 ). In correspondence with the members of S the
D corresponds to the incoming connection and its two arguments to the outgoing ones (or,
more precisely, to the corresponding occurrences along with their multiplicities in the formula).
So, we have two equivalent notations for exactly the same structure. However, the notation
used in S is completely general for any kind of formulas, not only CD ones.14
   On the basis of this kind of proof structures SGCD mentioned in the previous section has
achieved an impressive performance [11]. Using D-terms it enumerates such proof structures
together with the uniﬁcation of associated formulas. It also generates lemmata for supporting
proof search in an axiom-oriented way, thus demonstrating the characteristic Cm . SGCD is
12
   The reader not familiar with details of the CM is referred to the literature such as eg. [2].
13
   In an email of 26.Aug.2023 Peter Andrews has reminded the author of the possibility to involve Quantum Computing
   (QC) which might allow to test many list elements at the same time for uniﬁability. In his words: “The problem
   of searching for complete sets of connections may be radically diﬀerent when quantum computers are used.” This
   possibility would in fact speed up the proof process in a sensational manner.
14
   Formally representing connection sets may be done in various ways. The text already mentions two diﬀerent
   alternatives, whereby D-terms is a specialized way for CD formulas. Instead of naming connections as done in the
   text these could as well be represented as ordered pairs using denotations for occurrences of literals. For instance,
   we could enumerate the ﬁve occurrences of literals in the formula in Fig. 1 from right to left as 1, . . . , 5. Then the
   connection c122 explained in the text could be represented eg. by the ordered pair (3.1, 2.2).


                                                           126
the ﬁrst proof system among all currently available provers which achieved a fully automatic
proof for the TPTP problem LCL073-1, known as Meredith Single Axiom problem. Altogether
it ranks among the very best systems within the CD domain. This excellent performance of
SGCD demonstrates the importance of enhancing CM systems the way discussed here.
   The steps taken into this direction by the research discussed here is, however, just a beginning.
In the remaining of the subsection I shall outline several future research options. First of all,
the sequence S may be enhanced in the following way. In its present form it lists what may be
represented as full binary trees, for human understanding a more intuitive representation of the
same thing. The diﬀerent members of S may, however, share common substructures. If these
are represented such that any such common substructure occurs only once in any element of S
then its elements represent dags rather than trees. This way we may reduce the size of those
proof structures considerably and speed up proof search accordingly. Let us assume from here
on that S represents such a list of dags rather than trees.
   Along with each such dag there is a substitution of its variables by uniﬁcation of associated
formulas, including the goal and the axioms. If the uniﬁcation fails, like in the trivial case of
{c0 }, the set may be discarded from the list without restricting generality. Such failure may
be discovered in a strategic manner thus reducing the amount of uniﬁcational work involved.
Also, the terms in A and G may give guidance towards limiting the size of the members of
S, and thus reducing S. By closer inspection into the dynamics of the terms determined by
the connection structure,15 involved strategies may be worked out resulting in preferences
among the members of S, or such preferences might be learned by experiments. By noting such
preferences the amount of search might be reduced considerably.
   More generally, along with the given formula’s connection structure CS like the ﬁve connec-
tions in Fig. 1 there is a transformation T of terms for each of the connections which yields the
uniﬁcation of the connected literals. The information I given by CS along with T is necessary
to establish a proof, and at the same time it is also suﬃcient. Thus, the isolated information
contained in I contains all information needed for ﬁnding a proof in a maximally condensed
form. In other words, the property Cs from the previous section may be taken as to include the
minimality of the separated information I. Exploiting the various aspects contained in I is the
main task for ﬁnding proofs. Note that proof search in this generality is a fairly unexplored and
unexploited domain of ATP research, as it has been invisible through the eyes of RM.16
   All these considerations and many further ones have the potential of improving CM systems
designed along these lines in a substantial way well beyond the current state of the art in ATP.
In our opinion it is a good research strategy to carry out all these improvements ﬁrst for limited

15
   For illustration consider again the connection c122 . The associated substitution is w2 \iv1 w1 , ie. each activation of
   this connection adds, for instance, an i to the resulting w-term. In this sense, an activation of a connection triggers
   some kind of term transformation.
16
   In the present paper we can only provide pointers for future research, of course. Here is one more of those pointers.
   Instead of considering single connections during proof search, whole connection structures consisting of more
   than one connection may be taken into consideration. For illustration, consider again the connection c12        2 . The
   simplest kind of such a structure would, for instance, be an n-fold iteration of this connection, but with varying
   multiplicities. The n might even be taken as a variable to be instantiated by some constraint-driven mechanism
   aimed at achieving a certain goal like a certain number of occurrences of i’s in our example. This same idea might
   be generalized to connection structures with more than a single connection. This way lemmata would enter the
   game in a rather general setting.


                                                           127
structures like that of CD formulas. The crucial test of course comes with the generalization
of the gained insights for the limited cases eventually to arbitrary fol formulas, an enormous
research task and challenge. Since a list like S or the characterizing information like I may be
generated for any formula, as already mentioned above, there is no fundamental barrier for this
generalization.
   All this shows that the development of CM is by far not completed at all. The ATP community
is rather faced with substantial work ahead waiting to be undertaken. This amounts to promising
work with a potential for advancing the ﬁeld in a way which was not recognized in other methods
due to their particular features, especially to their C−s — in contrast to the Cs for CM.

3.2. Equality handling and others
In Mathematics equality is surely the most ubiquitous notion at all. It is worth a special handling
not only for Mathematicians, but also for ATP systems. While there has been extensive work in
the context of RM for handling equality for more than half a century, no analogous work exists
for the underdog CM. CM systems like leanCoP [18] handle equality in an axiomatic manner
like any other relation. In consequence, CM systems are doing poorly in CASC competitions
for any problem featuring equality, hence for many problems indeed.
    In principle, transferring the existing techniques for handling equality from RM including
superposition to CM in a sense might be a straightforward task. Yet, it requires top expertise
and familiarity in both subﬁelds of ATP, a rare constellation indeed, if existing at all anywhere
in the world. With the necessary support it could be acquired none the less. In the hope that
this will happen in the not-too-distant future I outline here a ﬁrst high-level version of how
equality could be incorporated more eﬃciently in CM.
    In the books [2, V.3] as well as [25, 4.5] ﬁrst steps towards such an incorporation have already
been outlined. There the treatment is based on the three axioms of equality, namely reﬂexivity
as well as substitutivity for functions and predicates. Typical treatments in the context of
resolution rather use the three congruence axioms along with functional substitutivity [26].
While logically the two sets of axioms are equivalent, it is worthwhile to think about which of
these suites the environment more adequately.
    In the two just referenced books the concept of a connection is extended by so-called eq-
connections. Eg. see the examples 3.2, 3.3, and 3.4 in [2, V.3] illustrating the concept. They
allow the direct substitution of terms by equal terms in the given formula or matrix similarly to
paramodulation in RM.17 They perfectly ﬁt into the structural representation of proofs in CM
and fulﬁl the preferred property Cs . The corollary 3.6 in that reference provides the theoretical
basis for this approach. Theoretically, this basis has been worked out in further detail in several
publications including [27].18
    Without a clever strategic guidance for substituting equals by equals, this initial step is of
little help however. What is needed is some ordering relation which imposes a direction to these

17
   Note that the mentioned examples generalize to any path through some matrix or formula, not only for the
   illustrated case of the matrix consisting of a single path (without considering the axiom).
18
   The present author is not aware of any work continuing this way of handling equality in the context of ATP.
   Whatever the reason for this ignorance in the last four decades, it clearly demonstrates the lack of interest in the
   CM within the ATP community.


                                                         128
possible replacements. Within RM the decisive technique to realize such a directed replacement
without restricting generality has been presented in [28], optimized and polished in subsequent
publications by the same and other authors (eg. [26]). In principle, the transfer of these tech-
niques into the context of eq-connections within the CM seems to be a straightforward task.
It would complement the comfortable way of dealing with equality by way of eq-connections
with an ordering relation for restricting the proliﬁc nature of substituting equals by equals. But,
as already stated, carrying out such a transfer never the less requires high competence in two
diﬀerent demanding areas.
   Unfortunately, there is little incentive for getting the performance of one system in view of a
feature like equality to the level of a competing one by incorporating the latter’s techniques.
In fact, this unattractiveness is most likely the reason why this transfer has not been done up
to now (but see [29] for a recent encouraging step into this direction). In the context of the
perspectives outlined in the previous subsection the situation eventually might change, since in
combination the respective eﬀorts open up the perspective of CM systems outperforming RM
systems.
   The present author has outlined many further potential improvements in detail for the CM
in the last decades. Here I may thus restrict myself to just mentioning the existence of these
details in those publications. For instance, see [12, 30] and the references to earlier work given
therein.
   Recall the reference in Sect. 2 to Sean Holden’s opinion concerning the programmming
languages used in implementations which certainly reﬂects the situation in the ATP in an apt
manner. The use of Prolog for developing the CM further is fully appropriate in my opinion,
especially if the ATP community will change its attitude in view of a more suitable metric for
measuring progress as discussed in Sect. 2. But once the CM has reached a state competitive
with that of its competitors the time will have come that a system will have to be implemented
in some low-level programming language to speed-up the low-level eﬃciency in its performance
matching that of leading RM systems.
   Once such a stable system with high performance will be available, also the time has come
for building a comfortable interface for the interaction with users, especially – but by far not
exclusively – with mathematicians. Such an interface would present the discovered proofs to the
users in a natural mathematical style of text. Since a CM proof may easily be translated into an
LK proof and such an LK proof is in fact just a formalized version of a natural mathematical proof,
resulting proofs may indeed be presented to users in a style familiar to mathematicians.19 From
that time on, mathematicians, Math-teachers, engineers etc. might ﬁnally become interested in
a daily use of such a system within their work.

19
     Decades ago there has been valuable work on translating CM-proofs into a form familiar to mathematicians. This
     is not the place to give a comprehensive overview of such work which has extensively been cited in earlier papers
     of the author. Just one notable work is worth mentioning in the present context as a representative example which
     is [31].
     One of the referees emphasizes this issue of human-readability in his review in the following way: “Connection
     provers . . . provide output easily transferable to a human-readable form. The last point is especially important for
     applications as theorem provers can be used as veriﬁers, which ought to provide explanations to users. The complexity
     of symbolic output has recently been addressed by the explainable AI community and is seen as an important issue to
     tackle.”


                                                            129
4. Conclusions
The paper has taken a high-level comparative look at the three major proof methods in ATP,
namely resolution (RM), tableaux (TM), and the Connection Method (CM). Generally, the
competition among diﬀerent research directions is fruitful if it is done on fair grounds. We have
argued that the community in ATP seems to be misguided by superﬁcial advantages favoring
RM. If one takes a closer look at qualitative characteristics of these competing methods, as
done in this text, the arguments all rather favor the CM as the most promising of the three
competitors.
   To a certain degree, however, the CM is sort of still in its infancy, lacking important aspects
to be worked out in much further detail. As described in Sect. 3.1 the most important aspect is
the realization of an involved proof search along the information provided by the connection
structure extractable from the formula to be proved. Further aspects like the special handling of
equality are described in Sect. 3.2. For being ready to the substantial investments needed for
advancing the CM in these directions, the community has ﬁrst to become convinced that such
an investment would advance ATP as such. The arguments presented in the text have tried to
nourish exactly this kind of conviction.


Acknowledgments
The initiative to the Workshop AReCCa 2023 is due to Jens Otten which in turn triggered the
writing of this text. I am grateful to him for this promotion of the CM as well as for decades
of fruitful collaboration. Thanks are due to Christoph Wernhard and several anonymous
referees for a number of helpful suggestions towards improving the text in terms of clarity and
understandability.


References
 [1] P. B. Andrews, Theorem proving via general matings, Journal of the ACM 28 (1981)
     193–214.
 [2] W. Bibel, Automated Theorem Proving, second ed., Vieweg Verlag, Braunschweig, 1987.
     First ed. 1982.
 [3] G. Gentzen, Untersuchungen über das logische Schließen, Mathematische Zeitschrift 39
     (1935) 176–210 and 405–431. Engl. transl. in [32].
 [4] A. S. Troelstra, H. Schwichtenberg, Basic Proof Theory, Cambridge Tracts in Theoretical
     Computer Science, 2nd ed., Cambridge University Press, Cambridge, 2000.
 [5] N. Arai, T. Pitassi, A. Urquhart, The complexity of analytic tableaux, Journal of Symbolic
     Logic 71 (2006) 777–790.
 [6] A. Urquhart, The complexity of propositional proofs, Bulletin of Symbolic Logic 1 (1995)
     425–467.
 [7] W. Bibel, E. Eder, Methods and calculi for deduction, in: D. M. Gabbay, C. J. Hogger, J. A.
     Robinson (Eds.), Handbook of Logic in Artiﬁcial Intelligence and Logic Programming,
     volume 1, Oxford University Press, Oxford, 1993, pp. 67–182.


                                               130
 [8] C. Wernhard, Generating compressed combinatory proof structures – an approach to
     automated ﬁrst-order theorem proving, in: B. Konev, C. Schon, A. Steen (Eds.), PAAR 2022.
     CEUR Workshop Proceedings, volume 3201, 2022. https://arxiv.org/abs/2209.12592.
 [9] C. Wernhard, CD tools – Condensed detachment and structure generating
     theorem proving (system description), CoRR abs/2207.08453, 2023. https: //-
     doi.org/10.48550/ARXIV.2207.08453.
[10] C. Wernhard, Structure-generating ﬁrst-order theorem proving, in: J. Otten, W. Bibel
     (Eds.), AReCCa 2023. CEUR Workshop Proceedings, 2023.
[11] M. Rawson, C. Wernhard, Z. Zombori, W. Bibel,                Lemmas: Generation, selec-
     tion, application,      in: R. Ramanayake, J. Urban (Eds.), The 32nd International
     Conference on Automated Reasoning with Analytic Tableaux and Related Meth-
     ods (TABLEAUX 2023), Prague, Proceedings, volume 14278 of LNAI, Springer,
     Cham, 2023, pp. 153–174. https://link.springer.com/content/pdf/10.1007/978-3-031-43513-
     3_9?pdf=chapter%20toc, https://arxiv.org/abs/2303.05854.
[12] W. Bibel, J. Otten, Experiments with connection method provers, in: 4th Conference
     on Artiﬁcial Intelligence and Theorem Proving, AITP 2019, April 7-12, 2019, Obergurgl,
     Austria, 2019. http://aitp-conference.org/2019/slides/WB.pdf.
[13] W. Bibel, J. Otten, From Schütte’s formal systems to modern automated deduction, in:
     R. Kahle, M. Rathjen (Eds.), The Legacy of Kurt Schütte, Springer, Cham, 2020, pp. 215–249.
     https://link.springer.com/chapter/10.1007/978-3-030-49424-7_13.
[14] W. Bibel, Short proofs of the pigeonhole formulas based on the connection method, Journal
     of Automated Reasoning 6 (1990) 287–297.
[15] A. Haken, The intractability of resolution, Theor. Comput. Sci. 39 (1985) 297–308.
[16] M. J. H. Heule, B. Kiesl, A. Biere, Short proofs without new variables, in: L. de Moura (Ed.),
     CADE 26, volume 10395 of LNCS (LNAI), Springer, 2017, pp. 130–147.
[17] S. B. Holden, Machine Learning for Automated Theorem Proving: Learning to Solve SAT
     and QSAT, volume 14 of Foundations and Trends® in Machine Learning, now publish-
     ers, Boston, Delft, 2021. URL: https://www.nowpublishers.com/article/Details/MAL-081.
     doi:10.1561/2200000081.
[18] J. Otten, W. Bibel, leanCoP: Lean connection-based theorem proving, Journal of Symbolic
     Computation 36 (2003) 139–161.
[19] J. Otten, nanoCoP: A non-clausal connection prover, in: N. Olivetti, A. Tiwari (Eds.),
     International Joint Conference on Automated Reasoning, IJCAR 2016, volume 9706 of
     LNAI, Springer, Heidelberg, 2016, pp. 302–312.
[20] J. Otten, Clausal connection-based theorem proving in intuitionistic First-Order Logic,
     in: TABLEAUX 2005, International Conference on Automated Reasoning with Analytic
     Tableaux and Related Methods, volume 3702 of Lecture Notes in Computer Science, Springer,
     Berlin, 2005, pp. 245–261.
[21] J. Otten, Implementing connection calculi for ﬁrst-order modal logics, in: E. Ternovska,
     K. Korovin, S. Schulz (Eds.), 9th International Workshop on the Implementation of Logics
     (IWIL 2012), Merida, Venezuela, 2012.
[22] D. J. Hughes, First-order proofs without syntax, arXiv:1906.11236v1 [math.LO] (2019)
     1–42.
[23] C. Wernhard, W. Bibel, Learning from Łukasiewicz and Meredith: Investigations into


                                               131
     proof structures, in: A. Platzer, G. Sutcliﬀe (Eds.), Proceedings of the 28th International
     Conference on Automated Deduction (CADE-28), volume 12699 of LNAI, Springer, 2021,
     pp. 58–75.
[24] C. Wernhard, W. Bibel, Investigations into proof structures, JAR (under review 2023).
[25] W. Bibel, Deduction: Automated Logic, Academic Press, London, 1993.
[26] L. Bachmair, H. Ganzinger, Equational Reasoning in Saturation-Based Theorem Proving,
     volume Automated Deduction – A Basis for Applications. W. Bibel and P. H. Schmitt (eds.),
     Volume I of Applied Logic Series, Kluwer, Dordrecht, 1998, pp. 353–398.
[27] J. Gallier, P. Narendran, S. Raatz, W. Snyder, Theorem proving using equational matings
     and rigid E-uniﬁcation, J.ACM 39 (1992) 377–429.
[28] L. Bachmair, H. Ganzinger, C. Lynch, W. Snyder, Basic paramodulation, Information and
     Computation 121 (1995) 172–192.
[29] G. Prusak, C. Kaliszyk, Lazy paramodulation in practice, in: B. Konev, C. Schon, A. Steen
     (Eds.), PAAR 2022, volume 3201 of CEUR Workshop Proceedings, CEUR-WS.org, 2022.
[30] W. Bibel, A vision for Automated Deduction rooted in the connection method, in:
     R. Schmidt, C. Nalon (Eds.), The 26th International Conference on Automated Reasoning
     with Analytic Tableaux and Related Methods (TABLEAUX 2017), volume 10501 of LNAI,
     Springer, 2017, pp. 3–21. DOI: 10.1007/978-3-319-66902-1_1.
[31] D. A. Miller, Expansion trees and their conversion to natural deduction proofs, in: R. E.
     Shostak (Ed.), 7th International Conference on Automated Deduction, Springer, Berlin,
     1984, pp. 375–393.
[32] M. E. Szabo (Ed.), The collected papers of Gerhard Gentzen, North-Holland, Amsterdam,
     1969.


                                              132

</pre>