<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Transforming and Analyzing Proofs in the CERES-system</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Stefan Hetzl Alexander Leitsch Daniel Weller Bruno Woltzenlogel Paleo Institute of Computer Languages (E185), Vienna University of Technology</institution>
        </aff>
      </contrib-group>
      <fpage>77</fpage>
      <lpage>91</lpage>
      <abstract>
        <p>Cut-elimination is the most prominent form of proof transformation in logic. The elimination of cuts in formal proofs corresponds to the removal of intermediate statements (lemmas) in mathematical proofs. Cut-elimination can be applied to mine real mathematical proofs, i.e. for extracting explicit and algorithmic information. The system CERES (cut-elimination by resolution) is based on automated deduction and was successfully applied to the analysis of nontrivial mathematical proofs. In this paper we focus on the input-output environment of CERES, and show how users can interact with the system and extract new mathematical knowledge.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>systems have been used successfully to verify complicated and famous mathematical proofs like this of
the four color theorem (see http://research.microsoft.com/~gonthier/). The core algorithm of
CERES, however, works on LK-proofs and performs cut-elimination. For this reason we have developed
the higher proof language HLK which is close to the sequent calculus LK, making translations to LK easy
and efficient.</p>
      <p>In this paper we present the input-output environment of CERES and illustrate the interaction of
users with the system. In the first step proofs are formalized in HLK. The formalized proof is then
compiled to LKDe and analyzed by the CERES-algorithm, and finally the resulting atomic cut normal
form can be viewed by Prooftool. Moreover, from the normal form a Herbrand sequent can be extracted
which contains mathematical information in a more condensed form. We illustrate all phases of proof
specification and analysis by an example, the tape proof of Christian Urban (see also [1]).
2</p>
    </sec>
    <sec id="sec-2">
      <title>System overview</title>
      <p>3. The HandyLK proof is input to the compiler HLK.
4. HLK generates a formal proof in sequent calculus LKDe.
6. CERES extracts from the formal proof a characteristic clause set, which contains clauses formed
from ancestors of cut-formulas in the formal proof.
7. The characteristic clause set is then input to a resolution theorem prover, e.g. Otter4 or Prover95.
8. The resolution theorem prover outputs a refutation of the characteristic clause set.
9. CERES receives the refutation, which will be used as a skeleton for the transformed proof in
atomiccut normal form (ACNF).
10. CERES outputs the grounded refutation in a tree format and the characteristic clause set. Moreover
it extracts projections from the formal proof and plugs them into the refutation in order to generate
the ACNF. The projections and the ACNF are also output. A Herbrand sequent is obtained from
the instantiation information of the quantifiers of the end-sequent (resulting in an (equationally)
valid sequent) and output as well. The Herbrand sequent typically summarizes the creative content
of the ACNF. For details, see [8].
11. All outputs and inputs of CERES can be opened with ProofTool.
12. ProofTool, a graphical user interface, renders all proofs, refutations, projections, sequents and
clause sets so that they can be visualized by the user.
13. The information displayed via ProofTool is analyzed by the user.
14. Based on his analysis, the user can formulate new mathematical ideas, e.g. a new informal direct
proof corresponding to the ACNF.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Proof analysis with CERES</title>
      <p>Our calculus LKDe is based on standard LK with permutation, contraction and weakening, atomic
axioms and multiplicative rules. For example, the following rules deal with the ^-connective:
G ` D; A P ` L; B
G; P ` D; L; A ^ B ^ : r</p>
      <p>A; G ` D
A ^ B; G ` D ^ : l1</p>
      <p>A; G ` D
B ^ A; G ` D ^ : l2
In addition, to bring the calculus closer to mathematical practice, definition introduction and equality
handling rules are used:</p>
      <p>Let A be a first-order formula with the free variables x1; : : : ; xk (denoted by A(x1; : : : ; xk)) and P be a
new k-ary predicate symbol (corresponding to the formula A). Then the definition introduction rules are:
A(t1; : : : ; tk); G ` D
P(t1; : : : ; tk); G ` D
defP : l</p>
      <p>G ` D; A(t1; : : : ; tk)
G ` D; P(t1; : : : ; tk)
defP : r
for arbitrary sequences of terms t1; : : : ; tk. Definition introduction is a simple and very powerful tool
in mathematical practice. Note that the introduction of important concepts and notations like groups,
integrals etc. can be formally described by introduction of new symbols. There are also definition
introduction rules for new function symbols which are of similar type.
4available at http://www-unix.mcs.anl.gov/AR/otter/
5available at http://www.cs.unm.edu/~mccune/prover9/
The equality rules (also called paramodulation rules) are:
on the right, where L denotes a set of positions of subterms where replacement of s by t has to be
performed. We call s = t the active equation of the rules.</p>
      <p>
        We will now introduce the CERES system by presenting the analysis of a proof from [13]; it was
formalized in LKDe and analyzed by CERES in [1]. The end-sequent formalizes the statement: on a
tape with infinitely many cells which are all labelled by 0 or by 1 there are two cells labelled by the same
number. f (x) = 0 expresses that the cell nr. x is labelled by 0. Indexing of cells is done by number terms
defined over 0; 1 and +. The proof j below uses two lemmas: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) there are infinitely many cells labelled
by 0 and (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) there are infinitely many cells labelled by 1. These lemmas are eliminated by CERES and a
more direct argument is obtained in the resulting proof j0.
3.1 Specifying proofs in HandyLK
In this section, we introduce the higher proof language HandyLK by presenting its most important
features and syntax by means of our example proof. HandyLK is based on the idea that many aspects of
writing LKDe proofs can be automatized. It supports a many-sorted first-order language.
      </p>
      <p>Before starting to write proofs, it is first necessary to define the language and the function and
predicate definitions one intends to use. In our example, we deal with a language with one sort (natural
numbers) and the constants mentioned above. In HandyLK , we write:
define type nat;
define constant 0, 1 of type nat;
define infix function + of type nat,nat to nat with weight 100;
define function f of type nat to nat;
In the definition of infix functions like +, we may set a weight that allows terms to be written in the usual
mathematical way, where superfluous brackets may be dropped. In addition to the constants, we define
some variables that will be used in the proof. Here, the function f is the labelling function assigning
labels to tape cells.
define variable k, l, n, p, q, x, n_0, n_1 of type nat;
We also define the axioms we will use. Note that :- represents ` in HandyLK .
define axiom :- k + l = l + k;
define axiom :- k + (l + n) = (k + l) + n;
define axiom k = k + (1 + l) :- ;
define axiom :- k = k;</p>
      <sec id="sec-3-1">
        <title>Finally, we introduce some predicate definitions.</title>
        <p>define predicate A by all x ( f(x) = 0 or f(x) = 1 );
define predicate I by all n ex k f( n + k ) = x;
Note that in the definition of I, x is a free variable. The free variables determine the arity of the defined
predicate, and can be instantiated by parameters. If P is a defined predicate, F is its defining formula
and x¯ are the free variables of F, then the predicate definition is interpreted as 8x¯(P(x¯) () F). In our
example, the predicate A is the assumption: All cells are labelled either 0 or 1. The predicate I(x) states
that there are infinitely many cells labelled x.</p>
        <p>We now turn to formalizing our example proof j. In LKDe, j is</p>
        <p>
          (t) (e(0))
A ` I(0); I(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) I(0) ` 9p9q(p 6= q ^ f (p) = f (q)) cut (e(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ))
        </p>
        <p>
          A ` 9p9q(p 6= q ^ f (p) = f (q)); I(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) I(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) ` 9p9q(p 6= q ^ f (p) = f (q))
        </p>
        <p>A ` 9p9q(p 6= q ^ f (p) = f (q))
cut
Here t is a proof of the fact that either infinitely many cells are labelled 0, or infinitely many cells are
labelled 1. The two applications of cut encode the application of two lemmas e(n) for n 2 f0; 1g: if there
are infinitely many cells labelled n, then there are two cells labelled by the same number.</p>
        <p>In HandyLK , we start the specification of the proof by giving an identifier, here we choose the name
the-proof:
define proof the-proof</p>
      </sec>
      <sec id="sec-3-2">
        <title>Next, the end-sequent is fixed:</title>
        <p>
          proves A :- ex p ex q ( not p = q and f(p) = f(q) );
Now, we begin by specifying the inferences from the bottom up. In HandyLK , only the active formulas
of a rule have to be specified, and no structural rules (except cut) have to be written down explicitly. The
HLK compiler keeps track of the formulas used in the proof, and automatically inserts the structural rules
necessary to be able to apply the rules specified by the user. The first inference in j is a cut, which is a
binary rule. Due to the linear nature of the HandyLK language, one of the subproofs above the binary
rule has to be referenced. The following proof reference clauses are available:
by proof hproof i
auto propositional hsequenti
explicit axiom hsequenti
The first clause gives the name of the subproof with which to continue, the second clause states that
the subproof ends in a propositional tautology whose proof should be computed automatically, and the
last clause asserts that the subproof consists solely of an axiom. In our example we reference the proof
e(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ).
        </p>
        <p>
          with cut I(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
        </p>
        <p>
          right by proof \epsilon(
          <xref ref-type="bibr" rid="ref1">1</xref>
          );
This specifies an application of cut with cut formula I(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ), where the right subproof will be e(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ). In
LKDe, this will be a rule application
(e(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ))
G ` D; I(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) I(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ); P ` L
        </p>
        <p>G; P ` D; L</p>
        <p>cut
where G; P; D; L will be populated by the appropriate formulas automatically by the HLK compiler.</p>
        <p>
          We continue the HandyLK specification of this proof part with the left subproof of the first
application of cut by encoding the second application: here, we reference both the left subproof t and the right
subproof e(0).
;
with cut I(0)
left by proof \tau
right by proof \epsilon(0);
This completes the HandyLK specification of j — but to be able to actually compile the proof into an
LKDe-proof, the subproofs t, e(0) and e(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) have to be encoded in HandyLK . We continue by showing
the HandyLK rule applications in t, which are definition introduction and quantifier rules. The proof t
starts by expanding the two occurrences of the defined predicate I:
        </p>
        <p>A ` 8n9k f (n + k) = 0; 8n9k f (n + k) = 1</p>
        <p>A ` I(0); 8n9k f (n + k) = 1</p>
        <p>
          A ` I(0); I(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
defI : r
defI : r
In HandyLK , it suffices to specify the predicate to be expanded and give the auxiliary formulas of the
rules, the correct main formula is then found by HLK through matching:
with undef I
        </p>
        <p>:- all n ex k f( n + k ) = 0, all n ex k f( n + k ) = 1;
Next, the two strong quantifiers are eliminated by 8 : r rules, introducing the eigenvariables n0 and n1:
A ` 9k f (n0 + k) = 0; 9k f (n1 + k) = 1
A ` 9k f (n0 + k) = 0; 8n9k f (n + k) = 1 8 : r</p>
        <p>A ` 8n9k f (n + k) = 0; 8n9k f (n + k) = 1 8 : r</p>
      </sec>
      <sec id="sec-3-3">
        <title>The corresponding HandyLK code is</title>
        <p>with all right</p>
        <p>:- ex k f( n_0 + k ) = 0, ex k f( n_1 + k ) = 1;
Again, it suffices to specify the auxiliary formulas, the correct main formulas are found by matching. The
same holds for the weak quantifier rules. In our example, we instantiate them by n1 and n0, respectively:
A ` f (n0 + n1) = 0; f (n1 + n0) = 1
A ` 9k f (n0 + k) = 0; f (n1 + n0) = 1 9 : r</p>
        <p>A ` 9k f (n0 + k) = 0; 9k f (n1 + k) = 1 9 : r
with ex right</p>
        <p>:- f( n_0 + n_1 ) = 0, f( n_1 + n_0 ) = 1;
Of course, HLK checks whether the quantifier rule applications are legal. Our proof t is nearly done —
it remains to decompose our assumption A and to use the commutativity of +. We only show the latter,
which is encoded with an application of paramodulation:
` n0 + n1 = n1 + n0 f (n1 + n0) = 1 ` f (n1 + n0) = 1
f (n0 + n1) = 1 ` f (n1 + n0) = 1
= : l2
Such a paramodulation application is common when formalizing proofs from mathematics: an equality
axiom from the background theory is used as the active equation. For this reason, it receives special
treatment in HandyLK : the active equation is simply specified in a by clause.</p>
        <p>
          with paramod by n_0 + n_1 = n_1 + n_0
right f( n_1 + n_0 ) = 1 :- ;
This rule application can also be encoded in the general syntax for binary rules, using a proof reference:
with paramod
f( n_1 + n_0 ) = 1
:left explicit axiom :- n_0 + n_1 = n_1 + n_0 ;
Recall that the definition of j in HandyLK contained references to proofs e(0) and e(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ), which show
that under the assumption that there are infinitely many cells labelled 0 and 1, respectively, it follows that
two cells are labelled by the same number. Clearly, these proofs have a similar structure — this fact can
be exploited in HandyLK by writing a meta proof that can be instantiated with some specific terms:
define proof \epsilon
with meta term i of type nat;
proves
        </p>
        <p>I(i) :- ex p ex q ( not p = q and f(p) = f(q) );
We omit the details of the specification of the meta proof e(i). Additionally, HandyLK supports the
definition of proofs in a recursive way, which is a convenient way to encode sequences of proofs. This
feature was used to encode a sequence of proofs for the formula scheme from [3].
3.2</p>
        <p>XML format for proofs
The programs in the CERES system do not work directly with proofs written in HandyLK — these proofs
are compiled into a flexible XML format using the HLK compiler. XML is a well known data
representation language which allows the use of arbitrary and well known utilities for editing, transformation and
presentation and standardized programming libraries. We are interested in tree-style proofs, formulas
and terms. Data is structured in trees in XML, therefore the encoding and decoding of these objects is
very straightforward.</p>
        <p>
          The format is flexible in the sense that it only assumes that proofs are trees or directed acyclic graphs
of rule applications that are labelled by sequents. In our practice, it has been used successfully to encode
LKDe and resolution proofs. The following abbreviated XML code represents the proof j:
&lt;proof symbol="the-proof" calculus="LK"&gt;
&lt;rule symbol="c:r" type="contrr" param="2"&gt;
&lt;sequent&gt;...&lt;/sequent&gt;
&lt;rule symbol="cut" type="cut"&gt;
&lt;sequent&gt;...&lt;/sequent&gt;
&lt;rule symbol="\pi:r" type="permr" param="(
          <xref ref-type="bibr" rid="ref12">1 2</xref>
          )"&gt;
&lt;sequent&gt;...&lt;/sequent&gt;
&lt;rule symbol="cut" type="cut"&gt;
&lt;sequent&gt;...&lt;/sequent&gt;
&lt;prooflink symbol="\tau"/&gt;
&lt;prooflink symbol="\epsilon(0)"/&gt;
&lt;/rule&gt;
&lt;/rule&gt;
&lt;prooflink symbol="\epsilon(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )"/&gt;
&lt;/rule&gt;
&lt;/rule&gt;
&lt;/proof&gt;
The symbol attributes are used for the visual presentation of the proof trees: they allow the user to
identify proofs and rules. The param attributes contain additional rule information, in the example they
specify which formulas are to be contracted (by the contraction rule c:r) and how the formulas are to be
permuted (by the permutation rule \pi:r). The prooflink tags reference other proofs by their name
and allow the representation of proofs as DAGs.
        </p>
        <p>In the above example, the formulas contained in the sequents have been left out. To give an example
of how a formula is encoded in the XML format, consider the following XML code corresponding to the
formula R(c; f (c)) ^ P(c):
&lt;conjunctiveformula type="and"&gt;
&lt;constantatomformula symbol="R"&gt;
&lt;constant symbol="c"/&gt;
&lt;function symbol="f"&gt;</p>
        <p>&lt;constant symbol="c"/&gt;
&lt;/function&gt;
&lt;/constantatomformula&gt;
&lt;constantatomformula symbol="P"&gt;</p>
        <p>&lt;constant symbol="c"/&gt;
&lt;/constantatomformula&gt;
&lt;/conjunctiveformula&gt;</p>
        <p>Data is organized in our XML format in a proofdatabase containing a number of proofs. The
proofdatabase may also contain
1. a list of defined predicates and their definitions,
2. a list of sequents specifying the axioms of the background theory in which the proofs are written,
3. arbitrary lists of sequents identified by some symbol (e.g. to store a Herbrand sequent).
3.3</p>
        <p>The CERES method
To prepare for the following section, we will now introduce the theoretical background for the CERES
method of cut-elimination, which is used by the CERES system to compute a proof in atomic cut normal
form and, in the end, a Herbrand sequent.</p>
        <p>The central idea of CERES consists in extracting a so-called characteristic clause set from a proof,
and then using a resolution refutation of this set to obtain a proof with only atomic cuts. We consider the
proofs in LKDe as directed trees with nodes which are labelled by sequents, where the root is labelled
by the end-sequent. According to the inference rules, we distinguish binary and unary nodes. In an
inference
n1 : S1 n2 : S2 x</p>
        <p>n : S
where n is labelled by S, n1 by S1 and n2 by S2, we call n1; n2 predecessors of n. Similarly n0 is
predecessor of n in a unary rule if n0 labels the premiss and n the consequent. Then the predecessor
relation is defined as the reflexive and transitive closure of the relation above. Every node is predecessor
of the root, and the axioms have only themselves as predecessors. For a formal definition of the concepts
we refer to [4] and [5]. A similar relation holds between formula occurrences in sequents. Instead of a
formal definition we give an example.</p>
        <p>Consider the rule:
The occurrences of P(a) and P(b) in the premiss are ancestors of the occurrence of P(a) ^ P(b) in
the consequent. P(a) and P(b) are called auxiliary formulas of the inference, and P(a) ^ P(b) the main
formula. 8x:P(x) in the premisses are ancestors of 8x:P(x) in the consequent. Again the ancestor relation
is defined by reflexive transitive closure.</p>
        <p>Let W be the set of all occurrences of cut-formulas in sequents of an LKDe-proof j. The cut-formulas
are not ancestors of the formulas in the end-sequent, but they might have ancestors in the axioms (if the
cuts are not generated by weakening only). The construction of the characteristic clause set is based on
the ancestors of the cuts in the axioms. Note that clauses are just defined as atomic sequents. We define
a set of clauses Cn for every node n in j inductively:</p>
        <p>If n is an occurrence of an axiom sequent S(n), and S0 is the subsequent of S(n) containing only
the ancestors of W then Cn = fS0g.</p>
        <p>Let n0 be the predecessor of n in a unary inference then Cn = Cn0 .</p>
        <p>Let n1; n2 be the predecessors of n in a binary inference. We distinguish two cases
(a) The auxiliary formulas of n1; n2 are ancestors of W. Then
(b) The auxiliary formulas of n1; n2 are not ancestors of W. Then</p>
        <p>Cn = Cn1 [ Cn2 :
Cn = Cn1</p>
        <p>Cn2 :
where C</p>
        <p>D = fC D j C 2 C ; D 2 D g and C D is the merge of the clauses C and D.</p>
        <p>The characteristic clause set CL(j) of j is defined as Cn0 , where n0 is the root.</p>
        <p>Theorem 1. Let j be a proof in LKDe. Then the clause set CL(j) is equationally unsatisfiable.
Remark. A clause set C is equationally unsatisfiable if C does not have a model where = is interpreted
as equality over a domain.</p>
        <p>Proof. This proof first appeared in [1]. Let n be a node in j and S0(n) the subsequent of S(n) which
consists of the ancestors of W (i.e. of a cut). It is shown by induction that S0(n) is LKDe-derivable from
Cn . If n0 is the root then, clearly, S0(n0) = ` and the empty sequent ` is LKDe-derivable from the axiom
set Cn0 , which is just CL(j). As all inferences in LKDe are sound over equational interpretations (where
new symbols introduced by definition introduction have to be interpreted according to the defining
equivalence), CL(j) is equationally unsatisfiable. Note that, without the rules = : l and = : r, the set CL(j)
is just unsatisfiable. Clearly the rules = : l and = : r are sound only over equational interpretations.</p>
      </sec>
      <sec id="sec-3-4">
        <title>The next steps in CERES are</title>
        <p>
          (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) the computation of the proof projections j[C] w.r.t. clauses C 2 CL(j),
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) the refutation of the set CL(j), resulting in an RP-tree g, i.e. in a deduction tree defined by the
inferences of resolution and paramodulation, and
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) “inserting” the projections j[C] into the leaves of g.
        </p>
        <p>
          For step (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) we skip in j all inferences where the auxiliary resp. main formulas are ancestors of a
cut. Instead of the end-sequent S we get S C for a C 2 CL(j).
        </p>
        <p>
          Step (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) consists in ordinary theorem proving by resolution and paramodulation (which is
equationally complete). For refuting CL(j) any first-order resolution prover can be used. By the completeness
of the methods we find a refutation tree g as CL(j) is unsatisfiable by Theorem 1.
        </p>
        <p>
          Step (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) makes use of the fact that, after computation of the simultaneous most general unifier of
the inferences in g, the resulting tree g0 is actually a derivation in LKDe. Indeed, after computation of
the simultaneous unifier, paramodulation becomes = : l and = : r and resolution becomes cut in LKDe.
Note that the definition rules, like the logical rules, do not appear in g0. Now for every leaf n in g0, which
is labelled by a clause C0 (an instance of a clause C 2 CL(j)) we insert the proof projection j[C0]. The
result is a proof with only atomic cuts.
        </p>
        <p>The proof projection is only sound if the proof j is skolemized, i.e. there are no strong quantifiers
(i.e. quantifiers with eigenvariable conditions) in the end-sequent. If j is not skolemized a priori it can
be transformed into a skolemized proof j0 in polynomial (at most quadratic) time; for details see [3].</p>
        <p>For illustration, consider the following example:
j =
j1 =
j2 =
We have CL(j) = fP(u) ` Q(u); ` P(a); Q(v) `g. The resolution refutation d of CL(j)
j1 j2
(8x)(P(x) ! Q(x)) ` (9y)(P(a) ! Q(y))</p>
        <p>cut
P(u)? ` P(u) Q(u) ` Q(u)?</p>
        <p>P(u)?; P(u) ! Q(u) ` Q(u)? !: l</p>
        <p>P(u) ! Q(u) ` (P(u) ! Q(u))? !: r</p>
        <p>P(u) ! Q(u) ` (9y)(P(u) ! Q(y))? 9 : r
(8x)(P(x) ! Q(x)) ` (9y)(P(u) ! Q(y))? 8 : l
(8x)(P(x) ! Q(x)) ` (8x)(9y)(P(x) ! Q(y))? 8 : r</p>
        <p>P(a) ` P(a)? Q(v)? ` Q(v)</p>
        <p>P(a); (P(a) ! Q(v))? ` Q(v) !: l
(P(a) ! Q(v))? ` P(a) ! Q(v) !: r
(P(a) ! Q(v))? ` (9y)(P(a) ! Q(y)) 9 : r
(9y)(P(a) ! Q(y))? ` (9y)(P(a) ! Q(y)) 9 : l
(8x)(9y)(P(x) ! Q(y))? ` (9y)(P(a) ! Q(y)) 8 : l</p>
        <p>Q(a) ` R
does the job of refuting CL(j). By applying the most general unifier s of d , we obtain a ground
refutation g = d s :
This will serve as a skeleton for a proof in ACNF. To complete the construction, we combine the skeleton
with the following projections (grounded by s ):
j(C1) =
j(C2) =
j(C3) =</p>
        <p>P(a) ` P(a) Q(a) ` Q(a)</p>
        <p>P(a); P(a) ! Q(a) ` Q(a) !: l</p>
        <p>P(a); (8x)(P(x) ! Q(x)) ` Q(a) 8 : l
P(a); (8x)(P(x) ! Q(x)) ` (9y)(P(a) ! Q(y)); Q(a)
w : r
P(a) ` P(a)</p>
        <p>w : r</p>
        <p>P(a) ` P(a); Q(v)
` P(a) ! Q(v); P(a) !: r
` (9y)(P(a) ! Q(y)); P(a) 9 : r
(8x)(P(x) ! Q(x)) ` (9y)(P(a) ! Q(y)); P(a) w : l</p>
        <p>Q(a) ` Q(a)</p>
        <p>P(a); Q(a) ` Q(a) w : l</p>
        <p>Q(a) ` P(a) ! Q(a) !: r</p>
        <p>Q(a) ` (9y)(P(a) ! Q(y)) 9 : r
Q(a); (8x)(P(x) ! Q(x)) ` (9y)(P(a) ! Q(y)) w : l</p>
        <p>j(C2) j(C1)
B ` C; P(a) P(a); B ` C; Q(a)</p>
        <p>cut
B; B ` C;C; Q(a)</p>
        <p>B; B; B ` C;C;C</p>
        <p>B ` C</p>
        <p>j(C3)
Q(a); B ` C
contractions
cut
The composition of skeleton and projections yields
j(g) =
where B = (8x)(P(x) ! Q(x)), C = (9y)(P(a) ! Q(y)). Clearly, j(g) is a proof of the end-sequent of
j in ACNF.
3.4</p>
        <p>Cut-elimination, proof visualization and Herbrand sequent extraction with CERES
After compiling the HandyLK source to XML with HLK, it is possible to perform proof analysis using
the CERES tool. In our example, we are interested in extracting a Herbrand sequent from a cut-free proof
of the end-sequent of j. To this end, the CERES method is used, which extracts an unsatisfiable set of
clauses from the proof and refutes it. In theory, any resolution prover can be used to refute this set of
clauses; Table 1 lists the format conversions currently supported by our implementation.</p>
        <p>In our example, we use Otter to find a resolution refutation. Running CERES yields the following
characteristic clause set in Otter input format
-&gt; =("f"(+(x4, x5)), "0"), =("f"(+(x5, x4)), "1"). % (C1)
=("f"(+(x4, x6)), "0"), =("f"(+(+(+(x4, x6), "1"), x7)), "0") -&gt;. % (C2)
=("f"(+(x4, x6)), "1"), =("f"(+(+(+(x4, x6), "1"), x7)), "1") -&gt;. % (C3)
which corresponds to the set of clauses
which is refutable in arithmetic.</p>
        <p>The Otter proof object resulting from the successful refutation is then converted into a resolution tree
in XML format. According to the CERES method, this refutation is then transformed into an LKDe
proof j0 of the end-sequent of j in ACNF. j0 consists of 1262 rules, and is clearly too large to be
depicted here (a small part of it can be seen in Figure 2(b)). From j0, a Herbrand sequent is extracted
and stored in XML as a list of sequents (containing only the Herbrand sequent). All proofs and sequents
computed during the analysis can be visualized using ProofTool as seen in Figure 2.
(a) Visualizing the input proof j.</p>
        <p>(b) A bird’s eye view on the output proof j0.</p>
        <p>To make the Herbrand sequent extracted in our example easier to read, the terms have been simplified
modulo commutativity and associativity of +, and the following abbreviations are used:
p1 = x + z + 2; p2 = x + z + 4; p3 = z + 1;
p5 = w + x + y + z + 6; p6 = w + x + y + z + 8; p7 = w + x + y + z + 4;
p4 = x + y + z + 3;
With this, the Herbrand sequent of the ACNF of the tape proof is
f (p1) = 0 _ f (p1) = 1; f (p2) = 0 _ f (p2) = 1; f (p3) = 0 _ f (p3) = 1;
f (p4) = 0 _ f (p4) = 1; f (p5) = 0 _ f (p5) = 1; f (p6) = 0 _ f (p6) = 1;
f (p7) = 0 _ f (p7) = 1
`
p1 6= p2 ^ f (p1) = f (p2); p3 6= p1 ^ f (p3) = f (p1);
p3 6= p2 ^ f (p3) = f (p2); p1 6= p4 ^ f (p1) = f (p4);
p5 6= p6 ^ f (p5) = f (p6); p7 6= p5 ^ f (p7) = f (p5);
p7 6= p6 ^ f (p7) = f (p6); p4 6= p7 ^ f (p4) = f (p7):
Clearly, the Herbrand sequent is much smaller than the ACNF from which it is extracted. As it contains
all quantifier instantiation information from the proof, it is much better suited to human interpretation
than the proof itself. The Herbrand sequent we extracted from j0 can be interpreted as the following
proof:
Theorem 2. On a tape with infinitely many cells where each cell is labelled 0 or 1, there are two distinct
cells that are labelled the same.</p>
      </sec>
      <sec id="sec-3-5">
        <title>Proof. It is easy to see that the following inequalities hold:</title>
        <p>p1 6= p2; p3 6= p1; p3 6= p2; p1 6= p4;
p5 6= p6; p7 6= p5; p7 6= p6; p4 6= p7:
We may therefore delete their occurrences from the right side of the Herbrand sequent and obtain the
simplified sequent
`
f (p1) = f (p2); f (p3) = f (p1); f (p3) = f (p2); f (p1) = f (p4);
f (p5) = f (p6); f (p7) = f (p5); f (p7) = f (p6); f (p4) = f (p7):
Now, we assume that this sequent is false and obtain a contradiction. If the sequent is false, then all
formulas on the left side are true and all formulas on the right side are false, so we may assume f (pi) =
0 _ f (pi) = 1 for i 2 f1; : : : ; 7g. Let f (p1) = a with a 2 f0; 1g and let a¯ = 1 a. We have assumed that
the first formula on the right side is false, so f (p2) = a¯. From the second formula we obtain f (p3) = a¯,
and from the third f (p3) = a¯¯ = a, which yields the desired contradiction.</p>
        <p>Note that the proof extracted from the Herbrand sequent uses only 3 cells even though the sequent
itself is not minimal. Still, the proof yields a minimal Herbrand sequent directly:
f (p1) = 0 _ f (p1) = 1; f (p2) = 0 _ f (p2) = 1; f (p3) = 0 _ f (p3) = 1;
`
p1 6= p2 ^ f (p1) = f (p2); p3 6= p1 ^ f (p3) = f (p1); p3 6= p2 ^ f (p3) = f (p2):
Comparing the proof obtained from the Herbrand sequent with the original proof, we have gained an
important piece of information: in the original proof, it is shown that either infinitely many cells are
labelled 0 or 1, while in the cut-free proof, only finitely many cells are used in the proof.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Open problems and future work</title>
      <p>On the theoretical side we are working on the following two major extensions of the CERES-method:
The first is an extension of CERES to second-order logic, a first step in that direction has been achieved
by extending it to the fragment of second-order proofs defined by containing only quantifier-free
comprehension in [9]. This extension will allow the formalization of a much broader range of mathematical
proofs and yield a more convenient proof formalization. The second theoretical extension enables the
method to work without skolemization whose main benefit will be the ability to eliminate single cuts or
even parts of a cut which is more realistic in mathematical applications than the simultaneous elimination
of all cuts.</p>
      <p>Concerning the practical aspects, it became clear in our experiments with the system that there are
two bottlenecks: 1. the formalization of the input proof and 2. finding a resolution refutation of the
characteristic clause set. Accordingly also these two points are of major concern for our future work.</p>
      <p>To ease the formalization of the input proof, we plan to enhance the capabilities of HLK, in
particular those of the auto propositional-mode to cover also reasoning in equational background theories
specified by term rewriting systems. We expect this feature to greatly reduce the amount of time spent on
proof formalization, as large parts of formalized proofs consist of equational reasoning. In the long term,
however, an interesting option would be to use one of the available proof assistants for the formalization
of proofs as these are highly developed tools. The theoretical obstacle to the use of existing proof
assistants lies in the fact that a translation of the formalized proof is necessary, as proof assistants typically
work in set theory or higher-order logic. Moreover, this translation should not be uniform because,
depending on the formalized proof and the aims of the proof analysis, one logical translation will be more
useful than another. The main practical obstacle lies in transforming proof objects generated by the proof
assistant (if they are provided at all) into the sequent calculus format as described in Section 3.2. Here, a
lot of work will have to go into importing the standard library of basic number systems, basic operations,
data structures, etc.</p>
      <p>The search for a refutation of the characteristic clause set turned out to be a hard problem for current
theorem provers (as described in [2]). To handle this problem, we need automated theorem provers with
high flexibility, allowing an interactive construction of the resolution refutation. For using several provers
in order to combine their respective advantages it would be very beneficial to have a standard output
format for encoding resolution refutations, a goal which is partially realized by the TPTP output format.
Another approach to enhance the power of the provers is to exploit the fact that the characteristic clause
sets of the CERES-method form a very specific subclass of theorem proving problems. In particular, it
seems promising to store in the clause set certain information about the structure of the original proof as
hints on how to find a refutation and to develop resolution refinements for CERES to use this additional
information.</p>
      <p>Another important area for future improvement is the human-readable representation of the output
proofs as well as other results of the analysis (e.g. Herbrand sequents); this aspect is also critical for
the method to adapt better to larger proofs. As the terms generated by cut-elimination are frequently
very large it would pay out to implement term simplifiers. A general approach to the simplification of
Herbrand sequents lies in computing a minimal variant of it by most general unification and a (possibly
external) tautology checker. For specific theories, e.g. arithmetic, one can even do better by using term
simplification algorithms based on term rewriting systems. Another useful addition is to include a flexible
handling of definitions in the extraction and display of Herbrand sequents which would make them an
even more powerful tool for understanding the mathematical content of a formal proof. These changes
will further contribute to the creation of an interface for proof analysis which is attractive to logicians
and mathematicians alike.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Matthias</given-names>
            <surname>Baaz</surname>
          </string-name>
          , Stefan Hetzl, Alexander Leitsch, Clemens Richter, and
          <string-name>
            <given-names>Hendrik</given-names>
            <surname>Spohr</surname>
          </string-name>
          .
          <article-title>Proof transformation by CERES</article-title>
          .
          <source>In Lecture Notes in Artificial Intelligence</source>
          , volume
          <volume>4108</volume>
          , pages
          <fpage>82</fpage>
          -
          <lpage>93</lpage>
          . Mathematical Knowledge Management, Springer Berlin,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Matthias</given-names>
            <surname>Baaz</surname>
          </string-name>
          , Stefan Hetzl, Alexander Leitsch, Clemens Richter, and
          <string-name>
            <given-names>Hendrik</given-names>
            <surname>Spohr</surname>
          </string-name>
          .
          <article-title>Ceres: An Analysis of Fu¨rstenberg's Proof of the Infinity of Primes</article-title>
          .
          <source>Theoretical Computer Science</source>
          ,
          <volume>403</volume>
          :
          <fpage>160</fpage>
          -
          <lpage>175</lpage>
          ,
          <year>August 2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Matthias</given-names>
            <surname>Baaz</surname>
          </string-name>
          and
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Leitsch</surname>
          </string-name>
          .
          <article-title>Cut normal forms and proof complexity</article-title>
          .
          <source>Annals of Pure and Applied Logic</source>
          ,
          <volume>97</volume>
          (
          <issue>1-3</issue>
          ):
          <fpage>127</fpage>
          -
          <lpage>177</lpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Matthias</given-names>
            <surname>Baaz</surname>
          </string-name>
          and
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Leitsch</surname>
          </string-name>
          .
          <article-title>Cut-elimination and Redundancy-elimination by Resolution</article-title>
          .
          <source>Journal of Symbolic Computation</source>
          ,
          <volume>29</volume>
          (
          <issue>2</issue>
          ):
          <fpage>149</fpage>
          -
          <lpage>176</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Matthias</given-names>
            <surname>Baaz</surname>
          </string-name>
          and
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Leitsch</surname>
          </string-name>
          .
          <article-title>Towards a clausal analysis of cut-elimination</article-title>
          .
          <source>Journal of Symbolic Computation</source>
          ,
          <volume>41</volume>
          :
          <fpage>381</fpage>
          -
          <lpage>410</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Gentzen</surname>
          </string-name>
          .
          <article-title>Untersuchungen u¨ber das logische Schließen</article-title>
          .
          <source>Mathematische Zeitschrift</source>
          ,
          <volume>39</volume>
          :
          <fpage>176</fpage>
          -
          <lpage>210</lpage>
          ,
          <fpage>405</fpage>
          -
          <lpage>431</lpage>
          ,
          <fpage>1934</fpage>
          -
          <lpage>1935</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Jean-Yves Girard</surname>
          </string-name>
          .
          <source>Proof Theory and Logical Complexity. Elsevier</source>
          ,
          <year>1987</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Hetzl</surname>
          </string-name>
          , Alexander Leitsch, Daniel Weller, and Bruno Woltzenlogel Paleo.
          <article-title>Herbrand sequent extraction</article-title>
          .
          <source>In Lecture Notes in Artificial Intelligence</source>
          , volume
          <volume>5144</volume>
          , pages
          <fpage>462</fpage>
          -
          <lpage>477</lpage>
          . Mathematical Knowledge Management, Springer Berlin,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Hetzl</surname>
          </string-name>
          , Alexander Leitsch, Daniel Weller, and
          <article-title>Bruno Woltzenlogel Paleo. A Clausal Approach to Proof Analysis in Second-Order Logic</article-title>
          .
          <source>In Symposium on Logical Foundations of Computer Science (LFCS</source>
          <year>2009</year>
          ), Lecture Notes in Computer Science. Springer,
          <year>2009</year>
          . to appear.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Leitsch</surname>
          </string-name>
          and
          <string-name>
            <given-names>Clemens</given-names>
            <surname>Richter</surname>
          </string-name>
          .
          <article-title>Equational theories in ceres</article-title>
          . unpublished (available at http: //www.logic.at/ceres/),
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <article-title>The Coq development team. The Coq proof assistant reference manual</article-title>
          .
          <source>LogiCal Project</source>
          ,
          <year>2004</year>
          . Version 8.0.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Tobias</surname>
            <given-names>Nipkow</given-names>
          </string-name>
          , Lawrence C. Paulson, and Markus Wenzel.
          <article-title>Isabelle/HOL - a proof assistant for HigherOrder Logic</article-title>
          , volume
          <volume>2283</volume>
          <source>of LNCS</source>
          . Springer Berlin,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>C.</given-names>
            <surname>Urban</surname>
          </string-name>
          .
          <article-title>Classical Logic and Computation</article-title>
          .
          <source>PhD thesis</source>
          , University of Cambridge Computer Laboratory,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>