=Paper=
{{Paper
|id=Vol-192/paper-11
|storemode=property
|title=System Description: The Cut-Elimination System CERES
|pdfUrl=https://ceur-ws.org/Vol-192/paper11.pdf
|volume=Vol-192
}}
==System Description: The Cut-Elimination System CERES==
System Description:
The Cut-Elimination System CERES ∗
Matthias Baaz1 , Stefan Hetzl2 , Alexander Leitsch2 ,
Clemens Richter2 , Hendrik Spohr2
1
Institute of Discrete Mathematics and Geometry (E104),
Vienna University of Technology, Wiedner Hauptstraße 8-10,
1040 Vienna, Austria
baaz@logic.at
2
Institute of Computer Languages (E185),
Vienna University of Technology, Favoritenstraße 9,
1040 Vienna, Austria
{hetzl|leitsch|richter|spohr}@logic.at
Abstract
Cut-elimination is the most prominent form of proof transformation in logic.
The elimination of cuts in formal proofs corresponds to the removal of intermediate
statements (lemmas) in mathematical proofs. The cut-elimination method CERES
(cut-elimination by resolution) works by constructing a set of clauses from a proof
with cuts. Any resolution refutation of this set then serves as a skeleton of an
LK-proof with only atomic cuts.
The use of resolution and the enormous size of (formalized) mathematically rele-
vant proofs suggest an implementation able to handle rather complex cut-elimination
problems. In this paper we present an implementation of CERES: the system CERES.
It already implements an improvement based on an extension of LK to the calcu-
lus LKDe containing equality rules and rules for introduction of definitions which
makes it easier to formalize and interpret mathematical proofs in LK. Furthermore
it increases the efficiency of the cut-elimination method. The system CERES already
performs well in handling quite large proofs.
1 Introduction
Proof analysis is a central mathematical activity which proves crucial to the development
of mathematics. Indeed many mathematical concepts such as the notion of group or the
notion of probability were introduced by analyzing existing arguments. In some sense
the analysis and synthesis of proofs form the very core of mathematical progress [12, 13].
Cut-elimination introduced by Gentzen [8] is the most prominent form of proof
transformation in logic and plays an important role in automating the analysis of math-
ematical proofs. The removal of cuts corresponds to the elimination of intermediate
∗
supported by the Austrian Science Fund (project no. P17995-N12)
Empirically Successful Computerized Reasoning 159
statements (lemmas) from proofs resulting in a proof which is analytic in the sense,
that all statements in the proof are subformulas of the result. Therefore, the proof of a
combinatorial statement is converted into a purely combinatorial proof. Cut elimination
is therefore an essential tool for the analysis of mathematical proofs, especially to make
implicit parameters explicit. Cut free derivations allow for
• the extraction of Herbrand disjunctions, which can be used to establish bounds
on existential quantifiers (e.g. Luckhardt’s analysis of the Theorem of Roth [11]).
• the construction of interpolants, which allow for the replacement of implicit defi-
nitions by explicit definitions according to Beth’s Theorem.
• the calculation of generalized variants of the end formula.
In a formal sense Girard’s analysis of van der Waerden’s theorem [9] is the application
of cut-elimination to the proof of Fürstenberg/Weiss with the “perspective” of obtaining
van der Waerden’s proof. Indeed an application of a complex proof transformation like
cut-elimination by humans requires a goal oriented strategy.
Note that cut-elimination is non-unique, i.e. there is no single cut-free proof which
represents the analytic version of a proof with lemmas. Therefore the application of
purely computational methods on existing proofs may even produce new interesting
proofs. Indeed, it is non-uniqueness which makes computational experiments with cut-
elimination interesting [3]. The experiments can be considered as a source for a base
of proofs in formal format which provide different mathematical and computational
information.
CERES [5, 6] is a cut-elimination method that is based on resolution. The method
roughly works as follows: The structure of the proof containing cuts is mapped to an
unsatisfiable set of clauses C (the characteristic clause set). A resolution refutation of
C, which is obtained using a first-order theorem prover, serves as a skeleton for the new
proof which contains only atomic cuts. In a final step also these atomic cuts can be
eliminated, provided the (atomic) axioms are valid sequents; but this step is of minor
mathematical interest and of low complexity. In the system CERES1 this method of cut-
elimination has been implemented. The system is capable of dealing with formal proofs
in an extended version of LK, among them also very large ones (i.e. proofs with more
than 1500 nodes and cut-elimination has been done within 14 seconds).
The extension of CERES to LKDe a calculus containing definition introductions and
equality rules [10] moves the system closer to real mathematical proofs. In particular,
introduction of definitions and concepts is perhaps the most significant activity of a
mathematician in structuring proofs and theories. By its high efficiency (the core of the
method is first-order theorem proving by resolution and paramodulation) CERES might
become a strong tool in automated proof mining and contribute to an experimental
culture of computer-aided proof analysis in mathematics.
1
available at http://www.logic.at/ceres/
160 Empirically Successful Computerized Reasoning
2 The System CERES
The cut-elimination system CERES is written in ANSI-C++. There are two main tasks.
One is to compute an unsatisfiable set of clauses characterizing the cut formulas. This
is done by automatically extracting the so-called characteristic clause set CL(ϕ). The
other is to generate a resolution refutation of CL(ϕ) by an external theorem prover2 ,
and to compute the necessary projections of ϕ w.r.t. the clauses in CL(ϕ) actually used
for the refutation. A projection of ϕ modulo a clause C ∈ CL(ϕ) is gained by applying
all inference rules of ϕ not operating on cut ancestors to the initial sequents of C. The
properly instantiated projections are concatenated, using the refutation obtained by the
theorem prover as a skeleton of a proof with only atomic cuts.
The extraction of the characteristic clause set CL(ϕ) from an LKDe-proof ϕ is
defined inductively. For every node ν of ϕ either:
• If ν is an occurrence of an axiom S and S ′ is the subsequent of S containing only
ancestors of cut formulas then Cν = { S ′ }.
• Let ν1 , ν2 be the predecessors of ν in a binary inference then we distinguish:
– The auxiliary formulas of ν1 , ν2 are ancestors of cut formulas then Cν =
Cν1 ∪ Cν2 .
– Otherwise Cν = Cν1 × Cν2 , where C × D = { C ◦ D | C ∈ C, D ∈ D } and C ◦ D
is the merge of the clauses C and D.
• Let ν ′ be the predecessor of ν in a unary inference then Cν = Cν ′ .
The characteristic clause set CL(ϕ) is defined as Cν where ν is the root node of ϕ.
The definition rules of LKDe directly correspond to the extension principle (see [7])
in predicate logic. It simply consists in introducing new predicate- and function sym-
bols as abbreviations for formulas and terms. Mathematically this corresponds to the
introduction of new concepts in theories. Let A be a first-order formula with the free
variables x1 , . . . , xk (denoted by A(x1 , . . . , xk ) and P be a new k-ary predicate symbol
(corresponding to the formula A). Then the rules are:
A(t1 , . . . , tk ), Γ ⊢ ∆ Γ ⊢ ∆, A(t1 , . . . , tk )
def P : l def P : r
P (t1 , . . . , tk ), Γ ⊢ ∆ Γ ⊢ ∆, P (t1 , . . . , tk )
for arbitrary sequences of terms t1 , . . . , tk . There are also definition introduction rules
for new function symbols which are of similar type. The equality rules are:
Γ1 ⊢ ∆1 , s = t A[s]Λ , Γ2 ⊢ ∆2 Γ1 ⊢ ∆1 , t = s A[s]Λ , Γ2 ⊢ ∆2
=: l1 =: l2
A[t]Λ , Γ1 , Γ2 ⊢ ∆1 , ∆2 A[t]Λ , Γ1 , Γ2 ⊢ ∆1 , ∆2
for inference on the left and
Γ1 ⊢ ∆1 , s = t Γ2 ⊢ ∆2 , A[s]Λ Γ1 ⊢ ∆1 , t = s Γ2 ⊢ ∆2 , A[s]Λ
=: r1 =: r2
Γ1 , Γ2 ⊢ ∆1 , ∆2 , A[t]Λ Γ1 , Γ2 ⊢ ∆1 , ∆2 , A[t]Λ
2
The current version of CERES uses the automated theorem prover Otter (see http://www-unix.mcs.
anl.gov/AR/otter/), but any refutational theorem prover based on resolution and paramodulation may
be used.
Empirically Successful Computerized Reasoning 161
on the right, where Λ denotes a set of positions of subterms where replacement of s by
t has to be performed. We call s = t the active equation of the rules. Note that, on
atomic sequents, the rules coincide with paramodulation – under previous application of
the most general unifier.
Concerning the extension of the CERES-method from LK to LKDe, equality rules
appearing within the input proof are propagated to the projections like any other binary
rules. During theorem proving equality is treated by paramodulation; its application
within the final clausal refutation is then transformed to the appropriate equality rules in
LKDe. The definition introductions do not require any other special treatment within
CERES than all other unary rules; in particular, they have no influence on the theorem
proving part.
Since the restriction to skolemized proofs is crucial to the CERES-method, the sys-
tem also performs skolemization (according to Andrew’s method [2]) on the input proof.
The system CERES expects an LKDe proof of a sequent S and a set of axioms as
input, and, after validation of the input proof, computes a proof of S containing at most
atomic-cuts. Input and output are formatted using the well known data representation
language XML. This allows the use of arbitrary and well known utilities for editing,
transformation and presentation and standardized programming libraries. To increase
the performance and avoid redundancy, most parts of the proofs are internally repre-
sented as directed acyclic graphs. This representation turns out to be very handy, also
for the internal unification algorithms.
The formal analysis of mathematical proofs (especially by a mathematician as a pre-
and post-“processor”) relies on a suitable format for the input and output of proofs,
and on an appropriate aid in dealing with them. We developed an intermediary proof
language connecting the language of mathematical proofs with LKDe and the compiler
HLK3 transforming proofs written in this higher-level language to LKDe. Furthermore
we implemented a proof tool4 with a graphical user interface, allowing a convenient input
and analysis of the output of CERES. Thereby the integration of definition- and equality-
rules into the underlying calculus plays an essential role in overlooking, understanding
and analyzing complex mathematical proofs by humans.
3 Example
The example proof below is taken from [14]; it was formalized in LK and analyzed by
a former version of CERES in the paper [3]. Here we use the extensions by equality
rules and definition-introduction in LKDe to give a simpler formalization and analysis
of the proof. The end-sequent formalizes the statement: on a tape with infinitely many
cells which are all labelled by 0 or by 1 there are at least two cells labelled by the same
number. f (x) = 0 expresses that the cell number x is labelled by 0. Indexing of cells
is done by number terms defined over 0, 1 and +. The proof ϕ below uses two lemmas:
(1) there are infinitely many cells labelled by 0 and (2) there are infinitely many cells
labelled by 1. The proof shows the statement by a case distinction: If there are infinitely
many cells labelled by 0 then choose two of them, otherwise there are infinitely many
3
available at http://www.logic.at/hlk/
4
available at http://www.logic.at/prooftool/
162 Empirically Successful Computerized Reasoning
cells labelled by 1 and we can then choose two of them. In each case there are two
cells with the same value. These lemmas are eliminated by CERES and a more direct
argument is obtained in the resulting proof ϕ′ . Ancestors of the cuts in ϕ are indicated
in boldface.
Let ϕ be the proof
(τ ) (ǫ0 )
A ⊢ I0 , I1 I0 ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q)) (ǫ1 )
cut
A ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q)), I1 I1 ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q))
cut
A ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q))
where τ =
(τ ′ )
A ⊢ f (n0 + n1 ) = 0, f (n1 + n0 ) = 1
∃r
A ⊢ f (n0 + n1 ) = 0, ∃k.f (n1 + k) = 1
∃r
A ⊢ ∃k.f (n0 + k) = 0, ∃k.f (n1 + k) = 1
∀: r
A ⊢ ∃k.f (n0 + k) = 0, ∀n∃k.f (n + k) = 1
∀: r
A ⊢ ∀n∃k.f (n + k) = 0, ∀n∃k.f (n + k) = 1
def I0 : r
A ⊢ I0 , ∀n∃k.f (n + k) = 1
def I1 : r
A ⊢ I0 , I 1
For τ ′ =
(Axiom)
⊢ n1 + n0 = n0 + n1 f (n1 + n0 ) = 1 ⊢ f (n1 + n0 ) = 1
=: l1
f (n0 + n1 ) = 0 ⊢ f (n0 + n1 ) = 0 f (n0 + n1 ) = 1 ⊢ f (n1 + n0 ) = 1
∨: l
f (n0 + n1 ) = 0 ∨ f (n0 + n1 ) = 1 ⊢ f (n0 + n1 ) = 0, f (n1 + n0 ) = 1
∀: l
∀x(f (x) = 0 ∨ f (x) = 1) ⊢ f (n0 + n1 ) = 0, f (n1 + n0 ) = 1
def A : l
A ⊢ f (n0 + n1 ) = 0, f (n1 + n0 ) = 1
And for i = 1, 2 we define the proofs ǫi =
ψ ηi
∧: r
f (s) = i, f (t) = i ⊢ s 6= t ∧ f (s) = f (t)
∃: r
f (s) = i, f (t) = i ⊢ ∃q(s 6= q ∧ f (s) = f (q))
∃: r
f (s) = i, f (t) = i ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q))
∃: l
f (n0 + k0 ) = i, ∃k.f (((n0 + k0 ) + 1) + k) = i ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q))
∀: l
f (n0 + k0 ) = i, ∀n∃k.f (n + k) = i ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q))
∃: l
∃k.f (n0 + k) = i, ∀n∃k.f (n + k) = i ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q))
∀: l
∀n∃k.f (n + k) = i, ∀n∃k.f (n + k) = i ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q))
c: l
∀n∃k.f (n + k) = i ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q))
def Ii : l
Ii ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q))
for s = n0 + k0 , t = ((n0 + k0 ) + 1) + k1 , and the proofs
ψ=
(axiom) (axiom)
⊢ (n0 + k0 ) + (1 + k1 ) = ((n0 + k0 ) + 1) + k1 n0 + k0 = (n0 + k0 ) + (1 + k1 ) ⊢
=: l1
n0 + k0 = ((n0 + k0 ) + 1) + k1 ⊢
¬: r
⊢ n0 + k0 6= ((n0 + k0 ) + 1) + k1
Empirically Successful Computerized Reasoning 163
and ηi =
(axiom)
f (t) = i ⊢ f (t) = i ⊢ i = i
=: r2
f (s) = i ⊢ f (s) = i f (t) = i ⊢ i = f (t)
=: r2
f (s) = i, f (t) = i ⊢ f (s) = f (t)
The characteristic clause set is (after variable renaming)
CL(ϕ) = {⊢ f (x + y) = 0, f (y + x) = 1; (C1 )
f (x + y) = 0, f (((x + y) + 1) + z) = 0 ⊢; (C2 )
f (x + y) = 1, f (((x + y) + 1) + z) = 1 ⊢} (C3 ).
The axioms used for the proof are the standard axioms of type A ⊢ A and instances of
⊢ x = x, of commutativity ⊢ x + y = y + x, of associativity ⊢ (x + y) + z = x + (y + z),
and of the axiom
x = x + (1 + y) ⊢,
expressing that x + (1 + y) 6= x for all natural numbers x, y.
The comparison with the analysis of Urban’s proof formulated in LK (without equal-
ity) [3] shows that the proof based on equality is much more transparent. In fact the
set of characteristic clauses contains only 3 elements (instead of 5), which are also sim-
pler. This also facilitates the refutation of the clause set and makes the output proof
simpler and more transparent. On the other hand, the analysis below shows that the
mathematical argument obtained by cut-elimination is the same as in [3].
The program Otter found the following refutation of CL(ϕ) (based on hyperresolu-
tion only — without equality inference):
The first hyperesolvent, based on the clash sequence (C2 ; C1 , C1 ), is
C4 = ⊢ f (y + x) = 1, f (z + ((x + y) + 1)) = 1, with the intermediary clause
D1 = f (((x + y) + 1) + z) = 0 ⊢ f (y + x) = 1.
The next clash is sequence is (C3 ; C4 , C4 ) which gives C5 with intermediary clause D2 ,
where:
C5 = ⊢ f (v ′ + u′ ) = 1, f (v + u) = 1,
D2 = f (x + y) = 1 ⊢ f (v + u) = 1.
Factoring C5 gives C6 : ⊢ f (v +u) = 1 (which roughly expresses that all fields are labelled
by 1). The final clash sequence (C3 ; C6 , C6 ) obviously results in the empty clause ⊢ with
intermediary clause D3 : f (((x + y) + 1) + z) = 1 ⊢. The hyperresolution proof ψ3 in form
of a tree can be obtained from the following resolution trees, where C ′ and ψ ′ stand for
renamed variants of C and of ψ, respectively:
ψ1 : =
C1 C2
res
D1 C1′
res
C4
164 Empirically Successful Computerized Reasoning
ψ2 : =
C3′ ψ1
res
D2 ψ1′
res
C5
factor
C6
ψ3 : =
ψ2 C3′′
res
D3 ψ2′
res
⊢
Instantiation of ψ3 by the uniform most general unifier σ of all resolutions gives a
deduction tree ψ3 σ in LKDe; indeed, after application of σ, resolution becomes cut
and factoring becomes contraction. The proof ψ3 σ is the skeleton of an LKDe-proof
of the end-sequent with only atomic cuts. Then the leaves of the tree ψ3 σ have to be
replaced by the proof projections. E.g., the clause C1 is replaced by the proof ϕ[C1 ],
where s = n0 + n1 and t = n1 + n0 :
(Axiom)
⊢ t = s f (t) = 1 ⊢ f (t) = 1
=: l
f (s) = 0 ⊢ f (s) = 0 f (s) = 1 ⊢ f (t) = 1
∨: l
f (s) = 0 ∨ f (s) = 1 ⊢ f (s) = 0, f (t) = 1
∀: l
∀x(f (x) = 0 ∨ f (x) = 1) ⊢ f (s) = 0, f (t) = 1
def A : l
A ⊢ f (s) = 0, f (t) = 1
w: r
A ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q)), f (s) = 0, f (t) = 1
Furthermore C2 is replaced by the projection ϕ[C2 ] and C3 by ϕ[C3 ], where (for
i = 0, 1) ϕ[C2+i ] =
ψ ηi
∧: r
f (s) = i, f (t) = i ⊢ s 6= t ∧ f (s) = f (t)
∃: r
f (s) = i, f (t) = i ⊢ ∃q(s 6= q ∧ f (s) = f (q))
∃: r
f (s) = i, f (t) = i ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q))
w: l
f (s) = i, f (t) = i, A ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q))
Note that ψ, η0 , η1 are the same as in the definition of ǫ0 , ǫ1 above.
By inserting the σ-instances of the projections into the resolution proof ψ3 σ and
performing some additional contractions, we eventually obtain the desired proof ϕ′ of
the end-sequent
A ⊢ ∃p∃q(p 6= q ∧ f (p) = f (q))
with only atomic cuts. ϕ′ no longer uses the lemmas that infinitely many cells are
labelled by 0 and by 1, respectively.
4 Intended Extensions of CERES And Future Work
We plan to develop the following extensions of CERES:
Empirically Successful Computerized Reasoning 165
• As the cut-free proofs are often very large and difficult to interpret, we intend
to provide the possibility to analyze certain characteristics of the cut-free proof
(which are simpler than the proof itself). An important example are Herbrand
sequents which may serve to extract bounds from proofs (see e.g. [11]). We plan
to develop algorithms for extracting Herbrand sequents (also from proofs of non-
prenex sequents as indicated in [4]) and for computing interpolants, which allow
for the replacement of implicit definitions by explicit ones according to Beth’s
Theorem.
• In the present version CERES eliminates all cuts at once. But - for the application
to real mathematical proofs human user interaction is required — in particular
only interesting cuts (i.e. lemmas of mathematical importance) deserve to be
eliminated, others should be integrated as additional axioms.
• As CERES requires the skolemization of the end-sequent the original proof must be
transformed to Skolem form. We plan to develop an efficient reversed-skolemization-
algorithm, which transforms the theorem to be proved into its original form.
• A great challenge in the formal analysis of mathematical proofs lies in provid-
ing a suitable format for the input and output of proofs. We plan to improve
our intermediary proof language and to move closer to the “natural” language of
mathematical proofs.
To demonstrate the abilities of CERES and the feasibility of formalizing and analyzing
complex proofs of mathematical relevance, we currently investigate a well known proof of
the infinity of primes using topology (which may be found in [1]). Our aim is to eliminate
the topological concepts from the proof by means of CERES, breaking it down to a proof
solely based on elementary number arithmetic. This way one can obtain new insights
into the construction of prime numbers contained in the topological proof.
References
[1] M. Aigner, G. M. Ziegler: Proofs from THE BOOK. Springer, 1998.
[2] P. B. Andrews: Resolution in Type Theory, Journal of Symbolic Logic, 36, pp. 414–
432, 1971.
[3] M. Baaz, S. Hetzl, A. Leitsch, C. Richter, H. Spohr: Cut-Elimination: Experiments
with CERES, Proc. LPAR 2004, pp. 481–495, Springer, 2004.
[4] M. Baaz, A. Leitsch: On Skolemization and Proof Complexity, Fundamenta Infor-
maticae, 20(4), pp. 353–379, 1994.
[5] M. Baaz, A. Leitsch: Cut-Elimination and Redundancy-Elimination by Resolution,
Journal of Symbolic Computation, 29, pp. 149–176, 2000.
[6] M. Baaz, A. Leitsch: Towards a Clausal Analysis of Cut-Elimination, Journal of
Symbolic Computation, 41, pp. 381–410, 2006.
166 Empirically Successful Computerized Reasoning
[7] E. Eder: Relative complexities of first-order calculi, Vieweg, 1992.
[8] G. Gentzen: Untersuchungen über das logische Schließen, Mathematische
Zeitschrift, 39, pp. 405–431, 1934–1935.
[9] J. Y. Girard: Proof Theory and Logical Complexity, in Studies in Proof Theory,
Bibliopolis, Napoli, 1987.
[10] A. Leitsch, C. Richter: Equational Theories in CERES, unpublished (available
at http://www.logic.at/ceres/), 2005.
[11] H. Luckhardt: Herbrand-Analysen zweier Beweise des Satzes von Roth: polynomi-
ale Anzahlschranken, The Journal of Symbolic Logic, 54, pp. 234–263, 1989.
[12] G. Polya: Mathematics and plausible reasoning, Volume I: Induction and Analogy
in Mathematics, Princeton University Press, Princeton, New Jersey, 1954.
[13] G. Polya: Mathematics and plausible reasoning, Volume II: Patterns of Plausible
Inference, Princeton University Press, Princeton, New Jersey, 1954.
[14] C. Urban: Classical Logic and Computation Ph.D. Thesis, University of Cambridge
Computer Laboratory, 2000.
Empirically Successful Computerized Reasoning 167