=Paper= {{Paper |id=Vol-192/paper-5 |storemode=property |title=Translating Higher-Order Problems to First-Order Clauses |pdfUrl=https://ceur-ws.org/Vol-192/paper05.pdf |volume=Vol-192 }} ==Translating Higher-Order Problems to First-Order Clauses== https://ceur-ws.org/Vol-192/paper05.pdf

Translating Higher-Order Problems to First-Order Clauses
Jia Meng1 , Lawrence C. Paulson2
1
National ICT, Australia
jiameng@nicta.com.au
2
Computer Laboratory, University of Cambridge, U.K.
LP15@cam.ac.uk

Abstract
Proofs involving large specifications are typically carried out through interactive
provers that use higher-order logic. A promising approach to improve the automa-
tion of interactive provers is by integrating them with automatic provers, which
are usually based on first-order logic. Consequently, it is necessary to translate
higher-order logic formulae to first-order form. This translation should ideally be
both sound and practical. We have implemented three higher-order to first-order
translations, with particular emphasis on the translation of types. Omitting some
type information improves the success rate, but can be unsound, so the interactive
prover must verify the proofs. In this paper, we will describe our translations and
experimental data that compares the three translations in respect of their success
rates for various automatic provers.

1 Introduction
Interactive theorem provers, such as HOL4 [GM93], Isabelle [NPW02] and
PVS [ORR+ 96] are widely used for formal verifications and specifications. They provide
expressive formalisms and tools for managing large scale proof projects. However, a ma-
jor weakness of interactive provers is the lack of automation. In order to overcome the
problem, we have integrated Isabelle with automatic theorem provers (ATPs) [MQP06].
ATPs combine a variety of reasoning methods and do not require users’ instructions
on how to use an axiom or when to use an axiom. For example, they do not require
equalities to be oriented, but use them as undirected equations.
Among the many logics used by interactive provers, higher-order logic (HOL) is the
most popular one because of its expressiveness. One of the most widely used logics in
Isabelle is Isabelle/HOL. In contrast, most of the powerful ATPs are based on first-order
logic (FOL). Therefore, it is important to translate HOL problems into FOL format.
Those HOL problems that do not involve function variables, predicate variables or
λ-abstractions can be translated directly to FOL format. However, we must be careful
about how to translate HOL problems that are truly higher-order. In particular, we need
to include the problem’s type information to preserve soundness. A sound approach is to
include all types for all terms. Unfortunately, this will result in large terms and clauses,
and much of the type information is redundant. A more compact type representation

70 Empirically Successful Computerized Reasoning
could enhance the performance of ATPs. Omitting some type information could lead to
unsoundness, which ultimately we will prevent through proof reconstruction (Sect. 2.5).
We have implemented three HOL to FOL translations, and we believe two of them are
new. We have also carried out extensive experiments in order to assess their effectiveness
with the provers E [Sch04], SPASS [Wei01] and Vampire [RV01].
Paper outline. We first describe three HOL to FOL translations and discuss their
soundness (Sect. 2). We then describe the experiments we ran (Sect. 3) and finally offer
some conclusions (Sect. 4).

2 Background
Higher-order logic (HOL) extends first-order logic (FOL) in several respects. The main
difference is that HOL terms can denote truth values and functions. Function values
can be expressed using λ-abstractions or by currying: that is, by applying a function
to fewer than the maximum number of arguments. In FOL, a function must always be
supplied the same number of arguments. In translating from HOL to FOL, the only way
to reconcile this difference is to regard all HOL functions as constants while providing
a two-argument function (called app and is abbreviated by @ below) to express function
application. In addition, we need a predicate B to convert all top-level FOL terms of
boolean type to predicates. These translations allow first-order provers to solve many
problems that contain higher-order features, though they do not yield the full power of
higher-order logic.
For example, the HOL formula ∀F p(F (x)) is translated to

{++B(@(p,@(F,x)))}

We use the combinators I, K, C, B and S to represent λ-abstractions, asserting the
combinator reduction equations as axioms. (Although K and S suffice in theory, the
resulting translation is exponential in the number of abstractions.) Another axiom we
assert is function extensionality:

∀f g [(∀x f (x) = g(x)) → f = g].

It has the following clause form, where e is a reserved Skolem function symbol, yielding
some x such that f (x) 6= g(x).

{--equal(@(F,@(@(e,F),G)),@(G,@(@(e,F),G))),
++equal(F,G)}

Finally, equality in Isabelle may appear in a λ-abstraction, and thus is treated as
an ordinary function symbol. We use a new function symbol fequal to represent the
function version of equality. Through λ-reduction, this equality may be promoted to
predicate level, becoming an ordinary equality. Promotion requires two additional ax-
ioms:

{++B(@(@(fequal,X),Y)),--equal(X,Y)}
{--B(@(@(fequal,X),Y)),++equal(X,Y)}

Empirically Successful Computerized Reasoning 71
Not all subgoals require the full power of HOL. Often the initial steps of the proof
replace complicated constructions by simple ones. Of the remaining subgoals, many
are purely first-order. Others are higher-order but use no λ-abstractions. These special
cases admit more efficient translations into first-order clauses, though naturally we must
also provide a translation that accommodates the general case.

2.1 Types in Isabelle/HOL
We have just seen how to represent HOL formulae in FOL form. A more important
issue is to embed type information of HOL formulae in FOL clauses. Let us review how
this works for problems that are already first-order [MP04, MQP06]. Before that, we
give a brief overview of Isabelle/HOL’s polymorphically sorted type system. We refer
readers to the two papers above for more information.
Isabelle/HOL supports axiomatic type classes, where a type class is a set of types.
For example, the type for real numbers real is a member of type class linorder. A type
class is axiomatic because it may have a set of properties—specified by axioms—that
all its member types should satisfy. A type may belong to several type classes and an
intersection of type classes is a sort. Moreover, each type constructor has one or more
arities, which describe how the result type class depends upon the arguments’ type
classes. For example, type constructor list has an arity that says if its argument is a
member of class linorder then the resulting list’s type is also a member of linorder.
Constants can be overloaded and types can be polymorphic, allowing instantiation
to more specific types. For example, the ≤ operator has the polymorphic type α →
α → bool; when it has type nat → nat → bool it denotes the usual ordering of the
natural numbers, and when it has type α set → α set → bool it denotes the subset
relation. The latter type is still polymorphic in the type of the set’s elements. Isabelle’s
overloading cannot be eliminated by preprocessing because polymorphic theorems about
≤ are applicable to all instances of this function, despite their different meanings.
When we translate Isabelle formulae to FOL clauses, we need to formalize types,
especially in view of Isabelle’s heavy use of overloading. We need to ensure that Isabelle
theorems involving polymorphic functions are only used for appropriate types. To ac-
complish this, polymorphic functions carry type information as additional arguments
and we translate Isabelle types to FOL terms. For example, we translate x ≤ y where
x and y are α set to le(x, y, set(α)). We also translate Isabelle’s axiomatic type classes
into first-order clauses. For this, we translate type classes to FOL predicates and types
to FOL terms.
This translation is reasonably compact, and it enforces overloading (≤ on sets is
not confused with ≤ on integers), but it does not capture other aspects of types. For
example, if we declare a two-element datatype two, then we obtain the theorem

∀x [x = a ∨ x = b].

The corresponding clause does not mention type two:
{++equal(X,a), ++equal(X,b)}
It therefore asserts that the universe consists of two elements; given our other axioms,
ATPs easily detect the inconsistency. We simply live with this risk for the moment,

72 Empirically Successful Computerized Reasoning
pending the implementation of proof reconstruction. A simple way of detecting such
issues is to check whether a proof refers to at least one conjecture clause: if not, then the
axiom clauses by themselves are inconsistent. This method detects some invalid proofs,
but not all of them.
Since HOL problems require currying, we need a different type embedding method
from first-order ones. We have implemented three type translations, namely fully-typed,
partial-typed and constant-typed.

2.2 The Fully-Typed Translation
The fully-typed translation, which resembles Hurd’s translation [Hur02], is sound. The
special function typeinfo, which we abbreviate to T, pairs each term with its type. For
instance, the formula P < Q is translated to

{++B(T(@(T(@(T(<, a=>a=>bool), T(P,a)), a=>bool), T(Q,a)), bool))}

This translation is sound because it includes types for all terms and subterms, right
down to the variables. When two terms are unified during a resolution step, their types
are unified as well. This instantiation of types guarantees that terms created in the
course of a proof continue to carry correct types. Isabelle unifies polymorphic terms
similarly. In fact, the resolution steps performed by an ATP could in principle be
reconstructed in Isabelle. Each FOL axiom clause corresponds to an Isabelle theorem.
If two FOL clauses are resolved, then the resolvant FOL clause will correspond to the
Isabelle theorem produced by Isabelle’s own resolution rule.
The fully-typed translation introduces much redundancy. Every part of a function
application is typed: the function’s type includes its argument and result types, which
are repeated in the translation of the function’s argument and by including the type of
the returned result. Through experiments (Sect. 3), we have found that these large terms
degrade the ATPs’ performance. A more compact HOL translation should improve the
success rate.
Hurd [Hur03] uses an untyped translation for the same reason. No term or predicate
has any type information. Because this translation can produce unsound proofs, Hurd
relies on proof reconstruction to verify them. If reconstruction fails, Hurd calls the ATP
again, using a typed translation. Hurd says that this happens less than one percent of
the time. This combination of an efficient but unsound translation with a soundness
check achieves both efficiency and soundness. We intend to take the same approach.
If we are to achieve a compact HOL translation, we will have to omit some types,
potentially admitting some unsound proofs. We cannot use a completely untyped trans-
lation because our requirements differ from Hurd’s. His tactic sends to ATPs a few
theorems that are chosen by users. In contrast, we send ATPs hundreds of theorems,
many involving overloading. Omitting the types from this large collection would result
in many absurd proofs, where for example, the operator ≤ simultaneously denoted “less
than” on integers and the subset relation. We have designed and experimented with two
compact HOL translations: the partial-typed and constant-typed translations. These
translations attach the most important type information (such as type instantiations of
polymorphic constants) that can block some incorrect resolutions.

Empirically Successful Computerized Reasoning 73
2.3 The Partial-Typed Translation
The partial-typed translation only includes the types of functions in function calls. The
type is translated to a FOL term and is inserted as a third argument of the application
operator (@). Taking the previous formula P < Q as an example, we translate it to

{++B(@(@(<, P, a=>a=>bool), Q, a=>bool))}.

Here, the type of < is a=>a=>bool, and we include this type as an additional argument
of function application @.
In a HOL formula, a function may be passed to another function as an argument. If
a function appears without arguments, we do not include its type. The FOL clauses are
derived from Isabelle formulae, which we know to be well-formed and type correct. The
partial-typed translation avoids the redundancy of the fully-typed translation. Most
of the time, this type encoding also ensures correct treatments of Isabelle overloading:
Isabelle overloaded constants are most likely to appear as operators (functions and
predicates) in formulae, whose types are inserted by the partial-typed encoding.
However, the partial-typed translation can still yield unsound proofs. It is vulnerable
to the example involving datatype two, described in Sect. 2.1 above.

2.4 The Constant-Typed Translation
In the constant-typed translation, we include types of polymorphic constants only. Fur-
thermore, we do not include a constant’s full type but only the instantiated values of its
type variables. Monomorphic constants do not need to carry types because their names
alone determine the types of their arguments. A polymorphic constant is translated to
a first-order function symbol. Its arguments, which represent types, are obtained by
matching its actual type against its declared type. This treatment of types is similar to
the one we use for problems that are already first-order.
Again considering our standard example, if P and Q are natural numbers (type nat),
we translate the formula P < Q to

{++B(@(@(<(nat), P), Q))}.

Similarly, if P and Q are sets (type α set), it becomes

{++B(@(@(<(set(a)), P), Q))}.

As for equality, if it appears as a predicate, then we do not insert its type. However,
if it appears as a constant in a combinator term, then we include its argument’s type as
its argument. Similarly, we translated the equality axiom above to the two clauses

{++B(@(@(fequal(T),X),Y)),--equal(X,Y)}
{--B(@(@(fequal(T),X),Y)),++equal(X,Y)}

This translation can reduce the size of terms significantly. However, like the partial-
typed one, it can be unsound.

74 Empirically Successful Computerized Reasoning
2.5 Which Translation to Use and Soundness Issues
Of the three HOL to FOL translations above, the fully-typed one is sound but pro-
duces excessively large terms. The partial-typed and constant-typed translations are
more compact, but may introduce unsound proofs. If we use either of the compact
translations, then we must verify proofs in Isabelle to ensure soundness.
There are several factors that affect the decision about which translation should be
used as the default.

• Can we verify the proofs for partial-typed and constant-typed translations? The
answer is yes. Although the FOL clauses carry insufficient type information, the
clauses still correspond to Isabelle lemmas and goals. Our approach to proof
reconstruction, which is currently being implemented, involves following the low-
level resolution steps. If two clauses cannot be resolved due to incompatible types,
Isabelle will detect this.

• What benefit do we obtain from using the compact translations? Our experi-
mental results (Sect. 3) show that the compact translations can boost the success
rate significantly. Therefore, it is worthwhile to use a compact translation, even
if occasional unsound proofs require retrying the problem using the fully-typed
translation.

Moreover, we aim to reconstruct proofs in Isabelle even if we use fully-typed trans-
lation. This is so that proofs can go through Isabelle kernel. Since proof reconstruction
is needed regardless of which translation is used, the only potential extra cost involved
in using a compact translation is that of occasional retries.

3 Experiments
It is obvious that the constant-typed translation is the most compact, while the fully-
typed one is the least compact. We can therefore predict that the constant-typed trans-
lation will deliver the best results with ATPs, while the fully-typed one will turn out
to be the worst. However, such claims need to be backed up by observations, especially
given that the fully-typed translation is the best for soundness.
For our experiments, we took 79 problems generated by Isabelle, most of which are
higher-order. Since our HOL translation can also be used for purely FOL problems
and our experiments were aimed at testing efficiency of the translation methods, we
translated all problems (both HOL and FOL) using the three translation methods we
mentioned in the previous section. We used our relevance filter [MP06] to reduce the
sizes of the problem. We ran these tests on a bank of Dual AMD Opteron processors
running at 2400MHz, using Condor1 to manage our batch jobs.
Each graph compares the success rates of the three translations, for some prover,
as the runtime per problem increases from 10 to 300 seconds. These short runtimes
are appropriate for our application of ATPs to support interactive proofs, We tested
three provers: E (Fig. 1), SPASS 2.2 (Fig. 2) and Vampire 8 (Fig. 3). We used E version
1
http://www.cs.wisc.edu/condor/

Empirically Successful Computerized Reasoning 75
80%
full
partial
70%
constant
60%

50%