Handling Common Transitive Relations
                    in First-Order Automated Reasoning

                                      Koen Claessen                Ann Lillieström
                                            Chalmers University of Technology
                                                {koen,annl}@chalmers.se


                                                        Abstract
                        We present a number of alternative ways of handling transitive binary
                        relations that commonly occur in first-order problems, in particular
                        equivalence relations, total orders, and reflexive, transitive relations.
                        We show how such relations can be discovered syntactically in an in-
                        put theory. We experimentally evaluate different treatments on prob-
                        lems from the TPTP, using resolution-based reasoning tools as well as
                        instance-based tools. Our conclusions are that (1) it is beneficial to
                        consider different treatments of binary relations as a user, and that (2)
                        reasoning tools could benefit from using a preprocessor or even built-in
                        support for certain binary relations.

1    Introduction
Most automated reasoning tools for first-order logic have some kind of built-in support for reasoning about equal-
ity. Equality is one of the most common binary relations, and there are great performance benefits from providing
built-in support for equality. Together, these two advantages by far outweigh the cost of implementation.
   Other common concepts for which there exists built-in support in many tools are associative, commutative
operators; and real-valued, rational-valued, and integer-valued arithmetic. Again, these concepts seem to appear
often enough to warrant the extra cost of implementing special support in reasoning tools.
   This paper is concerned with investigating what kind of special treatment we could give to commonly appearing
transitive binary relations, and what effect this treatment has in practice. Adding special treatment of transitive
relations to reasoning tools has been the subject of study before, in particular by means of chaining [1]. The
transitivity axiom

     @x , y, z . Rpx , yq ^ Rpy, z q ñ Rpx , z q

can lead to an expensive proof exploration in resolution and superposition based theorem provers, and can
generate a huge number of instances in instance-based provers and SMT-solvers. Transitive relations are also
common enough to motivate special built-in support. However, as far as we know, chaining is not implemented
in any of the major first-order reasoning tools (at least not in E [6], Vampire [5], Z3 [4], and CVC4 [2], which
were used in this paper).
   As an alternative to adding built-in support, in this paper we mainly look at (1) what a user of a reasoning
tool may do herself to optimize the handling of these relations, and (2) how a preprocessing tool may be able to
do this automatically. Adding built-in reasoning support in the tools themselves is not a main concern of this
paper.

    Copyright c by the paper’s authors. Copying permitted for private and academic purposes.
   In: P. Fontaine, S. Schulz, J. Urban (eds.): Proceedings of the 5th Workshop on Practical Aspects of Automated Reasoning
(PAAR 2016), Coimbra, Portugal, 02-07-2016, published at http://ceur-ws.org


                                                             11
                                     reflexive     ” @x        . Rpx , x q
                                     euclidean     ” @x , y, z . Rpx , yq ^ Rpx , z q ñ Rpy, z q
                                     antisymmetric ” @x , y . Rpx , yq ^ Rpy, x q ñ x “ y
                                     transitive    ” @x , y, z . Rpx , yq ^ Rpy, z q ñ Rpx , z q
                                     asymmetric    ” @x , y . Rpx , yq _ Rpy, x q
                                     total         ” @x , y . Rpx , yq _ Rpy, x q
                                     symmetric     ” @x , y . Rpx , yq ñ Rpy, x q
                                     coreflexive   ” @x , y . Rpx , yq ñ x “ y

                                Figure 1: Definitions of basic properties of binary relations
   By “treatment” we mean any way of logically expressing the relation. For example, a possible treatment of a
binary relation R in a theory T may simply mean axiomatizing R in T . But it may also mean transforming T
into a satisfiability-equivalent theory T 1 where R does not even syntactically appear.
   As an example, consider a theory T in which an equivalence relation R occurs. One way to deal with R is to
simply axiomatize it, by means of reflexivity, symmetry, and transitivity:

      @x        . Rpx , x q
      @x , y . Rpx , yq ñ Rpy, x q
      @x , y, z . Rpx , yq ^ Rpy, z q ñ Rpx , z q

Another way is to “borrow” the built-in equality treatment that exists in most theorem provers. We can do this
by introducing a new symbol rep, and replacing all occurrences of Rpx , yq by the formula:

      reppx q “ reppyq

The intuition here is that rep is now the representative function of the relation R. No axioms are needed. As
we shall see, this alternative treatment of equivalence relations is satisfiability-equivalent with the original one,
and actually is beneficial in practice in certain cases.
   In general, when considering alternative treatments, we strive to make use of concepts already built-in to the
reasoning tool in order to express other concepts that are not built-in.
   For the purpose of this paper, we have decided to focus on three different kinds of transitive relations: (1)
equivalence relations and partial equivalence relations, (2) total orders and strict total orders, and (3) general
reflexive, transitive relations. The reason we decided to concentrate on these three are because (a) they appear
frequently in practice, and (b) we found well-known ways but also novel ways of dealing with these.
   The target audience for this paper is thus both people who use reasoning tools and people who implement
reasoning tools.

2     Common properties of binary relations
In this section, we take a look at commonly occurring properties of binary relations, which combinations of these
are interesting for us to treat specially, and how we may go about discovering these.
    Take a look at Fig. 1. It lists 8 basic and common properties of binary relations. Each of these properties can
be expressed using one logic clause, which makes it easy to syntactically identify the presence of such a property
in a given theory.
    When we investigated the number of occurrences of these properties in a subset of the TPTP problem library1
[7], we ended up with the table in Fig. 2. The table was constructed by gathering all clauses from all TPTP
problems (after clausification), and keeping every clause that only contained one binary relation symbol and,
possibly, equality. Each such clause was then categorized as an expression of a basic property of a binary relation
symbol. We found only 163 such clauses that did not fit any of the 8 properties we chose as basic properties,
but were instead instances of two new properties. Both of these were quite esoteric and did not seem to have a
standard name in mathematics.
    The table also contains occurrences where a negated relation was stated to have a certain property, and also
occurrences where a flipped relation (a relation with its arguments swapped) was stated to have a certain property,
    1 For the statistics in this paper, we decided to only look at unsorted TPTP problems with 10.000 clauses or less.


                                                                 12
                                              4945    reflexive
                                              2082    euclidean
                                              1874    antisymmetric
                                              1567    transitive
                                               784    asymmetric
                                               784    total
                                               388    symmetric
                                                 3    coreflexive
                                              (163    other)
                    Figure 2: Number of occurrences of binary relation properties in TPTP

                          430+18    equivalence relations
                           181+7    partial equivalence relations
                           328+7    (strict) total orders
                          573+20    reflexive, transitive relations (excluding the above)
Figure 3:      Number of occurrences of binary relations in TPTP, divided                        up    into   Theo-
rem/Unsatisfiable/Unknown/Open problems + Satisfiable/CounterSatisfiable problems.

and also occurrences of combined negated and flipped relations. This explains for example why the number of
occurrences of total relations is the same as for asymmetric relations; if a relation is total, the negated relation
is asymmetric and vice-versa.
   We adopt the following notation. Given a property of binary relations prop, we introduce its negated version,
which is denoted by prop . The property prop holds for R if and only if prop holds for R. Similarly, we
introduce the flipped version of a property prop, which is denoted by prop ò . The property prop ò holds for R if
and only if prop holds for the flipped version of R.
   Using this notation, we can for example say that total is equivalent with asymmetric . Sometimes the
property we call euclidean here is called right euclidean; the corresponding variant left euclidean can be denoted
euclidean ò . Note that prop is not the same as prop! For example, a relation R can be reflexive, or reflexive
(which means that R is reflexive), or reflexive, which means that R is not reflexive.
   Using this notation on the 8 original basic properties from Fig. 1, we end up with 32 new basic properties
that we can use. (However, as we have already seen, some of these are equivalent to others.)
   This paper will look at 5 kinds of different binary relations, which are defined as combinations of basic
properties:

    equivalence relation           ” treflexive, symmetric, transitive u
    partial equivalence relation ” tsymmetric, transitive u
    total order                    ” ttotal , antisymmetric, transitive u
    strict total order             ” tantisymmetric , asymmetric, transitive u
    reflexive, transitive relation ” treflexive, transitive u

As a side note, in mathematics, strict total orders are sometimes defined using a property called trichotomous,
which means that exactly one of Rpx , yq, x “ y, or Rpy, x q must be true. However, when you clausify this property
in the presence of transitivity, you end up with antisymmetric which says that at least one of Rpx , yq, x “ y,
or Rpy, x q must be true. There seems to be no standard name in mathematics for the property antisymmetric ,
which is why we use this name.
   In Fig. 3, we display the number of binary relations we have found in (our subset of) the TPTP for each
category. The next section describes how we found these.

3   Syntactic discovery of common binary relations
If our goal is to automatically choose the right treatment of equivalence relations, total orders, etc., we must
have an automatic way of identifying them in a given theory. It is easy to discover for example an equivalence
relation in a theory by means of syntactic inspection. If we find the presence of the axioms reflexive, symmetric,
and transitive, for the same relational symbol R, we know that R is an equivalence relation.


                                                        13
                                          total             ô total ò
                                          total             ô asymmetric
                                          total             ô asymmetric ò
                                          reflexive         ô reflexive ò
                                          reflexive         ô reflexive ò
                                          symmetric         ô symmetric
                                          symmetric         ô symmetric ò
                                          symmetric         ô symmetric ò
                                          transitive        ô transitive ò
                                          transitive        ô transitive ò
                                          coreflexive       ô coreflexive ò
                                          coreflexive       ô coreflexive ò
                                          antisymmetric     ô antisymmetric ò
                                          antisymmetric     ô antisymmetric ò

                                   Figure 4: Basic properties that are equivalent

  But there is a problem. There are other ways of axiomatizing equivalence relations. For example, a much
more common way to axiomatize equivalence relations in the TPTP is to state the two properties reflexive and
euclidean for R.
   Rather than enumerating all possible ways to axiomatize certain relations by hand, we wrote a program that
computes all possible ways for any combination of basic properties to imply any other combination of basic
properties. Our program generates a table that can be precomputed in a minute or so and then used to very
quickly detect any alternative axiomatization of binary relations using basic properties.
   Let us explain how this table was generated. We start with a list of 32 basic properties (the 8 original basic
properties, plus their negated, flipped, and negated flipped versions). Firstly, we use an automated theorem
prover (we used E [6]) to discover which of these are equivalent with other such properties. The result is
displayed in Fig. 4. Thus, 17 basic properties can be removed from the list, because they can be expressed using
other properties. The list of basic properties now has 15 elements left.
  Secondly, we want to generate all implications of the form tprop 1 , . . , prop n u ñ prop where the set
tprop 1 , . . , prop n u is minimal. We do this separately for each prop. The results are displayed in Fig. 5.
   The procedure uses a simple constraint solver (a SAT-solver) to keep track of all implications it has tried so
far, and consists of one main loop. At every loop iteration, the constraint solver guesses a set tprop 1 , . . , prop n u
from the set of all properties P´ tprop u. The procedure then asks E whether or not tprop 1 , . . , prop n u ñ prop
is valid. If it is, then we look at the proof that E produces, and print the implication tprop a , . . , prop b u ñ prop,
where tprop a , . . , prop b u is the subset of properties that were used in the proof. We then also tell the constraint
solver never to guess a superset of tprop a , . . , prop b u again. If the guessed implication can not be proven, we tell
the constraint solver to never guess a subset of tprop 1 , . . , prop n u again. The procedure stops when no guesses
that satisfy all constraints can be made anymore.
  After the loop terminates, we may need to clean up the implications somewhat because some implications
may subsume others.
   In order to avoid generating inconsistent sets tprop 1 , . . , prop n u (that would imply any other property), we
also add the artificial inconsistent property false to the set, and generate implications for this property first. We
exclude any found implication here from the implication sets of the real properties.
  This procedure generates a complete list of minimal implications. It works well in practice, especially if all
guesses are maximized according to their size. The vast majority of the time is spent on the implication proofs,
and no significant time is spent in the SAT-solver.
   To detect a binary relation R with certain properties in a given theory, we simply gather all basic properties
about R that occur in the theory, and then compute which other properties they imply, using the pre-generated
table. In this way, we never have to do any theorem proving in order to detect a binary relation with certain
properties.


                                                           14
false         ð treflexive, reflexive u                    transitive ð teuclidean, antisymmetric u
false         ð treflexive, asymmetric u                   transitive ð teuclidean, euclidean u
false         ð ttotal, reflexive u                        transitive ð ttotal, euclidean ò u
false         ð ttotal, asymmetric u                       transitive ð tasymmetric, euclidean ò u
reflexive     ð ttotal u                                   transitive ð tsymmetric, euclidean ò u
euclidean     ð tsymmetric, asymmetric u                   transitive ð tasymmetric, euclidean ò u
euclidean     ð ttotal, symmetric u                        transitive ð tantisymmetric, euclidean ò u
euclidean     ð tantisymmetric, coreflexive u              transitive ð teuclidean, reflexive u
euclidean     ð ttransitive, euclidean u                   transitive ð teuclidean, euclidean ò u
euclidean     ð tantisymmetric, euclidean u                transitive ð teuclidean, euclidean ò u
euclidean     ð treflexive, coreflexive u                  transitive ð tantisymmetric, euclidean ò u
euclidean     ð treflexive, euclidean u                    transitive ð teuclidean ò , antisymmetric u
euclidean     ð tcoreflexive u                             transitive ð teuclidean , euclidean ò u
euclidean     ð ttotal, coreflexive u                      total       ð treflexive, coreflexive u
euclidean     ð treflexive , euclidean ò u                 total       ð treflexive, euclidean ò u
                                        ò
euclidean     ð treflexive, euclidean     u                total       ð treflexive, euclidean u
euclidean     ð tasymmetric, euclidean ò u                 total       ð treflexive, antisymmetric u
euclidean     ð tsymmetric, antisymmetric u                total       ð treflexive, transitive u
                                                   ò
euclidean     ð ttransitive, reflexive , euclidean   u     symmetric ð tcoreflexive u
euclidean     ð ttotal, euclidean u                        symmetric ð teuclidean , reflexive u
euclidean     ð ttransitive, coreflexive u                 symmetric ð teuclidean, reflexive u
euclidean     ð tasymmetric, coreflexive u                 symmetric ð tcoreflexive u
euclidean     ð tasymmetric, euclidean u                   symmetric ð ttotal, euclidean u
                                    ò
euclidean     ð ttotal, euclidean     u                    symmetric ð ttotal, euclidean u
                                                       ò
euclidean     ð tantisymmetric, reflexive , euclidean    u symmetric ð teuclidean, asymmetric u
euclidean     ð tasymmetric, euclidean ò u                 symmetric ð tasymmetric, euclidean u
euclidean     ð tsymmetric, transitive u                   symmetric ð treflexive, euclidean ò u
                                         ò
euclidean     ð teuclidean , euclidean u                   symmetric ð treflexive, euclidean ò u
euclidean     ð tsymmetric, euclidean ò u                  symmetric ð treflexive , euclidean ò u
euclidean     ð treflexive, euclidean ò u                  symmetric ð treflexive , euclidean ò u
euclidean     ð treflexive, symmetric, antisymmetric u     symmetric ð ttotal, euclidean ò u
euclidean     ð treflexive, symmetric, transitive u        symmetric ð tasymmetric, euclidean ò u
euclidean     ð ttotal, euclidean ò u                      symmetric ð teuclidean , euclidean ò u
euclidean     ð teuclidean ò , coreflexive u               symmetric ð ttotal, euclidean ò u
antisymmetric ð tcoreflexive u                             symmetric ð teuclidean, euclidean ò u
antisymmetric ð tasymmetric u                              symmetric ð teuclidean, euclidean ò u
antisymmetric ð ttransitive, reflexive u                   symmetric ð teuclidean, reflexive u
antisymmetric ð treflexive , euclidean ò u                 symmetric ð tasymmetric, euclidean ò u
antisymmetric ð teuclidean, reflexive u                    symmetric ð teuclidean , euclidean ò u
transitive    ð tsymmetric, asymmetric u                   symmetric ð treflexive, euclidean u
transitive    ð ttotal, symmetric u                        coreflexive ð tsymmetric, asymmetric u
transitive    ð tantisymmetric, coreflexive u              coreflexive ð tantisymmetric, coreflexive u
transitive    ð tcoreflexive u                             coreflexive ð ttransitive, euclidean , reflexive u
transitive    ð teuclidean, transitive u                   coreflexive ð tasymmetric, coreflexive u
transitive    ð tsymmetric, antisymmetric u                coreflexive ð teuclidean, reflexive u
transitive    ð treflexive, euclidean u                    coreflexive ð treflexive, antisymmetric, euclidean u
transitive    ð teuclidean, reflexive u                    coreflexive ð tasymmetric, euclidean u
transitive    ð ttotal, euclidean u                        coreflexive ð teuclidean, asymmetric u
transitive    ð treflexive, coreflexive u                  coreflexive ð tsymmetric, antisymmetric u
transitive    ð tasymmetric, coreflexive u                 coreflexive ð ttotal, antisymmetric, euclidean u
transitive    ð tasymmetric, euclidean u                   coreflexive ð treflexive, antisymmetric, euclidean ò u
transitive    ð tasymmetric, transitive u                  coreflexive ð treflexive, antisymmetric, euclidean ò u
transitive    ð tantisymmetric, transitive u               coreflexive ð tsymmetric, transitive, reflexive u
transitive    ð teuclidean ò , euclidean ò u               coreflexive ð ttransitive, reflexive , coreflexive u
transitive    ð teuclidean, coreflexive u                  coreflexive ð tantisymmetric, euclidean , euclidean ò u
transitive    ð ttotal, coreflexive u                      coreflexive ð ttotal, antisymmetric, euclidean ò u
transitive    ð teuclidean, asymmetric u                   coreflexive ð ttotal, euclidean, antisymmetric u
transitive    ð ttotal, euclidean u                        coreflexive ð ttotal, antisymmetric, euclidean ò u
transitive    ð treflexive, euclidean ò u                  coreflexive ð tantisymmetric, reflexive , euclidean ò u
transitive    ð teuclidean ò , transitive u                coreflexive ð treflexive , euclidean ò u
transitive    ð teuclidean, symmetric u                    coreflexive ð tantisymmetric, euclidean , reflexive u
transitive    ð treflexive , euclidean ò u                 coreflexive ð tasymmetric, euclidean ò u
transitive    ð tantisymmetric, euclidean u                coreflexive ð tasymmetric, euclidean ò u
transitive    ð ttotal, euclidean ò u                      coreflexive ð tantisymmetric, euclidean , euclidean ò u
transitive    ð teuclidean, antisymmetric u                coreflexive ð teuclidean, antisymmetric, euclidean ò u
transitive    ð treflexive, euclidean ò u                  coreflexive ð teuclidean, antisymmetric, euclidean ò u
transitive    ð treflexive, symmetric, antisymmetric u     coreflexive ð teuclidean, reflexive, antisymmetric u
transitive    ð teuclidean ò , coreflexive u               coreflexive ð ttransitive, reflexive , euclidean ò u
transitive    ð treflexive, symmetric, transitive u


                 Figure 5: The complete list of implications between properties


                                                    15
4   Handling equivalence relations
Equalification As mentioned in the introduction, an alternative way of handling equivalence relations R is
to create a new symbol rep and replace all occurrences of R with a formula involving rep:

    R reflexive
    R symmetric            _
    R transitive
    T r. . Rpx , yq . .s         T r. . reppx q “ reppyq . .s

To explain the above notation: We have two theories, one on the left-hand side of the arrow, and one on the
right-hand side of the arrow. The transformation transforms any theory that looks like the left-hand side into
a theory that looks like the right-hand side. We write T r. . e . .s for theories in which e occurs syntactically; in
the transformation, all occurrences of e should be replaced.
   We call the above transformation equalification. This transformation may be beneficial because the reasoning
now involves built-in equality reasoning instead of reasoning about an unknown symbol using axioms.
   The transformation is correct, meaning that it preserves (non-)satisfiability: (ñ) If we have a model of the
LHS theory, then R must be interpreted as an equivalence relation. Let rep be the representative function of R,
in other words we have Rpx , yq ô reppx q “ reppyq. Thus we also have a model of the RHS theory. (ð) If we
have a model of the RHS theory, let Rpx , yq :“ reppx q “ reppyq. It is clear that R is reflexive, symmetric, and
transitive, and therefore we have model of the LHS theory.
   In the transformation, we also remove the axioms for reflexivity, symmetry, and transitivity, because they
are not needed anymore. But what if R is axiomatized as an equivalence relation using different axioms? Then
we can remove any axiom about R that is implied by reflexivity, symmetry, and transitivity. Luckily we have
already computed a table of which properties imply which other ones (shown in Fig. 5).
Pequalification There are commonly occurring binary relations called partial equivalence relations that almost
behave as equivalence relations, but not quite. In particular, they do not have to obey the axiom of reflexivity.
Can we do something for these too?
  It turns out that a set with a partial equivalence relation R can be partitioned into two subsets: (1) one subset
on which R is an actual equivalence relation, and (2) one subset of elements which are not related to anything,
not even themselves.
  Thus, an alternative way of handling partial equivalence relations R is to create two new symbols, rep and P ,
and replace all occurrences of R with a formula involving rep and P:

    R symmetric
    R transitive           _
    T r. . Rpx , yq . .s         T r. . pP px q ^ P pyq ^ reppx q “ reppyqq . .s

Here, P is the predicate that indicates the subset on which R behaves as an equivalence relation.
    We call this transformation pequalification. This transformation may be beneficial because the reasoning now
involves built-in equality reasoning instead of reasoning about an unknown symbol using axioms. However, there
is also a clear price to pay since the size of the problem grows considerably.
    The transformation is correct, meaning that it preserves (non-)satisfiability: (ñ) If we have a model of the
LHS theory, then R must be interpreted as a partial equivalence relation. Let P px q :“ Rpx , x q, in other words P
is the subset on which R behaves like an equivalence relation. Let rep be a representative function of R on P , in
other words we have pP px q ^ P pyqq ñ pRpx , yq ô reppx q “ reppyqq. By the definition of P we then also have
Rpx , yq ô pP px q ^ P pyq ^ reppx q “ reppyqq. Thus we also have a model of the RHS theory. (ð) If we have a
model of the RHS theory, let Rpx , yq :“ P px q ^ P pyq ^ reppx q “ reppyq. This R is symmetric and transitive,
and therefore we have model of the LHS theory.

5   Handling total orders
Ordification Many reasoning tools have built-in support for arithmetic, in particular they support an order
ď on numbers. It turns out that we can “borrow” this operator when handling general total orders. Suppose we
have a total order:


                                                           16
    R : A ˆ A Ñ Bool

We now introduce a new injective function:

    rep : A Ñ R

We then replace all occurrences of R with a formula involving rep in the following way:

    R total
    R antisymmetric        _     @x , y. reppx q “ reppyq ñ x “ y
    R transitive
    T r. . Rpx , yq . .s         T r. . reppx q ď reppyq . .s

(Here, ď is the order on reals.) We call this transformation ordification. This transformation may be beneficial
because the reasoning now involves built-in arithmetic reasoning instead of reasoning about an unknown symbol
using axioms.
   The above transformation is correct, meaning that it preserves (non-)satisfiability: (ñ) If we have a model
of the LHS theory, then without loss of generality (by Löwenheim-Skolem), we can assume that the domain is
countable. Also, R must be interpreted as a total order. We now construct rep recursively as a mapping from
the model domain to R, such that we have Rpx , yq ô reppx q ď reppyq, in the following way. Let ta0 , a1 , a2 , . .u
be the domain of the model, and set reppa0 q :“ 0. For any n ą 0, pick a value for reppan q that is consistent with
the total order R and all earlier domain elements ai , for 0 ď i ă n. This can always be done because there is
always extra room for a new, unique element between any two distinct values of R. Thus rep is injective and we
also have a model of the RHS theory. (ð) If we have a model of the RHS theory, let Rpx , yq :“ reppx q ď reppyq.
It is clear that R is total and transitive, and also antisymmetric because rep is injective, and therefore we have
model of the LHS theory.
Note on Q vs. R The proof would have worked for Q as well instead of R. The transformation can therefore
be used for any tool that supports Q or R or both, and should choose whichever comparison operator is cheapest
if there is a choice. Using integer arithmetic would however not have been correct.
Note on strict total orders One may have expected to have a transformation specifically targeted to strict
total orders, i.e. something like:

    R antisymmetric
    R asymmetric            _      @x , y. reppx q “ reppyq ñ x “ y
    R transitive
    T r. . Rpx , yq . .s           T r. . reppx q ă reppyq . .s

However, the transformation for total orders already covers this case! Any strict total order R is also recognized
as a total order R, and ordification already transforms such theories in the correct way. The only difference is
that Rpx , yq is replaced with preppx q ď reppyqq instead of reppx q ă reppyq, which is satisfiability-equivalent.
(There may be a practical performance difference, which is not something we have investigated.)
Maxification Some reasoning tools do not have orders on real arithmetic built-in, but they may have other
concepts that are built-in that can be used to express total orders instead. One such concept is handling of
associative, commutative (AC) operators.
   For such a tool, one alternative way of handling total orders R is to create a new function symbol max and
replace all occurrences of R with a formula involving max:

    R total                      max associative
    R antisymmetric        _     max commutative
    R transitive                 @x , y. maxpx , yq “ x _ maxpx , yq “ y
    T r. . Rpx , yq . .s         T r. . maxpx , yq “ y . .s

We call this transformation maxification. This transformation may be beneficial because the reasoning now
involves built-in equality reasoning with AC unification (and one extra axiom) instead of reasoning about an
unknown relational symbol (using three axioms).


                                                          17
   The above transformation is correct, meaning that it preserves (non-)satisfiability: (ñ) If we have a model
of the LHS theory, then R must be interpreted as a total order. Let max be the maximum function associated
with this order. Clearly, it must be associative and commutative, and the third axiom also holds. Moreover, we
have Rpx , yq ô maxpx , yq “ y. Thus we also have a model of the RHS theory. (ð) If we have a model of the
RHS theory, let Rpx , yq :“ maxpx , yq “ y. Given the axioms in the RHS theory, R is total, antisymmetric, and
transitive, and therefore we have model of the LHS theory.
   (It may be the case that maxification can also be used to express orders that are weaker than total orders.
At the time of this writing, we have not figured out how to do this.)

6   Handling reflexive, transitive relations
Transification The last transformation we present is designed as an alternative treatment for any relation
that is reflexive and transitive. It does not make use of any built-in concept in the tool. Instead, it transforms
theories with a transitivity axiom into theories without that transitivity axiom. Instead, transitivity is specialized
at each positive occurrence of the relational symbol.
   As such, an alternative way of handling reflexive, transitive relations R is to create a new symbol Q and
replace all positive occurrences of R with a formula involving Q; the negative occurrences are simply replaced
by Q:

    R reflexive                    Q reflexive
    R transitive            _
    T r. . Rpx , yq . .            T r. . p@r . Qpr , x q ñ Qpr , yqq . .
            Rpx , yq . .s                   Qpx , yq                  . .s

We call this transformation transification. This transformation may be beneficial because reasoning about
transitivity in a naive way can be very expensive for theorem provers, because from transitivity there are many
possible conclusions to draw that trigger each other “recursively”. The transformation only works on problems
where every occurrence of R is either positive or negative (and not both, such as under an equivalence operator).
If this is not the case, the problem has to be translated into one where this is the case. This can for example be
done by means of clausification.
    Note that the resulting theory does not blow-up; only clauses with a positive occurrence of R gets one extra
literal per occurrence.
    We replace any positive occurrence of Rpx , yq with an implication that says “for any r , if you could reach x
from r , now you can reach y too”. Thus, we have specialized the transitivity axiom for every positive occurrence
of R.
    Note that in the RHS theory, Q does not have to be transitive! Nonetheless, the transformation is correct,
meaning that it preserves (non-)satisfiability: (ñ) If we have a model of the LHS theory, then R is reflexive
and transitive. Now, set Q px , yq :“ R px , yq. Q is obviously reflexive. We have to show that R px , yq implies
@r . Qpr , x q ñ Qpr , yq. This is indeed the case because R is transitive. Thus we also have a model of the RHS
theory. (ð) If we have a model of the RHS theory, then Q is reflexive. Now, set Rpx , yq :“ @r . Qpr , x q ñ Qpr , yq.
R is reflexive (by reflexivity of implication) and transitive (by transitivity of implication). Finally, we have to
show that Qpx , yq implies Rpx , yq, which is the same as showing that @r . Qpr , x q ñ Qpr , yq implies Qpx , yq,
which is true because Q is reflexive. Thus we also have a model of the RHS theory.

7   Experimental results
We evaluate the effects of the different axiomatizations using two different resolution based theorem provers, E
1.9 [6] (with the xAuto and tAuto options) and Vampire 4.0 [5] (with the casc mode option), and two SMT-
solvers, Z3 4.4.2 [4] and CVC4 1.4 [2]. The experiments were performed on a 2xQuad Core Intel Xeon E5620
processor with 24 GB physical memory, running at 2.4 GHz. We use a time limit of 5 minutes on each problem.
   We started from a set of 11674 test problems from the TPTP, listed as Unsatisfiable, Theorem, Unknown or
Open (leaving out the very large theories). For each problem, a new theory was generated for each applicable
transformation. For most problems, no relation matching any of the given criteria was detected, and thus no new
theories were produced for these problems. Evaluation of reasoning tools on Satisfiable and CounterSatisfiable
problems is left as future work.
   The experimental results are summarized in Fig. 6.


                                                             18
                                         E                Vampire            Z3                CVC4
        equalification    (430)   422    +4   -33   428      +0   -2   362   +50    -3   370    +18   -39
        pequalification   (181)    96    +0   -34    93     +4    -9    38    +9    -4    59     +1   -11
        transification    (573)   324    +2   -26   308    +26 -11     234   +10   -46   255    +13   -42
        ordification      (328)         n/a         296    +16 -12     238   +51   -13   267    +13   -15
        maxification      (328)   273    +1   -23   296      +2   -0   238    +1   -41   267     +4    -0
Figure 6: Table showing for each theorem prover the number of test problems solved before the transformation,
how many new problems are solved after the transformation, and the number of problems that could be solved
before but not after the transformation. (Total number of applicable problems for each transformation in
parentheses). A +value in boldface indicates that there were hard problems (Rating 1.0) solved with that
combination of treatment and theorem prover. An underlined -value indicates that time slicing (running both
methods in 50% of the time each) solves a strictly larger superset of problems with that combination of treatment
and theorem prover.

7.1   Equivalence relations
Equivalence relations were present in 430 of the test problems. The majority of these problems appear in the
GEO and SYN categories. Interestingly, among these 430 problems, there are only 23 problems whose equivalence
relations are axiomatized with transitivity axioms. The remaining 407 problems axiomatize equivalence relations
with euclidean and reflexivity axioms, as discussed in section 3. The number of equivalence relations in each
problem ranges from 1 to 40, where problems with many equivalence relations all come from the SYN category.
There is no clear correspondence between the number of equivalence relations in a problem and the performance
of the prover prior to and after the transformation.
Equalification As can be seen in Fig. 6, equalification performs very well with Z3, and somewhat well with
CVC4, while it worsens the results of the resolution based provers, which already performed well on the original
problems. Using a time slicing strategy, which runs the prover on the original problem for half the time and on
the transformed problem for the second half, solves a strict superset of problems than the original for all of the
theorem provers used in the evaluation. Fig. 7 shows in more detail the effect on solving times for the different
theorem provers.
Pequalification In 181 of the test problems, relations that are transitive and symmetric, but not reflexive,
were found. The majority of these problems are in the CAT and FLD categories of TPTP. All of the tested
theorem provers perform worse on these problems compared to the problems with true equivalence relations.
This is also the case after performing pequalification. Pequalification turns out to be particularly bad for E,
which solves 34 fewer problems after the transformation. Pequalification makes Z3 perform only slightly better
and Vampire and CVC4 slightly worse.
Hard problems solved using Pequalification After Pequalification, one problem with rating 1.0 (FLD041-
2) is solved by Vampire. Rating 1.0 means that no known current theorem prover solves the problem in reasonable
time.

7.2   Total orders
Total orders were found in 328 problems. The majority are in the SWV category, and the remaining in HWV,
LDA and NUM. In 77 of the problems, the total order is positively axiomatized, and in the other 251 problems
it is negative (and thus axiomatized as a strict total order). There is never more than one order present in any
of the problems.
Ordification For each of the problems, we ran the theorem provers with built-in support for arithmetic on
the problems before and after applying ordification. Vampire was run on a version in TFF format [8], and Z3
and CVC4 on a version in SMT format [3]. The original problems were also transformed into TFF and SMT
in order to achieve a relevant comparison. Ordification performs well for Z3, while for Vampire and CVC4 it is
good for some problems and bad for some. Fig. 8 shows how solving times are affected, and the diagrams also
show great potential for time slicing, in particular for Vampire and Z3.
Hard problems solved using Ordification After Ordification, 15 problems, all from the SWV category, with
rating 1.0 are solved. (SWV004-1, SWV035+1, SWV040+1, SWV044+1, SWV049+1, SWV079+1, SWV100+1,


                                                          19
                                   100                                                                               100


                                                                              equalified-vampire
                                    10                                                                                10


                   equalified-E
                                    1                                                                                   1


                                   0.1                                                                                0.1


                               0.01                                                                                  0.01
                                   0.01   0.1   1          10      100                                                   0.01    0.1      1          10    100
                                                original-E                                                                             original-vampire


                                   100                                                                               100


                                   10                                                                                10


                                                                                                   equalified-cvc4
                   equalified-z3


                                    1                                                                                 1


                                   0.1                                                                               0.1


                               0.01                                                                              0.01
                                   0.01   0.1   1             10   100                                               0.01       0.1     1          10     100
                                                original-z3                                                                            original-cvc4


Figure 7: The time taken to solve problems, with and without equalification, using E, Vampire, Z3 and CVC4

SWV101+1, SWV108+1, SWV110+1, SWV113+1, SWV118+1, SWV120+1, SWV124+1, SWV130+1) Vam-
pire and Z3 each solve 14 hard problems and CVC4 solves 13 of them. One problem (SWV004-1) is even
categorized as Unknown, which means that no prover has ever solved it. After ordification, all three provers
were able to solve it in less than a second.
Maxification Maxification, our second possible treatment of total orders, turned out to be disadvantageous
to E and Z3, while having a small positive effect on Vampire and CVC4 (see Fig. 6).

7.3   Transitive and Reflexive relations
573 test problems include relations that are transitive and reflexive, excluding equivalence relations and total
orders. In all of them, transitivity occurs syntactically as an axiom. The problems come from a variety of
categories, but a vast majority are in SET, SEU and SWV. Only about half of the original problems were solved
by each theorem prover, which may indicate that transitivity axioms are difficult for current theorem provers
to deal with. We evaluate the performance of the theorem provers before and after transification. Problems
that include equivalence relations and total orders are excluded, as the corresponding methods equalification
and ordification give better results and should be preferred. Vampire benefits the most from transification, and
solves 32 new problems after the transformation, while 10 previously solvable problems become unsolvable within
the time limit. For E, the effect of transification is almost exclusively negative. Vampire also has a significantly
worse performance than E overall on the original problems that include relations that are transitive and reflexive.
Both Z3 and CVC4 perform worse after the transformation, but time slicing can be a good way to improve the
results.
Hard problems solved using Transification     After transification, two new problems with rating 1.0 are
solved, both by Vampire (SEU322+2 and SEU372+2).


                                                                         20
                                                                                     100                                                                100
                            100


                                                                                      10                                                                 10

         ordified-vampire
                             10


                                                                                                                                       ordified-CVC4
                                                                      ordified-Z3
                                                                                       1                                                                  1

                              1
                                                                                      0.1                                                                0.1


                             0.1
                                                                                     0.01                                                               0.01


                            0.01                                                    0.001                                                              0.001
                                0.01   0.1      1          10   100                     0.001   0.01    0.1         1       10   100                       0.001   0.01   0.1         1      10   100
                                             original-vampire                                                 original-Z3                                                    original-CVC4


                                             Figure 8: Effects of ordification, using Vampire, Z3 and CVC4

Equalification and Transification Since all equivalence relations are transitive and reflexive, the method
for transification works also on equivalence relations. Comparing the two methods on the 430 problems with
equivalence relations, we concluded that equalification and transification work equally bad for E, Vampire and
CVC4. Both transification and equalification improve the results for Z3, but equalification does so significantly.
Ordification and Transification We compared ordification and transification on the 328 problems containing
total orders. Transification seems to make these problems generally more difficult for theorem provers to solve,
while ordification instead improved the results on many of the problems. Transification makes the theorem
prover perform worse on these problems also for E, which cannot make use of ordification since it does not
provide support for arithmetic.

8   Discussion, Conclusions, and Future Work
We have presented 5 transformations that can be applied to theories with certain transitive relations: equalifica-
tion, pequalification, transification, ordification, and maxification. We have also created a method for syntactic
discovery of binary relations where these transformations are applicable.
   For users of reasoning tools that create their own theories, it is clear that they should consider using one of
the proposed alternative treatments when writing theories. For all of our methods, there are existing theories for
which some provers performed better on these theories than others. In particular, there exist 18 TPTP problems
that are now solvable that weren’t previously.
   For implementers of reasoning tools, our conclusions are less clear. For some combinations of treatments
and provers (such as transification for Vampire, and equalification for Z3), overall results are clearly better,
and we would thus recommend these treatments as preprocessors for these provers. Some more combinations
of treatments and provers lend themselves to a time slicing strategy that can solve strictly more problems, and
could thusly be integrated in a natural way in provers that already have the time slicing machinery in place.
Related Work        Chaining [1] is a family of methods that limit the use of transitivity-like axioms in proofs
by only allowing certain chains of them to occur in proofs. The result is a complete proof system that avoids
the derivation of unnecessary consequences of transitivity. However, chaining is not implemented in any of the
reasoning tools we considered for this paper. In personal communication with some of the authors, chaining-like
techniques have not been deemed important enough to be considered for implementation, and their preliminary
experimental results were mostly negative.
Future Work There is a lot of room for improvements and other future work. There are many other relations
that are more or less common that could benefit from an alternative treatment like the transformations described
in this paper. In particular, maxification seems to be an idea that could be applied to binary relations that are
weaker than total orders, which may make this treatment more effective. But there are also other, non-transitive
relations that are of interest.
   Ordification uses total orders as the base-transformation, and treats strict total orders as negated total orders.
We would like to investigate more in which cases it may be beneficial to treat strict total orders differently from
total orders, and when to use ă on R instead of ď.
   We would also like to investigate the effect of our transformations on Satisfiable and CounterSatisfiable


                                                                                                       21
                                                 100                                                                                    100


                                                                                                 transified-vampire
                                                 10                                                                                      10


                 transified-e                      1                                                                                       1


                                                 0.1                                                                                     0.1


                                                0.01                                                                                    0.01
                                                    0.01     0.1    1          10     100                                                   0.01    0.1      1          10    100
                                                                    original-e                                                                            original-vampire


                                                100                                                                                     100


                                                10                                                                                      10


                                                                                                                      transified-cvc4
                                transified-z3


                                                 1                                                                                       1


                                                0.1                                                                                     0.1


                                            0.01                                                                                    0.01
                                                0.01        0.1    1             10   100                                               0.01       0.1     1          10     100
                                                                   original-z3                                                                            original-cvc4


                                                      Figure 9: Effects of transification, using E, Vampire, Z3 and CVC4

problems. What would be the effect on finite model finders? Some transformations change the number of
variables per clause, which will influence the performance of finite model finders. What happens to saturation-
based tools? Is it easier or harder to saturate a problem after transformation? Evaluating these questions is
hard, because we did not find many satisfiable problems in the TPTP that contained relevant relations (see Fig.
3). Instead, we considered using Unsatisfiable problems, and measuring the time it takes to show the absence of
models up to a certain size.
   There are other kinds of relations than binary relations. For example, we can have an ternary relation that
behaves as an equivalence relation in its 2nd and 3rd argument. An alternative treatment of this relation would
be to introduce a binary function symbol rep. We do not know whether or not this occurs often, and if it is a
good idea to treat higher-arity relational symbols specially in this way.
   Lastly, we would like to look at how these ideas could be used inside a theorem prover; as soon as the prover
discovers that a relation is an equivalence relation or a total order, one of our transformations could be applied,
on the fly. The details of how to do this remain to be investigated.

References
[1] Leo Bachmair and Harald Ganzinger. Ordered chaining calculi for first-order theories of transitive relations.
    Journal of the ACM (JACM), 45(6):1007–1049, 1998.

[2] Clark Barrett, Christopher L. Conway, Morgan Deters, Liana Hadarean, Dejan Jovanović, Tim King, Andrew
    Reynolds, and Cesare Tinelli. CVC4. In Proceedings of the 23rd International Conference on Computer Aided
    Verification, CAV’11, pages 171–177, Berlin, Heidelberg, 2011. Springer-Verlag.

[3] Clark Barrett, Aaron Stump, Cesare Tinelli, Sascha Boehme, David Cok, David Deharbe, Bruno Dutertre,
    Pascal Fontaine, Vijay Ganesh, Alberto Griggio, Jim Grundy, Paul Jackson, Albert Oliveras, Sava Krsti,


                                                                                            22
   Michal Moskal, Leonardo De Moura, Roberto Sebastiani, To David Cok, and Jochen Hoenicke. C.: The
   SMT-LIB Standard: Version 2.0. Technical report, 2010.
[4] Leonardo de Moura and Nikolaj Bjørner. Z3: An Efficient SMT Solver. In Tools and Algorithms for the
    Construction and Analysis of Systems, 14th International Conference, TACAS 2008, Held as Part of the
    Joint European Conferences on Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March
    29-April 6, 2008. Proceedings, volume 4963 of Lecture Notes in Computer Science, pages 337–340. Springer,
    2008.
[5] Laura Kovács and Andrei Voronkov. First-Order Theorem Proving and Vampire. In Computer Aided Verifi-
    cation - 25th International Conference, CAV 2013, Saint Petersburg, Russia, July 13-19, 2013. Proceedings,
    pages 1–35, 2013.
[6] Stephan Schulz. The E Theorem Prover, 2015. http://www.eprover.org/.
[7] Geoff Sutcliffe. The TPTP Problem Library, 2015. http://www.tptp.org/.

[8] Geoff Sutcliffe, Stephan Schulz, Koen Claessen, and Peter Baumgartner. The TPTP Typed First-order Form
    and Arithmetic. In Nikolaj Bjørner and Andrei Voronkov, editor, The 18th International Conference on Logic
    for Programming Artificial Intelligence and Reasoning (LPAR-18), pages 406–419, Merida, Venezuela, mar
    2012. Springer Verlag.


                                                     23