Introduction

Coimbra, Portugal

Handling Common Transitive Relations in First-Order Automated Reasoning

Koen Claessen

Ann Lilliestrom

annlg@chalmers.se 0 0 Chalmers University of Technology

2016

0 2 07

We present a number of alternative ways of handling transitive binary relations that commonly occur in rst-order problems, in particular equivalence relations, total orders, and re exive, transitive relations. We show how such relations can be discovered syntactically in an input theory. We experimentally evaluate di erent treatments on problems from the TPTP, using resolution-based reasoning tools as well as instance-based tools. Our conclusions are that (1) it is bene cial to consider di erent treatments of binary relations as a user, and that (2) reasoning tools could bene t from using a preprocessor or even built-in support for certain binary relations.

Introduction

Most automated reasoning tools for rst-order logic have some kind of built-in support for reasoning about equality. Equality is one of the most common binary relations, and there are great performance bene ts from providing built-in support for equality. Together, these two advantages by far outweigh the cost of implementation.

Other common concepts for which there exists built-in support in many tools are associative, commutative operators; and real-valued, rational-valued, and integer-valued arithmetic. Again, these concepts seem to appear often enough to warrant the extra cost of implementing special support in reasoning tools.

This paper is concerned with investigating what kind of special treatment we could give to commonly appearing transitive binary relations, and what e ect this treatment has in practice. Adding special treatment of transitive relations to reasoning tools has been the subject of study before, in particular by means of chaining [ 1 ]. The transitivity axiom re exive euclidean antisymmetric transitive asymmetric total symmetric core exive @x : Rpx ; x q @x ; y ; z : Rpx ; y q ^ Rpx ; z q ñ Rpy ; z q @x ; y : Rpx ; y q ^ Rpy ; x q ñ x y @x ; y ; z : Rpx ; y q ^ Rpy ; z q ñ Rpx ; z q @x ; y : Rpx ; y q _ Rpy ; x q @x ; y : Rpx ; y q _ Rpy ; x q @x ; y : Rpx ; y q ñ Rpy ; x q @x ; y : Rpx ; y q ñ x y

By \treatment" we mean any way of logically expressing the relation. For example, a possible treatment of a binary relation R in a theory T may simply mean axiomatizing R in T . But it may also mean transforming T into a satis ability-equivalent theory T 1 where R does not even syntactically appear.

As an example, consider a theory T in which an equivalence relation R occurs. One way to deal with R is to simply axiomatize it, by means of re exivity, symmetry, and transitivity: Another way is to \borrow" the built-in equality treatment that exists in most theorem provers. We can do this by introducing a new symbol rep, and replacing all occurrences of Rpx ; y q by the formula: reppx q

reppy q The intuition here is that rep is now the representative function of the relation R. No axioms are needed. As we shall see, this alternative treatment of equivalence relations is satis ability-equivalent with the original one, and actually is bene cial in practice in certain cases.

In general, when considering alternative treatments, we strive to make use of concepts already built-in to the reasoning tool in order to express other concepts that are not built-in.

For the purpose of this paper, we have decided to focus on three di erent kinds of transitive relations: (1) equivalence relations and partial equivalence relations, (2) total orders and strict total orders, and (3) general re exive, transitive relations. The reason we decided to concentrate on these three are because (a) they appear frequently in practice, and (b) we found well-known ways but also novel ways of dealing with these.

The target audience for this paper is thus both people who use reasoning tools and people who implement reasoning tools. 2

Common properties of binary relations

In this section, we take a look at commonly occurring properties of binary relations, which combinations of these are interesting for us to treat specially, and how we may go about discovering these.

Take a look at Fig. 1. It lists 8 basic and common properties of binary relations. Each of these properties can be expressed using one logic clause, which makes it easy to syntactically identify the presence of such a property in a given theory.

When we investigated the number of occurrences of these properties in a subset of the TPTP problem library1 [ 7 ], we ended up with the table in Fig. 2. The table was constructed by gathering all clauses from all TPTP problems (after clausi cation), and keeping every clause that only contained one binary relation symbol and, possibly, equality. Each such clause was then categorized as an expression of a basic property of a binary relation symbol. We found only 163 such clauses that did not t any of the 8 properties we chose as basic properties, but were instead instances of two new properties. Both of these were quite esoteric and did not seem to have a standard name in mathematics.

The table also contains occurrences where a negated relation was stated to have a certain property, and also occurrences where a ipped relation (a relation with its arguments swapped) was stated to have a certain property, 1For the statistics in this paper, we decided to only look at unsorted TPTP problems with 10.000 clauses or less. and also occurrences of combined negated and ipped relations. This explains for example why the number of occurrences of total relations is the same as for asymmetric relations; if a relation is total, the negated relation is asymmetric and vice-versa.

We adopt the following notation. Given a property of binary relations prop, we introduce its negated version, which is denoted by prop . The property prop holds for R if and only if prop holds for R. Similarly, we introduce the ipped version of a property prop, which is denoted by propò. The property propò holds for R if and only if prop holds for the ipped version of R.

Using this notation, we can for example say that total is equivalent with asymmetric . Sometimes the property we call euclidean here is called right euclidean; the corresponding variant left euclidean can be denoted euclideanò. Note that prop is not the same as prop! For example, a relation R can be re exive, or re exive (which means that R is re exive), or re exive, which means that R is not re exive.

Using this notation on the 8 original basic properties from Fig. 1, we end up with 32 new basic properties that we can use. (However, as we have already seen, some of these are equivalent to others.)

This paper will look at 5 kinds of di erent binary relations, which are de ned as combinations of basic properties: equivalence relation partial equivalence relation total order strict total order re exive; transitive relation tre exive; symmetric; transitive u tsymmetric; transitive u ttotal ; antisymmetric; transitive u tantisymmetric ; asymmetric; transitive u tre exive; transitive u As a side note, in mathematics, strict total orders are sometimes de ned using a property called trichotomous, which means that exactly one of Rpx ; y q, x y , or Rpy ; x q must be true. However, when you clausify this property in the presence of transitivity, you end up with antisymmetric which says that at least one of Rpx ; y q, x y , or Rpy ; x q must be true. There seems to be no standard name in mathematics for the property antisymmetric , which is why we use this name.

In Fig. 3, we display the number of binary relations we have found in (our subset of) the TPTP for each category. The next section describes how we found these. 3

Syntactic discovery of common binary relations

If our goal is to automatically choose the right treatment of equivalence relations, total orders, etc., we must have an automatic way of identifying them in a given theory. It is easy to discover for example an equivalence relation in a theory by means of syntactic inspection. If we nd the presence of the axioms re exive, symmetric, and transitive, for the same relational symbol R, we know that R is an equivalence relation. total total total re exive re exive symmetric symmetric symmetric transitive transitive core exive core exive antisymmetric antisymmetric ô total ò ô asymmetric ô asymmetric ò ô re exiveò ô re exive ò ô symmetric ô symmetricò ô symmetric ò ô transitiveò ô transitive ò ô core exiveò ô core exive ò ô antisymmetricò ô antisymmetric ò

But there is a problem. There are other ways of axiomatizing equivalence relations. For example, a much more common way to axiomatize equivalence relations in the TPTP is to state the two properties re exive and euclidean for R.

Rather than enumerating all possible ways to axiomatize certain relations by hand, we wrote a program that computes all possible ways for any combination of basic properties to imply any other combination of basic properties. Our program generates a table that can be precomputed in a minute or so and then used to very quickly detect any alternative axiomatization of binary relations using basic properties.

Let us explain how this table was generated. We start with a list of 32 basic properties (the 8 original basic properties, plus their negated, ipped, and negated ipped versions). Firstly, we use an automated theorem prover (we used E [ 6 ]) to discover which of these are equivalent with other such properties. The result is displayed in Fig. 4. Thus, 17 basic properties can be removed from the list, because they can be expressed using other properties. The list of basic properties now has 15 elements left.

Secondly, we want to generate all implications of the form tprop1; : : ; propn u ñ prop where the set tprop1; : : ; propn u is minimal. We do this separately for each prop. The results are displayed in Fig. 5.

The procedure uses a simple constraint solver (a SAT-solver) to keep track of all implications it has tried so far, and consists of one main loop. At every loop iteration, the constraint solver guesses a set tprop1; : : ; propn u from the set of all properties P tprop u. The procedure then asks E whether or not tprop1; : : ; propn u ñ prop is valid. If it is, then we look at the proof that E produces, and print the implication tpropa; : : ; propb u ñ prop, where tpropa; : : ; propb u is the subset of properties that were used in the proof. We then also tell the constraint solver never to guess a superset of tpropa; : : ; propb u again. If the guessed implication can not be proven, we tell the constraint solver to never guess a subset of tprop1; : : ; propn u again. The procedure stops when no guesses that satisfy all constraints can be made anymore.

After the loop terminates, we may need to clean up the implications somewhat because some implications may subsume others.

In order to avoid generating inconsistent sets tprop1; : : ; propn u (that would imply any other property), we also add the arti cial inconsistent property false to the set, and generate implications for this property rst. We exclude any found implication here from the implication sets of the real properties.

This procedure generates a complete list of minimal implications. It works well in practice, especially if all guesses are maximized according to their size. The vast majority of the time is spent on the implication proofs, and no signi cant time is spent in the SAT-solver.

To detect a binary relation R with certain properties in a given theory, we simply gather all basic properties about R that occur in the theory, and then compute which other properties they imply, using the pre-generated table. In this way, we never have to do any theorem proving in order to detect a binary relation with certain properties. false ð tre exive; re exive u transitive ð teuclidean; antisymmetric u false ð tre exive; asymmetric u transitive ð teuclidean; euclidean u false ð ttotal; re exive u transitive ð ttotal; euclideanò u false ð ttotal; asymmetric u transitive ð tasymmetric; euclideanò u re exive ð ttotal u transitive ð tsymmetric; euclideanò u euclidean ð tsymmetric; asymmetric u transitive ð tasymmetric; euclidean ò u euclidean ð ttotal; symmetric u transitive ð tantisymmetric; euclidean ò u euclidean ð tantisymmetric; core exive u transitive ð teuclidean; re exive u euclidean ð ttransitive; euclidean u transitive ð teuclidean; euclidean ò u euclidean ð tantisymmetric; euclidean u transitive ð teuclidean; euclideanò u euclidean ð tre exive; core exive u transitive ð tantisymmetric; euclideanò u euclidean ð tre exive; euclidean u transitive ð teuclideanò; antisymmetric u euclidean ð tcore exive u transitive ð teuclidean ; euclideanò u euclidean ð ttotal; core exive u total ð tre exive; core exive u eeuucclliiddeeaann ðð ttrree eexxiivvee; e;ueculcidliedaenanòòuu ttoottaall ðð ttrree eexxiivvee;; eeuucclliiddeeaann òu u euclidean ð tasymmetric; euclideanò u total ð tre exive; antisymmetric u euclidean ð tsymmetric; antisymmetric u total ð tre exive; transitive u euclidean ð ttransitive; re exive ; euclidean ò u symmetric ð tcore exive u euclidean ð ttotal; euclidean u symmetric ð teuclidean ; re exive u euclidean ð ttransitive; core exive u symmetric ð teuclidean; re exive u euclidean ð tasymmetric; core exive u symmetric ð tcore exive u euclidean ð tasymmetric; euclidean u symmetric ð ttotal; euclidean u euclidean ð ttotal; euclidean ò u symmetric ð ttotal; euclidean u euclidean ð tantisymmetric; re exive ; euclidean ò u symmetric ð teuclidean; asymmetric u euclidean ð tasymmetric; euclidean ò u symmetric ð tasymmetric; euclidean u euclidean ð tsymmetric; transitive u symmetric ð tre exive; euclidean ò u euclidean ð teuclidean ; euclideanò symmetric ð tre exive; euclideanò u euclidean ð tsymmetric; euclideanò uu symmetric ð tre exive ; euclideanò u euclidean ð tre exive; euclideanò u symmetric ð tre exive ; euclidean ò u euclidean ð tre exive; symmetric; antisymmetric u symmetric ð ttotal; euclidean ò u euclidean ð tre exive; symmetric; transitive u symmetric ð tasymmetric; euclideanò u euclidean ð ttotal; euclideanò u symmetric ð teuclidean ; euclideanò u euclidean ð teuclideanò; core exive u symmetric ð ttotal; euclideanò u antisymmetric ð tcore exive u symmetric ð teuclidean; euclidean ò u antisymmetric ð tasymmetric u symmetric ð teuclidean; euclideanò u antisymmetric ð ttransitive; re exive u symmetric ð teuclidean; re exive u antisymmetric ð tre exive ; euclideanò u symmetric ð tasymmetric; euclidean ò u antisymmetric ð teuclidean; re exive u symmetric ð teuclidean ; euclidean ò u transitive ð tsymmetric; asymmetric u symmetric ð tre exive; euclidean u transitive ð ttotal; symmetric u core exive ð tsymmetric; asymmetric u transitive ð tantisymmetric; core exive u core exive ð tantisymmetric; core exive u transitive ð tcore exive u core exive ð ttransitive; euclidean ; re exive u transitive ð teuclidean; transitive u core exive ð tasymmetric; core exive u transitive ð tsymmetric; antisymmetric u core exive ð teuclidean; re exive u transitive ð tre exive; euclidean u core exive ð tre exive; antisymmetric; euclidean u transitive ð teuclidean; re exive u core exive ð tasymmetric; euclidean u transitive ð ttotal; euclidean u core exive ð teuclidean; asymmetric u transitive ð tre exive; core exive u core exive ð tsymmetric; antisymmetric u transitive ð tasymmetric; core exive u core exive ð ttotal; antisymmetric; euclidean u transitive ð tasymmetric; euclidean u core exive ð tre exive; antisymmetric; euclidean ò u transitive ð tasymmetric; transitive u core exive ð tre exive; antisymmetric; euclideanò u ttrraannssiittiivvee ðð tteaunctilisdyemamnòet;reicu;ctlridaenasnitivòe u u ccoorree eexxiivvee ðð tttsryamnmsiteitvreic; ;retraenxsiivteive;;croereexeixvieve u u transitive ð teuclidean; core exive u core exive ð tantisymmetric; euclidean ; euclideanò u transitive ð ttotal; core exive u core exive ð ttotal; antisymmetric; euclideanò u transitive ð teuclidean; asymmetric u core exive ð ttotal; euclidean; antisymmetric u transitive ð ttotal; euclidean u core exive ð ttotal; antisymmetric; euclidean ò u transitive ð tre exive; euclidean ò u core exive ð tantisymmetric; re exive ; euclidean ò u transitive ð teuclideanò; transitive u core exive ð tre exive ; euclideanò u transitive ð teuclidean; symmetric u core exive ð tantisymmetric; euclidean ; re exive u transitive ð tre exive ; euclideanò u core exive ð tasymmetric; euclideanò u ttrraannssiittiivvee ðð tttaonttails;yemumcliedtreaicn; euòculidean u ccoorree eexxiivvee ðð ttaasnytimsymmemtreictr;iecu;celuidceliadneanò u; euclidean ò u transitive ð teuclidean; antisymmetric u core exive ð teuclidean; antisymmetric; euclidean ò u transitive ð tre exive; euclideanò u core exive ð teuclidean; antisymmetric; euclideanò u transitive ð tre exive; symmetric; antisymmetric u core exive ð teuclidean; re exive; antisymmetric u transitive ð teuclideanò; core exive u core exive ð ttransitive; re exive ; euclidean ò u transitive ð tre exive; symmetric; transitive u

Handling equivalence relations

Equali cation As mentioned in the introduction, an alternative way of handling equivalence relations R is to create a new symbol rep and replace all occurrences of R with a formula involving rep:

R re exive R symmetric R transitive

T r: : Rpx ; y q : :s _

T r: : reppx q reppy q : :s

To explain the above notation: We have two theories, one on the left-hand side of the arrow, and one on the right-hand side of the arrow. The transformation transforms any theory that looks like the left-hand side into a theory that looks like the right-hand side. We write T r: : e : :s for theories in which e occurs syntactically; in the transformation, all occurrences of e should be replaced.

We call the above transformation equali cation. This transformation may be bene cial because the reasoning now involves built-in equality reasoning instead of reasoning about an unknown symbol using axioms.

The transformation is correct, meaning that it preserves (non-)satis ability: (ñ) If we have a model of the LHS theory, then R must be interpreted as an equivalence relation. Let rep be the representative function of R, in other words we have Rpx ; y q ô reppx q reppy q. Thus we also have a model of the RHS theory. (ð) If we have a model of the RHS theory, let Rpx ; y q : reppx q reppy q. It is clear that R is re exive, symmetric, and transitive, and therefore we have model of the LHS theory.

In the transformation, we also remove the axioms for re exivity, symmetry, and transitivity, because they are not needed anymore. But what if R is axiomatized as an equivalence relation using di erent axioms? Then we can remove any axiom about R that is implied by re exivity, symmetry, and transitivity. Luckily we have already computed a table of which properties imply which other ones (shown in Fig. 5).

Pequali cation There are commonly occurring binary relations called partial equivalence relations that almost behave as equivalence relations, but not quite. In particular, they do not have to obey the axiom of re exivity. Can we do something for these too?

It turns out that a set with a partial equivalence relation R can be partitioned into two subsets: (1) one subset on which R is an actual equivalence relation, and (2) one subset of elements which are not related to anything, not even themselves.

Thus, an alternative way of handling partial equivalence relations R is to create two new symbols, rep and P , and replace all occurrences of R with a formula involving rep and P:

R symmetric R transitive

T r: : Rpx ; y q : :s _

T r: : pP px q ^ P py q ^ reppx q reppy qq : :s Here, P is the predicate that indicates the subset on which R behaves as an equivalence relation.

We call this transformation pequali cation. This transformation may be bene cial because the reasoning now involves built-in equality reasoning instead of reasoning about an unknown symbol using axioms. However, there is also a clear price to pay since the size of the problem grows considerably.

The transformation is correct, meaning that it preserves (non-)satis ability: (ñ) If we have a model of the LHS theory, then R must be interpreted as a partial equivalence relation. Let P px q : Rpx ; x q, in other words P is the subset on which R behaves like an equivalence relation. Let rep be a representative function of R on P , in other words we have pP px q ^ P py qq ñ pRpx ; y q ô reppx q reppy qq. By the de nition of P we then also have Rpx ; y q ô pP px q ^ P py q ^ reppx q reppy qq. Thus we also have a model of the RHS theory. (ð) If we have a model of the RHS theory, let Rpx ; y q : P px q ^ P py q ^ reppx q reppy q. This R is symmetric and transitive, and therefore we have model of the LHS theory. 5

Handling total orders

Ordi cation Many reasoning tools have built-in support for arithmetic, in particular they support an order ¤ on numbers. It turns out that we can \borrow" this operator when handling general total orders. Suppose we have a total order: rep : A Ñ R

R total R antisymmetric R transitive

T r: : Rpx ; y q : :s We now introduce a new injective function: We then replace all occurrences of R with a formula involving rep in the following way: _ reppy q ñ x

T r: : reppx q ¤ reppy q : :s

(Here, ¤ is the order on reals.) We call this transformation ordi cation. This transformation may be bene cial because the reasoning now involves built-in arithmetic reasoning instead of reasoning about an unknown symbol using axioms.

The above transformation is correct, meaning that it preserves (non-)satis ability: (ñ) If we have a model of the LHS theory, then without loss of generality (by Lowenheim-Skolem), we can assume that the domain is countable. Also, R must be interpreted as a total order. We now construct rep recursively as a mapping from the model domain to R, such that we have Rpx ; y q ô reppx q ¤ reppy q, in the following way. Let ta0; a1; a2; : : u be the domain of the model, and set reppa0q : 0. For any n ¡ 0, pick a value for reppanq that is consistent with the total order R and all earlier domain elements ai, for 0 ¤ i n. This can always be done because there is always extra room for a new, unique element between any two distinct values of R. Thus rep is injective and we also have a model of the RHS theory. (ð) If we have a model of the RHS theory, let Rpx ; y q : reppx q ¤ reppy q. It is clear that R is total and transitive, and also antisymmetric because rep is injective, and therefore we have model of the LHS theory.

Note on Q vs. R The proof would have worked for Q as well instead of R. The transformation can therefore be used for any tool that supports Q or R or both, and should choose whichever comparison operator is cheapest if there is a choice. Using integer arithmetic would however not have been correct.

Note on strict total orders total orders, i.e. something like:

One may have expected to have a transformation speci cally targeted to strict

R antisymmetric R asymmetric R transitive

T r: : Rpx ; y q : :s _ reppy q ñ x y

T r: : reppx q reppy q : :s

However, the transformation for total orders already covers this case! Any strict total order R is also recognized as a total order R, and ordi cation already transforms such theories in the correct way. The only di erence is that Rpx ; y q is replaced with preppx q ¤ reppy qq instead of reppx q reppy q, which is satis ability-equivalent. (There may be a practical performance di erence, which is not something we have investigated.) Maxi cation Some reasoning tools do not have orders on real arithmetic built-in, but they may have other concepts that are built-in that can be used to express total orders instead. One such concept is handling of associative, commutative (AC) operators.

For such a tool, one alternative way of handling total orders R is to create a new function symbol max and replace all occurrences of R with a formula involving max:

R total R antisymmetric R transitive

T r: : Rpx ; y q : :s _ max associative max commutative @x ; y : maxpx ; y q T r: : maxpx ; y q x _ maxpx ; y q y : :s y We call this transformation maxi cation. This transformation may be bene cial because the reasoning now involves built-in equality reasoning with AC uni cation (and one extra axiom) instead of reasoning about an unknown relational symbol (using three axioms).

The above transformation is correct, meaning that it preserves (non-)satis ability: (ñ) If we have a model of the LHS theory, then R must be interpreted as a total order. Let max be the maximum function associated with this order. Clearly, it must be associative and commutative, and the third axiom also holds. Moreover, we have Rpx ; y q ô maxpx ; y q y . Thus we also have a model of the RHS theory. (ð) If we have a model of the RHS theory, let Rpx ; y q : maxpx ; y q y . Given the axioms in the RHS theory, R is total, antisymmetric, and transitive, and therefore we have model of the LHS theory.

(It may be the case that maxi cation can also be used to express orders that are weaker than total orders. At the time of this writing, we have not gured out how to do this.) 6

Handling re exive, transitive relations

Transi cation The last transformation we present is designed as an alternative treatment for any relation that is re exive and transitive. It does not make use of any built-in concept in the tool. Instead, it transforms theories with a transitivity axiom into theories without that transitivity axiom. Instead, transitivity is specialized at each positive occurrence of the relational symbol.

As such, an alternative way of handling re exive, transitive relations R is to create a new symbol Q and replace all positive occurrences of R with a formula involving Q; the negative occurrences are simply replaced by Q:

R re exive R transitive

T r: : Rpx ; y q : : Rpx ; y q : :s _

Q re exive

T r: : p@r : Qpr ; x q ñ Qpr ; y qq : :

Qpx ; y q : :s We call this transformation transi cation. This transformation may be bene cial because reasoning about transitivity in a naive way can be very expensive for theorem provers, because from transitivity there are many possible conclusions to draw that trigger each other \recursively". The transformation only works on problems where every occurrence of R is either positive or negative (and not both, such as under an equivalence operator). If this is not the case, the problem has to be translated into one where this is the case. This can for example be done by means of clausi cation.

Note that the resulting theory does not blow-up; only clauses with a positive occurrence of R gets one extra literal per occurrence.

We replace any positive occurrence of Rpx ; y q with an implication that says \for any r , if you could reach x from r , now you can reach y too". Thus, we have specialized the transitivity axiom for every positive occurrence of R.

Note that in the RHS theory, Q does not have to be transitive! Nonetheless, the transformation is correct, meaning that it preserves (non-)satis ability: (ñ) If we have a model of the LHS theory, then R is re exive and transitive. Now, set Q px ; y q : R px ; y q. Q is obviously re exive. We have to show that R px ; y q implies @r : Qpr ; x q ñ Qpr ; y q. This is indeed the case because R is transitive. Thus we also have a model of the RHS theory. (ð) If we have a model of the RHS theory, then Q is re exive. Now, set Rpx ; y q : @r : Qpr ; x q ñ Qpr ; y q. R is re exive (by re exivity of implication) and transitive (by transitivity of implication). Finally, we have to show that Qpx ; y q implies Rpx ; y q, which is the same as showing that @r : Qpr ; x q ñ Qpr ; y q implies Qpx ; y q, which is true because Q is re exive. Thus we also have a model of the RHS theory. 7

Experimental results

We evaluate the e ects of the di erent axiomatizations using two di erent resolution based theorem provers, E 1.9 [ 6 ] (with the xAuto and tAuto options) and Vampire 4.0 [ 5 ] (with the casc mode option), and two SMTsolvers, Z3 4.4.2 [ 4 ] and CVC4 1.4 [ 2 ]. The experiments were performed on a 2xQuad Core Intel Xeon E5620 processor with 24 GB physical memory, running at 2.4 GHz. We use a time limit of 5 minutes on each problem.

We started from a set of 11674 test problems from the TPTP, listed as Unsatis able, Theorem, Unknown or Open (leaving out the very large theories). For each problem, a new theory was generated for each applicable transformation. For most problems, no relation matching any of the given criteria was detected, and thus no new theories were produced for these problems. Evaluation of reasoning tools on Satis able and CounterSatis able problems is left as future work.

The experimental results are summarized in Fig. 6. equali cation pequali cation transi cation ordi cation maxi cation (430) (181) (573) (328) (328) 422 96 324 273 Equivalence relations were present in 430 of the test problems. The majority of these problems appear in the GEO and SYN categories. Interestingly, among these 430 problems, there are only 23 problems whose equivalence relations are axiomatized with transitivity axioms. The remaining 407 problems axiomatize equivalence relations with euclidean and re exivity axioms, as discussed in section 3. The number of equivalence relations in each problem ranges from 1 to 40, where problems with many equivalence relations all come from the SYN category. There is no clear correspondence between the number of equivalence relations in a problem and the performance of the prover prior to and after the transformation.

Equali cation As can be seen in Fig. 6, equali cation performs very well with Z3, and somewhat well with CVC4, while it worsens the results of the resolution based provers, which already performed well on the original problems. Using a time slicing strategy, which runs the prover on the original problem for half the time and on the transformed problem for the second half, solves a strict superset of problems than the original for all of the theorem provers used in the evaluation. Fig. 7 shows in more detail the e ect on solving times for the di erent theorem provers.

Pequali cation In 181 of the test problems, relations that are transitive and symmetric, but not re exive, were found. The majority of these problems are in the CAT and FLD categories of TPTP. All of the tested theorem provers perform worse on these problems compared to the problems with true equivalence relations. This is also the case after performing pequali cation. Pequali cation turns out to be particularly bad for E, which solves 34 fewer problems after the transformation. Pequali cation makes Z3 perform only slightly better and Vampire and CVC4 slightly worse.

Hard problems solved using Pequali cation After Pequali cation, one problem with rating 1.0 (FLD0412) is solved by Vampire. Rating 1.0 means that no known current theorem prover solves the problem in reasonable time. 7.2

Total orders Total orders were found in 328 problems. The majority are in the SWV category, and the remaining in HWV, LDA and NUM. In 77 of the problems, the total order is positively axiomatized, and in the other 251 problems it is negative (and thus axiomatized as a strict total order). There is never more than one order present in any of the problems.

Ordi cation For each of the problems, we ran the theorem provers with built-in support for arithmetic on the problems before and after applying ordi cation. Vampire was run on a version in TFF format [ 8 ], and Z3 and CVC4 on a version in SMT format [3]. The original problems were also transformed into TFF and SMT in order to achieve a relevant comparison. Ordi cation performs well for Z3, while for Vampire and CVC4 it is good for some problems and bad for some. Fig. 8 shows how solving times are a ected, and the diagrams also show great potential for time slicing, in particular for Vampire and Z3.

Hard problems solved using Ordi cation After Ordi cation, 15 problems, all from the SWV category, with rating 1.0 are solved. (SWV004-1, SWV035+1, SWV040+1, SWV044+1, SWV049+1, SWV079+1, SWV100+1, 0.1 0.01 100 3 10 z d e iilf a equ 1 0.1 0.01 e ir p m a v d e iilf a u q e 10 1 0.1 100 4 10 c v c d e iif laqu 1 e 0.1 0.01 0.01 0.1 1 original-E 10 100 SWV101+1, SWV108+1, SWV110+1, SWV113+1, SWV118+1, SWV120+1, SWV124+1, SWV130+1) Vampire and Z3 each solve 14 hard problems and CVC4 solves 13 of them. One problem (SWV004-1) is even categorized as Unknown, which means that no prover has ever solved it. After ordi cation, all three provers were able to solve it in less than a second.

Maxi cation Maxi cation, our second possible treatment of total orders, turned out to be disadvantageous to E and Z3, while having a small positive e ect on Vampire and CVC4 (see Fig. 6). 7.3

Transitive and Re exive relations 573 test problems include relations that are transitive and re exive, excluding equivalence relations and total orders. In all of them, transitivity occurs syntactically as an axiom. The problems come from a variety of categories, but a vast majority are in SET, SEU and SWV. Only about half of the original problems were solved by each theorem prover, which may indicate that transitivity axioms are di cult for current theorem provers to deal with. We evaluate the performance of the theorem provers before and after transi cation. Problems that include equivalence relations and total orders are excluded, as the corresponding methods equali cation and ordi cation give better results and should be preferred. Vampire bene ts the most from transi cation, and solves 32 new problems after the transformation, while 10 previously solvable problems become unsolvable within the time limit. For E, the e ect of transi cation is almost exclusively negative. Vampire also has a signi cantly worse performance than E overall on the original problems that include relations that are transitive and re exive. Both Z3 and CVC4 perform worse after the transformation, but time slicing can be a good way to improve the results.

Hard problems solved using Transi cation After transi cation, two new problems with rating 1.0 are solved, both by Vampire (SEU322+2 and SEU372+2). 0.1 3 i-edZ 1 ifr d o 0.1 10 4 C -dCV 1 iirfe d o 0.1 Equali cation and Transi cation Since all equivalence relations are transitive and re exive, the method for transi cation works also on equivalence relations. Comparing the two methods on the 430 problems with equivalence relations, we concluded that equali cation and transi cation work equally bad for E, Vampire and CVC4. Both transi cation and equali cation improve the results for Z3, but equali cation does so signi cantly. Ordi cation and Transi cation We compared ordi cation and transi cation on the 328 problems containing total orders. Transi cation seems to make these problems generally more di cult for theorem provers to solve, while ordi cation instead improved the results on many of the problems. Transi cation makes the theorem prover perform worse on these problems also for E, which cannot make use of ordi cation since it does not provide support for arithmetic. 8

Discussion, Conclusions, and Future Work

We have presented 5 transformations that can be applied to theories with certain transitive relations: equali cation, pequali cation, transi cation, ordi cation, and maxi cation. We have also created a method for syntactic discovery of binary relations where these transformations are applicable.

For users of reasoning tools that create their own theories, it is clear that they should consider using one of the proposed alternative treatments when writing theories. For all of our methods, there are existing theories for which some provers performed better on these theories than others. In particular, there exist 18 TPTP problems that are now solvable that weren't previously.

For implementers of reasoning tools, our conclusions are less clear. For some combinations of treatments and provers (such as transi cation for Vampire, and equali cation for Z3), overall results are clearly better, and we would thus recommend these treatments as preprocessors for these provers. Some more combinations of treatments and provers lend themselves to a time slicing strategy that can solve strictly more problems, and could thusly be integrated in a natural way in provers that already have the time slicing machinery in place. Related Work Chaining [ 1 ] is a family of methods that limit the use of transitivity-like axioms in proofs by only allowing certain chains of them to occur in proofs. The result is a complete proof system that avoids the derivation of unnecessary consequences of transitivity. However, chaining is not implemented in any of the reasoning tools we considered for this paper. In personal communication with some of the authors, chaining-like techniques have not been deemed important enough to be considered for implementation, and their preliminary experimental results were mostly negative.

Future Work There is a lot of room for improvements and other future work. There are many other relations that are more or less common that could bene t from an alternative treatment like the transformations described in this paper. In particular, maxi cation seems to be an idea that could be applied to binary relations that are weaker than total orders, which may make this treatment more e ective. But there are also other, non-transitive relations that are of interest.

Ordi cation uses total orders as the base-transformation, and treats strict total orders as negated total orders. We would like to investigate more in which cases it may be bene cial to treat strict total orders di erently from total orders, and when to use on R instead of ¤.

We would also like to investigate the e ect of our transformations on Satis able and CounterSatis able 0.1 100 3 10 z d e iifs tran 1 0.1 0.01 e ir p m a v d e iifs n a tr 10 1 0.1 100 4 10 c v c d e iif rsan 1 t 0.1 0.01 0.01 0.01 0.1 1 original-e 10 100 problems. What would be the e ect on nite model nders? Some transformations change the number of variables per clause, which will in uence the performance of nite model nders. What happens to saturationbased tools? Is it easier or harder to saturate a problem after transformation? Evaluating these questions is hard, because we did not nd many satis able problems in the TPTP that contained relevant relations (see Fig. 3). Instead, we considered using Unsatis able problems, and measuring the time it takes to show the absence of models up to a certain size.

There are other kinds of relations than binary relations. For example, we can have an ternary relation that behaves as an equivalence relation in its 2nd and 3rd argument. An alternative treatment of this relation would be to introduce a binary function symbol rep. We do not know whether or not this occurs often, and if it is a good idea to treat higher-arity relational symbols specially in this way.

Lastly, we would like to look at how these ideas could be used inside a theorem prover; as soon as the prover discovers that a relation is an equivalence relation or a total order, one of our transformations could be applied, on the y. The details of how to do this remain to be investigated. [3] Clark Barrett, Aaron Stump, Cesare Tinelli, Sascha Boehme, David Cok, David Deharbe, Bruno Dutertre, Pascal Fontaine, Vijay Ganesh, Alberto Griggio, Jim Grundy, Paul Jackson, Albert Oliveras, Sava Krsti,

[1]

Leo

Bachmair and

Harald

Ganzinger . Ordered chaining calculi for rst-order theories of transitive relations . Journal of the ACM (JACM) , 45 ( 6 ): 1007 { 1049 , 1998 .

[2]

Clark

Barrett , Christopher L. Conway, Morgan Deters, Liana Hadarean, Dejan Jovanovic, Tim King,

Andrew

Reynolds , and Cesare Tinelli. CVC4. In Proceedings of the 23rd International Conference on Computer Aided Veri cation, CAV'11 , pages 171 { 177 , Berlin, Heidelberg, 2011 . Springer-Verlag.

[4] Leonardo de Moura and Nikolaj Bj rner. Z3: An E cient SMT Solver . In Tools and Algorithms for the Construction and Analysis of Systems , 14th International Conference, TACAS 2008, Held as Part of the Joint European Conferences on Theory and Practice of Software , ETAPS 2008 , Budapest, Hungary, March 29-April 6, 2008 . Proceedings, volume 4963 of Lecture Notes in Computer Science, pages 337 { 340 . Springer, 2008 .

[5]

Laura

Kovacs and

Andrei

Voronkov . First-Order Theorem Proving and Vampire . In Computer Aided Veri - cation - 25th International Conference, CAV 2013 ,

Saint

Petersburg , Russia, July 13-19 , 2013 . Proceedings, pages 1 { 35 , 2013 .

[6]

Stephan

Schulz. The E Theorem Prover , 2015 . http://www.eprover.org/.

[7]

Geo

Sutcli e . The TPTP Problem Library , 2015 . http://www.tptp.org/.

[8] Geo

Sutcli e

, Stephan Schulz, Koen Claessen, and

Peter

Baumgartner . The TPTP Typed First-order Form and Arithmetic . In Nikolaj Bj rner and Andrei Voronkov, editor, The 18th International Conference on Logic for Programming Arti cial Intelligence and Reasoning (LPAR-18) , pages 406 { 419 , Merida , Venezuela, mar 2012 . Springer Verlag.