=Paper=
{{Paper
|id=Vol-2908/short12
|storemode=property
|title=Proof Tree Preserving Sequence Interpolation of Quantified Formulas in the Theory of Equality
|pdfUrl=https://ceur-ws.org/Vol-2908/short12.pdf
|volume=Vol-2908
|authors=Elisabeth Henkel,Jochen Hoenicke,Tanja Schindler
|dblpUrl=https://dblp.org/rec/conf/smt/HenkelHS21
}}
==Proof Tree Preserving Sequence Interpolation of Quantified Formulas in the Theory of Equality==
Proof Tree Preserving Sequence Interpolation of Quantified Formulas in the Theory of Equality (work in progress) Elisabeth Henkel, Jochen Hoenicke and Tanja Schindler Department of Computer Science, University of Freiburg, Germany Abstract Interpolation of SMT formulas is difficult, especially in the presence of quantifiers since quantifier instantiations introduce mixed terms, i.e., terms containing symbols that belong to different partitions of the input formula. Existing interpolation algorithms for quantified formulas require proof modifications or syntactical restrictions on the generated proof trees. We present a non-restrictive, proof tree preserving approach to compute inductive sequences of interpolants for quantified formulas in the theory of equality with uninterpreted functions using a single proof tree. Keywords SMT, interpolation, quantified formulas, sequence interpolants, proof tree preserving interpolation 1. Introduction We present an interpolation procedure for quantified formulas in the theory of equality with uninterpreted functions. Interpolants have various applications in model checking, for example, during abstraction refinement or invariant generation [1, 2, 3, 4]. For the analysis of sequential programs, sequence interpolants [2] that are computed from infeasible error paths, are of particular interest. For each point of a program path, a sequence interpolant provides a reason why an error is unreachable from this point on. Often, the interpolant is general enough to exclude several paths at once. Many verification problems require the ability to reason about quantifiers, which are, among others, needed in user-added assertions, for the modeling of unsupported theories or memory modeling [5, 6]. However, the support of interpolation in the presence of quantified formulas is limited. A major challenge are mixed terms that are introduced in the context of quantifier instantiations. Most interpolation algorithms are either restricted to quantifier-free input for- mulas [7, 8, 9, 10, 11, 12, 13], require proof modifications [14], or require syntactical restrictions on the generated proof trees [15, 16, 17]. Our method is based on the proof tree preserving interpolation approach [12, 13] which allows for interpolation of quantifier-free formulas in the presence of mixed literals without the need for proof modifications or solver restrictions. In particular, sequence or tree interpolants SMT 2021, 19th International Workshop on Satisfiability Modulo Theories, July 18 - 19, 2021 " henkele@informatik.uni-freiburg.de (E. Henkel); hoenicke@informatik.uni-freiburg.de (J. Hoenicke); schindle@informatik.uni-freiburg.de (T. Schindler) Β© 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 3 can be computed from one single proof of unsatisfiability. We extend this procedure to handle instantiation-based resolution proofs of quantified formulas similar to [14]. However, we simplify the presented quantifier introduction step and perform all necessary modifications only virtually such that the proof tree preserving character is restored. The contributions of this paper are a proof tree and satisfiability preserving scheme for the virtual modification of mixed terms, a scheme to interpolate instantiation clauses generated by a quantifier solver, and modifications of interpolation rules for theory lemmas and resolvents to generate valid inductive sequences of interpolants from a single proof tree. Related Work. Interpolation goes back to Craig [18] who proved that for a valid implication π΄ β π΅ in first-order logic, there always exists an interpolant (Craig called it βintermediateβ), i.e., a formula πΌ that is implied by π΄ and implies π΅, and that contains only symbols that π΄ and π΅ share. Since then, several methods for computing interpolants from proofs of unsatisfiability have been presented both for quantifier-free and quantified problems. We focus our presentation on some of the most recent and some of the closest related works, and refer to [15, 19] for an overview of existing work before 2015. Most work on interpolation in the presence of quantified formulas has been done in the context of automated theorem proving. Gleiss et al. [17] present an interpolation method based on splitting the proof of unsatisfiability into local subproofs. Their method can be applied to arbitrary first-order theories, but requires local proofs, i.e., proofs where each inference can be assigned to one partition as all occurring symbols lie in one partition. Each subproof that only uses inferences assigned to π΄ is then summarized in an intermediant. For local proofs, the conjunction of these intermediants is an interpolant. The method is implemented in the Vampire theorem prover [20]. Bonacina and Johansson [15] present a method for interpolation in first-order logic with equality in the context of the superposition calculus which first computes provisional interpolants and then quantifies over constants that are not shared between π΄ and π΅. Their method does not require local proofs, but only yields interpolants if the only non-shared symbols in the provisional interpolants are constants. The authors also discuss how the method can be used in the DPLL(Ξ + π― ) calculus [21], but to our knowledge, there does not exist an implementation. The method presented by KovΓ‘cs and Voronkov [16] does not require local proofs either. They also follow a two-step approach, first computing relational interpolants and then quantifying over non-shared symbols. In contrast to [15], their method can deal with non-shared function symbols in the relational interpolants, but is restricted to logic without equality. A basis for our work is the interpolation method by Christ and Hoenicke [14] for instantiation- based proofs in SMT solvers. It relies on proof transformations to obtain a purified proof where all instances of quantified formulas are local. This is achieved by introducing variables and auxiliary equations relating the variables to the terms they replace. To obtain an interpolant, these variables are bound by quantifiers once the terms they replace are resolved from the proof. Finally, Christ et al. [12] present a framework that allows for computing quantifier-free interpolants for quantifier-free problems from a given proof of unsatisfiability without modifying the proof. In particular, it deals with mixed literals containing non-shared terms from both π΄ and π΅ by auxiliary variables and more complex interpolation rules. We describe this method in more detail in Section 3. 4 2. Notation We assume the usual notions of first-order logic. A theory π― is defined by a signature that contains interpreted and uninterpreted constant, function, and predicate symbols, and by a set of axioms that settle the meaning for its interpreted symbols. We consider the theory of equality (with uninterpreted functions), whose axioms state reflexivity, symmetry, and transitivity for the single interpreted symbol, the equality predicate =, and congruence for each uninterpreted function symbol. As usual, a term is a variable, a constant, or an application of an π-ary function symbol to π terms. An atom is the formula that is always true (β€), an equality between two terms, or the application of an π-ary predicate to π terms. A literal is an atom or its negation. A clause is a disjunction of literals, and a formula is in conjunctive normal form (CNF) if it is a conjunction of clauses. We assume that formulas are in CNF where each clause can be preceded by universal quantifiers only. This can be accomplished by Skolemization. We usually denote functions by π, π, β, terms by π, π, π , π‘, variables by π£, π₯, π¦, π§, literals by β, clauses by πΆ, the empty clause by β₯, and formulas by π, π, π΄, π΅, πΌ. For two formulas π and π, we use π |= π to denote that π is a logical consequence of π in the theory of equality. Formula π is unsatisfiable if and only if π |= β₯. We denote the equivalence of two formulas by π β‘ π. A formula is said to be ground if it does not contain any variable. A substitution maps variables to terms. We denote the result of substituting the terms Β―π‘ for the variables π₯ Β― in a formula π(π₯ Β― ) by π(π‘Β―). We call a substitution ground if the terms Β―π‘ are ground. By π[β] we denote a formula in negation normal form where the literal β only occurs positively, and by π[π] the same formula where all occurrences of β are replaced by the formula π. 3. Preliminaries The basis for our interpolation procedure is an instantiation-based resolution proof for the unsatisfiability of a given formula. We shortly describe the structure of such proofs and how interpolants can be computed from ground resolution proofs. Instantiation-Based Resolution Proofs. An instance of a quantified formula βπ₯ Β― ) is Β― .π(π₯ the result of applying a ground substitution for the variables π₯ Β― to the body π, i.e., the formula π(π‘Β―) for ground terms Β―π‘. Most SMT solvers treat problems containing quantified formulas by successively adding instances of the quantified formulas to the ground part which is then solved in the usual way (e.g., with the DPLL(π― )/CDCL algorithm). This approach is mainly suitable to show unsatisfiability of a formula, but it can also show satisfiability in some restricted fragments. A resolution proof for the unsatisfiability of a formula in CNF is a derivation of the empty clause β₯ using the rule πΆ1 β¨ β πΆ2 β¨ Β¬β πΆ1 β¨ πΆ2 where πΆ1 and πΆ2 are clauses, and β is a literal called the pivot (literal). A resolution proof can be represented by a tree. For a formula in a theory π― that contains quantified subformulas, the tree has three types of leaves: input clauses, theory lemmas, i.e., clauses that are valid in 5 the theory π― , and instantiation clauses of the form Β¬(βπ₯ Β― )) β¨ π(π‘Β―). Note that we treat the Β― .π(π₯ quantified formula as a literal which also occurs positively as an input clause. The inner nodes are clauses obtained by resolution, and the unique root node is the empty clause β₯. Interpolation. A Craig interpolant [18] for an unsatisfiable conjunction π΄ β§ π΅ is a formula πΌ that is implied by π΄, contradicts π΅, and contains only symbols that occur in both π΄ and π΅.1 The notion of a Craig interpolant extends to formulas in a theory π― , in this case, symbols interpreted by the theory may occur in the interpolant. The notion of binary interpolants can be generalized to sequences of formulas. Given such a sequence of formulas π1 , . . . , ππ that together are unsatisfiable, a sequence interpolant [2] πΌ0 , . . . , πΌπ has the following properties: 1. the formulas πΌ0 and πΌπ are β€ and β₯, respectively, 2. the sequence of interpolants is inductive, i.e., for π = 0, . . . , π, the interpolant πΌπ together with the formula ππ+1 implies the next interpolant πΌπ+1 , and 3. for π = 1, . . . , π, each interpolant πΌπ contains only symbols that occur both in some of the formulas π1 , . . . , ππ and in some of the formulas ππ+1 , . . . ππ . We call the formulas ππ the partitions2 of the interpolation problem. If a symbol occurs in two partitions ππ and ππ , we can assume that it also occurs in all partitions between π and π without changing the symbol condition of the sequence interpolants. Therefore, we say a symbol virtually occurs in ππ if it occurs in some ππ with π β€ π and in some ππ with π β€ π. Each interpolant πΌπ in a sequence interpolant is also a binary interpolant of π΄π :β‘ π1 β§Β· Β· Β·β§ππ and π΅π :β‘ ππ+1 β§ Β· Β· Β· β§ ππ . Note that not every sequence of binary interpolants obtained by such partitioning is a sequence interpolant. A literal is mixed if there is no partition ππ where all symbols of the literal virtually occur. All literals of the input problem are obviously not mixed, but mixed literals can be introduced by Nelson-Oppen equality propagation. Interpolants can be computed from resolution proofs, e.g., with the interpolation systems by PudlΓ‘k [7] or by McMillan [8]. The general idea is to compute so-called partial interpolants for every formula proved in the intermediate steps. For resolution proofs such a formula is a clause πΆ. The conjunction Β¬πΆ is split and added to the input formulas to obtain a sequence interpolation problem. First, each term and literal is given one or more colors that correspond to the partitions where it occurs. A term that occurs in multiple partitions can be given multiple colors, or a color can be chosen arbitrarily. If a term has multiple colors, we require that the set of colors forms an interval [π, π]. We define the projection Β¬πΆ β ππ as the conjunction of all literals in Β¬πΆ that have color π. In particular, β β ππ is β if β has color π, and β€ otherwise. Then the partial interpolant of πΆ is a sequence interpolant of π1 β§ Β¬πΆ β π1 , . . . , ππ β§ Β¬πΆ β ππ . The interpolation systems of PudlΓ‘k and McMillan only differ in the way they define the coloring: for PudlΓ‘k a shared literal is colored with all partitions it occurs in, for McMillan a shared literal is colored with only the last partition it occurs in. Both systems require that each literal has at 1 As mentioned in the related work, Craig originally defined an interpolant for a valid implication π΄ β π΅, i.e., as a formula implied by π΄ and contradictory to π΅. The notion we use is also known as reverse interpolant [22]. 2 The word partition can be used in English for the sections, or the boundaries, or the whole partitioning. In this paper we use the word partition for one section of the partitioning. 6 least one color and that all symbols in the literal occur in the corresponding partition. Mixed literals require a different interpolation scheme where projections β β ππ are defined differently. Proof Tree Preserving Interpolation. The proof tree preserving interpolation framework allows for computing binary [12] and sequence or tree interpolants [13] in the presence of mixed equality literals without modifying a given proof tree. It works with equality interpolating theories [9, 23]. A mixed equality literal π = π or π ΜΈ= π is a literal where π is colored with [π β² , π] and π is colored with [π, π β² ] where these intervals are disjoint. W.l.o.g., we assume that ππ < ππ . We also say that this literal is π, π-mixed. The basic idea is to use a fresh auxiliary variable π₯ for each mixed equality literal. When projecting a π, π-mixed literal, we split it into two literals using the auxiliary variable. (π = π) β ππ :β‘ (π = π₯) (π = π) β ππ :β‘ (π₯ = π) (π = π) β ππ :β‘ β€ for π ΜΈ= π, π (π ΜΈ= π) β ππ :β‘ EQ(π₯, π) (π = ΜΈ π) β ππ :β‘ β€ for π ΜΈ= π, π ΜΈ π) β ππ :β‘ Β¬ EQ(π₯, π) (π = Here, EQ is an uninterpreted predicate that forces the interpolant to have a certain shape. The interpolation procedure treats EQ and π₯ as shared symbols, which may occur in the interpolant. Note that β β ππ β§ β β ππ is equisatisfiable to β and has the same models for the symbols in β. Again, we define the partial interpolants of a clause πΆ as the sequence interpolant for the sequence π1 β§ Β¬πΆ β π1 , . . . , ππ β§ Β¬πΆ β ππ . If the negation Β¬πΆ of a clause contains a mixed equality π = π, the partial interpolants πΌπ for πΆ with π β€ π < π may contain the auxiliary variable π₯. If Β¬πΆ contains π ΜΈ= π, we require that the variable π₯ in the interpolant πΌπ only occurs in atoms of the form EQ(π₯, π π ) which must occur positively in πΌπ . This is not really a restriction as interpolants naturally have this shape. The π π are shared terms, which are closely related to equality-interpolating terms in the sense of [9]. As mentioned earlier, an interpolant is computed by annotating each clause πΆ in the proof tree, starting with the leaves, by a partial interpolant πΌ0 , . . . , πΌπ . We denote this by πΆ : πΌ1 , . . . , πΌπβ1 (we omit πΌ0 β‘ β€ and πΌπ β‘ β₯). Resolution combines the partial interpolants of the involved clauses. To resolve on a mixed literal π = π, we combine the partial interpolants of the input clauses using the following function that replaces all occurrences of EQ(π₯, π π ) in πΌ by a copy of πΌ β² where all occurrences of π₯ are replaced by π π . ruleEq(π = π, πΌ[EQ(π₯, π 1 )]...[EQ(π₯, π π )], πΌ β² (π₯)) :β‘ πΌ[πΌ β² (π 1 )]...[πΌ β² (π π )] The following rule is used to compute the partial interpolants for the resolvent of a resolution rule from the partial interpolants of the antecedents. πΆ1 β¨ β : πΌ1 , . . . , πΌπβ1 πΆ2 β¨ Β¬β : πΌ1β² , . . . , πΌπβ1 β² πΆ1 β¨ πΆ2 : πΌ1β²β² , . . . , πΌπβ1 β²β² β§ πΌπ β¨ πΌπβ² if β has only colors π β€ π, (resolution) βͺ βͺ βͺ if β has only colors π > π, βͺ β¨πΌ β§ πΌ π π with πΌπβ²β² :β‘ βͺruleEq(β, πΌπ , πΌπ ) βͺ βͺ β² if β is π, π-mixed, π β€ π < π, β©(πΌ β¨ β) β§ (πΌ β² β¨ Β¬β) if β has colors [π, π] with π β€ π < π. βͺ π π 7 Example 1 (Mixed Literals). Consider the following two (tautological) clauses πΆ :β‘ π = π β¨ π ΜΈ= π‘ β¨ π‘ ΜΈ= π πΆ β² :β‘ π ΜΈ= π β¨ π ΜΈ= π (π) β¨ π = π (π), where π is π1 -local, π is π2 -local, and the remaining symbols are shared. A partial interpolant of πΆ is the formula EQ(π₯, π‘), since it is an interpolant of Β¬πΆ β π1 β‘ EQ(π₯, π) β§ π = π‘ and Β¬πΆ β π2 β‘ Β¬ EQ(π₯, π) β§ π = π‘ . This interpolant can be obtained by using well-known interpolation techniques for the theory of equality. Note that EQ(π₯, π‘) captures the equality-interpolating term π‘ of π = π in the sense of [9]. For the second clause πΆ β² , the formula π = π (π₯) is a partial interpolant, i.e., it is an interpolant of Β¬πΆ β² β π1 β‘ π = π₯ β§ π = π (π) and Β¬πΆ β² β π2 β‘ π₯ = π β§ π ΜΈ= π (π) . The interpolant for the resolvent of πΆ and πΆ β² can now be obtained using the resolution rule: π = π β¨ π ΜΈ= π‘ β¨ π‘ ΜΈ= π : EQ(π₯, π‘) π ΜΈ= π β¨ π ΜΈ= π (π) β¨ π = π (π) : π = π (π₯) π ΜΈ= π‘ β¨ π ΜΈ= π (π) β¨ π‘ ΜΈ= π β¨ π = π (π) : π = π (π‘) The interested reader is invited to check that the annotated partial interpolant is correct. 4. Mixed Terms One obstacle to use proof tree preserving interpolation in the presence of quantifiers are mixed terms. To illustrate the problem, let us consider the following example. Example 2 (Running Example). Take the sequence of formulas π1 , π2 , π3 with π1 : βπ₯.π(β(π₯)) = π₯ π2 : βπ¦.π (π(π¦)) ΜΈ= π π3 : βπ§.π (π§) = π§. The conjunction of the three formulas is unsatisfiable, which can be seen as follows. Instantiating π1 with π gives π(β(π)) = π. Instantiating π2 with β(π) gives π (π(β(π))) ΜΈ= π. Substituting π(β(π)) using the first equality yields π (π) ΜΈ= π, which is clearly a contradiction to π3 . A mixed term is a term that uses symbols from at least two different partitions. In the example above, a mixed term is β(π), because β only occurs in π1 and π only occurs in π2 . Mixed terms may be generated during the instantiation process of quantified formulas. They can thus appear in a resolution proof within instantiation clauses and theory lemmas but not in input clauses. A resolution node may contain a mixed term if it was already contained in one of its antecedents. Interpolation algorithms based on resolution proofs usually rely upon non-mixed terms. We do not want to restrict the instantiation procedure to local terms as this would negatively effect the performance of the solver. Transforming the proofs into localized proofs in the context of sequence interpolation can break the inductivity property if not done in a consistent manner. So it would require to either localize the proof for all partitions at once or to compute the interpolants by repeated binary interpolation, which requires a new proof for each interpolant. We propose a method to handle mixed terms, simply by coloring them with some input partition. This partition does not contain all symbols occurring in the term. Thus, we replace subterms of the wrong color by fresh variables to purify the term. Each variable is defined by adding an auxiliary equation to all partitions where the subtermβs outer function symbol occurs. 8 4.1. Purification of Mixed Literals We proceed by first assigning a color to each term, and purify mixed terms afterwards. As a symbol may appear in several partitions, we can assign one or multiple of these partitions as its color. However, this decision may affect the number of quantifiers within the final interpolant. We use the following coloring scheme to obtain as few quantifiers as possible. To color a term, we start by coloring its subterms from bottom up. The color of a constant symbol corresponds to the partitions it occurs in. For a function application π (π‘1 , . . . , π‘π ), we compute the intersection of the partitions where π occurs and the colors of the terms π‘1 , . . . , π‘π . If the intersection is non-empty, it is used to color the term. Otherwise, if it is empty, meaning that the term contains symbols for which there is no common partition. We color the term by all partitions in which the function symbol π occurs. The whole literal is then colored as if the top-level terms occurred in the partitions corresponding to the termsβ colors. If the literal β colored with partition π contains a mixed term, it contains some constant π or function application π (π‘1 , . . . , π‘π ) where π or π does not occur in the partition π. In the projection β β ππ , these constants and function applications are replaced by fresh variables π£π (π‘1 ,...,π‘π ) . Each fresh variable is set equal to the term it stands for by an auxiliary equation of the form π£π (π‘1 ,...,π‘π ) = π (π£π‘1 , . . . , π£π‘π ). The arguments of the function application are also replaced by fresh variables and the corresponding auxiliary equations are added to their corresponding partitions, i.e., the term is flattened. Example 3 (Purification). Take the sequence of formulas π1 , π2 , π3 from the running example (Example 2). Consider the literal β β‘ π(β(π)) = π that stems from instantiating π1 . The subterm π occurs only in partition 2. The term β(π) is mixed since β only occurs in partition 1. We assign it partition 1. The term π(β(π)) is now also assigned partition 1 as it is the intersection where π occurs (1, 2) and what its argument β(π) was assigned. The whole literal is treated as mixed equality, since the left-hand side π(β(π)) of β is colored 1 and the right-hand side π is colored 2. When computing the projection of the mixed equality β β ππ , we split β first into π(β(π)) = π₯ and π₯ = π, as we explained in Section 3. In the projection, we further replace the subterm π on the left-hand side (that does not match partition 1) with a fresh variable. The auxiliary equation is added to partition 2, since this is the partition where π occurs. β β‘ π(β(π)) = π : β β π1 β‘ π(β(π£π )) = π₯, β β π2 β‘ π₯ = π β§ π£π = π, β β π3 β‘ β€ Consider the literal β β‘ π (π(β(π))) = π that stems from instantiating π2 . As before, the term π(β(π)) is assigned partition 1. The term π (π(β(π))) is mixed as π is only in partition 2, 3. So the whole term gets assigned partitions 2, 3. Since the right-hand side π of the equality appears only in partition 2, the whole literal is assigned partition 2. When computing the projection, we replace all subterms not occurring in partition 2 with fresh variables, as well as their subterms. The auxiliary equations are added to the partitions where the corresponding function symbol appears. β β‘ π (π(β(π))) = π : β β π1 β‘ π£β = β(π£π ), β β π2 β‘ π (π(π£β )) = π β§ π£π = π, β β π3 β‘ β€ Note that if β occurred in multiple partitions, the auxiliary equation π£β = β(π£π ) would be added to all projections where β occurs in the partition. 9 4.2. Quantifier Introduction As mentioned in Section 3, our algorithm computes for each clause πΆ in the resolution proof a partial interpolant of π1 β§Β¬πΆ β π1 , . . . , ππ β§Β¬πΆ β ππ . Interpolation of leaf clauses is performed based on the color of literals, i.e., the interpolants are computed as if the literal occurred in the partition corresponding to its color. This may introduce symbols from the wrong partition in the interpolant, i.e., the resulting interpolant may violate the symbol condition. The algorithm then replaces all terms containing symbols that are not allowed in the interpolant by their auxiliary variable. In many cases, these auxiliary variables are allowed to occur in the partial interpolant, since they satisfy the symbol condition with respect to the corresponding interpolation problem π1 β§ Β¬πΆ β π1 , . . . , ππ β§ Β¬πΆ β ππ . We say that a variable π£ is supported by a clause πΆ if the term that it replaced occurs within a mixed term in πΆ, that is, if the auxiliary equation defining π£ is part of one of the projections Β¬πΆ β ππ . Note that a variable may be unsupported even if the corresponding subterm occurs in the clause, e.g., π£π is not supported by π (π) ΜΈ= π, because the literal contains no mixed term and thus there is no auxiliary equation for π£π . A variable may only appear unbound in the partial interpolant of a clause πΆ if it is supported by πΆ. Thus, in a post-processing step, we bind variables that are not supported by the current clause but still occur in a partial interpolant. An existential quantifier is introduced around the interpolant πΌπ for each unsupported variable π£π where the head function symbol π appears in π΄π . Similarly, a universal quantifier is introduced for π£π if π appears in π΅π . If multiple variables become unsupported at the same time, we need to be careful to add the quantifiers in the right order. If one of the variables corresponds to a subterm of the other, then the variable for the outer term is defined using the variable of the subterm. Hence its quantifier needs to appear after the quantifier for the variable of the outer term. Since we add quantifiers from inside to outside, we need to add the quantifiers in inverse dependency order, i.e., the quantifier for the variable corresponding to the outermost term needs to be added first. 5. Interpolation for Quantified Formulas In this section, we show how to compute partial sequence interpolants for the different clause types involved in a proof of unsatisfiability for quantified input formulas. The overall algorithm follows a simple common theme: first compute the interpolants using only the colors we assigned to literals and ignore the symbol condition. Then introduce the fresh variable π£π‘ for every term π‘ that violates the symbol condition. Finally, introduce quantifiers for each variable π£π‘ that is unsupported by the current clause, as outlined in Section 4.2. In the following, we concretize this for the different kind of clauses in the resolution proof. Input Clauses. Input clauses cannot contain mixed terms. The interpolant πΌπ of an input clause πΆ is either Β¬(Β¬πΆ β π΄π ) if the clause πΆ is part of π΄π , or Β¬πΆ β π΅π if πΆ is part of π΅π . Here Β¬πΆ β π΄π removes all literals from πΆ that do not have any color from π΄π . For PudlΓ‘kβs algorithm where each literal is colored with all partitions it occurs in, all literals are removed and the interpolants are always either β€ or β₯. 10 Instantiation Clauses. We say that an instantiation clause originates from the partition in which its quantified formula occurs in. We extend the definition of Β¬πΆ β π΄π to handle a π, π-mixed equality π = π as follows: if both π, π are in π΄π , then (β β π΄π ) β‘ β€, if both are in π΅π , then (β β π΄π ) β‘ β, and if π is in π΄π and π in π΅π , then (π = π β π΄π ) β‘ π₯ = π and (π ΜΈ= π β π΄π ) β‘ Β¬ EQ(π₯, π). Analogously, we define Β¬πΆ β π΅π . The partial sequence interpolant πΌπ is then computed in three steps. 1. Set πΌπ to Β¬(Β¬πΆ β π΄π ) if πΆ originates from π΄π , and to Β¬πΆ β π΅π , otherwise. 2. Replace every subterm π‘ in πΌπ that must not occur in πΌπ according to the symbol condition by its corresponding fresh variable π£π‘ . 3. Introduce quantifiers for each variable that is not supported by πΆ. Theory Lemmas. Our proof tree can contain theory lemmas for transitivity and congruence. π‘1 ΜΈ= π‘2 β¨ Β· Β· Β· β¨ π‘πβ1 ΜΈ= π‘π β¨ π‘1 = π‘π (transitivity) π‘1 ΜΈ= π‘β²1 β¨ Β· Β· Β· β¨ π‘π ΜΈ= π‘β²π β¨ π (π‘1 , . . . , π‘π ) = π (π‘β²1 , . . . , π‘β²π ) (congruence) Transitivity lemmas are interpolated by summarizing equality chains involving π΄π -colored literals in the negation of the lemma. If the single equality of the transitivity lemma is mixed, the chain EQ(π₯, π1 ) β§ π1 = Β· Β· Β· = ππ is summarized by EQ(π₯, ππ ). As the ends of the π΄π -colored chains are also π΅π -colored (the boundary terms appear in π΅π -colored literals), this satisfies the symbol condition except when a mixed term contains a subterm from a different partition. To fix the interpolant, all subterms π‘ from the wrong partition are replaced by their corresponding variable π£π‘ . Our construction guarantees that these variables are supported by πΆ. For congruence lemmas, the interpolant depends on the color of the equality π (π‘1 , . . . , π‘π ) = π (π‘β²1 , . . . , π‘β²π ). If it is colored with a partition from π΄π , the interpolant is Β¬(Β¬πΆ β π΄π ) and if it is colored with a partition from π΅π , the interpolant is Β¬πΆ β π΅π . If the equality is a mixed equality where π (π‘1 , . . . , π‘π ) is from π΄π , then π is shared and the interpolant is EQ(π₯, π (π₯1 , . . . , π₯π )) where π₯ is the variable for the projection of the mixed equality, and for π = 1, . . . , π the term π₯π is the auxiliary variable of π‘π = π‘β²π , if that equality is mixed, or π‘π if the equality is colored with π΅π , or π‘β²π if it is colored with π΄π . Again, the subterms π‘ from the wrong partition are replaced by their corresponding variable π£π‘ , which is supported by πΆ. Resolution Clauses. The partial interpolant for a resolution node is computed by the rule (resolution) in Section 3. Since the resolution rule removes a literal from both antecedent clauses, the resolvent may no longer support all variables occurring in the interpolant. So in the last step, we add quantifiers for all unsupported variables as outlined in Section 4.2. Example 4. We illustrate the algorithm on our running example (Example 2). π1 : βπ₯.π(β(π₯)) = π₯ π2 : βπ¦.π (π(π¦)) ΜΈ= π π3 : βπ§.π (π§) = π§ The symbol π occurs in partitions 2β3, π in 1β2, β in 1, π in 2. Our goal is to compute sequence interpolants πΌ1 , πΌ2 where π1 implies πΌ1 , and πΌ1 and π2 imply πΌ2 , and πΌ2 and π3 imply β₯. Also πΌ1 may only contain the symbol π and πΌ2 only the symbol π . 11 Figure 1 gives an instantiation-based resolution proof for the unsatisfiability of π1 β§ π2 β§ π3 . Each clause πΆ in the proof tree is annotated with its partial interpolant πΌ1 , πΌ2 . In the following, we explain step by step how these interpolants were computed. First, we color the literals occurring in the proof. The literals π(β(π)) = π and π (π(β(π)) = π were already explained in Section 4.1. The literal π (π(β(π))) = π (π) is colored the same way as π (π(β(π))) = π. Finally, the literal π (π) = π is not mixed and is colored with 2. To summarize, the projection of these literals to the partitions 1β3 is as follows. β β β π1 β β π2 β β π3 π(β(π)) = π π(β(π£π )) = π₯ π₯ = π β§ π£π = π β€ π (π(β(π))) = π (π) π£β = β(π£π ) π (π(π£β )) = π (π) β§ π£π = π β€ π (π(β(π))) = π π£β = β(π£π ) π (π(π£β )) = π β§ π£π = π β€ π (π) = π β€ π (π) = π β€ The first input clause on the top left of the proof tree is from π1 , i.e., it is in π΄1 and π΄2 . By the rule for input clauses its interpolants are β₯, β₯. The projections of the negated instantiation clause Inst on the top right are Β¬Inst β π1 β‘ (π1 β§EQ(π₯, π(β(π£π )))) Β¬Inst β π2 β‘ (Β¬ EQ(π₯, π)β§π£π = π) Β¬Inst β π3 β‘ β€. For πΌ1 , the quantified formula is in π΄1 , and the interpolant is first set to Β¬(Β¬πΆ β π΄1 ). Since π(β(π)) = π is 1, 2-mixed, (π(β(π)) ΜΈ= π β π΄1 ) is Β¬ EQ(π₯, π) and πΌ1 is set to EQ(π₯, π). In the final step, the forbidden symbol π is replaced by its shared variable π£π yielding EQ(π₯, π£π ). The variable π£π is supported by the clause and therefore not quantified. The interpolant πΌ2 is just β₯ as it contains only literals with colors 1β2. Resolving the first two clauses builds the disjunction of the corresponding interpolants, because the literal is colored with partition 1. The equality in the congruence lemma, in the second row in Figure 1, is colored with partition 2. Therefore, interpolant πΌ1 is first set to Β¬πΆ β π΅1 , which is π(β(π)) = π₯. As the function symbol β may not occur in the interpolant, the term β(π) is replaced by its auxiliary variable π£β . The interpolant πΌ2 is set to β₯ since the lemma involves no literal colored with partition 2. The result is π(π£β ) = π₯, β₯, which is indeed a sequence interpolant of π£β = β(π£π ) β§ π(β(π£π )) = π₯, π (π(π£β )) ΜΈ= π (π) β§ π₯ = π β§ π£π = π, β€. The next resolution step involves a 1, 2-mixed literal. Therefore, we use the third case (ruleEq) of rule (resolution). The first interpolants of the antecedents are combined by replacing EQ(π₯, π£π ) with π(π£β ) = π£π , yielding π(π£β ) = π£π . All literals in the transitivity lemma (third row) are colored with partition 2. Therefore the resulting interpolants are β€, β₯. The resolution step combines the interpolants with conjunction or disjunction, respectively, as the literal is colored with partition 2. The second instantiation clause is completely colored with partition 2 as is the corresponding input clause. The interpolants are just β€, β₯ and resolving them with the other branch does not change the previous interpolants. However, since the literal π (π(β(π))) ΜΈ= π gets removed in the resolution step, π£β and π£π are no longer supported. Therefore, quantifiers are introduced: the inner existential quantifier for the 1-colored outermost term π£β = β(π£π ) and the outer universal quantifier for the 2-colored inner term π£π = π. Note that although the clause still contains π in the literal π (π) ΜΈ= π, this literal is not mixed and therefore does not support π£π . The last instantiation clause is originating from π3 , hence πΌ1 is set to Β¬πΆ β π΅1 (which is β€) and πΌ2 is set to Β¬πΆ β π΅2 . Since π (π) ΜΈ= π is colored with partition 2, the second interpolant is 12 βπ₯.π(β(π₯)) = π₯ β₯, β₯ Β¬(βπ₯.π(β(π₯)) = π₯), π(β(π)) = π EQ(π₯, π£π ), β₯ π(β(π)) = π EQ(π₯, π£π ), β₯ π(β(π)) ΜΈ= π, π (π(β(π))) = π (π) π(π£β ) = π₯, β₯ π (π(β(π))) = π (π) π(π£β ) = π£π , β₯ π (π(β(π))) ΜΈ= π (π), π (π) ΜΈ= π, π (π(β(π))) = π β€, β₯ π (π) ΜΈ= π, π (π(β(π))) = π π(π£β ) = π£π , β₯ βπ¦.π (π(π¦)) ΜΈ= π β€, β₯ Β¬(βπ¦.π (π(π¦)) ΜΈ= π), π (π(β(π))) ΜΈ= π β€, β₯ π (π(β(π))) ΜΈ= π β€, β₯ π (π) ΜΈ= π βπ£π .βπ£β .π(π£β ) = π£π , β₯ Input Clause βπ§.π (π§) = π§ β€, β€ Β¬(βπ§.π (π§) = π§), π (π) = π β€, βπ£π .π (π£π ) ΜΈ= π£π Instantiation Clause π (π) = π β€, βπ£π .π (π£π ) ΜΈ= π£π Theory Lemma Resolvent β₯ βπ£π .βπ£β .π(π£β ) = π£π , βπ£π .π (π£π ) ΜΈ= π£π Figure 1: Refutation by resolution for Example 4 annotated by partial sequence interpolants. π (π) ΜΈ= π. In the next step, π is replaced by its auxiliary variable π£π and since this is not supported by the clause, an existential quantifier must be introduced. So the partial sequence interpolant is β€, βπ£π .π (π£π ) ΜΈ= π£π . This is indeed a valid interpolant for the sequence β€, π (π) ΜΈ= π, βπ§.π (π§) = π§. In the last two resolution steps, the interpolants are combined with conjunction or disjunction. Since for each resolution step one of the interpolants is β€ or β₯, the other interpolant is not changed. The final sequence interpolant, i.e., βπ£π βπ£β .π(π£β ) = π£π , βπ£π .π (π£π ) ΜΈ= π£π , can be found in the annotation of the empty clause. 6. Conclusion and Future Work We presented an algorithm to obtain inductive sequences of interpolants from instantiation- based refutation proofs of quantified formulas in the theory of equality. In contrast to most existing algorithms that allow for interpolation of quantified formulas, our approach works on non-local proofs. It neither requires to restrict the inference done by the SMT solver, nor does it require transformations of the proof tree. Mixed terms occurring in a proof tree due to quantifier instantiations are directly treated using virtual purification. To obtain (partial) interpolants satisfying the symbol condition, local terms are replaced by auxiliary variables that are eventually bound by quantifiers. Our algorithm computes sequence interpolants from a single proof of unsatisfiability instead of repeatedly performing binary interpolation on several proof trees. Implementation of the algorithm in SMTInterpol [24] is ongoing work. We plan to extend our algorithm to tree interpolants as well as to other theories including linear arithmetic. 13 Acknowledgments This work was partially supported by the German Research Council (DFG) under HO 5606/1-2. References [1] T. A. Henzinger, R. Jhala, R. Majumdar, K. L. McMillan, Abstractions from proofs, in: N. D. Jones, X. Leroy (Eds.), Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2004, Venice, Italy, January 14-16, 2004, ACM, 2004, pp. 232β244. URL: https://doi.org/10.1145/964001.964021. doi:10.1145/964001. 964021. [2] K. L. McMillan, Lazy abstraction with interpolants, in: T. Ball, R. B. Jones (Eds.), Computer Aided Verification, 18th International Conference, CAV 2006, Seattle, WA, USA, August 17-20, 2006, Proceedings, volume 4144 of Lecture Notes in Computer Science, Springer, 2006, pp. 123β136. URL: https://doi.org/10.1007/11817963_14. doi:10.1007/11817963\_14. [3] K. L. McMillan, Quantified invariant generation using an interpolating saturation prover, in: C. R. Ramakrishnan, J. Rehof (Eds.), Tools and Algorithms for the Construction and Analysis of Systems, 14th International Conference, TACAS 2008, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March 29-April 6, 2008. Proceedings, volume 4963 of Lecture Notes in Computer Science, Springer, 2008, pp. 413β427. URL: https://doi.org/10.1007/978-3-540-78800-3_31. doi:10.1007/978-3-540-78800-3\_31. [4] M. Heizmann, J. Hoenicke, A. Podelski, Nested interpolants, in: POPL, ACM, 2010, pp. 471β482. [5] M. N. Seghir, A. Podelski, T. Wies, Abstraction refinement for quantified array assertions, in: J. Palsberg, Z. Su (Eds.), Static Analysis, 16th International Symposium, SAS 2009, Los Angeles, CA, USA, August 9-11, 2009. Proceedings, volume 5673 of Lecture Notes in Com- puter Science, Springer, 2009, pp. 3β18. URL: https://doi.org/10.1007/978-3-642-03237-0_3. doi:10.1007/978-3-642-03237-0\_3. [6] R. Blanc, A. Gupta, L. KovΓ‘cs, B. Kragl, Tree interpolation in Vampire, in: K. L. McMillan, A. Middeldorp, A. Voronkov (Eds.), Logic for Programming, Artificial Intel- ligence, and Reasoning - 19th International Conference, LPAR-19, Stellenbosch, South Africa, December 14-19, 2013. Proceedings, volume 8312 of Lecture Notes in Computer Science, Springer, 2013, pp. 173β181. URL: https://doi.org/10.1007/978-3-642-45221-5_13. doi:10.1007/978-3-642-45221-5\_13. [7] P. PudlΓ‘k, Lower bounds for resolution and cutting plane proofs and monotone computations, J. Symb. Log. 62 (1997) 981β998. URL: https://doi.org/10.2307/2275583. doi:10.2307/2275583. [8] K. L. McMillan, An interpolating theorem prover, in: K. Jensen, A. Podelski (Eds.), Tools and Algorithms for the Construction and Analysis of Systems, 10th International Conference, TACAS 2004, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2004, Barcelona, Spain, March 29 - April 2, 2004, Proceedings, 14 volume 2988 of Lecture Notes in Computer Science, Springer, 2004, pp. 16β30. URL: https: //doi.org/10.1007/978-3-540-24730-2_2. doi:10.1007/978-3-540-24730-2\_2. [9] G. Yorsh, M. Musuvathi, A combination method for generating interpolants, in: R. Nieuwen- huis (Ed.), Automated Deduction - CADE-20, 20th International Conference on Automated Deduction, Tallinn, Estonia, July 22-27, 2005, Proceedings, volume 3632 of Lecture Notes in Computer Science, Springer, 2005, pp. 353β368. URL: https://doi.org/10.1007/11532231_26. doi:10.1007/11532231\_26. [10] A. Cimatti, A. Griggio, R. Sebastiani, Efficient interpolant generation in satisfiability modulo theories, in: C. R. Ramakrishnan, J. Rehof (Eds.), Tools and Algorithms for the Construction and Analysis of Systems, 14th International Conference, TACAS 2008, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March 29-April 6, 2008. Proceedings, volume 4963 of Lecture Notes in Computer Science, Springer, 2008, pp. 397β412. URL: https://doi.org/10.1007/978-3-540-78800-3_30. doi:10.1007/978-3-540-78800-3\_30. [11] A. Fuchs, A. Goel, J. Grundy, S. Krstic, C. Tinelli, Ground interpolation for the theory of equality, in: S. Kowalewski, A. Philippou (Eds.), Tools and Algorithms for the Construction and Analysis of Systems, 15th International Conference, TACAS 2009, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, York, UK, March 22-29, 2009. Proceedings, volume 5505 of Lecture Notes in Computer Science, Springer, 2009, pp. 413β427. URL: https://doi.org/10.1007/978-3-642-00768-2_34. doi:10. 1007/978-3-642-00768-2\_34. [12] J. Christ, J. Hoenicke, A. Nutz, Proof tree preserving interpolation, in: TACAS, volume 7795 of Lecture Notes in Computer Science, Springer, 2013, pp. 124β138. [13] J. Christ, J. Hoenicke, Proof tree preserving tree interpolation, J. Autom. Reasoning 57 (2016) 67β95. [14] J. Christ, J. Hoenicke, Instantiation-based interpolation for quantified formulae, in: N. BjΓΈrner, R. Nieuwenhuis, H. Veith, A. Voronkov (Eds.), Decision Procedures in Software, Hardware and Bioware, 18.04. - 23.04.2010, volume 10161 of Dagstuhl Seminar Proceedings, Schloss Dagstuhl - Leibniz-Zentrum fΓΌr Informatik, Germany, 2010. URL: http://drops. dagstuhl.de/opus/volltexte/2010/2735/. [15] M. P. Bonacina, M. Johansson, On interpolation in automated theorem proving, J. Autom. Reason. 54 (2015) 69β97. URL: https://doi.org/10.1007/s10817-014-9314-0. doi:10.1007/ s10817-014-9314-0. [16] L. KovΓ‘cs, A. Voronkov, First-order interpolation and interpolating proof systems, in: T. Eiter, D. Sands (Eds.), LPAR-21, 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Maun, Botswana, May 7-12, 2017, volume 46 of EPiC Series in Computing, EasyChair, 2017, pp. 49β64. URL: https://easychair.org/publications/ paper/r2J. [17] B. Gleiss, L. KovΓ‘cs, M. Suda, Splitting proofs for interpolation, in: L. de Moura (Ed.), Automated Deduction - CADE 26 - 26th International Conference on Automated De- duction, Gothenburg, Sweden, August 6-11, 2017, Proceedings, volume 10395 of Lec- ture Notes in Computer Science, Springer, 2017, pp. 291β309. URL: https://doi.org/10.1007/ 978-3-319-63046-5_18. doi:10.1007/978-3-319-63046-5\_18. [18] W. Craig, Three uses of the Herbrand-Gentzen theorem in relating model theory and 15 proof theory, J. Symb. Log. 22 (1957) 269β285. URL: https://doi.org/10.2307/2963594. doi:10.2307/2963594. [19] M. P. Bonacina, M. Johansson, Interpolation systems for ground proofs in automated deduction: a survey, J. Autom. Reason. 54 (2015) 353β390. URL: https://doi.org/10.1007/ s10817-015-9325-5. doi:10.1007/s10817-015-9325-5. [20] L. KovΓ‘cs, A. Voronkov, First-order theorem proving and Vampire, in: N. Sharygina, H. Veith (Eds.), Computer Aided Verification - 25th International Conference, CAV 2013, Saint Petersburg, Russia, July 13-19, 2013. Proceedings, volume 8044 of Lecture Notes in Com- puter Science, Springer, 2013, pp. 1β35. URL: https://doi.org/10.1007/978-3-642-39799-8_1. doi:10.1007/978-3-642-39799-8\_1. [21] L. M. de Moura, N. BjΓΈrner, Engineering DPLL(T) + saturation, in: A. Armando, P. Baum- gartner, G. Dowek (Eds.), Automated Reasoning, 4th International Joint Conference, IJCAR 2008, Sydney, Australia, August 12-15, 2008, Proceedings, volume 5195 of Lec- ture Notes in Computer Science, Springer, 2008, pp. 475β490. URL: https://doi.org/10.1007/ 978-3-540-71070-7_40. doi:10.1007/978-3-540-71070-7\_40. [22] L. KovΓ‘cs, A. Voronkov, Interpolation and symbol elimination, in: R. A. Schmidt (Ed.), Automated Deduction - CADE-22, 22nd International Conference on Automated Deduction, Montreal, Canada, August 2-7, 2009. Proceedings, volume 5663 of Lecture Notes in Computer Science, Springer, 2009, pp. 199β213. URL: https://doi.org/10.1007/978-3-642-02959-2_17. doi:10.1007/978-3-642-02959-2\_17. [23] R. Bruttomesso, S. Ghilardi, S. Ranise, Quantifier-free interpolation in combinations of equality interpolating theories, ACM Trans. Comput. Log. 15 (2014) 5:1β5:34. URL: https://doi.org/10.1145/2490253. doi:10.1145/2490253. [24] J. Christ, J. Hoenicke, A. Nutz, SMTInterpol: An interpolating SMT solver, in: SPIN, volume 7385 of Lecture Notes in Computer Science, Springer, 2012, pp. 248β254. 16