Efficient Interpolation for the Theory of Arrays (work in progress) Jochen Hoenicke and Tanja Schindler∗ Department of Computer Science, University of Freiburg {hoenicke,schindle}@informatik.uni-freiburg.de Abstract Existing techniques for Craig interpolation for the quantifier-free fragment of the theory of arrays require a special solver. The solver needs to know in advance the partitioning (A, B) of the interpolation problem and needs to avoid creating AB-mixed terms to be suitable for interpolation. This limits the efficiency of these solvers especially when computing sequence and tree interpolants. We present a new approach using Proof Tree Preserving Interpolation and an array solver based on Weak Equivalence on Arrays. We give an interpolation algorithm for the lemmas produced by the array solver. 1 Introduction Several model-checkers [1, 2, 5, 11, 12, 13, 14, 15, 17, 18] use interpolants to find candidate in- variants to prove the correctness of software. They require efficient tools to check satisfiability of a formula in a decidable theory and to compute interpolants (usually sequence or tree inter- polants) for unsatisfiable formulas. Moreover, they often need to combine several theories, e.g., integer or bitvector theory for reasoning about numeric variables and array theory for reason- ing about pointers. In this paper we present an interpolation procedure for the quantifier-free fragment of the theory of arrays that allows for the combination with other theories and that can reuse an existing unsatisfiability proof to compute interpolants efficiently. Our method is based on the array solver presented in [6], which fits well into existing Nelson- Oppen frameworks. The solver generates lemmas (valid in the theory of arrays) that explain equalities between variables shared between different theories. The variables do not necessarily belong to the same formula in the interpolation problem and the solver does not need to know the partitioning. Instead, we use the technique of Proof Tree Preserving Interpolation [9], which can produce interpolants from existing proofs propagating equalities between symbols from different partitions. The contribution of this paper is an algorithm to interpolate the lemmas produced by the solver of the theory of arrays. This solver only produces two types of lemmas, namely a variant of the read-over-write axiom and a variant of the extensionality axiom. However, the lemmas can contain array store chains of arbitrary length which need to be handled by the interpolation procedure. Bruttomesso et al. [4] showed that adding a diff function is sufficient to get a theory of arrays that is closed under interpolation and in principle their algorithm can be used to interpolate the lemmas. We give a more efficient algorithm that exploits the special shape of these axioms. Related Work. Brillout et al. [3] give an interpolation procedure for the combination of Presburger Arithmetic and the quantifier-free theory of arrays. However, their procedure pro- duces quantified interpolants. For certain cases, their algorithm removes quantifiers by simple syntactic transformations, but in general this is not possible. ∗ This work is supported by the German Research Council (DFG) under HO 5606/1-1. Efficient Interpolation for the Theory of Arrays Hoenicke and Schindler Equality interpolating theories [21, 4] define a general framework for interpolation in the combination of quantifier-free theories. This framework handles theory lemmas that propagate equalities between variables shared between theories, enabling Nelson-Oppen theory combina- tion. The equalities may relate variables that come from different formulas in the interpolation problem. A theory is equality interpolating if it can find an interpolating term for these equali- ties that is expressed using only the symbols occuring in both parts of the interpolation problem. The algorithm of Yorsh and Musuvathi [21] only supports convex theories and is not ap- plicable to the theory of arrays. Bruttomesso et al. [4] extended the framework to non-convex theories. They also present a complete interpolation procedure for the quantifier-free theory of arrays that works for theory combination. However, their solver depends on the partioning of the interpolation problem and this can lead to exponential blow-up of the solving procedure. Our interpolation procedure works on a proof produced by a more efficient array solver that is independent of the partioning of the interpolation problem. Totla and Wies [20] present an interpolation method for arrays based on complete instan- tiations. It combines the idea of [4] with local theory extension [19]. Given an interpolation problem A and B, they define two sets K[W (A, B)] and K[W (B, A)], each using only symbols from A resp. B, that contain the instantiations of the array axioms needed to prove unsatisfi- abilty. Then an existing solver and interpolation procedure for uninterpreted functions can be used to compute the interpolant. The procedure produces a quadratic blow-up on the input formulas. We also found that their procedure fails for some extensionality lemmas, when we used it to create candidate interpolants. The last two techniques require to know the partitions at solving time. Thus, when com- puting sequence interpolants or tree interpolants, they would require either an adapted inter- polation procedure or the solver has to run multiple times. In contrast, our method can be easily extended to tree interpolation [7]. 2 Notation We assume standard first order logic. A theory T is given by a signature Σ and a set of axioms. The theory of arrays TA is parameterized by an index theory and an element theory. The signature ΣA of TA contains the select (or read ) function ·[·] and the store (or write) function ·h·  ·i. In the following, a, b, s, t will denote array terms, i, j, k index terms and v, w element terms. For an array a, index i and element v, a[i] returns the element stored in a at i, and ahi  vi returns a copy of a where the element at index i is replaced by the element v, leaving a unchanged. The functions are defined by the following axioms proposed by McCarthy [16]. ∀a i v. ahi  vi[i] = v (idx) ∀a i j v. i 6= j → ahi  vi[j] = a[j] (read-over-write) We consider the variant of the extensional theory of arrays proposed by Bruttomesso et al. [4] where the signature is extended by the function diff(·, ·) which for distinct arrays a and b returns an index where a and b differ, and an arbitrary index else. The extensionality axiom then becomes ∀a b. a[diff(a, b)] = b[diff(a, b)] → a = b . (ext-diff) The authors of [4] have shown that the quantifier-free fragment of the theory of arrays with diff, TAxDiff , is closed under interpolation. To express the interpolants conveniently we use the k notation from [20] for rewriting arrays. For k ≥ 0 we define a b for two arrays a and b 2 Efficient Interpolation for the Theory of Arrays Hoenicke and Schindler inductively as 0 k+1 k a b := a a b := ahdiff(a, b)  b[diff(a, b)]i b . k Thus, a b changes the values in a at k indices to the values stored in b. The equation k a b = b holds if and only if a and b differ at at most k indices. The indices where they differ k are the diff terms occuring in a b. An interpolation problem (A, B) is a pair of formulas where A ∧ B is unsatisfiable. A Craig interpolant for (A, B) is a formula I such that (i) A implies I, (ii) I and B are unsatisfiable and (iii) I contains only symbols shared between A and B. Given an interpolation problem (A, B), the symbols shared between A and B are called shared, symbols only occuring in A are called A-local and symbols only occuring in B, B-local. A literal, e.g. a = b, that contains A-local and B-local symbols is called mixed. 3 Preliminaries 3.1 Proof Tree Preserving Interpolation In this section we give a short overview of the proof tree preserving interpolation framework presented by Christ et al. [9]. This method defines two projections ·  A and ·  B that project a literal to its A-part resp. B-part. For a literal ` occurring only in the formula A, the projections are `  A ≡ ` and `  B ≡ >, and similar for a literal occurring only in B. This projection can be naturally extended to conjunctions of literals. Then a partial interpolant of a clause C occurring in the proof tree is defined as the interpolant of A ∧ (¬C)  A and B ∧ (¬C)  B. The paper shows that these partial interpolants can be computed inductively over the proof tree and the partial interpolant of the root is the interpolant of A and B. For a theory lemma C, a partial interpolant is computed from the interpolation problem (¬C  A, ¬C  B). The core idea of proof tree preserving interpolation is a scheme to handle mixed equalities a = b where a is A-local and b is B-local. For these a fresh variable xab is introduced and the projections are defined as follows. (a = b)  A :≡ (a = xab ) (a = b)  B :≡ (xab = b) Thus, a = b is equivalent to ∃xab .(a = b)  A ∧ (a = b)  B and xab is a new shared variable that may occur in interpolants. For disequalities we follow [10] and use an auxiliary variable xab and a Boolean auxiliary variable pxab . We define EQ(x, s) :≡ px xor x = s and define the projections for a 6= b as (a 6= b)  A :≡ EQ(xab , a) (a 6= b)  B :≡ ¬ EQ(xab , b) . For an interpolation problem (A∧(¬C)  A, B ∧(¬C)  B) where ¬C contains a 6= b we require as additional symbol condition that the interpolant has the form I[EQ(xab , s1 )] . . . [EQ(xab , sn )]1 , where s1 , . . . , sn are shared terms and each EQ term occurs positively in I. For a resolution step on the pivot literal a = b the following interpolation rule combines the partial interpolants of the input clauses to a partial interpolant of the resolvent. C1 ∨ a = b : I1 [EQ(xab , s1 )] . . . [EQ(xab , sn )] C2 ∨ a 6= b : I2 (xab ) C1 ∨ C2 : I1 [I2 (s1 )] . . . [I2 (sn )] 1 One can show that such an interpolant exists for every equality interpolating theory in the sense of Defini- tion 4.1 in [4]. The terms s1 , . . . , sn are the terms v in that definition. 3 Efficient Interpolation for the Theory of Arrays Hoenicke and Schindler 3.2 Weakly Equivalent Arrays In this section, we revisit the definitions and results about weakly equivalent arrays that are used in the decision procedure for the theory of arrays presented by Christ and Hoenicke [6]. For a formula F , let V be the set of terms that contains the array terms in F and in addition the select terms a[i] and their indices i and for every store term ahi  vi in F the terms i, v, a[i] and ahi  vi[i]. Let ∼ be the equivalence relation on V representing equality. The weak equivalence graph GW is defined by its vertices, the array-valued terms in V , and its undirected i edges of the form (i) s1 ↔ s2 if s1 ∼ s2 and (ii) s1 ↔ s2 if s1 has the form s2 hi  ·i or vice W versa. If two arrays a and b are connected in G by a path P they are called weakly equivalent. P We write a ⇔ b. Weakly equivalent arrays can differ only at finitely many positions given by i Stores (P ) := {i | ∃s1 s2 . s1 ↔ s2 ∈ P }. Two arrays a and b are called weakly equivalent on i, denoted by a ≈i b, if there exists a path P between them such that k 6∼ i holds for every k ∈ Stores (P ). If a and b are weakly equivalent on i, they must store the same value at i. Two arrays a and b are called weakly congruent on i, a ∼i b, if the equality a0 [j] = b0 [k] holds for j ∼ k ∼ i and a0 ≈i a, b0 ≈i b. Also in this case they must store the same value at i. P We use Cond(a ⇔ b), Cond(a ≈i b), Cond(a ∼i b) to denote the conjunction of the literals v = v (resp. v 6= v 0 ), v, v 0 ∈ V , such that v ∼ v 0 (resp. v 6∼ v 0 ) is necessary to show the 0 corresponding property. Instances of array lemmas are generated according to the following rules: i∼j a ≈i b a[i], b[j] ∈ V (read-over-weakeq) i 6= j ∨ ¬ Cond(a ≈i b) ∨ a[i] = b[j] P a⇔b ∀i ∈ Stores (P ) . a ∼i b a, b ∈ V P _ (weakeq-ext) ¬ Cond(a ⇔ b) ∨ ¬ Cond(a ∼i b) ∨ a = b i∈Stores(P ) The first rule, based on (read-over-write), propagates equalities between select terms and the second, based on extensionality, propagates equalities on array terms. In the following, we will describe how we can derive partial interpolants for these array lemmas. 4 Interpolants for Read-Over-Weakeq Lemmas In this section, we show how to interpolate lemmas generated by (read-over-weakeq). A lemma of this type (see Figure 1) explains a conflict of the form i = j ∧ Cond(a ≈i b) ∧ a[i] 6= b[j] . The weak equivalence a ≈i b ensures that a and b are equal at i = j which contradicts a[i] 6= b[j]. If the index equality i = j is mixed, the interpolation problem contains the shared auxiliary variable xij for i = j. If i (resp. j) is shared, we call i (resp. j) the shared term for i = j. We then identify four basic cases: (i) there exists a shared term for i = j and a[i] 6= b[j] is in B or mixed, (ii) there is a shared term for i = j and a[i] 6= b[j] is A-local, (iii) both i and j are B-local, and (iv) both i and j are A-local. 4.1 There is a Shared Term for i = j and a[i] 6= b[j] is in B or mixed If there exists a shared term x for the index equality i = j, the interpolant I can contain terms s[x] for shared array terms s occuring in the weak path between a and b. The basic idea is to summarize the weak A-paths by applying rule (read-over-weakeq) on their end terms. 4 Efficient Interpolation for the Theory of Arrays Hoenicke and Schindler a[i] | b[j] i j \ — / / k1 k2 km−1 km a ··· b Figure 1: A read-over-weakeq conflict. Solid lines represent strong (dis-)equalities, dotted lines function-argument relations, and zigzag lines represent weak paths consisting of store steps and array equalities. Example 1. Consider the following read-over-weakeq conflict: i = j ∧ a = s1 ∧ s1 hk1  v1 i = s2 ∧ s2 hk2  v2 i = s3 ∧ s3 = b ∧ i 6= k1 ∧ i 6= k2 ∧ a[i] 6= b[j] where a, k2 , v2 , i are A-local, b, k1 , v1 , j are B-local, and s1 , s2 , s3 are shared. Projecting the mixed literals on A and B as described in Section 3.1 yields the interpolation problem A : i = xij ∧ a = s1 ∧ s2 hk2  v2 i = s3 ∧ EQ(xik1 , i) ∧ i 6= k2 ∧ EQ(xa[i]b[j] , a[i]) B : xij = j ∧ s1 hk1  v1 i = s2 ∧ s3 = b ∧ ¬ EQ(xik1 , k1 ) ∧ ¬ EQ(xa[i]b[j] , b[j]) . An interpolant is I ≡ EQ(xa[i]b[j] , s1 [xij ]) ∧ s2 [xij ] = s3 [xij ] ∧ EQ(xik1 , xij ) . A implies I. As a = s1 holds, we get a[xij ] = s1 [xij ]. With EQ(xa[i]b[j] , a[i]) and i = xij follows EQ(xa[i]b[j] , s1 [xij ]). The equality s2 [xij ] = s3 [xij ] follows by applying (read-over-weakeq) on s3 = s2 hk2  v2 i and using xij = i 6= k2 . Finally, EQ(xik1 , i) and i = xij yield EQ(xik1 , xij ). B contradicts I. By ¬ EQ(xik1 , k1 ) and EQ(xik1 , xij ) we get xij 6= k1 . With s2 = s1 hk1 v1 i and (read-over-weakeq) we get s1 [xij ] = s2 [xij ]. From s3 = b we get s3 [xij ] = b[xij ]. Transitivity, EQ(xa[i]b[j] , s1 [xij ]) and s2 [xij ] = s3 [xij ] and the above select equalities yield EQ(xa[i]b[j] , b[xij ]). With xij = j, we get a contradiction to ¬ EQ(xa[i]b[j] , b[j]). Symbol condition for I. I contains only shared symbols and auxiliary variables. All auxiliary variables introduced for disequalities appear in positive EQ terms. Algorithm. Assume that the weak path P : a ≈i b is subdivided into A- and B-paths where shared paths are added to B-paths. Let x be the shared term for i = j, i.e. x stands for i if i is shared, for j if i is not shared but j is, and for the auxiliary variable xij if i = j is mixed. (i) An inner A-path π of P starts and ends with a shared term: π : s1 ≈i s2 (these shared array terms can also be auxiliary variables introduced for a mixed array equality). The summary is s1 [x] = s2 [x]. For every B-local index disequality i 6= k on π add the disjunct x = k and for every mixed index disequality add the disjunct EQ(xik , x). The interpolant of the subpath is _ _ Iπ ≡ s1 [x] = s2 [x] ∨ FπA (x) where FπA (x) :≡ x=k ∨ EQ(xik , x) . k∈Stores(π) k∈Stores(π) i6=k B-local i6=k mixed (ii) If a[i] 6= b[j] is mixed and a[i] is A-local, the first A-path on P starts with a or a is shared, i.e. π : a ≈i s1 (where s1 can be a). For the path π, build the term EQ(xa[i]b[j] , s1 [x]) and add FπA (x) as in case (i). Iπ ≡ EQ(xa[i]b[j] , s1 [x]) ∨ FπA (x) 5 Efficient Interpolation for the Theory of Arrays Hoenicke and Schindler (iii) Similarly in the case where a[i] 6= b[j] is mixed and b[j] is A-local, the last A-path on P ends with b or b is shared, π : sn ≈i b. In this case the disjunct i 6= j needs to be added if i = j is B-local and i, j are both shared. Iπ ≡ EQ(xa[i]b[j] , sn [x]) ∨ FπA (x) [ ∨ i 6= j] (iv) For every B-path π, add the conjunct x 6= k for each A-local index disequality i 6= k, and the conjunct EQ(xik , x) for each mixed index disequality i 6= k on π. We define ^ ^ Iπ ≡ FπB (x) where FπB (x) :≡ x 6= k ∧ EQ(xik , x) . k∈Stores(π) k∈Stores(π) i6=k A-local i6=k mixed The lemma interpolant is the conjunction of the above path interpolants. If i, j are shared and i = j is A-local, add the conjunct i = j. ^ ^ I≡ Iπ ∧ Iπ [ ∧ i = j] . π∈A-paths π∈B-paths 4.2 There is a Shared Term for i = j and a[i] 6= b[j] is A-local If there exists a shared index for i = j and a[i] 6= b[j] is A-local, we build disequalities for the B-paths instead of equalities for the A-paths. This corresponds roughly to obtaining the interpolant of the inverse problem (B, A) by Section 4.1 and negating the resulting formula. Only the EQ terms are not negated because of the asymmetry of the projection. Using the definitions of FπA and FπB from the previous section, the lemma interpolant is _ _ I≡ (s1 [x] 6= s2 [x] ∧ FπB (x)) ∨ FπA (x) [ ∨ i 6= j] . (π:s1 ≈i s2 )∈B-paths π∈A-paths 4.3 Both i and j are B-local When both i and j are B-local, we have no term x representing the weak path index i in the interpolant I. Hence, summarizing A- (resp. B-) paths by terms of the form s1 [x] = s2 [x] ∨ FπA (resp. s1 [x] 6= s2 [x] ∧ FπB ) as above is not possible. Instead we use the diff function to make statements about the desired index. For instance, if a = bhi  vihj  wi for arrays a, b with 2 1 b[j] 6= w, then a b = b and diff(a, b) = j or diff(a b, b) = j hold. Example 2. Consider the following conflict: i = j ∧ a = s1 ∧ s1 hk  vi = s2 ∧ s2 = b ∧ i 6= k ∧ a[i] 6= b[j] where a, b, i, j are B-local, k, v are A-local, and s1 , s2 are shared. Splitting the mixed disequality i 6= k as described in Section 3.1 results in the interpolation problem A : s1 hk  vi = s2 ∧ EQ(xik , k) B : i = j ∧ a = s1 ∧ s2 = b ∧ ¬ EQ(xik , i) ∧ a[i] 6= b[j] . An interpolant should reflect the information that s1 and s2 can differ at most at one index satisfying the EQ term. Using diff, we can express the interpolant 1 I : (s1 = s2 ∨ EQ(xik , diff(s1 , s2 ))) ∧ s1 s2 = s2 . 6 Efficient Interpolation for the Theory of Arrays Hoenicke and Schindler A implies I. The first literal in A implies that s1 and s2 differ at most at index k, yielding 1 s1 s2 = s2 . If s1 6= s2 , then k = diff(s1 , s2 ) and EQ(xik , diff(s1 , s2 )) follows from EQ(xik , k). B contradicts I. B implies that s1 [i] 6= s2 [i] holds. Thus, s1 = s2 cannot hold and I states that EQ(xik , diff(s1 , s2 )) holds. With ¬ EQ(xik , i) in B, this implies diff(s1 , s2 ) 6= i. Hence, 1 s1 and s2 must differ at least at two indices. This contradicts s1 s2 = s2 in I. Symbol condition for I. I contains only shared symbols and the auxiliary variable for i 6= k appears in a positive EQ term only. To generalize this idea, we define inductively for arrays a and b, a number m ≥ 0 and a formula F (·) with one free parameter: weq(a, b, 0, F (·)) :≡ a=b 1 (weq) weq(a, b, m + 1, F (·)) :≡ (a = b ∨ F (diff(a, b))) ∧ weq(a b, b, m, F (·)) . The term weq(a, b, m, F (·)) states that arrays a and b differ at most at m indices and that each index i on which they differ satisfies the formula F (i). Algorithm. For any A-path π : s1 ≈i s2 , we count the number of stores |π| := | Stores (π) |. Each index i where s1 and s2 differ needs to satisfy FπA (i) as defined in Section 4.1. There is nothing to do for B-paths. The lemma interpolant is ^ I≡ weq(s1 , s2 , |π|, FπA (·)) . (π:s1 ≈i s2 )∈A-paths 4.4 Both i and j are A-local In the case where both i and j are A-local, we summarize the B-paths, as all B-path ends are shared terms. Analogously to weq, we define for arrays a, b, a number m ≥ 0 and a formula F : nweq(a, b, 0, F (·)) :≡ a 6= b 1 (nweq) nweq(a, b, m + 1, F (·)) :≡ (a 6= b ∧ F (diff(a, b))) ∨ nweq(a b, b, m, F (·)) . The term nweq(a, b, m, F (·)) expresses that either one of the first m indices i found by stepwise rewriting a to b satisfies the formula F (i), or a and b differ at more than m indices. Analogously to the last section, the lemma interpolant can be expressed as _ I≡ nweq(s1 , s2 , |π|, FπB (·)) . (π:s1 ≈i s2 )∈B-paths Note that this is almost the negation of the interpolant of (B, A) computed as in Section 4.3. Only the EQ terms are not negated because of the asymmetry of the projection. 5 Interpolants for Weakeq-Ext Lemmas In this section, we describe how to compute interpolants for array lemmas of type (weakeq-ext). Figure 2 displays a visualization of the corresponding conflict P ^ Cond(a ⇔ b) ∧ Cond(a ∼i b) ∧ a 6= b . i∈Stores(P ) 7 Efficient Interpolation for the Theory of Arrays Hoenicke and Schindler / i1 i2 im−1 im a ... b i1 — — 1 1 k11 k21 km 1 −1 km 1 a ... b .. . im — — m m k1m k2m km m −1 km m a ... b Figure 2: Visualization of a weakeq-ext lemma. Solid lines represent strong (dis-)equalities, zigzag lines store paths and dashed zigzag lines weak paths which can contain select edges. The main path P shows that a and b differ at most at the indices in Stores (P ) and a ∼i b shows that a and b do not differ at index i. A weak path labelled with i (short: i-path) represents weak congruence on i, i.e. it can contain select edges a0 j, k b0 where i ∼ j and i ∼ k and a0 [j] ∼ b0 [k]. We modify the methods in Section 4 to summarize the i-paths. In the B-local case 4.3, B-local select edges make no difference, as the weq terms are built over A-paths, and analogously for the A-local case 4.4. However, if there are A-local select edges in the B-local case or vice versa, then k is shared or the index equality i = k is mixed and we can use k or the auxiliary variable xik to proceed as in the shared cases 4.1 and 4.2. We have to adapt the interpolation procedures in Sections 4.1 and 4.2 by adding the index equalities that pertain to a select edge, analogously to the index disequality for a store edge before. More specifically, we add to FπA (x) a disjunct x 6= k for each B-local i = k on an A-path, and x 6= xik for each mixed i = k. Here, x is the shared term for the main index equality i = j as before. For B-paths we add to FπB (x) the conjunct x = k for each A-local i = k and x = xik for each mixed i = k. Furthermore, if there is a mixed select equality a0 [j] = b0 [k] on the weak path, the auxiliary variable xa0 [j]b0 [k] is used in the summary for the A-path instead of s[x], i.e., we get a term of the form xa0 [j]b0 [k] = s2 [x] in 4.1, and analogously for 4.2. For (weakeq-ext) lemmas, we distinguish three cases: (i) a 6= b is in B, (ii) a 6= b is A-local, or (iii) a 6= b is mixed. 5.1 a 6= b is in B If the disequality a 6= b is in B, the A-paths both on the main store path and on the weak paths have only shared path ends. Hence, we summarize A-paths similarly to Sections 4.1 and 4.3. P Algorithm. Divide the main path a ⇔ b into A-paths and B-paths. For each i ∈ Stores (P ) on a B-path, summarize the corresponding i-path as in Sections 4.1 or 4.3. The resulting π summary is denoted by Ii . For an A-path s1 ⇔ s2 use a weq term to state that each index where s1 and s2 differ satisfies Ii (·) for some i ∈ Stores (π) where Ii is computed as in 4.1 with the shared term · for i = j. The lemma interpolant is ^ ^ _ I≡ Ii ∧ weq(s1 , s2 , |π|, Ii (·)) . π i∈Stores(π) (s1 ⇔s2 )∈A-paths i∈Stores(π) π∈B-paths 8 Efficient Interpolation for the Theory of Arrays Hoenicke and Schindler 5.2 a 6= b is A-local The case where a 6= b is A-local is similar with the roles of A and B swapped. For each i ∈ Stores (π) on an A-path π on P , interpolate the corresponding weak path as in Sections 4.2 or 4.4 and obtain Ii . For each i ∈ Stores (π) on a B-path π on P , interpolate the corresponding weak path as in Section 4.2 using · as shared term and obtain Ii (·). The lemma interpolant is _ _ ^ I≡ Ii ∨ nweq(s1 , s2 , |π|, Ii (·)) . π i∈Stores(π) (s1 ⇔s2 )∈B-paths i∈Stores(π) π∈A-paths 5.3 a 6= b is mixed If a 6= b is mixed, where w.l.o.g. a is A-local, the outer A- and B-paths end with A-local or B-local terms respectively. The auxiliary variable xab may not be used in store or select terms, thus we first need to find a shared term representing a before we can summarize A-paths. Example 3. Consider the following conflict: a = shi1  v1 i ∧ b = shi2  v2 i ∧ a 6= b (main path) ∧ a[i1 ] = s1 [i1 ] ∧ b = s1 hk1  w1 i ∧ i1 6= k1 (i1 -path) ∧ a = s2 hk2  w2 i ∧ i2 6= k2 ∧ b[i2 ] = s2 [i2 ] (i2 -path) where a, i1 , v1 , k2 , w2 are A-local, b, i2 , v2 , k1 , w1 are B-local and s, s1 , s2 are shared. An interpolant for the conflict is I ≡ EQ(xab , s) ∧ weq(s, s2 , 1, EQ(xi2 k2 , ·))   ∨ nweq s, s1 , 2, EQ(xab , sh·  s1 [·]i) ∧ weq(sh·  s1 [·]i, s2 , 1, EQ(xi2 k2 , :)) ∧ EQ(xi1 k1 , ·) , where in the second line the symbol · refers to the diff term of the outer nweq term and the symbol : to the diff term of the inner weq term. A implies I. Because of a = shi1  v1 i, the arrays a and s can differ only at i1 . If a = s, we get EQ(xab , s) by replacing a in the A-projection of a 6= b and we get weq(s, s2 , 1, EQ(xi2 k2 , ·)) from the i2 -path as in Section 4.3. Otherwise, s[i1 ] 6= s1 [i1 ], as a[i1 ] = s1 [i1 ] holds in A by the i1 -path. By correcting s at i1 with s1 [i1 ], we get a, and therefore EQ(xab , shi1  s1 [i1 ]i) holds. Again, we get weq(shi1  s1 [i1 ]i, s2 , 1, EQ(xi2 k2 , :)) as in Section 4.3. Finally, for i1 , we have the literal EQ(xi1 k1 , i1 ) by projection. We know that i1 is among the diff terms between s and s1 . Thus, the nweq term holds since i1 satisfies the nested formula. I contradicts B. Assume that the first disjunct of I holds. By EQ(xab , s) in I and ¬ EQ(xab , b) in B, we get s 6= b. The only potential difference is at i2 because of B. Since B contains s2 [i2 ] = b[i2 ] this implies that s and s2 also differ at i2 . However, with the weq term in I and ¬ EQ(xi2 k2 , i2 ) in B, we get a contradiction. Assume now that the second disjunct of I holds. Then either s and s1 must differ at some index · which satisfies the formula inside the nweq term, or at more than 2 positions. We know from B that s and s1 can only differ at i2 and k1 . For k1 , there is the term ¬ EQ(xi1 k1 , k1 ) in B. Hence, · can only be i2 . Because of EQ(xab , sh·  s1 [·]i) in I and ¬ EQ(xab , b) in B, we get sh·  s1 [·]i = 6 b. Since s and b only differ at i2 , the difference can only be at i2 = · and sh·  s1 [·]i[i2 ] 6= b[i2 ] = s2 [i2 ]. By the inner weq term in I, i2 has to satisfy EQ(xi2 k2 , i2 ), which is a contradiction to B. 9 Efficient Interpolation for the Theory of Arrays Hoenicke and Schindler π Algorithm. Identify in the main path P the first A-path a ⇔0 s1 and its store indices Stores (π0 ) = {i1 , . . . i|π0 | }. To build an interpolant, we rewrite s1 by storing at each index im the value a[im ]. We use s̃ to denote the intermediate arrays. We build a formula Im (s̃) inductively over m ≤ |π0 |. This formula is an interpolant if s̃ is a shared array that differs from a only at the indices i1 , . . . , im . For m = 0, i.e., a = s̃, we modify the lemma by adding the strong edge s̃ ↔ a in front of all paths and summarize it as if it was B-local using the algorithm in Section 5.1, but drop the π weq term for the path s̃ ↔ a ⇔0 s1 . This yields I5.1 (s̃). We define I0 (s̃) ≡ EQ(xab , s̃) ∧ I5.1 (s̃) . For the induction step to m + 1 we assume that s̃ only differs from a at i1 , . . . , im , im+1 . Our goal is to find a shared index term x for im+1 and a shared value v for a[x]. We use the im+1 -path to conclude that s̃hx  vi is equal to a at im+1 . Then we can include Im (s̃hx  vi) computed using the induction hypothesis. (i) If there is a select edge on a B-subpath of the im+1 -path or if im+1 is itself shared, we immediately get a shared term x for im+1 . If the last B-path π m+1 on the im+1 -path starts with a mixed select equality, then the corresponding auxiliary variable is the shared value v. Otherwise, π m+1 starts with a shared array sm+1 and v := sm+1 [x]. We summarize the im+1 -path from a to the start of π m+1 as in Section 4.2 and get I4.2 (x). Finally, we set Im+1 (s̃) ≡ I4.2 (x) ∨ (Im (s̃hx  vi) ∧ FπBm+1 (x)) . π m+1 (ii) Otherwise, we split the im+1 -path into a ∼im+1 sm+1 and sm+1 ⇔ b, where π m+1 is the last B-subpath of the im+1 -path. If s1 and a are equal on im+1 then also s̃ and a are equal and the interpolant is simply Im (s̃). If a and sm+1 differ on im+1 , we build an interpolant from a ∼im+1 sm+1 as0 in 4.4 and obtain I4.4 . Otherwise, s1 and sm+1 differ on im+1 . We build the P π store path s1 ⇔ sm+1 by concatenating P and π m+1 . Using nweq on the subpaths s ⇔ s0 of 0 |π| 0 P we find the shared term x for im+1 . If π is in A we need to add the conjunct s s = s0 to obtain an interpolant. We get Im+1 (s̃) ≡ Im (s̃) ∨ I4.4 [for a ∼im+1 sm+1 ]   |π| 0 _ ∨ nweq s, s0 , |π|, Im (s̃h·  sm+1 [·]i) ∧ FπBm+1 (·) [ ∧ s s = s0 ] . π s⇔s0 in P 0 The lemma interpolant for the mixed extensionality lemma is I ≡ I|π0 | (s1 ). 6 Conclusion and Future Work We presented an interpolation algorithm for the quantifier-free fragment of the theory of arrays. The algorithm uses an efficient array solver based on weak equivalence on arrays. In contrast to most existing interpolation algorithms for arrays, the solver does not depend on the partioning of the interpolation problem. Thus, our technique allows for efficient interpolation especially in the context of sequence interpolants and tree interpolants where interpolants for different partitions of the same unsatisfiable formula need to be computed. Because we use the framework of proof tree preserving interpolation we only need to provide algorithms to interpolate the lemmas. These algorithms build the formulas by simply iterating over the weak equivalence and weak congruence paths found by the array solver. Implementation of the algorithm in SMTInterpol [8] is ongoing work. The algorithm for read-over-weakeq lemmas is already implemented and supports sequence and tree interpolation. 10 Efficient Interpolation for the Theory of Arrays Hoenicke and Schindler References [1] Pavel Andrianov, Karlheinz Friedberger, Mikhail U. Mandrykin, Vadim S. Mutilin, and Anton Volkov. CPA-BAM-BnB: Block-abstraction memoization and region-based memory models for predicate abstractions - (competition contribution). In TACAS (2), volume 10206 of Lecture Notes in Computer Science, pages 355–359, 2017. [2] Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala, and Rupak Majumdar. The software model checker blast. STTT, 9(5-6):505–525, 2007. [3] Angelo Brillout, Daniel Kroening, Philipp Rümmer, and Thomas Wahl. Program verification via craig interpolation for presburger arithmetic with arrays. In VERIFY@IJCAR, volume 3 of EPiC Series in Computing, pages 31–46. EasyChair, 2010. [4] Roberto Bruttomesso, Silvio Ghilardi, and Silvio Ranise. Quantifier-free interpolation in combi- nations of equality interpolating theories. ACM Trans. Comput. Log., 15(1):5:1–5:34, 2014. [5] Franck Cassez, Anthony M. Sloane, Matthew Roberts, Matthew Pigram, Pongsak Suvanpong, and Pablo González de Aledo Marugán. Skink: Static analysis of programs in LLVM intermediate representation - (competition contribution). In TACAS (2), volume 10206 of Lecture Notes in Computer Science, pages 380–384, 2017. [6] Jürgen Christ and Jochen Hoenicke. Weakly equivalent arrays. In FroCos, volume 9322 of Lecture Notes in Computer Science, pages 119–134. Springer, 2015. [7] Jürgen Christ and Jochen Hoenicke. Proof tree preserving tree interpolation. J. Autom. Reasoning, 57(1):67–95, 2016. [8] Jürgen Christ, Jochen Hoenicke, and Alexander Nutz. Smtinterpol: An interpolating SMT solver. In SPIN, volume 7385 of Lecture Notes in Computer Science, pages 248–254. Springer, 2012. [9] Jürgen Christ, Jochen Hoenicke, and Alexander Nutz. Proof tree preserving interpolation. In TACAS, volume 7795 of Lecture Notes in Computer Science, pages 124–138. Springer, 2013. [10] Jürgen Christ, Jochen Hoenicke, and Alexander Nutz. Proof tree preserving interpolation. CoRR, abs/1705.05309, 2017. Improved and simplified version. http://arxiv.org/abs/1705.05309. [11] Matthias Dangl, Stefan Löwe, and Philipp Wendler. Cpachecker with support for recursive pro- grams and floating-point arithmetic - (competition contribution). In TACAS, volume 9035 of Lecture Notes in Computer Science, pages 423–425. Springer, 2015. [12] Marius Greitschus, Daniel Dietsch, Matthias Heizmann, Alexander Nutz, Claus Schätzle, Christian Schilling, Frank Schüssele, and Andreas Podelski. Ultimate taipan: Trace abstraction and abstract interpretation - (competition contribution). In TACAS (2), volume 10206 of Lecture Notes in Computer Science, pages 399–403, 2017. [13] Matthias Heizmann, Yu-Wen Chen, Daniel Dietsch, Marius Greitschus, Alexander Nutz, Betim Musa, Claus Schätzle, Christian Schilling, Frank Schüssele, and Andreas Podelski. Ultimate au- tomizer with an on-demand construction of floyd-hoare automata - (competition contribution). In TACAS (2), volume 10206 of Lecture Notes in Computer Science, pages 394–398, 2017. [14] Matthias Heizmann, Jochen Hoenicke, and Andreas Podelski. Nested interpolants. In POPL, pages 471–482. ACM, 2010. [15] Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar, and Grégoire Sutre. Lazy abstraction. In POPL, pages 58–70. ACM, 2002. [16] John McCarthy. Towards a mathematical science of computation. In IFIP Congress, pages 21–28, 1962. [17] Kenneth L. McMillan. Lazy abstraction with interpolants. In CAV, volume 4144 of Lecture Notes in Computer Science, pages 123–136. Springer, 2006. [18] Alexander Nutz, Daniel Dietsch, Mostafa Mahmoud Mohamed, and Andreas Podelski. ULTI- MATE KOJAK with memory safety checks - (competition contribution). In TACAS, volume 9035 of Lecture Notes in Computer Science, pages 458–460. Springer, 2015. [19] Viorica Sofronie-Stokkermans. Hierarchic reasoning in local theory extensions. In CADE, volume 11 Efficient Interpolation for the Theory of Arrays Hoenicke and Schindler 3632 of Lecture Notes in Computer Science, pages 219–234. Springer, 2005. [20] Nishant Totla and Thomas Wies. Complete instantiation-based interpolation. J. Autom. Reason- ing, 57(1):37–65, 2016. [21] Greta Yorsh and Madanlal Musuvathi. A combination method for generating interpolants. In CADE, volume 3632 of Lecture Notes in Computer Science, pages 353–368. Springer, 2005. 12