1. Introduction

AXDInterpolator: A Tool for Computing Interpolants for Arrays with MaxDif

Jose Abel Castellanos Joo

Silvio Ghilardi

Alessandro Gianola

Deepak Kapur

0 0 Department of Computer Science, University of New Mexico , Albuquerque , USA 1 Dipartimento di Matematica, Università degli Studi di Milano , Milan , Italy 2 Faculty of Computer Science, Free University of Bozen-Bolzano , Bolzano , Italy

40 52

Several approaches toward quantifier-free interpolation algorithms of theories involving arrays have been proposed by extending the language using a binary function skolemizing the extensionality principle. In FoSSaCS 2021, the last three authors studied the enrichment of the McCarthy's theory of extensional arrays with a maxdif operation. This paper discusses the implementation of the interpolation algorithm proposed in FoSSaCS 2021 using the Z3 API. The implementation allows the user to choose iZ3, Mathsat, or SMTInterpol as interpolation engines. The tool returns a formula in SMTLIB2 format, which allows compatibility with model checkers and invariant generators using such a format. We compare our algorithm with state-of-the-art interpolation engines. Our experiments using unsatisfiable formulae extracted with the model checker UAutomizer show the feasibility of our tool. For that purpose, we used C programs from the ReachSafety-Arrays and MemSafety-Arrays tracks of SV-COMP.

eol>Interpolation Arrays MaxDif SMT

1. Introduction

comparing, searching, sorting functions; verification problems of this kind are collected, e.g., in the SV-COMP benchmarks categories “ReachSafety-Arrays” and “MemSafety-Arrays”1, where safety verification tasks involving arrays of finite but unknown length are considered.

By itself, the theory of arrays with extensionality does not have quantifier free interpolation [ 5 ]; however, in [ 6 ] it was shown that quantifier-free interpolation is restored if one enriches the language with a binary function skolemizing the extensionality axiom. Such a Skolem function, applied to two array variables , , returns an index diff(, ) where , difer (it returns an arbitrary value if is equal to ). This semantics for the diff operation is very undetermined and does not have a significant interpretation in concrete programs. That is why in [ 7 ] the diff operation is modified so as to obtain a defined and natural meaning: we introduce the theory of arrays with maxdif ℛ( ), where diff(, ) returns the biggest index where , difer (in case = we ask for diff(, ) to be the minimum index 0). Since it is natural to view arrays as functions defined on initial intervals of the nonnegative integers, this choice has a clear semantic motivation. The expressive power of the theory of arrays so enriched becomes bigger: for instance, if we also add to the language a constant symbol for the undefined array constantly equal to some ‘undefined’ value ⊥ (where ⊥ is meant to be diferent from the values [] actually in use), then we can define || as diff(, ). In this way we can model the fact that is undefined outside the interval [ 0, || ] - this is useful to formalize the above mentioned SV-COMP benchmarks. The efectiveness of quantifier-free interpolation in the theory of arrays with maxdif is exemplified in the Strcpy function example of Figure 1 in [ 7 ]:2 we need to express | · | to formalize its initial states without quantifiers. In [ 7 ], ℛ( ) is shown to admit quantifier-free interpolation, and a suitable algorithm based on instantiation à-la-Herbrand and on a hierarchical approach ([ 8, 9, 10 ]) is introduced: by exploiting iterated applications of the maxdif operator and reducing the interpolation problem to a reduced interpolating theory, we can compute quantifier-free interpolants out of the input formulae.

In this paper, we present (in Section 3) AXDInterpolator3, a tool that implements the interpolation algorithm of [ 7 ]. Following an instantiation-based approach, the algorithm always (and in particular for all the examples in this paper) returns a quantifier-free formula. A comparison is done with the state-of-the-art interpolation engines that have known interpolant generation algorithms for the theory of arrays–iZ3, MathSat, and SMTInterpol [ 11, 12, 13 ] support the theory of arrays with maxdif. Indeed, on examples supporting expressive features like diff or the | · | operator, no proper comparison is possible. In case unsafe traces for such examples are considered, in contrast to AXDInterpolator, Mathsat and iZ3 cannot usually get a quantifier-free interpolant: either the tools detect the satisfiability of the unsafe trace (since they reason modulo a theory that is weaker than ℛ( )) so they cannot compute an interpolant, 1https://sv-comp.sosy-lab.org/2020/benchmarks.php 2The Strcpy function performs sequential index-wise copies of values of two arrays , from to where is the length of both arrays: the loop invariant for this examples is = ∨ ( > diff(, ) ∧ diff(, ) ≥ ), being the index variable in the loop. This formula is a loop invariant because if the two arrays , are diferent while < then the conjunction > diff(, ) ∧ diff(, ) ≥ asserts that the possibly entry where , difer has not been processed yet.

3We use the prerelease version 0.9.0 of our tool: https://github.com/typesAreSpaces/AXDInterpolator/releases/ tag/0.9.0 or, in case the unsafe trace is correctly detected as unsat (equipping the solver with axioms for maxdif), they fail returning errors or quantified interpolant or timeout; for Example 1 from [ 7 ], SMTInterpol fails. Mathsat and iZ3 fail also in the case of the example from [ 5 ], which is well-known not to admit a quantifier-free interpolant unless the extensionality axiom is skolemized with a generic dif operator: in this specific case, SMTInterpol, instead, can successfully compute a quantifier-free interpolant using the plain dif operator, but AXDInterpolator can compute an interpolant with maxdif a bit faster. Running experiments over two tracks of the SV-COMP (ReachSafety-Arrays and MemSafety-Arrays), we use unsatisfiable formulae extracted with the model checker UAutomizer [ 14 ]: the latter automatically transforms C programs taken from those benchmarks into suitable SMTLIB2 files that allow to verify particular properties of these programs. Since UAutomizer does not support the extended language of arrays with maxdif, the comparison among the diferent solvers becomes acceptable. The results show that AXDInterpolator, iZ3, Mathsat and SMTInterpol have an overall identical behavior over these benchmarks: interpolants are computed for almost every example, apart from a small portion of timeout outcomes reached by AXDInterpolator as its implementation is evolving. On more expressive examples, such as Example 1 from [ 7 ], other solvers including the most expressive SMTInterpol fail, whereas AXDinterpolator produces useful interpolants. AXDInterpolator uses the Z3 API [ 15 ] and allows the user to choose among diferent solvers in order to compute the interpolants in the reduced theory invoked by the hierarchical method: iZ3, Mathsat [ 12 ] and SMTInterpol [ 13 ]. It generates a formula in SMTLIB2 format, which guarantees compatibility with model checkers and invariant generators adopting the same format.

2. Arrays with MaxDif

The McCarthy theory of arrays [ 4 ] with extensionality has three sorts ARRAY, ELEM, INDEX (called “array”, “element”, and “index” sort, respectively) and two function symbols (“read”) and (“write”) of appropriate arities. Besides the axioms for defining and , it contains the ‘extensionality’ axiom: ∀, . ̸= → (∃. (, ) ̸= (, )). In [ 7 ], the Skolem function diff for this axiom has been given a proper semantics, i.e., diff returns the biggest index where two diferent arrays difer. For introducing our theory of arrays with maxdif, we first need to consider an index theory (see [ 7 ] for the general definition). The signature of an index theory contains at least the binary relation symbol ≤ and the constant 0. Examples of index theories are the theory of total orders , integer diference logic ℐℒ, integer linear arithmetic ℒℐ, and real linear arithmetics ℒℛ. For most applications, ℐℒ (namely the theory of integer numbers with 0, ordering, successor and predecessor) sufices as in this theory one can model counters for scanning arrays.

Given an index theory , we now introduce our array theory with maxdif ℛ( ) (parameterized by ) as follows. We still have the same sorts; the language includes the symbols of , the operations , , a binary function diff of type ARRAY × ARRAY → INDEX, as well as constants and ⊥ of sorts ARRAY and ELEM, resp. The constant ⊥ models an undetermined (e.g. undefined, not-in-use, not coming from appropriate initialization, etc.) value and models the totally undefined array; the term diff(, ) returns the maximum index where and difer and returns 0 if and are equal.4 Formally, the axioms of ℛ( ) include, besides the axioms of , the following ones:

As an efect of the above axioms, we have that an array is undefined outside the interval [0, ||], where || is defined as || := diff(, ). Typically, this interval is finite, and it can be shown that any satisfiable constraint is satisfiable in a model where all such intervals (involving variables appearing in the constraint) are finite [ 7 ].

For the interpolation algorithm from [ 7 ], we need to introduce iterated diff operations, similarly to [ 10 ]. As we know, diff(, ) returns the biggest index where and difer (it returns 0 if = ). Now we want an operator that returns the last-but-one index where , difer (0 if , difer in at most one index), etc. Our language is already enough expressive for that, so we can introduce such operators explicitly as follows. Given array variables , , we define by mutual recursion the sequence of array terms 1, 2, . . . and of index terms diff1(, ), diff2(, ), . . . : 1 := ; diff1(, ) := diff(, 1); +1 := (, diff(, ), (, diff(, ))); diff+1(, ) := diff(, +1) Lemma 2.1. A formula like diff1(, ) = 1 ∧ · · · · · · ∧ diff(, ) = is equivalent modulo ℛ to the conjunction of the following five formulae: 1 ≥ 2 ∧ · · · ∧

− 1 ≥ ∧ ≥ 0 ⋀︀<( > +1 → (, ) ̸= (, ))

⋀︀<( = +1 → = 0) ⋀︀≤ ((, ) = (, ) → = 0) ∀ℎ (ℎ > → (, ℎ) = (, ℎ) ∨ ℎ = 1 ∨ · · · ∨ ℎ = − 1) (8) Separated Pairs, -Instantiation and Satisfiability. The key step of the interpolation algorithm from [ 7 ] depends upon the problem of checking satisfiability (modulo ℛ( )) of quantifier-free formulae; this is solved by adapting instantiation techniques, like those from [ 16 ].

4Notice that it might well be the case that diff(, ) = 0 for diferent , , but in that case 0 is the only index where , difer.

The complexity () of a term is defined as the number of function symbols occurring in . If ℐ is a set of -terms, an ℐ-instance of a universal formula of the kind ∀ is a formula of the kind (/) for some ∈ ℐ.

A pair of sets of quantifier-free formulae Φ = (Φ 1, Φ 2) is a separated pair if (1) Φ 1 contains equalities of the form diff(, ) = and = (, , ); moreover if it contains the equality diff(, ) = , it must also contain an equality of the form diff(, ) = for every < ; (2) Φ 2 contains Boolean combinations of -atoms and of atoms of the forms: {(, ) = (, ), (, ) = , 1 = 2}, where , , , , , 1, 2 are variables or constants of the appropriate sorts.

The separated pair is said to be finite if Φ 1 and Φ 2 are both finite. In practice, in a separated pair Φ = (Φ 1, Φ 2), reading (, ) as a functional application, it turns out that the formulae from Φ 2 can be translated into quantifier-free formulae of the combined theory ∪ ℰ ℱ (the array variables occurring in Φ 2 are converted into free unary function symbols). ∪ ℰ ℱ enjoys the decidability of the quantifier-free fragment and has quantifier-free interpolation because is an index theory [ 7 ]: we adopt a hierarchical approach (similar to [ 8, 9 ]) and we rely on satisfiability and interpolation algorithms for such a theory as black boxes .

Let ℐ be a set of -terms and let Φ = (Φ 1, Φ 2) be a separated pair; we let Φ( ℐ) = (Φ 1(ℐ), Φ 2(ℐ)) be the smallest separated pair satisfying the following conditions: - Φ 1(ℐ) is equal to Φ 1 and Φ 2(ℐ) contains Φ 2; - Φ 2(ℐ) contains all ℐ-instances of the two formulae ∀ (, ) = ⊥, ∀ ( < 0 → (, ) = ⊥), where is any array variable occurring in Φ 1 or Φ 2; - if Φ 1 contains = (, , ), then Φ 2(ℐ) contains all the ℐ-instances of the equivalent formula ( ≥ 0 → (, ) = ) ∧ ∀ℎ (ℎ ̸= → (, ℎ) = (, ℎ)); - if Φ 1 contains the conjunction ⋀︀

=1 diff(, ) = , then Φ 2(ℐ) contains the formulae (9), (10), (11), (12) as well as all ℐ-instances of the formula (13).

For ∈ N ∪ {∞}, the -instantiation of Φ = (Φ 1, Φ 2) is the separated pair Φ( ℐΦ ) = (Φ 1(ℐΦ ), Φ 2(ℐΦ )), where ℐΦ is the set of -terms of complexity at most built up from the index variables occurring in Φ 1, Φ 2. The full instantiation of Φ = (Φ 1, Φ 2) is the separated pair Φ( ℐΦ∞) = (Φ 1(ℐΦ∞), Φ 2(ℐΦ∞)) (which is usually not finite). A separated pair Φ = (Φ 1, Φ 2) is -instantiated if Φ = Φ( ℐΦ ); it is ℛ( )-satisfiable if so it is the formula ⋀︀ Φ 1 ∧ ⋀︀ Φ 2. The (ℛ( )) problem is decidable for every index theory (and can be reduced to ∪ ℰ ℱ -satisfiability) thanks to the following results: Lemma 2.2. Given a quantifier-free formula , we can compute finitely many finite separation pairs Φ 1 = (Φ 11, Φ 12), . . . , Φ = (Φ 1, Φ 2) such that is ℛ( )-satisfiable if so is one of the Φ .

Lemma 2.3. The following conditions are equivalent for a finite separation pair Φ = (Φ (i) Φ is ℛ( )-satisfiable; (ii) ⋀︀ Φ 2(ℐΦ0) is ∪ ℰ ℱ -satisfiable. 1, Φ 2):

2.1. The interpolation algorithm

The interpolation algorithm from [ 7 ] is based on instantiation à-la-Herbrand. Our problem is to compute a quantifier-free interpolant in ℛ( ) of two quantifier-free formulae and .

We call the variables occurring in both and common variables, whereas the variables occurring in (resp. in ) are called -variables (resp. -variables). The same terminology applies to terms, atoms and formulae. As shown in [ 7 ], both and can be given in the form of ifnite separated pairs. Thus, is of the form ⋀︀ 1 ∧ ⋀︀ 2, is of the form ⋀︀ 1 ∧ ⋀︀ 2, for separated pairs (1, 2) and (1, 2). Also, and can be assumed to be both 0-instantiated.

The formulae from 2 and 2 are formulae from the signature of ∪ ℰ ℱ (after rewriting terms of the kind (, ) to (), where the are free function symbols). Of course, if 2 ∧ 2 is ∪ ℰ ℱ -inconsistent, we can get our quantifier-free interpolant by using our black box algorithm for interpolation in the weaker theory ∪ ℰ ℱ .

Formally, we apply the Loop below until 2 ∧ 2 becomes ∪ ℰ ℱ -inconsistent: Theorem 4 in [ 7 ] guarantees that, if ∧ is ℛ( )-inconsistent, then the Loop always terminates. When 2 ∧ 2 becomes inconsistent, we can get our interpolant via the interpolation algorithm for ∪ ℰ ℱ . We need a counter recording how many times the Loop has been executed. Loop (to be repeated until 2 ∧ 2 becomes inconsistent modulo ∪ ℰ ℱ , initially = 0). Pick two distinct common ARRAY-variables 1, 2 and ≥ 1 and such that no conjunct of the kind diff(1, 2) = occurs in both 1 and 1 for some ≥ 1 (but such that for every < there is a conjunct of the form diff(, ) = occurring in both 1 and 1). Pick also a fresh INDEX constant ; conjoin diff(1, 2) = to both 1 and 1; then -instantiate both and . Increase to + 1.

Notice that the fresh index constants introduced during the Loop are considered common constants and so they are considered in the -instantiation of both and . In the output, these constants need to be eliminated.

3. Implementation

Our implementation is written in C++ using the Z3 C++ API [ 15 ]. It takes as input: - A SMTLIB2 file using standard elements of the language. - An index theory [ 7 ]. Currently, we support quantifier free fragments of total order theory (QF_TO), integer diference ( QF_IDL), and linear integer arithmetic (QF_LIA). - A positive integer indicating a maximum number of allowed iterations of the main loop. 5 - An integer number denoting the SMT solver to use to compute the interpolant in the case the input is unsatisfiable in the reduced theory ∪ ℰ ℱ . Currently, we support the SMT solvers Z3, Mathsat, and SMTInterpol.

Our extended language is parameterized by the array sorts in the input formula as well as by an index theory. The domain sort of every array is currently implemented using the Int sort 6. (declare-sort A) (declare-fun diff’A’ ((Array Int A) (Array Int A)) Int) (declare-fun length’A’ ((Array Int A)) Int) (declare-fun empty_array’A’ () (Array Int A) (declare-fun undefined’A’ () A) 5Index theories stronger than QF_TO require such upper bound for termination purposes in reasonable time. 6The requirement of including additional information on the name of sorted-functions in our extended language is necessary since Mathsat does not implement ad hoc polymorphism.

The rest of the SMTLIB2 file should specify two assertions. The first one encodes the A-part and the second one the B-part of the interpolation problem. In the following we describe the algorithm in [ 7 ] by phases as it was implemented in our tool.

Preprocessing. This step takes two main parts. First, our tool scans the declaration section in the input file to internalize the signature involving the sorts of the array elements present in the input file. Additionally, we instantiate new declarations of uninterpreted sorts ArraySortA which abstract the array sorts present in the input file, as well as two new functions: (declare-fun wr ((ArraySortA Int A)) ArraySortA) (declare-fun rd (ArraySortA Int) A)

with a proper uninterpreted sort. The reason for this is to internalize an uninterpreted language in the algorithm in order to perform theory reduction from ℛ( ) to ∪ ℰ ℱ .

The second part involves processing the input formula. The algorithm in [ 7 ] handles a conjunctive fragment situation. Thus, the algorithm uses Z3 tactics to perform a DNF transformation and produce disjuncts of conjunctions. Every disjunct is processed by the AXDInterpolator C++ class in our implementation which can handle conjunctive formulae. Each conjunctive formula aforementioned is normalized (we replace length’A’(x) subformulae for diff’A’(x, empty-array’A’)) and flattened by introducing constants of the right sort. During the execution of the aforementioned methods, we keep track of the constants as well as array constants appearing in the input formula. Index constants are used to instantiate quantified formulae of the respective ℛ( ) theory in use while array constants are necessary to compute common pairs of arrays between the A-part and the B-part of the interpolation problem in the main loop of the algorithm.

Standard Pairs. Separated pairs from Section 2 are called in our implementation Standard Pairs, and their requirements are encoded as a C++-class called StandardInput. There, atoms {(, ) = (, ), (, ) = , 1 = 2} are such that , are of sort ArraySort’A’, , are of sort Int, is of uninterpreted sort A, and 1, 2 are of appropriate sorts.

Given a formula, this class keeps two separate z3::expr_vector denoting the conjunction of formulae on the first and second component of the standard pair. We use one instantiation of the StandardInput class for both formulae A-part and B-part. Additionally, this class implements methods handling the instantiations of formulae. Instantiation is achieved by including an undefined z3::expr which is replaced using the z3.substitute(z3::expr_vector const &, z3::expr_vector const&) method.

Main Loop of the Algorithm and Interpolation Generation. First, the algorithm identifies the set of common array constants between A-part and the B-part. If this set is empty or contains a single element, the algorithm proceeds to instantiate an z3::solver and check the satisfiability of the second component of the Standard Pairs of the A-part and B-part. Otherwise, it uses an iterator to select common pair of array constants which selects elements in a fair way 7. Similarly to the previous case, it checks the satisfiability of the second component 7A selection over objects can be seen as a function : {1, . . . , } → N. If ∈ {1, . . . , }, () denotes the of the Standard Pairs. If the state of the solver is unsatisfiable the loop terminates saving the formulae of the Standard Pairs of the A-part and B-part. Otherwise, it selects the common pair (1, 2) of array constants and ‘propagates’ this pair to each of the Standard Pairs of the A-part and B-part. The propagation consists of adding the equation of the form diff(1, 2) = to the first component of the Standard Pairs where is the smallest integer number such that diff(1, 2) does not occur in the Stand Pair of both A-part and B-part and is a fresh Int constant. Subsequently, both Standard Pairs are − .

If the index theory in use extends QF_TO, then the implementation uses heuristics to increase the value of an internal number keeping track of the maximum allowed complexity on the index terms. We noticed the implementation performs well if only 0 − are executed. Better heuristics are under investigation.

If in the main loop the solver eventually detects that the second component of the Standard Pairs is unsatisfiable, then the implementation saves the formulae of the Standard Pairs of the A-part and B-part respectively. Depending on the choice of the user, the implementation executes a command on an external shell calling the SMT-solver to compute the interpolant using the pair of unsatisfiable formulae mentioned before. For the latter, the implementation ‘compiles’ a temporary SMTLIB2 file for the respective SMT-solver chosen.

Reading the interpolant result from the previous process the algorithm removes added constants in order to provide a solution using elements from the input formula. Since no SMT-solver supports interpolation algorithms for the QF_TO we select the QF_IDL as the theory for the interpolation engine and use a custom rewriter implemented on Z3 in order to relax a QF_IDL formula as a QF_TO statement. The latter is important for us in order to remain closed in the respective language.

Correctness of the Output. If the interpolant is successfully computed, our implementation allows the user to automatically verify the correctness of the output, that is the output is a real interpolant in ℛ( ): this is done by exploiting the first-order axioms of ℛ( ) and by calling a z3::solver instance using them. For the latter the implementation ‘lifts’ the result by replacing applications of diff to their recursive definition. This is achieved using a data structure that memorizes these intermediate formulae.

4. Evaluation 4.1. Some interesting examples

We present in detail an example of two formulae , such that our implementation can compute a quantifier-free interpolant for them (in 0.066 sec), whereas the other solvers cannot, since: (i) modulo the standard theory of arrays, they return that ∧ is sat; (ii) when enriched with the quantified axioms of ℛ( ), iZ3 cannot return an answer within 15 minutes, SMTInterpol correctly detects that ∧ is unsat but cannot return an interpolant and Mathsat returns error since it does not support quantifiers. The example is taken from [ 7 ] (Example 1 ): number of times the object has been selected. A selection over objects is fair if for all , in {1, . . . , } we have that | () − ()| ≤ 1. - B-part: 1 < 2 ∧ 2 < 3 ∧ (1, 2) ̸= (2, 2) Our implementation returns the following as an interpolant (and verifies its correctness): - Interpolant: let !1 = ¬((2, diff(2, 1)) = (1, diff(2, 1))) be in let !2 = ¬(diff(2, 1) ≤ 1) ∧ !1 ∧ ¬(3 ≤ diff(2, 1)) be in diff2(2, 1) ≤ 1 ∧ ¬!2 If the previous file is given the iZ3, SMTInterpol and Mathsat, then the SMT-solvers output sat: clearly, the axioms for maxdif are missing and ∧ is satisfiable in the standard theory of arrays. Only AXDInterpolator is able to compute a quantifier-free interpolant.

Consider this example from [ 5 ]: := = (, , ) and := (, ) ̸= (, ) ∧ (, ) ̸= (, ) ∧ ̸= . When adding the axioms for maxdif, Mathsat and iZ3 are not able to find an interpolant, whereas both SMTInterpol and AXDInterpolator get a quantifier-free interpolant: AXDInterpolator is a bit faster than SMTInterpol (0.062 vs 0.176 sec), and the output of the latter uses the plain dif with the undetermined semantics. The Strcpy function from [ 7 ] gives the same results, and AXDInterpolator is still a bit faster than SMTInterpol (0.057 vs 0.205 sec).

4.2. Benchmarks using SV-COMP and UAutomizer

We also tested our implementation on two benchmarks. We performed our experiments on a machine with Linux 5.11.16, Intel i7-9700 4.7 GHz with 32 GB of memory. We used the model checker UAutomizer [ 14 ] to extract their SMT Scripts which are SMTLIB2 files containing the queries produced in order to verify some particular property of C programs. In our case, we chose C-programs from the ReachSafety-Arrays and MemSafety-Arrays tracks of the SV-COMP [ 17 ]. We let the machine produce SMT Scripts for 15 minutes. We used these SMT Scripts files to compare the number of interpolants computed from unsatisfiable formulae. For the latter we assigned each process up to 360 seconds and 6 GB of memory 8. All the examples do not include the expressive features of ℛ( ), since they have been automatically produced by UAutomizer: this allows us to run these examples on the other solvers iZ3, Mathsat and SMTInterpol. We defined two outcomes: Success, and Timeout. Success is declared when an interpolant is computed, and Timeout when the process takes more time or space than the previously specified. No file failed (i.e., terminated without returning an interpolant). Our tool was configured to run with QF_LIA and 1000 allowed loop repetitions with the heuristic of only 0-instantiations. The solvers were configured to use the QF_AUFLIA theory. The results are shown in Tables 1,2,3 and 4. The implementation is available at the following link: https://github.com/typesAreSpaces/AXDInterpolator. The experimental results can be obtained following the instructions in the documentation.

8We were able to use 1008 unsatisfiable formulae from the ReachSafety-Arrays track; 110 queries contained array constants or function applications of select/store. From the MemSafety-Arrays track, we were able to obtain 758 unsatisfiable formulae; 750 contained array related elements. We noticed the input formulae do not contain applications of any dif operator as expected since the proposed extension of ℛ( ) is relatively new.

5. Related Work

In [ 11 ] McMillan introduced an interpolating proof calculus to extract interpolants using refutational proofs obtained from the Z3 SMT-solver. His approach benefits from the flexibility of Z3 to handle a variety of theories and their combination: it relies on a secondary interpolation solver in order to ‘fill the gaps’ of refutational proofs introduced by theory lemmas, which are formulae derived by the satellities theories encoded in Z3 without a detailed explanation in the proof. The secondary interpolation solver only requires an interpolation algorithm for QF_UFLIA. Since the theory of arrays is introduced using quantified formulae, this approach generates quantified formulae.

The authors in [18] employed the proof tree preserving interpolation scheme from [19] to compute interpolants using a resolution proof. This approach is capable of handling mixed literals but not mixed terms9, so the authors introduced the idea of weakly equivalences between arrays to handle those cases for their interpolation calculi. The authors include the diff operation between arrays in order to compute quantifier-free interpolants, but no additionally semantical properties are given to the diff operation.

The authors in [ 10 ] proposed a general framework for computing interpolants via reduction using finite instantiations. In particular, the paper discusses complete interpolation procedures for theories of arrays and linked lists, as well as other heap-allocated data structures.

6. Conclusions

In this paper we presented AXDInterpolator, the implementation of the interpolation algorithm from [ 7 ]. We showed the feasibility of AXDInterpolator by validating it on two benchmarks from the SV-COMP. We also compared our implementation with state-of-the-art solvers: apart from very few timeout outcomes, our tool handled all the examples the other solvers did. We also managed to handle interesting examples that are not handled by the other solvers, which makes the option of our tool appealing. Specifically, expressive examples that make use of features like | · | cannot in general be managed by any existing solver: this suggests that AXDInterpolator could be applied for solving more sophisticated examples coming from concrete C programs using arrays. This analysis is a significant point to investigate in future work.

We noticed that our implementation, in case of the SV-COMP benchmarks, is behind in time with respect to the state-of-the-art solvers. We think the main reasons might be the following: - The UAutomizer model checker does not include our extended signature in their SMT Scripts. We might take into consideration that many industrial problems will not explicitly use this language, however the overall structure of many array programs commonly use | · | for various algorithms. Thus, we need to integrate this structure in the input formula. - Our implementation timeouts slightly more often than other solvers. This point can be further improved by doing better engineering work at the implementation level. The DNF conversion (in worst case, exponential) appears to be the first target to rework. - The current design does not perform incremental satisfiability checks. As future work we plan to perform incremental checks since this can improve the performance of the used z3::solver data structure. Incremental checks can be implemented due to the incremental nature of the proposed interpolation algorithm by including a hash consed data structure on the terms/predicates produced in the main loop of the algorithm and because the data structure z3::solver can keep track of previously proven assertions. 27 – April 1, 2021, Proceedings, Part II 12652 (2021) 401—422. URL: https://europepmc.org/ articles/PMC7984550. [18] J. Hoenicke, T. Schindler, Eficient interpolation for the theory of arrays, in: Proc. of IJCAR, volume 10900 of LNCS (LNAI), Springer, 2018, pp. 549–565. doi:10.1007/ 978-3-319-94205-6\_36. [19] J. Christ, J. Hoenicke, A. Nutz, Proof tree preserving interpolation, in: N. Piterman, S. A. Smolka (Eds.), Tools and Algorithms for the Construction and Analysis of Systems, Springer Berlin Heidelberg, Berlin, Heidelberg, 2013, pp. 124–138.

[1] K. L. McMillan , Interpolation and SAT-based model checking , in: Proc. of CAV , volume 2725 of LNCS , Springer, 2003 , pp. 1 - 13 . doi: 10 .1007/978-3- 540 -45069-6\_ 1. 9Given an interpolation pair (A, B), a symbol is A-local if appears in the symbols of but not in the symbols of . A literal/term is mixed if it contains A-local and B-local symbols

[2] K. L. McMillan , Lazy abstraction with interpolants , in: Proc. of CAV , volume 4144 of LNCS , Springer, 2006 , pp. 123 - 136 . doi: 10 .1007/11817963\_ 14 .

[3]

Craig , Three uses of the Herbrand-Gentzen theorem in relating model theory and proof theory , J. Symbolic Logic 22 ( 1957 ) 269 - 285 .

[4]

Mccarthy , Towards a mathematical science of computation , in: In IFIP Congress, North-Holland, 1962 , pp. 21 - 28 .

[5]

Kapur ,

Majumdar ,

C. G.

Zarba , Interpolation for Data Structures , in: Proc. of SIGSOFT-FSE, ACM , 2006 , pp. 105 - 116 .

[6]

Bruttomesso ,

Ghilardi ,

Ranise , Quantifier-free interpolation of a theory of arrays , Log. Methods Comput. Sci. 8 ( 2012 ).

[7]

Ghilardi ,

Gianola ,

Kapur , Interpolation and amalgamation for arrays with maxdif , in: S. Kiefer, C. Tasson (Eds.), Foundations of Software Science and Computation Structures - 24th International Conference, FOSSACS 2021 , Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021 ,

Luxembourg

City , Luxembourg, March 27 - April 1, 2021 , Proceedings, volume 12650 of Lecture Notes in Computer Science, Springer, 2021 , pp. 268 - 288 .

[8]

Sofronie-Stokkermans , Interpolation in local theory extensions , Log. Methods Comput. Sci. 4 ( 2008 ).

[9]

Sofronie-Stokkermans , On interpolation and symbol elimination in theory extensions , Log. Methods Comput. Sci . 14 ( 2018 ).

[10]

Totla , T. Wies, Complete instantiation-based interpolation , J. Autom. Reasoning 57 ( 2016 ) 37 - 65 .

[11] K. L. McMillan , Interpolants from Z3 proofs , in: Proc. of FMCAD , 2011 , pp. 19 - 27 .

[12]

Bozzano ,

Bruttomesso ,

Cimatti ,

Junttila , P. van Rossum,

Schulz , R. Sebastiani, MathSAT: Tight integration of SAT and mathematical decision procedures , Journal of Automated Reasoning ( 2005 ).

[13]

Christ ,

Hoenicke , A . Nutz, SMTInterpol: An interpolating SMT solver , in: A.

Donaldson , D. Parker (Eds.), Model Checking Software - 19th International Workshop, SPIN 2012 , Oxford, UK, July 23-24 , 2012 . Proceedings, volume 7385 of Lecture Notes in Computer Science, Springer, 2012 , pp. 248 - 254 .

[14]

Heizmann ,

Christ ,

Dietsch ,

Ermis ,

Hoenicke ,

Lindenmann ,

Nutz ,

Schilling ,

Podelski , Ultimate Automizer with SMTInterpol , in: N. Piterman , S. A. Smolka (Eds.), Tools and Algorithms for the Construction and Analysis of Systems , Springer Berlin Heidelberg, Berlin, Heidelberg, 2013 , pp. 641 - 643 .

[15] L. De Moura , N. Bjørner, Z3: An Eficient SMT Solver , in: Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS'08/ETAPS'08 , Springer-Verlag, Berlin, Heidelberg, 2008 , pp. 337 - 340 .

[16]

A. R.

Bradley ,

Manna ,

H. B.

Sipma , What's decidable about arrays? , in: Proc. of VMCAI , volume 3855 of LNCS , Springer, 2006 , pp. 427 - 442 . doi: 10 .1007/11609773\_ 28 .

[17]

J. F.

Groote ,

K. G.

Larsen ,

Beyer , Software verification: 10th comparative evaluation (sv-comp 2021 ), Tools and Algorithms for the Construction and Analysis of Systems27th International Conference , TACAS 2021, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021 ,

Luxembourg

City , Luxembourg, March