<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>AXDInterpolator: A Tool for Computing Interpolants for Arrays with MaxDif</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jose Abel Castellanos Joo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Silvio Ghilardi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandro Gianola</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Deepak Kapur</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of New Mexico</institution>
          ,
          <addr-line>Albuquerque</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dipartimento di Matematica, Università degli Studi di Milano</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Faculty of Computer Science, Free University of Bozen-Bolzano</institution>
          ,
          <addr-line>Bolzano</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>40</fpage>
      <lpage>52</lpage>
      <abstract>
        <p>Several approaches toward quantifier-free interpolation algorithms of theories involving arrays have been proposed by extending the language using a binary function skolemizing the extensionality principle. In FoSSaCS 2021, the last three authors studied the enrichment of the McCarthy's theory of extensional arrays with a maxdif operation. This paper discusses the implementation of the interpolation algorithm proposed in FoSSaCS 2021 using the Z3 API. The implementation allows the user to choose iZ3, Mathsat, or SMTInterpol as interpolation engines. The tool returns a formula in SMTLIB2 format, which allows compatibility with model checkers and invariant generators using such a format. We compare our algorithm with state-of-the-art interpolation engines. Our experiments using unsatisfiable formulae extracted with the model checker UAutomizer show the feasibility of our tool. For that purpose, we used C programs from the ReachSafety-Arrays and MemSafety-Arrays tracks of SV-COMP.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Interpolation</kwd>
        <kwd>Arrays</kwd>
        <kwd>MaxDif</kwd>
        <kwd>SMT</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>comparing, searching, sorting functions; verification problems of this kind are collected, e.g., in
the SV-COMP benchmarks categories “ReachSafety-Arrays” and “MemSafety-Arrays”1, where
safety verification tasks involving arrays of finite but unknown length are considered.</p>
      <p>
        By itself, the theory of arrays with extensionality does not have quantifier free
interpolation [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]; however, in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] it was shown that quantifier-free interpolation is restored if one
enriches the language with a binary function skolemizing the extensionality axiom. Such a
Skolem function, applied to two array variables , , returns an index diff(, ) where , 
difer (it returns an arbitrary value if  is equal to ). This semantics for the diff operation is
very undetermined and does not have a significant interpretation in concrete programs. That is
why in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] the diff operation is modified so as to obtain a defined and natural meaning: we
introduce the theory of arrays with maxdif ℛ( ), where diff(, ) returns the biggest
index where ,  difer (in case  =  we ask for diff(, ) to be the minimum index 0). Since
it is natural to view arrays as functions defined on initial intervals of the nonnegative integers,
this choice has a clear semantic motivation. The expressive power of the theory of arrays so
enriched becomes bigger: for instance, if we also add to the language a constant symbol  for
the undefined array constantly equal to some ‘undefined’ value ⊥ (where ⊥ is meant to be
diferent from the values [] actually in use), then we can define || as diff(,  ). In this way
we can model the fact that  is undefined outside the interval [ 0, || ] - this is useful to formalize
the above mentioned SV-COMP benchmarks. The efectiveness of quantifier-free
interpolation in the theory of arrays with maxdif is exemplified in the Strcpy function example of
Figure 1 in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]:2 we need to express | · | to formalize its initial states without quantifiers. In
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], ℛ( ) is shown to admit quantifier-free interpolation, and a suitable algorithm based
on instantiation à-la-Herbrand and on a hierarchical approach ([
        <xref ref-type="bibr" rid="ref10 ref8 ref9">8, 9, 10</xref>
        ]) is introduced: by
exploiting iterated applications of the maxdif operator and reducing the interpolation problem
to a reduced interpolating theory, we can compute quantifier-free interpolants out of the input
formulae.
      </p>
      <p>
        In this paper, we present (in Section 3) AXDInterpolator3, a tool that implements the
interpolation algorithm of [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Following an instantiation-based approach, the algorithm always (and
in particular for all the examples in this paper) returns a quantifier-free formula. A comparison
is done with the state-of-the-art interpolation engines that have known interpolant generation
algorithms for the theory of arrays–iZ3, MathSat, and SMTInterpol [
        <xref ref-type="bibr" rid="ref11 ref12 ref13">11, 12, 13</xref>
        ] support
the theory of arrays with maxdif. Indeed, on examples supporting expressive features like
diff or the | · | operator, no proper comparison is possible. In case unsafe traces for such
examples are considered, in contrast to AXDInterpolator, Mathsat and iZ3 cannot usually get a
quantifier-free interpolant: either the tools detect the satisfiability of the unsafe trace (since they
reason modulo a theory that is weaker than ℛ( )) so they cannot compute an interpolant,
1https://sv-comp.sosy-lab.org/2020/benchmarks.php
2The Strcpy function performs  sequential index-wise copies of values of two arrays ,  from  to  where
 is the length of both arrays: the loop invariant for this examples is  =  ∨ ( &gt; diff(, ) ∧ diff(, ) ≥ ),
 being the index variable in the loop. This formula is a loop invariant because if the two arrays ,  are diferent
while  &lt;  then the conjunction  &gt; diff(, ) ∧ diff(, ) ≥  asserts that the possibly entry where , 
difer has not been processed yet.
      </p>
      <p>
        3We use the prerelease version 0.9.0 of our tool: https://github.com/typesAreSpaces/AXDInterpolator/releases/
tag/0.9.0
or, in case the unsafe trace is correctly detected as unsat (equipping the solver with axioms for
maxdif), they fail returning errors or quantified interpolant or timeout; for Example 1 from [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
SMTInterpol fails. Mathsat and iZ3 fail also in the case of the example from [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], which is
well-known not to admit a quantifier-free interpolant unless the extensionality axiom is
skolemized with a generic dif operator: in this specific case, SMTInterpol, instead, can successfully
compute a quantifier-free interpolant using the plain dif operator, but AXDInterpolator can
compute an interpolant with maxdif a bit faster. Running experiments over two tracks of the
SV-COMP (ReachSafety-Arrays and MemSafety-Arrays), we use unsatisfiable formulae extracted
with the model checker UAutomizer [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]: the latter automatically transforms C programs taken
from those benchmarks into suitable SMTLIB2 files that allow to verify particular properties
of these programs. Since UAutomizer does not support the extended language of arrays with
maxdif, the comparison among the diferent solvers becomes acceptable. The results show that
AXDInterpolator, iZ3, Mathsat and SMTInterpol have an overall identical behavior over
these benchmarks: interpolants are computed for almost every example, apart from a small
portion of timeout outcomes reached by AXDInterpolator as its implementation is evolving.
On more expressive examples, such as Example 1 from [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], other solvers including the most
expressive SMTInterpol fail, whereas AXDinterpolator produces useful interpolants.
AXDInterpolator uses the Z3 API [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] and allows the user to choose among diferent solvers in order
to compute the interpolants in the reduced theory invoked by the hierarchical method: iZ3,
Mathsat [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and SMTInterpol [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. It generates a formula in SMTLIB2 format, which
guarantees compatibility with model checkers and invariant generators adopting the same
format.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Arrays with MaxDif</title>
      <p>
        The McCarthy theory of arrays [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] with extensionality has three sorts ARRAY, ELEM, INDEX (called
“array”, “element”, and “index” sort, respectively) and two function symbols  (“read”) and
 (“write”) of appropriate arities. Besides the axioms for defining  and , it contains the
‘extensionality’ axiom: ∀, .  ̸=  → (∃. (, ) ̸= (, )). In [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], the Skolem function
diff for this axiom has been given a proper semantics, i.e., diff returns the biggest index
where two diferent arrays difer. For introducing our theory of arrays with maxdif, we first
need to consider an index theory  (see [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] for the general definition). The signature of an
index theory  contains at least the binary relation symbol ≤ and the constant 0. Examples of
index theories are the theory of total orders  , integer diference logic ℐℒ, integer linear
arithmetic ℒℐ, and real linear arithmetics ℒℛ. For most applications, ℐℒ (namely the
theory of integer numbers with 0, ordering, successor and predecessor) sufices as in this theory
one can model counters for scanning arrays.
      </p>
      <p>Given an index theory  , we now introduce our array theory with maxdif ℛ( )
(parameterized by  ) as follows. We still have the same sorts; the language includes the symbols
of  , the operations , , a binary function diff of type ARRAY × ARRAY → INDEX, as well
as constants  and ⊥ of sorts ARRAY and ELEM, resp. The constant ⊥ models an undetermined
(e.g. undefined, not-in-use, not coming from appropriate initialization, etc.) value and  models
the totally undefined array; the term diff(, ) returns the maximum index where  and 
difer and returns 0 if  and  are equal.4 Formally, the axioms of ℛ( ) include, besides
the axioms of  , the following ones:</p>
      <p>
        As an efect of the above axioms, we have that an array  is undefined outside the interval
[0, ||], where || is defined as || := diff(, ). Typically, this interval is finite, and it can be
shown that any satisfiable constraint is satisfiable in a model where all such intervals (involving
variables appearing in the constraint) are finite [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        For the interpolation algorithm from [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], we need to introduce iterated diff operations,
similarly to [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. As we know, diff(, ) returns the biggest index where  and  difer (it
returns 0 if  = ). Now we want an operator that returns the last-but-one index where , 
difer (0 if ,  difer in at most one index), etc. Our language is already enough expressive
for that, so we can introduce such operators explicitly as follows. Given array variables , ,
we define by mutual recursion the sequence of array terms 1, 2, . . . and of index terms
diff1(, ), diff2(, ), . . . :
1 := ;
diff1(, ) := diff(, 1);
+1 := (, diff(, ), (, diff(, )));
diff+1(, ) := diff(, +1)
Lemma 2.1. A formula like
diff1(, ) = 1 ∧ · · · · · · ∧
diff(, ) = 
is equivalent modulo ℛ to the conjunction of the following five formulae:
1 ≥ 2 ∧ · · · ∧
      </p>
      <p>− 1 ≥  ∧  ≥ 0
⋀︀&lt;( &gt; +1 → (,  ) ̸= (,  ))</p>
      <p>
        ⋀︀&lt;( = +1 →  = 0)
⋀︀≤ ((,  ) = (,  ) →  = 0)
∀ℎ (ℎ &gt;  → (, ℎ) = (, ℎ) ∨ ℎ = 1 ∨ · · · ∨
ℎ = − 1)
(8)
Separated Pairs,  -Instantiation and Satisfiability. The key step of the interpolation
algorithm from [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] depends upon the problem of checking satisfiability (modulo ℛ( )) of
quantifier-free formulae; this is solved by adapting instantiation techniques, like those from [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>4Notice that it might well be the case that diff(, ) = 0 for diferent , , but in that case 0 is the only index
where ,  difer.</p>
      <p>The complexity () of a term  is defined as the number of function symbols occurring in .
If ℐ is a set of  -terms, an ℐ-instance of a universal formula of the kind ∀  is a formula of the
kind (/) for some  ∈ ℐ.</p>
      <p>A pair of sets of quantifier-free formulae Φ = (Φ 1, Φ 2) is a separated pair if
(1) Φ 1 contains equalities of the form diff(, ) =  and  = (, , ); moreover if
it contains the equality diff(, ) = , it must also contain an equality of the form
diff(, ) =  for every  &lt; ;
(2) Φ 2 contains Boolean combinations of  -atoms and of atoms of the forms: {(, ) =
(, ), (, ) = , 1 = 2}, where , , , , , 1, 2 are variables or constants of
the appropriate sorts.</p>
      <p>
        The separated pair is said to be finite if Φ 1 and Φ 2 are both finite. In practice, in a separated
pair Φ = (Φ 1, Φ 2), reading (, ) as a functional application, it turns out that the formulae
from Φ 2 can be translated into quantifier-free formulae of the combined theory  ∪ ℰ  ℱ (the
array variables occurring in Φ 2 are converted into free unary function symbols).  ∪ ℰ  ℱ
enjoys the decidability of the quantifier-free fragment and has quantifier-free interpolation
because  is an index theory [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]: we adopt a hierarchical approach (similar to [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]) and we
rely on satisfiability and interpolation algorithms for such a theory as black boxes .
      </p>
      <p>Let ℐ be a set of  -terms and let Φ = (Φ 1, Φ 2) be a separated pair; we let Φ( ℐ) =
(Φ 1(ℐ), Φ 2(ℐ)) be the smallest separated pair satisfying the following conditions:
- Φ 1(ℐ) is equal to Φ 1 and Φ 2(ℐ) contains Φ 2;
- Φ 2(ℐ) contains all ℐ-instances of the two formulae ∀ (, ) = ⊥, ∀ ( &lt; 0 →
(, ) = ⊥), where  is any array variable occurring in Φ 1 or Φ 2;
- if Φ 1 contains  = (, , ), then Φ 2(ℐ) contains all the ℐ-instances of the equivalent
formula ( ≥ 0 → (, ) = ) ∧ ∀ℎ (ℎ ̸=  → (, ℎ) = (, ℎ));
- if Φ 1 contains the conjunction ⋀︀</p>
      <p>=1 diff(, ) = , then Φ 2(ℐ) contains the
formulae (9), (10), (11), (12) as well as all ℐ-instances of the formula (13).</p>
      <p>For  ∈ N ∪ {∞}, the  -instantiation of Φ = (Φ 1, Φ 2) is the separated pair Φ( ℐΦ ) =
(Φ 1(ℐΦ ), Φ 2(ℐΦ )), where ℐΦ is the set of  -terms of complexity at most  built up from
the index variables occurring in Φ 1, Φ 2. The full instantiation of Φ = (Φ 1, Φ 2) is the separated
pair Φ( ℐΦ∞) = (Φ 1(ℐΦ∞), Φ 2(ℐΦ∞)) (which is usually not finite). A separated pair Φ = (Φ 1, Φ 2)
is  -instantiated if Φ = Φ( ℐΦ ); it is ℛ( )-satisfiable if so it is the formula ⋀︀ Φ 1 ∧ ⋀︀ Φ 2.
The   (ℛ( )) problem is decidable for every index theory  (and can be reduced to
 ∪ ℰ  ℱ -satisfiability) thanks to the following results:
Lemma 2.2. Given a quantifier-free formula , we can compute finitely many finite separation
pairs Φ 1 = (Φ 11, Φ 12), . . . , Φ  = (Φ 1, Φ 2) such that  is ℛ( )-satisfiable if so is one of the
Φ .</p>
      <p>Lemma 2.3. The following conditions are equivalent for a finite separation pair Φ = (Φ
(i) Φ is ℛ( )-satisfiable; (ii) ⋀︀ Φ 2(ℐΦ0) is  ∪ ℰ  ℱ -satisfiable.
1, Φ 2):</p>
      <sec id="sec-2-1">
        <title>2.1. The interpolation algorithm</title>
        <p>
          The interpolation algorithm from [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] is based on instantiation à-la-Herbrand. Our problem is to
compute a quantifier-free interpolant in ℛ( ) of two quantifier-free formulae  and .
        </p>
        <p>
          We call the variables occurring in both  and  common variables, whereas the variables
occurring in  (resp. in ) are called -variables (resp. -variables). The same terminology
applies to terms, atoms and formulae. As shown in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], both  and  can be given in the form of
ifnite separated pairs. Thus,  is of the form ⋀︀ 1 ∧ ⋀︀ 2,  is of the form ⋀︀ 1 ∧ ⋀︀ 2, for
separated pairs (1, 2) and (1, 2). Also,  and  can be assumed to be both 0-instantiated.
        </p>
        <p>The formulae from 2 and 2 are formulae from the signature of  ∪ ℰ  ℱ (after rewriting
terms of the kind (, ) to (), where the  are free function symbols). Of course, if
2 ∧ 2 is  ∪ ℰ  ℱ -inconsistent, we can get our quantifier-free interpolant by using our black
box algorithm for interpolation in the weaker theory  ∪ ℰ  ℱ .</p>
        <p>
          Formally, we apply the Loop below until 2 ∧ 2 becomes  ∪ ℰ  ℱ -inconsistent: Theorem 4
in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] guarantees that, if  ∧  is ℛ( )-inconsistent, then the Loop always terminates.
When 2 ∧ 2 becomes inconsistent, we can get our interpolant via the interpolation algorithm
for  ∪ ℰ  ℱ . We need a counter  recording how many times the Loop has been executed.
Loop (to be repeated until 2 ∧ 2 becomes inconsistent modulo  ∪ ℰ  ℱ , initially  = 0).
Pick two distinct common ARRAY-variables 1, 2 and  ≥ 1 and such that no conjunct of the kind
diff(1, 2) =  occurs in both 1 and 1 for some  ≥ 1 (but such that for every  &lt;  there
is a conjunct of the form diff(, ) =  occurring in both 1 and 1). Pick also a fresh INDEX
constant ; conjoin diff(1, 2) =  to both 1 and 1; then  -instantiate both  and .
Increase  to  + 1.
        </p>
        <p>Notice that the fresh index constants  introduced during the Loop are considered common
constants and so they are considered in the  -instantiation of both  and . In the output,
these constants  need to be eliminated.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Implementation</title>
      <p>
        Our implementation is written in C++ using the Z3 C++ API [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. It takes as input:
- A SMTLIB2 file using standard elements of the language.
- An index theory  [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Currently, we support quantifier free fragments of total order
theory (QF_TO), integer diference ( QF_IDL), and linear integer arithmetic (QF_LIA).
- A positive integer indicating a maximum number of allowed iterations of the main loop. 5
- An integer number denoting the SMT solver to use to compute the interpolant in the
case the input is unsatisfiable in the reduced theory  ∪ ℰ  ℱ . Currently, we support
the SMT solvers Z3, Mathsat, and SMTInterpol.
      </p>
      <p>Our extended language is parameterized by the array sorts in the input formula as well as by an
index theory. The domain sort of every array is currently implemented using the Int sort 6.
(declare-sort A)
(declare-fun diff’A’ ((Array Int A) (Array Int A)) Int)
(declare-fun length’A’ ((Array Int A)) Int)
(declare-fun empty_array’A’ () (Array Int A)
(declare-fun undefined’A’ () A)
5Index theories stronger than QF_TO require such upper bound for termination purposes in reasonable time.
6The requirement of including additional information on the name of sorted-functions in our extended language
is necessary since Mathsat does not implement ad hoc polymorphism.</p>
      <p>
        The rest of the SMTLIB2 file should specify two assertions. The first one encodes the A-part
and the second one the B-part of the interpolation problem. In the following we describe the
algorithm in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] by phases as it was implemented in our tool.
      </p>
      <p>Preprocessing. This step takes two main parts. First, our tool scans the declaration section in
the input file to internalize the signature involving the sorts of the array elements present in the
input file. Additionally, we instantiate new declarations of uninterpreted sorts ArraySortA
which abstract the array sorts present in the input file, as well as two new functions:
(declare-fun wr ((ArraySortA Int A)) ArraySortA)
(declare-fun rd (ArraySortA Int) A)</p>
      <p>with a proper uninterpreted sort. The reason for this is to internalize an uninterpreted
language in the algorithm in order to perform theory reduction from ℛ( ) to  ∪ ℰ  ℱ .</p>
      <p>
        The second part involves processing the input formula. The algorithm in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] handles
a conjunctive fragment situation. Thus, the algorithm uses Z3 tactics to perform a DNF
transformation and produce disjuncts of conjunctions. Every disjunct is processed by the
AXDInterpolator C++ class in our implementation which can handle conjunctive formulae.
Each conjunctive formula aforementioned is normalized (we replace length’A’(x)
subformulae for diff’A’(x, empty-array’A’)) and flattened by introducing constants of the
right sort. During the execution of the aforementioned methods, we keep track of the 
constants as well as array constants appearing in the input formula. Index constants are used to
instantiate quantified formulae of the respective ℛ( ) theory in use while array constants
are necessary to compute common pairs of arrays between the A-part and the B-part of
the interpolation problem in the main loop of the algorithm.
      </p>
      <p>Standard Pairs. Separated pairs from Section 2 are called in our implementation Standard Pairs,
and their requirements are encoded as a C++-class called StandardInput. There, atoms
{(, ) = (, ), (, ) = , 1 = 2} are such that ,  are of sort ArraySort’A’, , 
are of sort Int,  is of uninterpreted sort A, and 1, 2 are of appropriate sorts.</p>
      <p>Given a formula, this class keeps two separate z3::expr_vector denoting the
conjunction of formulae on the first and second component of the standard pair. We use
one instantiation of the StandardInput class for both formulae A-part and B-part.
Additionally, this class implements methods handling the instantiations of formulae.
Instantiation is achieved by including an undefined z3::expr which is replaced using the
z3.substitute(z3::expr_vector const &amp;, z3::expr_vector const&amp;)
method.</p>
      <p>Main Loop of the Algorithm and Interpolation Generation. First, the algorithm identifies
the set of common array constants between A-part and the B-part. If this set is empty or
contains a single element, the algorithm proceeds to instantiate an z3::solver and check
the satisfiability of the second component of the Standard Pairs of the A-part and B-part.
Otherwise, it uses an iterator to select common pair of array constants which selects elements in
a fair way 7. Similarly to the previous case, it checks the satisfiability of the second component
7A selection over  objects can be seen as a function  : {1, . . . , } → N. If  ∈ {1, . . . , },  () denotes the
of the Standard Pairs. If the state of the solver is unsatisfiable the loop terminates saving
the formulae of the Standard Pairs of the A-part and B-part. Otherwise, it selects the
common pair (1, 2) of array constants and ‘propagates’ this pair to each of the Standard Pairs
of the A-part and B-part. The propagation consists of adding the equation of the form
diff(1, 2) =  to the first component of the Standard Pairs where  is the smallest integer
number such that diff(1, 2) does not occur in the Stand Pair of both A-part and B-part
and  is a fresh Int constant. Subsequently, both Standard Pairs are  − .</p>
      <p>If the index theory in use extends QF_TO, then the implementation uses heuristics to increase
the value of an internal number keeping track of the maximum allowed complexity on the index
terms. We noticed the implementation performs well if only 0 −  are executed.
Better heuristics are under investigation.</p>
      <p>If in the main loop the solver eventually detects that the second component of the Standard
Pairs is unsatisfiable, then the implementation saves the formulae of the Standard Pairs of the
A-part and B-part respectively. Depending on the choice of the user, the implementation
executes a command on an external shell calling the SMT-solver to compute the interpolant
using the pair of unsatisfiable formulae mentioned before. For the latter, the implementation
‘compiles’ a temporary SMTLIB2 file for the respective SMT-solver chosen.</p>
      <p>Reading the interpolant result from the previous process the algorithm removes added
constants in order to provide a solution using elements from the input formula. Since no
SMT-solver supports interpolation algorithms for the QF_TO we select the QF_IDL as the
theory for the interpolation engine and use a custom rewriter implemented on Z3 in order
to relax a QF_IDL formula as a QF_TO statement. The latter is important for us in order to
remain closed in the respective language.</p>
      <p>Correctness of the Output. If the interpolant is successfully computed, our implementation
allows the user to automatically verify the correctness of the output, that is the output is a real
interpolant in ℛ( ): this is done by exploiting the first-order axioms of ℛ( ) and by
calling a z3::solver instance using them. For the latter the implementation ‘lifts’ the result
by replacing applications of diff to their recursive definition. This is achieved using a data
structure that memorizes these intermediate formulae.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Evaluation</title>
      <sec id="sec-4-1">
        <title>4.1. Some interesting examples</title>
        <p>
          We present in detail an example of two formulae ,  such that our implementation can compute
a quantifier-free interpolant for them (in 0.066 sec), whereas the other solvers cannot, since:
(i) modulo the standard theory of arrays, they return that  ∧  is sat; (ii) when enriched
with the quantified axioms of ℛ( ), iZ3 cannot return an answer within 15 minutes,
SMTInterpol correctly detects that  ∧  is unsat but cannot return an interpolant and
Mathsat returns error since it does not support quantifiers. The example is taken from [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]
(Example 1 ):
number of times the object  has been selected. A selection over  objects is fair if for all ,  in {1, . . . , } we have
that | () −  ()| ≤ 1.
- B-part: 1 &lt; 2 ∧ 2 &lt; 3 ∧ (1, 2) ̸= (2, 2)
Our implementation returns the following as an interpolant (and verifies its correctness):
- Interpolant: let !1 = ¬((2, diff(2, 1)) = (1, diff(2, 1))) be in let !2 =
¬(diff(2, 1) ≤ 1) ∧ !1 ∧ ¬(3 ≤ diff(2, 1)) be in diff2(2, 1) ≤ 1 ∧ ¬!2
If the previous file is given the iZ3, SMTInterpol and Mathsat, then the SMT-solvers
output sat: clearly, the axioms for maxdif are missing and  ∧  is satisfiable in the standard
theory of arrays. Only AXDInterpolator is able to compute a quantifier-free interpolant.
        </p>
        <p>
          Consider this example from [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]:  :=  = (, , ) and  := (, ) ̸= (, ) ∧
(, ) ̸= (, ) ∧  ̸= . When adding the axioms for maxdif, Mathsat and iZ3
are not able to find an interpolant, whereas both SMTInterpol and AXDInterpolator get
a quantifier-free interpolant: AXDInterpolator is a bit faster than SMTInterpol (0.062 vs
0.176 sec), and the output of the latter uses the plain dif with the undetermined semantics. The
Strcpy function from [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] gives the same results, and AXDInterpolator is still a bit faster than
SMTInterpol (0.057 vs 0.205 sec).
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Benchmarks using SV-COMP and UAutomizer</title>
        <p>
          We also tested our implementation on two benchmarks. We performed our experiments on a
machine with Linux 5.11.16, Intel i7-9700 4.7 GHz with 32 GB of memory. We used the model
checker UAutomizer [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] to extract their SMT Scripts which are SMTLIB2 files containing the
queries produced in order to verify some particular property of C programs. In our case, we
chose C-programs from the ReachSafety-Arrays and MemSafety-Arrays tracks of the SV-COMP
[
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. We let the machine produce SMT Scripts for 15 minutes. We used these SMT Scripts files
to compare the number of interpolants computed from unsatisfiable formulae. For the latter
we assigned each process up to 360 seconds and 6 GB of memory 8. All the examples do not
include the expressive features of ℛ( ), since they have been automatically produced by
UAutomizer: this allows us to run these examples on the other solvers iZ3, Mathsat and
SMTInterpol. We defined two outcomes: Success, and Timeout. Success is declared when
an interpolant is computed, and Timeout when the process takes more time or space than the
previously specified. No file failed (i.e., terminated without returning an interpolant). Our tool
was configured to run with QF_LIA and 1000 allowed loop repetitions with the heuristic of
only 0-instantiations. The solvers were configured to use the QF_AUFLIA theory. The
results are shown in Tables 1,2,3 and 4. The implementation is available at the following link:
https://github.com/typesAreSpaces/AXDInterpolator. The experimental results can be obtained
following the instructions in the documentation.
        </p>
        <p>8We were able to use 1008 unsatisfiable formulae from the ReachSafety-Arrays track; 110 queries contained
array constants or function applications of select/store. From the MemSafety-Arrays track, we were able to obtain
758 unsatisfiable formulae; 750 contained array related elements. We noticed the input formulae do not contain
applications of any dif operator as expected since the proposed extension of ℛ( ) is relatively new.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Related Work</title>
      <p>
        In [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] McMillan introduced an interpolating proof calculus to extract interpolants using
refutational proofs obtained from the Z3 SMT-solver. His approach benefits from the flexibility
of Z3 to handle a variety of theories and their combination: it relies on a secondary interpolation
solver in order to ‘fill the gaps’ of refutational proofs introduced by
theory lemmas, which are
formulae derived by the satellities theories encoded in Z3 without a detailed explanation in
the proof. The secondary interpolation solver only requires an interpolation algorithm for
QF_UFLIA. Since the theory of arrays is introduced using quantified formulae, this approach
generates quantified formulae.
      </p>
      <p>The authors in [18] employed the proof tree preserving interpolation scheme from [19] to
compute interpolants using a resolution proof. This approach is capable of handling mixed
literals but not mixed terms9, so the authors introduced the idea of weakly equivalences between
arrays to handle those cases for their interpolation calculi. The authors include the diff
operation between arrays in order to compute quantifier-free interpolants, but no additionally
semantical properties are given to the diff operation.</p>
      <p>
        The authors in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] proposed a general framework for computing interpolants via reduction
using finite instantiations. In particular, the paper discusses complete interpolation procedures
for theories of arrays and linked lists, as well as other heap-allocated data structures.
      </p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>
        In this paper we presented AXDInterpolator, the implementation of the interpolation algorithm
from [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. We showed the feasibility of AXDInterpolator by validating it on two benchmarks
from the SV-COMP. We also compared our implementation with state-of-the-art solvers: apart
from very few timeout outcomes, our tool handled all the examples the other solvers did. We also
managed to handle interesting examples that are not handled by the other solvers, which makes
the option of our tool appealing. Specifically, expressive examples that make use of features like
| · | cannot in general be managed by any existing solver: this suggests that AXDInterpolator
could be applied for solving more sophisticated examples coming from concrete C programs
using arrays. This analysis is a significant point to investigate in future work.
      </p>
      <p>We noticed that our implementation, in case of the SV-COMP benchmarks, is behind in time
with respect to the state-of-the-art solvers. We think the main reasons might be the following:
- The UAutomizer model checker does not include our extended signature in their SMT
Scripts. We might take into consideration that many industrial problems will not explicitly
use this language, however the overall structure of many array programs commonly use
| · | for various algorithms. Thus, we need to integrate this structure in the input formula.
- Our implementation timeouts slightly more often than other solvers. This point can be
further improved by doing better engineering work at the implementation level. The DNF
conversion (in worst case, exponential) appears to be the first target to rework.
- The current design does not perform incremental satisfiability checks. As future work
we plan to perform incremental checks since this can improve the performance of the
used z3::solver data structure. Incremental checks can be implemented due to the
incremental nature of the proposed interpolation algorithm by including a hash consed
data structure on the terms/predicates produced in the main loop of the algorithm and
because the data structure z3::solver can keep track of previously proven assertions.
27 – April 1, 2021, Proceedings, Part II 12652 (2021) 401—422. URL: https://europepmc.org/
articles/PMC7984550.
[18] J. Hoenicke, T. Schindler, Eficient interpolation for the theory of arrays, in: Proc.
of IJCAR, volume 10900 of LNCS (LNAI), Springer, 2018, pp. 549–565. doi:10.1007/
978-3-319-94205-6\_36.
[19] J. Christ, J. Hoenicke, A. Nutz, Proof tree preserving interpolation, in: N. Piterman,
S. A. Smolka (Eds.), Tools and Algorithms for the Construction and Analysis of Systems,
Springer Berlin Heidelberg, Berlin, Heidelberg, 2013, pp. 124–138.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>K. L. McMillan</surname>
          </string-name>
          ,
          <article-title>Interpolation and SAT-based model checking</article-title>
          ,
          <source>in: Proc. of CAV</source>
          , volume
          <volume>2725</volume>
          <source>of LNCS</source>
          , Springer,
          <year>2003</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>540</fpage>
          -45069-6\_
          <article-title>1. 9Given an interpolation pair (A, B), a symbol  is A-local if  appears in the symbols of  but not in the symbols of . A literal/term is mixed if it contains A-local and B-local symbols</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>K. L. McMillan</surname>
          </string-name>
          ,
          <article-title>Lazy abstraction with interpolants</article-title>
          ,
          <source>in: Proc. of CAV</source>
          , volume
          <volume>4144</volume>
          <source>of LNCS</source>
          , Springer,
          <year>2006</year>
          , pp.
          <fpage>123</fpage>
          -
          <lpage>136</lpage>
          . doi:
          <volume>10</volume>
          .1007/11817963\_
          <fpage>14</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>W.</given-names>
            <surname>Craig</surname>
          </string-name>
          ,
          <article-title>Three uses of the Herbrand-Gentzen theorem in relating model theory and proof theory</article-title>
          ,
          <source>J. Symbolic Logic</source>
          <volume>22</volume>
          (
          <year>1957</year>
          )
          <fpage>269</fpage>
          -
          <lpage>285</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Mccarthy</surname>
          </string-name>
          ,
          <article-title>Towards a mathematical science of computation</article-title>
          , in: In IFIP Congress, North-Holland,
          <year>1962</year>
          , pp.
          <fpage>21</fpage>
          -
          <lpage>28</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kapur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Majumdar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. G.</given-names>
            <surname>Zarba</surname>
          </string-name>
          ,
          <article-title>Interpolation for Data Structures</article-title>
          ,
          <source>in: Proc. of SIGSOFT-FSE, ACM</source>
          ,
          <year>2006</year>
          , pp.
          <fpage>105</fpage>
          -
          <lpage>116</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Bruttomesso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghilardi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ranise</surname>
          </string-name>
          ,
          <article-title>Quantifier-free interpolation of a theory of arrays</article-title>
          ,
          <source>Log. Methods Comput. Sci. 8</source>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghilardi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gianola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kapur</surname>
          </string-name>
          ,
          <article-title>Interpolation and amalgamation for arrays with maxdif</article-title>
          , in: S. Kiefer,
          <string-name>
            <surname>C.</surname>
          </string-name>
          Tasson (Eds.),
          <source>Foundations of Software Science and Computation Structures - 24th International Conference, FOSSACS</source>
          <year>2021</year>
          ,
          <article-title>Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021</article-title>
          ,
          <string-name>
            <given-names>Luxembourg</given-names>
            <surname>City</surname>
          </string-name>
          , Luxembourg, March 27 - April 1,
          <year>2021</year>
          , Proceedings, volume
          <volume>12650</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2021</year>
          , pp.
          <fpage>268</fpage>
          -
          <lpage>288</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>V.</given-names>
            <surname>Sofronie-Stokkermans</surname>
          </string-name>
          ,
          <article-title>Interpolation in local theory extensions</article-title>
          ,
          <source>Log. Methods Comput. Sci. 4</source>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>V.</given-names>
            <surname>Sofronie-Stokkermans</surname>
          </string-name>
          ,
          <article-title>On interpolation and symbol elimination in theory extensions</article-title>
          ,
          <source>Log. Methods Comput. Sci</source>
          .
          <volume>14</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>N.</given-names>
            <surname>Totla</surname>
          </string-name>
          , T. Wies,
          <article-title>Complete instantiation-based interpolation</article-title>
          ,
          <source>J. Autom. Reasoning</source>
          <volume>57</volume>
          (
          <year>2016</year>
          )
          <fpage>37</fpage>
          -
          <lpage>65</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>K. L. McMillan</surname>
          </string-name>
          ,
          <article-title>Interpolants from Z3 proofs</article-title>
          ,
          <source>in: Proc. of FMCAD</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>19</fpage>
          -
          <lpage>27</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bozzano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bruttomesso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cimatti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Junttila</surname>
          </string-name>
          , P. van Rossum,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schulz</surname>
          </string-name>
          , R. Sebastiani,
          <article-title>MathSAT: Tight integration of SAT and mathematical decision procedures</article-title>
          ,
          <source>Journal of Automated Reasoning</source>
          (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Christ</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hoenicke</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Nutz,
          <string-name>
            <surname>SMTInterpol:</surname>
          </string-name>
          <article-title>An interpolating SMT solver</article-title>
          , in: A.
          <string-name>
            <given-names>F.</given-names>
            <surname>Donaldson</surname>
          </string-name>
          , D. Parker (Eds.),
          <source>Model Checking Software - 19th International Workshop, SPIN 2012</source>
          , Oxford, UK,
          <source>July 23-24</source>
          ,
          <year>2012</year>
          . Proceedings, volume
          <volume>7385</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2012</year>
          , pp.
          <fpage>248</fpage>
          -
          <lpage>254</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Heizmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Christ</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dietsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Ermis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hoenicke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lindenmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nutz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Schilling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Podelski</surname>
          </string-name>
          ,
          <article-title>Ultimate Automizer with SMTInterpol</article-title>
          , in: N.
          <string-name>
            <surname>Piterman</surname>
            ,
            <given-names>S. A.</given-names>
          </string-name>
          <string-name>
            <surname>Smolka</surname>
          </string-name>
          (Eds.),
          <source>Tools and Algorithms for the Construction and Analysis of Systems</source>
          , Springer Berlin Heidelberg, Berlin, Heidelberg,
          <year>2013</year>
          , pp.
          <fpage>641</fpage>
          -
          <lpage>643</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>L. De Moura</surname>
            ,
            <given-names>N. Bjørner,</given-names>
          </string-name>
          <article-title>Z3: An Eficient SMT Solver</article-title>
          ,
          <source>in: Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS'08/ETAPS'08</source>
          , Springer-Verlag, Berlin, Heidelberg,
          <year>2008</year>
          , pp.
          <fpage>337</fpage>
          -
          <lpage>340</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A. R.</given-names>
            <surname>Bradley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Manna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. B.</given-names>
            <surname>Sipma</surname>
          </string-name>
          ,
          <article-title>What's decidable about arrays?</article-title>
          ,
          <source>in: Proc. of VMCAI</source>
          , volume
          <volume>3855</volume>
          <source>of LNCS</source>
          , Springer,
          <year>2006</year>
          , pp.
          <fpage>427</fpage>
          -
          <lpage>442</lpage>
          . doi:
          <volume>10</volume>
          .1007/11609773\_
          <fpage>28</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J. F.</given-names>
            <surname>Groote</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. G.</given-names>
            <surname>Larsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Beyer</surname>
          </string-name>
          ,
          <article-title>Software verification: 10th comparative evaluation (sv-comp</article-title>
          <year>2021</year>
          ),
          <article-title>Tools and Algorithms for the Construction and Analysis of Systems27th International Conference</article-title>
          , TACAS 2021,
          <article-title>Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021</article-title>
          ,
          <string-name>
            <given-names>Luxembourg</given-names>
            <surname>City</surname>
          </string-name>
          , Luxembourg, March
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>