Verification of Imperative Programs by
        Transforming Constraint Logic Programs?

                       Emanuele De Angelis1 , Fabio Fioravanti1 ,
                       Alberto Pettorossi2 , and Maurizio Proietti3
                   1
                    DEC, University ‘G. D’Annunzio’, Pescara, Italy
                    {emanuele.deangelis,fioravanti}@unich.it
                 2
                   DICII, University of Rome Tor Vergata, Rome, Italy
                           pettorossi@disp.uniroma2.it
              3
                IASI-CNR, Rome, Italy maurizio.proietti@iasi.cnr.it


        Abstract. We present a method for verifying partial correctness prop-
        erties of imperative programs that manipulate integers and arrays by
        using techniques based on the transformation of constraint logic pro-
        grams (CLP). We use CLP as a metalanguage for representing imper-
        ative programs, their executions, and their properties. First, we encode
        the correctness of an imperative program, say prog, as the negation of
        a predicate incorrect defined by a CLP program T . By construction,
        incorrect holds in the least model of T if and only if the execution of
        prog from an initial configuration eventually halts in an error configura-
        tion. Then, we apply to program T a sequence of transformations that
        preserve its least model semantics. These transformations are based on
        well-known transformation rules, such as unfolding and folding, guided
        by suitable transformation strategies, such as specialization and gener-
        alization. The objective of the transformations is to derive a new CLP
        program TransfT where the predicate incorrect is defined either by (i)
        the fact ‘incorrect.’ (and in this case prog is not correct), or by (ii)
        the empty set of clauses (and in this case prog is correct). In the case
        where we derive a CLP program such that neither (i) nor (ii) holds, we
        iterate the transformation. Since the problem is undecidable, this pro-
        cess may not terminate. We show through examples that our method
        can be applied in a rather systematic way, and is amenable to automa-
        tion by transferring to the field of program verification many techniques
        developed in the field of program transformation.


1     Introduction

In the last decade formal techniques have received a renewed attention as the
basis of a methodology for increasing the reliability of software artifacts and
reducing the cost of software production. In particular, great efforts have been
made to devise automatic techniques such as software model checking [23], for
verifying the correctness of programs with respect to their specifications.
?
    A preliminary version of this paper appears in [10].
84     Emanuele De Angelis, Fabio Fioravanti, Alberto Pettorossi and Maurizio Proietti


     In many software model checking techniques, the use of constraints has been
 very effective both for constructing models of programs and for reasoning about
 them [2, 8, 9, 12, 18, 20, 22, 33, 34]. Several kinds of constraints have been consid-
 ered, such as equalities and inequalities over booleans, integers, reals, and finite
 or infinite trees. By using constraints we can represent in a symbolic, compact
 way the (possibly infinite) sets of values computed by programs and, in general,
 the sets of states which are reached during program executions. Then, by us-
 ing powerful solvers specifically designed for the classes of constraints we have
 mentioned above, we can reason about program properties in an efficient way.
     In this paper we consider a simple imperative programming language with
 integer and array variables and we use Constraint Logic Programming (CLP) [21]
 as a metalanguage for representing imperative programs, their executions, and
 the properties to be verified. We use constraints consisting of linear equalities
 and inequalities over integers. Note, however, that the method presented here is
 parametric with respect to the constraint domain which is used. By following an
 approach originally presented in [33], a given imperative program prog and its
 interpreter are first encoded as a CLP program. Then, the proofs of the properties
 of interest about the program prog are sought by analyzing that derived CLP
 program. In order to improve the efficiency of that analysis, it is advisable to first
 compile-away the CLP interpreter of the language in which prog is written. This
 is done by specializing the interpreter with respect to the given program prog
 using well-known program specialization techniques [24, 33].
    In previous papers [9, 16] we have shown that program specialization can
 be used not only as a preprocessing step to improve the efficiency of program
 analysis, but also as a means of analysis on its own. In this paper, we extend
 that approach and we propose a verification method based on more general
 unfold/fold transformation rules for CLP programs [5, 13, 37].
     Transformation-based verification techniques are very appealing because they
 are parametric with respect to both the programming languages in which pro-
 grams are written, and the logics in which the properties of interest are specified.
 Moreover, since the output of a transformation-based verification method is a
 program which is equivalent to the given program with respect to the proper-
 ties of interest, we can apply a sequence of transformations, thereby refining the
 analysis to the desired degree of precision (see, for instance, [9]).
     The specific contributions of this paper are the following. We present a veri-
 fication method based on a set of transformation rules which includes the rules
 for performing conjunctive definition, conjunctive folding, and goal replacement,
 besides the usual rules for unfolding and constraint manipulation which are used
 during program specialization. The rules for conjunctive definition and conjunc-
 tive folding allow us to introduce and transform new predicates defined in terms
 of conjunctions of old predicates, while program specialization can only deal with
 new predicates that correspond to specialized versions of exactly one old pred-
 icate. The goal replacement rule allows us to replace conjunctions of predicates
 and constraints by applying equivalences that hold in the least model of the CLP
Verification of Imperative Programs by Transforming Constraint Logic Programs         85


 program at hand, while program specialization can only replace conjunctions of
 constraints.
      By using these more powerful definition and folding rules, we extend the
 specialization-based verification method in the following two directions: (i) we
 verify programs with respect to specifications given by sets of CLP clauses (for
 instance, recursively defined relations among program variables), whereas pro-
 gram specialization can only deal with specifications given by constraints, and
 (ii) we verify programs manipulating arrays and other data structures by apply-
 ing equivalences between predicates that axiomatize suitable properties of those
 data structures (for instance, the ones deriving from the axiomatization of the
 theory of arrays [31]).
      The paper is organized as follows. In Section 2 we present our transformation-
 based verification method. First, we introduce a simple imperative language and
 we describe how correctness properties of imperative programs can be translated
 into predicates defined by CLP programs. We also present a general strategy
 for applying the transformation rules to CLP programs, with the objective of
 verifying the properties of interest. Next, we present two examples of application
 of our verification method. In particular, in Section 3 we show how we deal with
 specifications given by recursive CLP clauses, and in Section 4 we show how we
 deal with programs which manipulate arrays. Finally, in Section 5 we discuss
 the related work which has been recently done in the area of automatic program
 verification.


 2    The Transformation-Based Verification Method

 We consider an imperative C-like programming language with integer and array
 variables, assignments (=), sequential compositions (;), conditionals (if and
 if else), while-loops (while), and jumps (goto). A program is a sequence of
 (labeled) commands, and in each program there is a unique halt command
 which, when executed, causes program termination.
     The semantics of our language is defined by a transition relation, denoted
 =⇒, between configurations. Each configuration is a pair hhc, δii of a command c
 and an environment δ. An environment δ is a function that maps: (i) every integer
 variable identifier x to its value v, and (ii) every integer array identifier a to a
 finite function from the set {0, . . . , dim(a)−1}, where dim(a) is the dimension of
 the array a, to the set of the integer numbers. The definition of the relation =⇒
 is similar to the ‘small step’ operational semantics given in [35], and is omitted.
     Given an imperative program prog, we address the problem of verifying
 whether or not, starting from any initial configuration that satisfies the prop-
 erty ϕinit , the execution of prog eventually leads to a final configuration that
 satisfies the property ϕerror , also called an error configuration. This problem is
 formalized by defining an incorrectness triple of the form {{ϕinit }} prog {{ϕerror }},
 where ϕinit and ϕerror are encoded by CLP predicates defined by (possibly re-
 cursive) clauses. We say that a program prog is incorrect with respect to ϕinit
 and ϕerror , whose free variables are assumed to be among z1 , . . . , zr , if there
86       Emanuele De Angelis, Fabio Fioravanti, Alberto Pettorossi and Maurizio Proietti


 exist environments δinit and δh such that: (i) ϕinit (δinit (z1 ), . . . , δinit (zr )) holds,
 (ii) hh`0 : c0 , δinit ii =⇒∗ hh`h : halt, δh ii, and (iii) ϕerror (δh (z1 ), . . . , δh (zr )) holds,
 where `0 : c0 is the first labeled command of prog and `h : halt is the unique
 halt command of prog. A program is said to be correct with respect to ϕinit and
 ϕerror iff it is not incorrect with respect to ϕinit and ϕerror . Note that this notion
 of correctness is equivalent to the usual notion of partial correctness specified by
 the Hoare triple {ϕinit } prog {¬ϕerror }.
      Our verification method is based on the formalization of the notion of pro-
 gram incorrectness by using a predicate incorrect defined by a CLP program.
      In this paper a CLP program is a finite set of clauses of the form A :- c,B,
 where A is an atom, c is a constraint (that is, a possibly empty conjunction
 of linear equalities and inequalities over the integers), and B is a goal (that
 is, a possibly empty conjunction of atoms). The conjunction c,B is called a
 constrained goal. A clause of the form: A :- c is called a constrained fact. We refer
 to [21] for other notions of CLP with which the reader might be not familiar.
      We translate the problem of checking whether or not the program prog is
 incorrect with respect to the properties ϕinit and ϕerror into the problem of
 checking whether or not the predicate incorrect is a consequence of the CLP
 program T defined by the following clauses:
     incorrect :- initConf(X), reach(X).
     reach(X) :- tr(X, X1), reach(X1).
     reach(X) :- errorConf(X).
 together with the clauses for the predicates initConf(X), errorConf(X), and
 tr(X, X1). They are defined as follows: (i) initConf(X) encodes an initial config-
 uration satisfying the property ϕinit , (ii) errorConf(X) encodes an error config-
 uration satisfying the property ϕerror , and (iii) tr(X, X1) encodes the transition
 relation =⇒. (Note that in order to define initConf(X), errorConf(X), and
 tr(X, X1) and, in particular, to represent operations over the integer variables
 and the elements of arrays, we need constraints.) The predicate reach(X) holds
 if an error configuration Y such that errorConf(Y) holds, can be reached from
 the configuration X.
      The imperative program prog is correct with respect to the properties ϕinit
 and ϕerror iff incorrect 6∈ M (T ), where M (T ) denotes the least model of pro-
 gram T [21]. Due to the presence of integer variables and array variables, M (T )
 is in general an infinite model, and both the bottom-up and top-down evaluation
 of the query incorrect may not terminate. In order to deal with this difficulty,
 we propose an approach to program verification which is symbolic and, by using
 program transformations, allows us to avoid the exhaustive exploration of the
 possibly infinite space of reachable configurations.
      Our verification method consists in applying to program T a sequence of pro-
 gram transformations that preserve the least model M (T ) [13, 15]. In particular,
 we apply the following transformation rules, collectively called unfold/fold rules:
 (i) (conjunctive) definition, (ii) unfolding, (iii) goal replacement, (iv) clause re-
 moval, and (v) (conjunctive) folding. Our verification method is made out of the
 following two steps.
Verification of Imperative Programs by Transforming Constraint Logic Programs          87


 Step (A): Removal of the Interpreter. Program T is specialized with respect to the
 given prog (on which tr depends), initConf, and errorConf, thereby deriving a
 new program T 1 such that: (i) incorrect ∈ M (T ) iff incorrect ∈ M (T 1), and
 (ii) tr does not occur explicitly in T 1 (in this sense we say that the interpreter
 is removed or compiled-away).
 Step (B): Propagation of the Initial and Error Properties. By applying a sequence
 of unfold/fold transformation rules, the CLP program T 1 is transformed into a
 new CLP program T 2 such that the program prog is correct with respect to the
 given initial and error properties iff incorrect 6∈ M (T 2).
     The objective of Step (B) is to propagate the initial and the error properties
 so as to derive a program T 2 where the predicate incorrect is defined by either
 (i) the fact ‘incorrect.’ (in which case prog is incorrect), or (ii) the empty set
 of clauses (in which case prog is correct). In the case where neither (i) nor (ii)
 holds, that is, in program T 2 the predicate incorrect is defined by a non-empty
 set of clauses not containing the fact ‘incorrect.’, we cannot conclude anything
 about the correctness of prog and, similarly to what has been proposed in [9],
 we iterate Step (B) in the hope of deriving a program where either (i) or (ii)
 holds. Obviously, due to undecidability limitations, it may be the case that we
 never get a program where either (i) or (ii) holds.
     Steps (A) and (B) are both instances of the Transform strategy outlined in
 Figure 1 below. These two instances are obtained by using two different ways
 of controlling the application of the transformation rules. In particular, in the
 instance of the Transform strategy that realizes Step (A) we never apply the goal
 replacement rule and the resulting strategy coincides with the fully automatic
 specialization strategy presented in [9].
     In the Transform strategy we make use of the following rules, where P is
 the input CLP program, and Defs is a set of clauses, called definition clauses,
 constructed as we indicate in that strategy.
 Definition Rule. By this rule we introduce a clause of the form newp(X) :- c,G,
 where newp is a new predicate symbol, X is a tuple of variables occurring in (c,G),
 c is a constraint, and G is a non-empty conjunction of atoms.
 Unfolding Rule. Given a clause C of the form H :- c,L,A,R, where H and A are
 atoms, c is a constraint, and L and R are (possibly empty) conjunctions of atoms,
 let us consider the set {Ki :- ci ,Bi | i = 1, . . . , m} made out of the (renamed
 apart) clauses of P such that, for i = 1, . . . , m, A is unifiable with Ki via the most
 general unifier ϑi and (c,ci ) ϑi is satisfiable (thus, the unfolding rule performs
 some constraint solving operations). By unfolding C w.r.t. A using P , we derive
 the set {(H :- c,ci ,L,Bi ,R) ϑi | i = 1, . . . , m} of clauses.
 Goal Replacement Rule. If a constrained goal c1 , G1 occurs in the body of a clause
 C, and M (P ) |= ∀ (c1 , G1 ↔ c2 , G2 ), then we derive a new clause D by replacing
 c1 , G1 by c2 , G2 in the body of C.
      The equivalences which are needed for goal replacements are called laws
 and their validity in M (P ) can be proved once and for all, before applying the
 Transform strategy.
88     Emanuele De Angelis, Fabio Fioravanti, Alberto Pettorossi and Maurizio Proietti


 Input: Program P .
 Output: Program TransfP such that incorrect ∈ M (P ) iff incorrect ∈ M (TransfP).
 Initialization:
 Let InDefs be the set of all clauses of P whose head is the atom incorrect;
 TransfP := ∅ ; Defs := InDefs ;
 while in InDefs there is a clause C do
      Unfolding: Apply the unfolding rule at least once using P , and derive from C
          a set U(C) of clauses;
     Goal Replacement: Apply a sequence of goal replacements, and derive from
       U(C) a set R(C) of clauses;
     Clause Removal: Remove from R(C) all clauses whose body contains an un-
       satisfiable constraint;
     Definition & Folding: Introduce a (possibly empty) set NewDefs of new pred-
       icate definitions and add them to Defs and to InDefs;
       Fold the clauses in R(C) different from constrained facts by using the clauses
       in Defs, and derive a set F(C) of clauses;
     InDefs := InDefs − {C};     TransfP := TransfP ∪ F(C);
 end-while;
 Removal of Useless Clauses:
 Remove from TransfP all clauses whose head predicate is useless.

                           Fig. 1. The Transform strategy.


 Folding Rule. Given a clause E of the form: H :- e, L, Q, R and a clause D in
 Defs of the form K :- d, D such that: (i) for some substitution ϑ, Q = D ϑ, and (ii)
 ∀ (e → d ϑ) holds, then by folding E using D we derive H :- e, L, K ϑ, R.
 Removal of Useless Clauses. The set of useless predicates in a given program Q
 is the greatest set U of predicates occurring in Q such that p is in U iff every
 clause with head predicate p is of the form p(X) :- c, G1 , q(Y), G2 , for some q in
 U . A clause in a program Q is useless if the predicate of its head is useless in Q.
     The termination of the Transform strategy is guaranteed by suitable tech-
 niques for controlling the unfolding and the introduction of new predicates. We
 refer to [28] for a survey of techniques which ensure the finiteness of unfolding.
 The introduction of new predicates is controlled by applying generalization op-
 erators based on various notions, such as widening, convex hull, most specific
 generalization, and well-quasi ordering, which have been proposed for analyzing
 and transforming CLP programs (see, for instance, [8, 11, 17, 32]).
     The correctness of the strategy with respect to the least model semantics
 directly follows from the fact that the application of the transformation rules
 complies with some suitable conditions that guarantee the preservation of that
 model [13].
 Theorem 1. (Termination and Correctness of the Transform strategy) (i) The
 Transform strategy terminates. (ii) Let program TransfP be the output of the
Verification of Imperative Programs by Transforming Constraint Logic Programs               89


 Transform strategy applied on the input program P . Then, incorrect ∈ M (P )
 iff incorrect ∈ M (TransfP).


 3    Verification of Recursively Defined Properties

 In this section we will show, through an example, that our verification method
 can be used when the initial properties and the error properties are specified
 by (possibly recursive) CLP clauses, rather than by constraints only (as done,
 for instance, in [9]). In order to deal with that kind of properties, during the
 Definition & Folding phase of the Transform strategy, we allow ourselves to
 introduce new predicates which are defined by clauses of the form: Newp :- c, G,
 where Newp is an atom with a new predicate symbol, c is a constraint, and G is a
 conjunction of one or more atoms. This kind of predicate definitions allows us to
 perform program verifications that cannot be done by the technique presented
 in [9], where the goal G is assumed to be a single atom.
     Let us consider the following program GCD that computes the greatest com-
 mon divisor z of two positive integers m and n, denoted gcd (m, n, z).
     GCD:            `0 : x = m ;
                     `1 : y = n ;
                     `2 : while (x 6= y) { if (x > y) x = x−y ; else y = y−x ; } ;
                     `3 : z = x ;
                     `h : halt
 We also consider the incorrectness triple {{ϕinit (m, n)}} GCD {{ϕerror (m, n, z)}},
 where:
 (i) ϕinit (m, n) is m ≥ 1 ∧ n ≥ 1, and (ii) ϕerror (m, n, z) is ∃ d (gcd (m, n, d) ∧ d 6= z).
 These properties ϕinit and ϕerror are defined by the following CLP clauses 1 and
 2–5, respectively:
 1. phiInit(M, N) :- M ≥ 1, N ≥ 1.
 2. phiError(M, N, Z) :- gcd(M, N, D), D 6= Z.
 3. gcd(X, Y, D) :- X > Y, X1 = X−Y, gcd(X1, Y, D).
 4. gcd(X, Y, D) :- X < Y, Y1 = Y−X, gcd(X, Y1, D).
 5. gcd(X, Y, D) :- X = Y, Y = D.
 The predicates initConf and errorConf specifying the initial and the error
 configurations, respectively, are defined by the following clauses:
 6. initConf(cf(cmd(0, asgn(int(x), int(m))),
        [[int(m), M], [int(n), N], [int(x), X], [int(y), Y], [int(z), Z]]))
           :- phiInit(M, N).
 7. errorConf(cf(cmd(h, halt),
        [[int(m), M], [int(n), N], [int(x), X], [int(y), Y], [int(z), Z]])) :-
           phiError(M, N, Z).
 Thus, the CLP program encoding the given incorrectness triple consists of clauses
 1–7 above, together with the clauses defining the predicates incorrect, reach,
 and tr given as indicated in Section 2.
90     Emanuele De Angelis, Fabio Fioravanti, Alberto Pettorossi and Maurizio Proietti


    Now we perform Step (A) of our verification method, which consists in the
 removal of the interpreter, and we derive the following CLP program:
  8. incorrect :- M ≥ 1, N ≥ 1, X = M, Y = N, new1(M, N, X, Y, Z).
  9. new1(M, N, X, Y, Z) :- X > Y, X1 = X−Y, new1(M, N, X1, Y, Z).
 10. new1(M, N, X, Y, Z) :- X < Y, Y1 = Y−X, new1(M, N, X, Y1, Z).
 11. new1(M, N, X, Y, Z) :- X = Y, Z = X, Z 6= D, gcd(M, N, D).
 By moving the constrained atom ‘Z 6= D, gcd(M, N, D)’ from the body of clause 11
 to the body of clause 8, we can rewrite clauses 8 and 11 as follows (this rewriting
 is correct because in clauses 9 and 10 the predicate new1 modifies neither the
 value of M nor the value of N):
 8r. incorrect :- M ≥ 1, N ≥ 1, X = M, Y = N, Z 6= D, gcd(M, N, D), new1(M, N, X, Y, Z).
 11r. new1(M, N, X, Y, Z) :- X = Y, Z = X.
 Note that we could avoid performing the above rewriting and obtain a similar
 program where the constraints characterizing the initial and the error properties
 occur in the same clause by starting our derivation from a more general definition
 of the reachability relation. However, an in-depth analysis of this variant of our
 verification method is beyond the scope of this paper.
    Now we will perform Step (B) of the verification method by applying the
 Transform strategy to the derived program consisting of clauses {3, 4, 5, 8r, 9, 10,
 11r}. Initially, we have that the sets InDefs and Defs of definition clauses are
 both equal to {8r}.
 Unfolding. We start off by unfolding clause 8r w.r.t. the atom new1(M, N, X, Y, Z),
 and we get:
 12. incorrect :- M ≥ 1, N ≥ 1, X = M, Y = N, X > Y, X1 = X−Y, Z 6= D, gcd(M, N, D),
                   new1(M, N, X1, Y, Z).
 13. incorrect :- M ≥ 1, N ≥ 1, X = M, Y = N, X < Y, Y1 = Y−X, Z 6= D, gcd(M, N, D),
                   new1(M, N, X, Y1, Z).
 14. incorrect :- M ≥ 1, N ≥ 1, X = M, Y = N, X = Y, Z = X, Z 6= D, gcd(M, N, D).
 By unfolding clauses 12, 13, and 14 w.r.t. the atom gcd(M, N, D), we derive:
 15. incorrect :- M ≥ 1, N ≥ 1, M > N, X1 = M−N, Z 6= D, gcd(X1, N, D),
                  new1(M, N, X1, N, Z).
 16. incorrect :- M ≥ 1, N ≥ 1, M < N, Y1 = N−M, Z 6= D, gcd(M, Y1, D),
                  new1(M, N, M, Y1, Z).
 (The unfolding of clause 14 produces the empty set of clauses because the con-
 straint ‘X = M, Z = X, Z 6= D, M = D’ is unsatisfiable.) The Goal Replacement and
 Clause Removal phases leave the set of clauses produced by the Unfolding
 phase unchanged, because no laws are available for the predicate gcd.
 Definitions & Folding. In order to fold clauses 15 and 16, we perform a gener-
 alization step and we introduce a new predicate defined by the following clause:
 17. new2(M, N, X, Y, Z, D) :- M ≥ 1, N ≥ 1, Z 6= D, gcd(X, Y, D), new1(M, N, X, Y, Z).
 The body of this clause 17 is the most specific generalization of the bodies of
 clause 8r (which is the only clause in Defs), and clauses 15 and 16 (which are the
Verification of Imperative Programs by Transforming Constraint Logic Programs            91


 clauses to be folded). Now, clauses 15 and 16 can be folded by using clause 17,
 thereby deriving:
 18. incorrect :- M ≥ 1, N ≥ 1, M > N, X1 = M−N, Z 6= D, new2(M, N, X1, N, Z, D).
 19. incorrect :- M ≥ 1, N ≥ 1, M < N, Y1 = N−M, Z 6= D, new2(M, N, M, Y1, Z, D).
 Clause 17 defining the new predicate new2 is added to Defs and InDefs and,
 since the latter set is not empty, we perform a new iteration of the while-loop
 body of the Transform strategy.
 Unfolding. By unfolding clause 17 w.r.t. new1(M,N,X,Y,Z) and then unfolding
 the resulting clauses w.r.t. gcd(X,Y,Z), we derive:
 20. new2(M, N, X, Y, Z, D) :- M ≥ 1, N ≥ 1, X > Y, X1 = X−Y, Z 6= D, gcd(X1, Y, D),
                               new1(M, N, X1, Y, Z).
 21. new2(M, N, X, Y, Z, D) :- M ≥ 1, N ≥ 1, X < Y, Y1 = Y−X, Z 6= D, gcd(X, Y1, D),
                               new1(M, N, X, Y1, Z).
 Definition & Folding. Clauses 20 and 21 can be folded by using clause 17,
 and we derive:
 22. new2(M, N, X, Y, Z, D) :- M ≥ 1, N ≥ 1, X > Y, X1 = X−Y, Z 6= D, new2(M, N, X1, Y, Z).
 23. new2(M, N, X, Y, Z, D) :- M ≥ 1, N ≥ 1, X < Y, Y1 = Y−X, Z 6= D, new2(M, N, X, Y1, Z).
 No new predicate definition is introduced, and the Transform strategy exits the
 while-loop. The final program TransfP is the set {18, 19, 22, 23} of clauses, which
 contains no constrained facts. Hence both predicates incorrect and new2 are
 useless and all clauses of TransfP can be removed. Thus, the Transform strategy
 terminates with TransfP = ∅ and we conclude that the imperative program GCD
 is correct w.r.t. the given initial and error properties.


 4    Verification of Array Programs

 In this section we apply our verification method to the following program
 ArrayMax which computes the maximal element of an array:
      ArrayMax :        `0 : i = 0;
                        `1 : while (i < n) { if (a[i] > max) max = a[i];
                                              i = i+1; };
                        `h : halt
 We consider the following incorrectness triple:
      {{ϕinit (i,n,a,max)}} ArrayMax {{ϕerror (n,a,max)}}
 where: (i) ϕinit (i,n,a,max) is i ≥ 0 ∧ n = dim(a) ∧ n ≥ i + 1 ∧ max = a[i], and
 (ii) ϕerror (n,a,max) is ∃k (0 ≤ k < n ∧ a[k] > max ).
      First, we construct a CLP program T which encodes the above incorrectness
 triple, similarly to what has been done in Section 3. In particular, the properties
 ϕinit and ϕerror are defined by the following CLP clauses, respectively:
 1. phiInit(I, N, A, Max) :- I ≥ 0, N ≥ I+1, read((A, N), I, Max).
 2. phiError(N, A, Max) :- K ≥ 0, N > K, Z > Max, read((A, N), K, Z).
 The clauses defining the predicates initConf(X) and errorConf(X) which specify
 the initial and the error configurations, respectively, are as follows:
92     Emanuele De Angelis, Fabio Fioravanti, Alberto Pettorossi and Maurizio Proietti


 3. initConf(cf(cmd(0, asgn(int(i), int(0))),
         [[int(i), I], [int(n), N], [array(a), (A, N)], [int(max), Max]])) :-
            phiInit(I, N, A, Max).
 4. errorConf(cf(cmd(h, halt),
         [[int(i), I], [int(n), N], [array(a), (A, N)], [int(max), Max]])) :-
            phiError(N, A, Max).
 Now we start off by applying Step (A) of our verification method which con-
 sists in the removal of the interpreter. From program T we obtain the following
 program T 1:
 5. incorrect :- I = 0, N ≥ 1, read((A, N), I, Max), new1(I, N, A, Max).
 6. new1(I, N, A, Max) :- I1 = I+1, I < N, I ≥ 0, M > Max, read((A, N), I, M),
                          new1(I1, N, A, M).
 7. new1(I, N, A, Max) :- I1 = I+1, I < N, I ≥ 0, M ≤ Max, read((A, N), I, M),
                          new1(I1, N, A, Max).
 8. new1(I, N, A, Max) :- I ≥ N, K ≥ 0, N > K, Z > Max, read((A, N), K, Z).
 As indicated in [9], in order to propagate the error property, we ‘reverse’ the
 derived program T 1 and we get the following program T 1rev :
 rev1. incorrect :- b(U), r2(U).
 rev2. r2(V) :- trans(U, V), r2(U).
 rev3. r2(U) :- a(U).
 where the predicates a, b, and trans are defined as follows:
 s4. a([new1, I, N, A, Max]) :- I = 0, N ≥ 1, read((A, N), I, Max)
 s5. trans([new1, I, N, A, Max], [new1, I1, N, A, M]) :-
                                  I1 = I+1, I < N, I ≥ 0, M > Max, read((A, N), I, M).
 s6. trans([new1, I, N, A, Max], [new1, I1, N, A, Max]) :-
                                  I1 = I+1, I < N, I ≥ 0, M ≤ Max, read((A, N), I, M).
 s7. b([new1, I, N, A, Max]) :- I ≥ N, K ≥ 0, K < N, Z > Max, read((A, N), K, Z).
 This reversal transformation, which from program T 1 derives program T 1rev ,
 can easily be automatedand it is correct in the sense that incorrect ∈ M (T 1)
 iff incorrect ∈ M (T 1rev ). This equivalence holds because: (i) in program T 1 the
 predicate incorrect is defined in terms of the predicate new1 that encodes the
 reachability relation from an error configuration to an initial configuration, and
 (ii) in program T 1rev the predicate incorrect is defined in terms of the pred-
 icate r2 that also encodes the reachability relation, but this time the encoding
 is, so to speak, ‘in the reversed direction’, that is, from an initial configuration
 to an error configuration.
      Now let us apply Step (B) of our verification method starting from the pro-
 gram T 1rev .
 Unfolding. First we unfold clause rev1 w.r.t. the atom b(U), and we get:
 9. incorrect :- I ≥ N, K ≥ 0, K < N, Z > Max, read((A, N), K, Z),
                 r2([new1, I, N, A, Max]).
 Neither Goal Replacement nor Clause Removal is applied.
Verification of Imperative Programs by Transforming Constraint Logic Programs            93


 Definition & Folding. In order to fold clause 9 we introduce the following
 clause:
 10. new2(I, N, A, Max, K, Z) :- I ≥ N, K ≥ 0, K < N, Z > Max, read((A,N), K, Z),
                    r2([new1, I, N, A, Max]).
 By folding clause 9 using clause 10, we get:
 11. incorrect :- I ≥ N, K ≥ 0, K < N, Z > Max, new2(I, N, A, Max, K, Z).
 Now we proceed by performing a second iteration of the body of the while-loop of
 the Transform strategy because InDefs is not empty (indeed, clause 10 belongs
 to InDefs).
 Unfolding. After some unfoldings from clause 10 we get the following clauses:
 12. new2(I1, N, A, M, K, Z) :- I1 = I+1, N = I1, K ≥ 0, K < I1, M > Max, Z > M,
                    read((A, N), K, Z), read((A, N), I, M), r2([new1, I, N, A, Max]).
 13. new2(I1, N, A, Max, K, Z) :- I1 = I+1, N = I1, K ≥ 0, K < I1, M ≤ Max, Z > Max,
                    read((A, N), K, Z), read((A, N), I, M), r2([new1, I, N, A, Max]).
 Goal Replacement. We use the following law which is a consequence of the
 fact that arrays are finite functions:
 (L1) read((A, N), K, Z), read((A, N), I, M) ↔
      (K = I, Z = M, read((A, N), K, Z)) ∨ (K 6= I, read((A, N), K, Z), read((A, N), I, M))
 Thus, (i) we replace the conjunction of atoms ‘read((A, N), K, Z), read((A, N), I, M)’
 occurring in the body of clause 12 by the right hand side of law (L1), and then
 (ii) we split the derived clause with disjunctive body into the following two
 clauses, each of which corresponds to a disjunct of the right hand side of (L1).
 We get the following clauses:
 12.1 new2(I1, N, A, M, K, Z) :- I1 = I+1, N = I1, K ≥ 0, K < I1, M > Max, Z > M,
                K = I, M = Z, read((A, N), K, Z), r2([new1, I, N, A, Max]).
 12.2 new2(I1, N, A, M, K, Z) :- I1 = I+1, N = I1, K ≥ 0, K < I1, M > Max, Z > M,
                K 6= I, read((A, N), K, Z), read((A, N), I, M), r2([new1, I, N, A, Max]).
 Clause Removal. The constraint ‘Z > M, M = Z’ in the body of clause 12.1 is
 unsatisfiable. Hence, this clause is removed from TranfP. By simplifying the
 constraints in clause 12.2 we get:
 14. new2(I1, N, A, M, K, Z) :- I1 = I+1, N = I1, K ≥ 0, K < I, M > Max, Z > M,
                    read((A, N), K, Z), read((A, N), I, M), r2([new1, I, N, A, Max]).
 By applying similar goal replacements and clause removals, from clause 13 we
 get:
 15. new2(I1, N, A, Max, K, Z) :- I1 = I+1, N = I1, K ≥ 0, K < I, M ≤ Max, Z > Max,
                    read((A, N), K, Z), read((A, N), I, M), r2([new1, I, N, A, Max]).
 Definition & Fold. In order to fold clause 14, we introduce the following def-
 inition:
 16. new3(I, N, A, Max, K, Z) :- K ≥ 0, K < N, K < I, Z > Max, read((A, N), K, Z)),
                    r2([new1, I, N, A, Max]).
94     Emanuele De Angelis, Fabio Fioravanti, Alberto Pettorossi and Maurizio Proietti


 Clause 16 is obtained from clauses 10 and 14 by applying a generalization op-
 erator called WidenSum [17], which is a variant of the classical widening opera-
 tor [6]. Clause 16 can be used also for folding clause 15, and by folding clauses 14
 and 15 using clause 16, we get:
 17. new2(I1, N, A, Max, K, Z) :- I1 = I+1, N = I1, K ≥ 0, K < I, M > Max, Z > M,
                    read((A, N), I, M), new3(I, N, A, Max, K, Z).
 18. new2(I1, N, A, M, K, Z) :- I1 = I+1, N = I1, K ≥ 0, K < I, M ≤ Max, Z > Max,
                    read((A, N), I, M), new3(I, N, A, Max, K, Z).
 Now we perform the third iteration of the body of the while-loop of the strategy.
 After some unfolding, goal replacement, clause removal, and folding steps, from
 clause 16 we get:
 19. new3(I1, N, A, M, K, Z) :- I1 = I+1, K ≥ 0, K < I, N ≥ I1, M > Max, Z > M,
                    read((A, N), I, M), new3(I, N, A, Max, K, Z).
 20. new3(I1, N, A, Max, K, Z) :- I1 = I+1, K ≥ 0, K < I, N ≥ I1, M ≤ Max, Z > Max,
                    read((A, N), I, M), new3(I, N, A, Max, K, Z).
 Since we did not introduce any new definition, and no clause remains to be pro-
 cessed (indeed, the set InDefs of definitions is empty), the Transform strategy ex-
 its the while-loop and we get the program consisting of the set {11, 17, 18, 19, 20}
 of clauses.
     Since no clause in this set is a constrained fact, by the final phase of removing
 the useless clauses we get a final program consisting of the empty set of clauses.
 Thus, the program ArrayMax is correct with respect to the given ϕinit and ϕerror
 properties.


 5   Related Work and Conclusions

 The verification method presented in this paper is an extension of the one in-
 troduced in [9], where Constraint Logic Programming (CLP) and iterated spe-
 cialization have been used to define a general verification framework that is
 parametric with respect to the programming language and the logic used for
 specifying the correctness properties. The main novelties of this paper are the
 following ones: (i) we have considered imperative programs acting on integer
 variables as well as array variables, and (ii) we have allowed a more expres-
 sive specification language, in which one can write properties about elements of
 arrays and, in general, elements of complex data structures.
     In order to deal with this more general setting, we have defined the oper-
 ational semantics of array manipulation, and we have also considered powerful
 transformation rules, such as conjunctive definition, conjunctive folding, and
 goal replacement. These transformation rules together with some strategies for
 guiding their application, have been implemented in the MAP transformation
 system [29], so that the proofs of program correctness have been performed in a
 semi-automatic way.
     The idea of encoding imperative programs into logic programs for reasoning
 about the properties of those imperative programs is not novel. In particular,
Verification of Imperative Programs by Transforming Constraint Logic Programs      95


 for instance, this encoding has been recently used for reasoning about the type
 system of Featherweight Java programs in [1]. The use of constraint-based tech-
 niques for program verification is not novel either.Indeed, CLP programs have
 been successfully applied to perform model checking of both finite and infinite
 state systems [12, 14, 17] because through CLP programs one can express in a
 simple manner both (i) the symbolic executions of imperative programs and
 (ii) the invariants which hold during their executions. Moreover, there are pow-
 erful CLP-based tools, such as ARMC [34], TRACER [22], and HSF [20], that
 can be used for performing model checking of imperative programs. These tools
 are fully automatic, but they are applicable to classes of programs and proper-
 ties that are much more limited than those considered in this paper. We have
 shown in [9] that, by focusing on verification tasks similar to those considered by
 ARMC, TRACER, and HSF, we can design a fully automatic, transformation-
 based verification technique whose effectiveness is competitive to the one of the
 above mentioned tools.
      Our rule-based program transformation technique is also related to conjunc-
 tive partial deduction (CPD) [11], a technique for the specialization of logic
 programs with respect to conjunctions of atoms. There are, however, some sub-
 stantial differences between CPD and the approach we have presented here.
 First, CPD is not able to specialize logic programs with constraints and, thus,
 it cannot be used to prove the correctness of the GCD program where the role
 of constraints is crucial. Indeed, using the ECCE conjunctive partial deduction
 system [27] for specializing the program consisting of clauses {3, 4, 5, 8r, 9,
 10, 11r} with respect to the query incorrect, we obtain a residual program
 where the predicate incorrect is not useless. Thus, we cannot conclude that
 the atom incorrect does not belong to the least model of the program, and thus
 we cannot conclude that the program is correct. One more difference between
 CPD and our technique is that we may use goal replacement rules which allow
 us to evaluate terms over domain-specific theories. In particular, we can apply
 the goal replacement rules using well-developed theories for data structures like
 arrays, lists, heaps and sets (see [4, 30, 19, 3, 36, 39] for some formalizations of
 these theories).
      An alternative, systemic approach to program transformation is supercom-
 pilation [38], which considers programs as machines. A supercompiler runs a
 program and, while it observes its behavior, produces an equivalent program
 without performing stepwise transformations of the original program.
      The verification method we have presented in this paper is also related to
 several other methods for verifying properties of imperative programs acting on
 arrays. Those methods use techniques based on abstract interpretation, theorem
 proving and, in particular, Satisfiability Modulo Theory (see, for instance, [7,
 25, 26]).
      The application of the powerful transformation rules we have considered in
 this paper enables us to verify larger classes of properties, but the strategies to
 be applied for dealing with those classes are not all instances of the automated
 strategy introduced in [9].
96        Emanuele De Angelis, Fabio Fioravanti, Alberto Pettorossi and Maurizio Proietti


     In the future we intend to consider the issue of designing fully mechanizable
 strategies for guiding the application of our program transformation rules. In
 particular, we want to study the problem of devising suitable unfolding strategies
 and generalization operators, by adapting the techniques already developed for
 program transformation. We also envisage that the application of the laws used
 by the goal replacement rule can be automated by importing in our framework
 the techniques used in the fields of Theorem Proving and Term Rewriting. For
 some specific theories we could also apply the goal replacement rule by exploiting
 the results obtained by external theorem provers or Satisfiability Modulo Theory
 solvers.
     We also plan to address the issue of proving correctness of programs acting
 on dynamic data structures such as lists or heaps, looking for a set of suitable
 goal replacement laws which axiomatize those structures.


 Acknowledgements

 We would like to thank the anonymous referees for their helpful comments and
 constructive criticism.


 References

     1. D. Ancona and G. Lagorio. Coinductive Type Systems for Object-Oriented Lan-
        guages. In Proceedings of the 23th European Conference on Object-Oriented Pro-
        gramming. ECOOP’09, Lecture Notes in Computer Science 5653, pages 2–26.
        Springer, 2009.
     2. D. Beyer, T. A. Henzinger, R. Majumdar, and A. Rybalchenko. Invariant synthesis
        for combined theories. In Proceedings of the 8th International Conference on Ver-
        ification, Model Checking, and Abstract Interpretation. VMCAI’07, Lecture Notes
        in Computer Science 4349, pages 378–394. Springer, 2007.
     3. R. S. Bird. An introduction to the theory of lists. In Proceedings of the NATO
        Advanced Study Institute on Logic of programming and calculi of discrete design,
        pages 5–42. Springer-Verlag New York, Inc., 1987.
     4. A. R. Bradley, Z. Manna, and H. B. Sipma. What’s decidable about arrays? In
        Proceedings of the 7th International Conference on Verification, Model Checking,
        and Abstract Interpretation. VMCAI’06, Charleston, SC, USA, Lecture Notes in
        Computer Science 3855. Springer, 2006.
     5. R. M. Burstall and J. Darlington. A transformation system for developing recursive
        programs. Journal of the ACM, 24(1):44–67, 1977.
     6. P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static
        analysis of programs by construction of approximation of fixpoints. In Proceedings
        of the 4th ACM-SIGPLAN Symposium on Principles of Programming Languages,
        POPL’77, pages 238–252. ACM Press, 1977.
     7. P. Cousot, R. Cousot, and F. Logozzo. A parametric segmentation functor for fully
        automatic and scalable array content analysis. In Proceedings of the 38th ACM
        Symposium on Principles of programming languages. POPL’11, pages 105–118,
        2011.
Verification of Imperative Programs by Transforming Constraint Logic Programs            97


  8. P. Cousot and N. Halbwachs. Automatic discovery of linear restraints among
     variables of a program. In Proceedings of the Fifth ACM Symposium on Principles
     of Programming Languages, POPL’78, pages 84–96. ACM Press, 1978.
  9. E. De Angelis, F. Fioravanti, A. Pettorossi, and M. Proietti. Verifying Programs
     via Iterated Specialization. In Proceedings of the ACM SIGPLAN 2013 Workshop
     on Partial Evaluation and Program Manipulation, PEPM’13, pages 43–52, 2013.
 10. E. De Angelis, F. Fioravanti, A. Pettorossi, and M. Proietti. Verification of Im-
     perative Programs by Constraint Logic Program Transformation. In Semantics,
     Abstract Interpretation, and Reasoning about Programs, SAIRP’13, EPTCS 129,
     pages 186–210, 2013.
 11. D. De Schreye, R. Glück, J. Jørgensen, M. Leuschel, B. Martens, and M. H.
     Sørensen. Conjunctive partial deduction: Foundations, control, algorithms, and
     experiments. Journal of Logic Programming, 41(2–3):231–277, 1999.
 12. G. Delzanno and A. Podelski. Model checking in CLP. In R. Cleaveland, ed.,
     5th International Conference on Tools and Algorithms for the Construction and
     Analysis of Systems, TACAS’99, Lecture Notes in Computer Science 1579, pages
     223–239. Springer-Verlag, 1999.
 13. S. Etalle and M. Gabbrielli. Transformations of CLP modules. Theoretical Com-
     puter Science, 166:101–146, 1996.
 14. F. Fioravanti, A. Pettorossi, and M. Proietti. Verifying CTL properties of infinite
     state systems by specializing constraint logic programs. In Proceedings of the ACM
     SIGPLAN Workshop on Verification and Computational Logic VCL’01, Florence,
     Italy, Technical Report DSSE-TR-2001-3, pages 85–96. University of Southampton,
     UK, 2001.
 15. F. Fioravanti, A. Pettorossi, and M. Proietti. Transformation Rules for Locally
     Stratified Constraint Logic Programs. In K.-K. Lau and M. Bruynooghe, eds
     Program Development in Computational Logic, Lecture Notes in Computer Sci-
     ence 3049, pages 292–340. Springer, 2004.
 16. F. Fioravanti, A. Pettorossi, M. Proietti, and V. Senni. Improving reachability anal-
     ysis of infinite state systems by specialization. In G. Delzanno and I. Potapov, eds.,
     Proceedings of the 5th International Workshop on Reachability Problems, RP’11,
     Genoa, Italy, Lecture Notes in Computer Science 6945, pages 165–179. Springer,
     2011.
 17. F. Fioravanti, A. Pettorossi, M. Proietti, and V. Senni. Generalization strategies
     for the verification of infinite state systems. Theory and Practice of Logic Pro-
     gramming. Special Issue on the 25th Annual GULP Conference, 13(2):175–199,
     2013.
 18. C. Flanagan. Automatic software model checking via constraint logic. Sci. Comput.
     Program., 50(1–3):253–270, 2004.
 19. S. Ghilardi, E. Nicolini, S. Ranise, and D. Zucchelli. Decision procedures for ex-
     tensions of the theory of arrays. Ann. Math. Artif. Intell., 50(3-4):231–254, 2007.
 20. S. Grebenshchikov, A. Gupta, N. P. Lopes, C. Popeea, and A. Rybalchenko.
     HSF(C): A Software Verifier based on Horn Clauses. In Proc. of the 18th Inter-
     national Conference on Tools and Algorithms for the Construction and Analysis
     of Systems, TACAS’12, Lecture Notes in Computer Science 7214, pages 549–551.
     Springer, 2012.
 21. J. Jaffar and M. Maher. Constraint logic programming: A survey. Journal of Logic
     Programming, 19/20:503–581, 1994.
 22. J. Jaffar, V. Murali, J. A. Navas, and A. E. Santosa. TRACER: A symbolic
     execution tool for verification. In CAV’12, Lecture Notes in Computer Science 7358,
     pages 758–766. Springer, 2012.
98     Emanuele De Angelis, Fabio Fioravanti, Alberto Pettorossi and Maurizio Proietti


 23. R. Jhala and R. Majumdar. Software model checking. ACM Computing Surveys,
     41(4):21:1–21:54, 2009.
 24. N. D. Jones, C. K. Gomard, and P. Sestoft. Partial Evaluation and Automatic
     Program Generation. Prentice Hall, 1993.
 25. L. Kovács and A. Voronkov. Finding loop invariants for programs over arrays using
     a theorem prover. In Proceedings of the 12th International Conference on Funda-
     mental Approaches to Software Engineering. FASE’09, Lecture Notes in Computer
     Science 5503, pages 470–485. Springer, 2009.
 26. D. Larraz, E. Rodríguez-Carbonell, and A. Rubio. SMT-based array invariant
     generation. In 14th International Conference on Verification, Model Checking,
     and Abstract Interpretation, VMCAI’13, Rome, Italy, Lecture Notes in Computer
     Science 7737, pages 169–188. Springer, 2013.
 27. M. Leuschel. The ECCE partial deduction system and the DPPD library of bench-
     marks, Release 3, Nov. 2000. Available from http://www.ecs.soton.ac.uk/∼mal.
 28. M. Leuschel and M. Bruynooghe. Logic program specialisation through partial
     deduction: Control issues. Theory and Practice of Logic Programming, 2(4&5):461–
     515, 2002.
 29. MAP: The MAP transformation system. http://www.iasi.cnr.it/∼proietti/
     system.html. Via a WEB interface: http://www.map.uniroma2.it/mapweb.
 30. J. McCarthy. A basis for a mathematical theory of computation. In Computer
     Programming and Formal Systems, pages 33–70. North-Holland, 1963.
 31. J. McCarthy. Towards a mathematical science of computation. In C. Popplewell,
     ed., Information Processing. Proceedings of IFIP 1962, pages 21–28, Amsterdam,
     1963. North Holland.
 32. J. C. Peralta and J. P. Gallagher. Convex hull abstractions in specialization of CLP
     programs. In M. Leuschel, ed., Logic Based Program Synthesis and Tranforma-
     tion, 12th International Workshop, LOPSTR’02, Madrid, Spain, Revised Selected
     Papers, Lecture Notes in Computer Science 2664, pages 90–108, 2003.
 33. J. C. Peralta, J. P. Gallagher, and H. Saglam. Analysis of Imperative Programs
     through Analysis of Constraint Logic Programs. In G. Levi, ed., Static Analysis,
     5th International Symposium, SAS’98, Pisa, Italy, Lecture Notes in Computer
     Science 1503, pages 246–261. Springer, 1998.
 34. A. Podelski and A. Rybalchenko. ARMC: The Logical Choice for Software Model
     Checking with Abstraction Refinement. In M. Hanus, ed., Practical Aspects of
     Declarative Languages, PADL’07, Lecture Notes in Computer Science 4354, pages
     245–259. Springer, 2007.
 35. C. J. Reynolds. Theories of Programming Languages. Cambridge University Press,
     1998.
 36. J. C. Reynolds. Separation logic: A logic for shared mutable data structures. In
     Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science
     LICS’02, pages 55–74. IEEE Computer Society, 2002.
 37. H. Tamaki and T. Sato. Unfold/fold transformation of logic programs. In S.-Å.
     Tärnlund, ed., Proceedings of the Second International Conference on Logic Pro-
     gramming, ICLP’84, pages 127–138, Uppsala, Sweden, 1984. Uppsala University.
 38. V. F. Turchin. The concept of a supercompiler. ACM TOPLAS, 8(3):292–325,
     1986.
 39. M. Wirsing. Algebraic specification. In J. Van Leeuwen, ed., Handbook of Theo-
     retical Computer Science, volume B, pages 675–788. Elsevier, 1990.