=Paper=
{{Paper
|id=Vol-3284/8665
|storemode=property
|title=On the Properties of Partial Completeness in Abstract Interpretation
|pdfUrl=https://ceur-ws.org/Vol-3284/8665.pdf
|volume=Vol-3284
|authors=Marco Campion,Mila Dalla Preda,Roberto Giacobazzi
|dblpUrl=https://dblp.org/rec/conf/ictcs/CampionPG22
}}
==On the Properties of Partial Completeness in Abstract Interpretation==
On the Properties of Partial Completeness in Abstract Interpretation (Short Paper)β Marco Campion, Mila Dalla Preda and Roberto Giacobazzi University of Verona, Italy Abstract We present a weakening of the completeness property in abstract interpretation. Completeness of a static analyzer represents the ideal situation where no false alarms are produced when answering queries on program behavior. Completeness, however, is a very rare condition to be satisfied in practice. We introduce the notion of partial completeness as a weakening of precision, namely, the abstract interpreter may produce a bounded number of false alarms, and then we show the key recursive properties of the class of programs for which an abstract interpreter is partially complete with a given bound of imprecision. Keywords Abstract Interpretation, Program Analysis, Computability, Partial Completeness 1. Introduction Abstract interpretation [2, 3, 4] is a general theory for the design of sound-by-construction program analysis tools. The abstract interpretation of a program π β Programs1 , denoted by Jπ Kπ΄ , consists of an abstract domain π΄, often specified by a pair of abstraction πΌπ΄ : πΆ β π΄ and concretization πΎπ΄ : π΄ β πΆ monotone maps, abstracting some concrete properties of interest πΆ and an interpreter Jπ Kπ΄ : π΄ β π΄, designed for the language used to specify π and on the abstract domain π΄. The structure of the abstract domain is given by a partial order β€π΄ that expresses the relative precision of its objects: If π, π β π΄ and π β€π΄ π then π over approximates π. We are interested in inferring program invariants, hence our concrete semantics Jπ K : β(S) β β(S) will be the standard collecting denotational semantics (e.g., see [5, Chapter 5] or [6, Chapter 3.5]) operating on a domain of sets of memory of states β(S). Soundness means that if program π satisfies the condition Jπ Kπ΄ πΌπ΄ (π) β€π΄ π for the input π β β(S) and output specification π β π΄, then Jπ Kπ β πΎπ΄ (π). When the converse holds we have precision, or completeness of the abstract interpreter, and therefore of the analysis. This Proceedings of the 23rd Italian Conference on Theoretical Computer Science, Rome, Italy, September 7-9, 2022 β This work is an extended abstract of our recent results published in [1]. " marco.campion@univr.it (M. Campion); mila.dallapreda@univr.it (M. Dalla Preda); roberto.giacobazzi@univr.it (R. Giacobazzi) 0000-0002-1099-3494 (M. Campion); 0000-0003-2761-4347 (M. Dalla Preda); 0000-0002-9582-3960 (R. Giacobazzi) Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop CEUR Workshop Proceedings (CEUR-WS.org) Proceedings http://ceur-ws.org ISSN 1613-0073 1 For our purposes, Programs contains programs written in the simple while-language defined in [5, Chapter 2, page 12] 1 Marco Campion et al. CEUR Workshop Proceedings 1β7 represents the ideal situation where no false alarms are produced. Completeness, however, is a very rare condition to be satisfied in practice, therefore, static program analysis needs to, in some way, deal with incompleteness. Moreover, the experience tells us that there are results that are βmore incompleteβ, i.e., less precise, than others, and this depends upon the way the program is written and the way the abstract interpreter is implemented [7]. As an example, consider the following program π ::= while π₯ > 1 do π₯ := π₯ β 2 if π₯ = 1 then π₯ := 10 if π₯ = 2 then π₯ := 100 and the set of values {2, 4}. Clearly Jπ K{2, 4} = {0}. However, when π is analyzed by an abstract interpreter Jπ KInt (e.g., see [6, Chapter 4.5]) which considers abstract arithmetical operators as the best correct approximation (bca for short), defined on the abstract domain of intervals Int β {[π, π] | π, π β Z βͺ {ββ, +β}, π β€ π} βͺ {β₯Int } [2] or defined as the bca for Int, namely, Jπ KInt bca β πΌInt β Jπ K β πΎInt β πΌInt , then it may exhibit different levels of imprecision. Recall that Int abstracts sets of integer values and it contains all intervals [π, π] such that π, π β ZΒ±β where ZΒ±β β Z βͺ {ββ, +β} , and π β€ π. So, for the input of π we get πΌInt ({2, 4}) = [2, 4], while for the output we obtain the single point πΌInt (Jπ K{2, 4}) = [0, 0]. Then, for the two mentioned analysis of π we get Jπ KInt bca {2, 4} = [0, 10] and Jπ KInt πΌInt ({2, 4}) = [0, 100]. Note the three different interval properties obtained by the concrete, bca and abstract evaluations of π over Int with input {2, 4}: [0, 0] β€Int [0, 10] β€Int [0, 100]. These discrepancies are what we want to measure. To this end, we consider a weaker form of metric function that is made specific for the elements of an abstract domain, namely, it is compatible with the underlying ordering relation β€π΄ , hence taking into account the presence of incomparable elements. This distance function incorporates both the qualitative comparison given by the partial order and a quantitative comparison (Section 2). By exploiting this idea, we can introduce the notion of π-partial completeness of an abstract domain π΄ with respect to a given program and a given (set of) input values (Section 3). A partially complete abstract interpretation allows some false-alarms to be reported, but their number is bounded. In this case, the imprecision of the abstract interpreter is bounded by π, namely, the distance between the property represented by the abstraction of the concrete semantics and the result of the abstract interpretation on the given input, is at most π. Given a tolerance π of imprecision and a set of inputs πΌ, we want to answer the following questions (Section 4): Can we implement a program analyzer such that all programs having πΌ as input are π-partial complete? Can we prove if our program analysis has or not a bounded level of imprecision? 2. Quasi-metrics on Abstract Domains Our goal is to derive the bound of imprecision of an abstract interpreter with respect to a given measure over the abstract domain. For this reason, we need a metric to compare the elements of the abstract domain according to their relative degree of precision. We refer to the 2 Marco Campion et al. CEUR Workshop Proceedings 1β7 weaker notion of quasi-metric introduced in [8] on a non-empty set π. This is a metric function πΏ : π Γ π β Qβ₯0 whose symmetry property may not hold. A set endowed with a quasi-metric is called quasi-metric space. Let π(πΆ) be the set of all abstract domains π΄ abstracting some set of concrete properties πΆ and having ordering relation β€π΄ 2 . Definition 2.1 (Quasi-metrics π΄-compatible). We say that the distance function πΏπ΄ : π΄Γπ΄ β Nββ₯0 βͺ {β₯} is a quasi-metric π΄-compatible for π΄ β π(πΆ) if for all π1 , π2 , π3 β π΄, it satisfies the following axioms: (i) π1 = π2 β πΏπ΄ (π1 , π2 ) = 0 (ii) π1 β€π΄ π2 β πΏπ΄ (π1 , π2 ) ΜΈ= β₯ (iii) π1 β€π΄ π2 β€π΄ π3 β πΏπ΄ (π1 , π3 ) β€ πΏπ΄ (π1 , π2 ) + πΏπ΄ (π2 , π3 ) (iv) βπ1 , π2 β π΄, π β Nβ₯0 the predicate πΏπ΄ (π1 , π2 ) β€ π is decidable. We allow the quasi-metric between two uncomparable elements to be β₯, which represents an undefined distance. An abstract domain π΄ β π(πΆ) endowed with a quasi-metric π΄-compatible πΏπ΄ , forms an abstract quasi-metric space, denoted by π΄ β (π΄, πΏπ΄ ). We use A(πΆ) to refer to the set of all abstract quasi-metric spaces, and write π΄ β A(πΆ). As an example of quasi-metric π΄-compatible, we define the weighted path-length πΏπ΄ w as the minimum weighted path, w.r.t. a weight function w, of intermediate elements between two comparable elements of an abstract domain π΄. Here πΏπ΄ w considers the lattice of π΄ as a weighted directed graph where each edge corresponds to two adjacent abstract elements π, π, that is, π β€π΄ π and there are no elements π β π΄ such that π β€π΄ π β€π΄ π. So for example, πΏInt w over the intervals abstraction having w(Β·, Β·) = 1, i.e., setting to 1 the weight of each edge of Int, returns 9 between [0, 0] and [0, 10] meaning that there are exactly 9 more elements in [0, 10] respect to [0, 0], while πΏInt w ([0, 0], [0, +β]) = β and πΏ w ([0, 0], [1, 10]) = β₯ because [0, 0] ΜΈβ€ Int Int [1, 10]. 3. Partial Completeness Standard completeness in program analysis3 , e.g., see [2, 3, 13], means that no false alarms are returned by analyzing the program with an abstract interpreter on any possible input state. Local completeness, instead, is a recently introduced property [14] that requires completeness only with respect to specific inputs. An π-partially complete program analyzer allows some false alarms to be reported over a considered (set of) input, but their amount is bounded by a constant π which is determined according to a quasi-metric which is compatible with the abstract domain used by the analysis. Formally, given a program π β Programs, a constant bound π β Nβ₯0 , a non-empty set of stores π β β(S) and π΄ β A(β(S)), we say that π΄ is π-partially complete for π in π if πΌπ΄ (Jπ Kπ) β€ππΏπ΄ Jπ Kπ΄ πΌπ΄ (π), where π β€ππΏπ΄ π iff πΏπ΄ (π, π) β€ π. We now have all the ingredients to introduce the notion of π-partial completeness class of programs. 2 We consider abstract domains in π(πΆ) that are either recursive or trivial [9, 1]. 3 In some topics concerning abstract interpretation, completeness is typically recurrent, e.g., in comparative seman- tics [10, 11] or formal languages [12]. Here we are considering only the field of program analysis. 3 Marco Campion et al. CEUR Workshop Proceedings 1β7 Definition 3.1 (π-Partial completeness class). The partial completeness class of an abstract quasi-metric space π΄ β A(β(S)), a constant π β Nβ₯0 and a non-empty set of inputs π β β(S), denoted C(π΄, π, π) β Programs, is defined as: C(π΄, π, π) β {π β Programs | πΌπ΄ (Jπ Kπ) β€ππΏπ΄ Jπ Kπ΄ πΌπ΄ (π)}. Similarly to the completeness case [15], for every π, the π-partial completeness class is infinite and non-extensional. It is infinite because for all π β Nβ₯0 and π β β(S), C(π΄) β C(π΄, π, π) and C(π΄) is infinite, where C(π΄) denotes the (global) completeness class of programs. It is also non-extensional because there always exist programs π and π such that: π is partially complete for π΄, Jπ K = JπK, and π is not partially complete for π΄. 4. Recursive Properties of Partial Complete Programs It is well known that for trivial abstractions (i.e., either the identity or an abstraction having one element only) the corresponding completeness class turns out to be the whole set of programs [13]. Moreover, for all non-trivial abstractions in π(β(S)), its completeness class is strictly contained in Programs [15]. In this section, we study the counterpart of these results for the case of partial completeness. It turns out that the quasi-metric π΄-compatible chosen for measuring the imprecision of the analysis, plays an important rule here in order to determine the recursive properties of C(π΄, π, π) and, therefore, of the considered program analysis. We first consider the simplest case of abstract quasi-metric spaces of stores π΄ β A(β(S)) satisfying the property of having a limited imprecision by π β Nβ₯0 , i.e., abstractions such that βπ β π΄ : πΏπ΄ (β₯π΄ , π) β€ π. These include, for instance, the case of finite height lattices with the weighted path-length quasi-metricβcomplete lattices which are both ACC and DCC. Theorem 4.1. If π΄ β A(β(S)) has limited imprecision, then we have for all π β β(S): βπ β Nβ₯0 . C(π΄, π, π) = Programs . For example, the Sign β {Z, β, 0, +, β } abstraction [16] measured with πΏSign w having w(Β·, Β·) = 1, has limited imprecision because by for any π β₯ 3, the partial completeness class turns out C((Sign, πΏSign w ), π, π) = Programs. The difference with respect to the case of standard completeness class C(π΄) is that, thanks to the possibility of admitting an upper margin to imprecision (i.e., possible false alarms), there always exists a class of partial completeness with respect to a given bound which includes all programs. Conversely, an abstract quasi-metric space of stores π΄ β A(β(S)) has unlimited imprecision when: βπ β Nβ₯0 , βπ β π΄. πΏπ΄ (β₯π΄ , π) > π. The abstract quasi-metric space (Int, πΏInt w ) is such an example. Theorem 4.2. Let π΄ β A(β(S)) be any abstract quasi-metric space of stores with unlimited imprecision πΏπ΄ and π β β(S). Then, the following equivalence holds: βπ β Nβ₯0 . C(π΄, π, π) = Programs β π΄ = β(S). Informally, if we consider a non-trivial abstract quasi-metric space that has unlimited imprecision and an input π, then independently of how we set a threshold π of false alarms acceptance, there always exists a program π for which the abstract analysis over π΄ with input π, is not 4 Marco Campion et al. CEUR Workshop Proceedings 1β7 π-partially complete, namely π ΜΈβ C(π΄, π, π). By a straightforward padding argument, any of these programs can be extended to an infinite set of programs for which the abstraction is π-partially incomplete. The class of π-partial incomplete programs for π΄ β A(β(S)) with input π, is the complement set of C(π΄, π, π), formally: C(π΄, π, π) = {π β Programs | πΌπ΄ (Jπ Kπ) ΜΈβ€ππΏπ΄ Jπ Kπ΄ πΌπ΄ (π)}. Theorem 4.2 implies that any non-trivial abstract domain of stores endowed with an unlimited imprecision πΏπ΄ , has an infinite set of programs for which the abstract interpreter is π-partially incomplete. Conversely, if πΏπ΄ has limited imprecision, then, trivially, we can always find a certain level of tolerance that makes the analysis π-partially complete for all programs. We now study the computational limits of the class of partially complete and incomplete programs. In this case, the topological structure of π΄ in terms of ascending chains, is fundamental. We say that an abstract quasi-metric space of stores π΄ β A(β(S)) is π-trivial for some π β Nβ₯0 if C(π΄, π, π) = Programs. The following theorem shows that the class of programs C(π΄, π, π) turns out to be recursively enumerable (r.e.) whenever the abstract domain of stores π΄ satisfies the Ascending Chain Condition (ACC). Theorem 4.3. If π΄ β A(β(S)) is ACC, then for every π β β(S) and π β Nβ₯0 , C(π΄, π, π) is r.e.. Sketch. Run a dovetail algorithm constructing the set π β πβN {πΌπ΄ (Jπ Kπ π )} starting from β¨οΈ π = β₯, where π π β π is an enumeration of π. If, at some point of the iterates, there exists π π β π such that Jπ Kπ π ΜΈ= β , π = π βπ΄ πΌπ΄ (Jπ Kπ π ) and πΏπ΄ (π, Jπ Kπ΄ πΌπ΄ (π)) β€ π, then the algorithm terminates. If π β C(π΄, π, π) then the convergence of the above algorithms is guaranteed by the ACC of π΄. The following theorem states that both C(π΄, π, π) and C(π΄, π, π) are non-r.e. sets when π΄ is not π-trivial and not ACC. Theorem 4.4. If π΄ β A(β(S)) is not π-trivial, then C(π΄, π, π) is non-r.e.. Moreover, if π΄ is also not ACC, then C(π΄, π, π) is non-r.e.. As a straightforward corollary, the local completeness class C(π΄, π) for π΄ satisfying the ACC property is r.e., while non-r.e. for non-ACC abstractions. Furthermore, if π΄ is not trivial, then C(π΄, π) is non-r.e. even if π΄ is ACC. Let us notice that Theorems 4.3 and 4.4 provide a further insight into the structure of C(π΄, π, π) and its complement class C(π΄, π, π). These theorems prove that, given any non-ACC abstract quasi-metric space of stores π΄, whenever we limit the expected imprecision of our analysis to a bound π of possible false alarms w.r.t. an input π, we cannot build a procedure that enumerates all programs satisfying that bound or that do not respect that bound, unless the abstract domain is π-trivial. Therefore, deciding whether a static program analysis can produce or cannot produce some bounded set of false alarms is in general impossible, unless π΄ satisfies the ACC. In this last case, we can at least answer βyesβ if the static analysis has a bounded imprecision over a program with a specific input. The π-partial completeness and incompleteness class of an abstraction are therefore a non-trivial property of programs for which no recursively enumerable procedure may exist which is able to enumerate all of their elements. 5 Marco Campion et al. CEUR Workshop Proceedings 1β7 References [1] M. Campion, M. Dalla Preda, R. Giacobazzi, Partial (in)completeness in abstract interpreta- tion: Limiting the imprecision in program analysis, Proc. ACM Program. Lang. 6 (2022). doi:10.1145/3498721. [2] P. Cousot, R. Cousot, Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints, in: Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, ACM Press, 1977, pp. 238β252. doi:10.1145/512950.512973. [3] P. Cousot, R. Cousot, Systematic design of program analysis frameworks, in: Proceedings of the 6th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, ACM Press, 1979, pp. 269β282. doi:10.1145/567752.567778. [4] P. Cousot, Principles of Abstract Interpretation, The MIT Press, Cambridge, Mass., 2021. [5] G. Winskel, The formal semantics of programming languages: an introduction, MIT press, 1993. [6] A. MinΓ©, Tutorial on static inference of numeric invariants by abstract interpretation, Foundations and Trends in Programming Languages 4 (2017) 120β372. URL: https://doi. org/10.1561/2500000034. doi:10.1561/2500000034. [7] R. Bruni, R. Giacobazzi, R. Gori, I. Garcia-Contreras, D. Pavlovic, Abstract extensionality: on the properties of incomplete abstract interpretations, PACMPL 4 (2020) 28:1β28:28. doi:10.1145/3371096. [8] W. A. Wilson, On quasi-metric spaces, American Journal of Mathematics 53 (1931) 675β684. doi:10.2307/2371174. [9] P. Cousot, R. Giacobazzi, F. Ranzato, Program analysis is harder than verification: A computability perspective, in: International Conference on Computer Aided Verification, Springer, 2018, pp. 75β95. doi:10.1007/978-3-319-96142-2_8. [10] P. Cousot, R. Cousot, Inductive definitions, semantics and abstract interpretation, in: Conference Record of the 19th ACM Symp. on Principles of Programming Languages (POPL β92), ACM Press, 1992, pp. 83β94. [11] P. Cousot, R. Cousot, Compositional and inductive semantic definitions in fixpoint, equational, constraint, closure-condition, rule-based and game-theoretic form (Invited Paper), in: P. Wolper (Ed.), Proc. of the 7th Internat. Conf. on Computer Aided Verification (CAV β95), volume 939 of Lecture Notes in Computer Science, Springer-Verlag, 1995, pp. 293β308. [12] M. Campion, M. Dalla Preda, R. Giacobazzi, Abstract interpretation of indexed grammars, in: International Static Analysis Symposium, Springer, 2019, pp. 121β139. doi:10.1007/ 978-3-030-32304-2_7. [13] R. Giacobazzi, F. Ranzato, F. Scozzari., Making abstract interpretation complete, Journal of the ACM 47 (2000) 361β416. doi:10.1145/333979.333989. [14] R. Bruni, R. Giacobazzi, R. Gori, F. Ranzato, A logic for locally complete abstract interpre- tations, in: Proc. 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2021), IEEE Computer Society, 2021, pp. 1β13. doi:10.1109/LICS52264.2021.9470608, distinguished paper. [15] R. Giacobazzi, F. Logozzo, F. Ranzato, Analyzing program analyses, in: Proceedings 6 Marco Campion et al. CEUR Workshop Proceedings 1β7 of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2015, Mumbai, India, January 15-17, 2015, 2015, pp. 261β273. doi:10. 1145/2676726.2676987. [16] P. Cousot, R. Cousot, Static determination of dynamic properties of programs, in: Pro- ceedings of the 2nd International Symposium on Programming, Dunod, Paris, 1976, pp. 106β130. doi:10.1145/390019.808314. 7