=Paper=
{{Paper
|id=Vol-3326/ARQNL2022_paper4
|storemode=property
|title=(Re)moving Quantifiers to Simplify Parameterised Boolean Equation Systems
|pdfUrl=https://ceur-ws.org/Vol-3326/ARQNL2022_paper4.pdf
|volume=Vol-3326
|authors=Thomas Neele
|dblpUrl=https://dblp.org/rec/conf/cade/Neele22
}}
==(Re)moving Quantifiers to Simplify Parameterised Boolean Equation Systems==
(Re)moving Quantifiers to Simplify Parameterised Boolean Equation Systems⋆ Thomas Neele Eindhoven University of Technology, Eindhoven, The Netherlands Abstract We investigate the simplification of parameterised Boolean equation systems (PBESs), a first-order fixpoint logic, by means of quantifier manipulation. Whereas the occurrence of quantifiers in a PBES can significantly increase the effort required to solve a PBES, i.e., compute its semantics, existing syntactic transformations have little to no support for quantifiers. We resolve this, by proposing a static analysis algorithm that identifies which quantifiers may be moved between equations such that their scope is reduced. This syntactic transformation can drastically reduce the effort required to solve a PBES. Additionally, we identify an improvement to the computation of guards, which can be used to strengthen our static analysis. 1. Introduction A parameterised Boolean equation system (PBES) [2] is a sequence of equations of the shape σX(d:D) = ϕ, where σ ∈ {µ, ν} is a fixpoint sign, X is a predicate variable, d of type D is a sequence of parameters and ϕ is a predicate formula over d. This equation specifies a solution for X, which is a set of values d may take on and which depends recursively on other predicate variables in the PBES; the fixpoint sign indicates whether we are looking for a least (µ) or greatest (ν) set of values. PBESs are the first-order extension of Boolean equation systems (BESs) or, equivalently, parity games. Similar to BESs, they can be used to encode decision problems, and hence see various applications, including model checking of modal µ-calculus formulae [3] and equivalence checking of processes [4]. PBESs are strictly more expressive than BESs, however, since some decision problems cannot be finitely encoded and require the first-order data of a PBES. The general concept of equation systems with fixpoints over complete lattices is captured in hierarchical equation systems [5]. Since directly computing the solution of a PBES is not straightforward, the preferred approach is to instantiate [6] its underlying BES, if that is finite (for infinite BESs, symbolic techniques can be a reasonable alternative [7]). However, this BES may be prohibitively large, thus preventing us in practice from obtaining a solution through instantiation. ARQNL 2022: Automated Reasoning in Quantified Non-Classical Logics, 11 August 2022, Haifa, Israel ⋆ This work previously appeared in the PhD thesis of the author [1, Chapter 7]. $ t.s.neele@tue.nl (T. Neele) 0000-0001-6117-9129 (T. Neele) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) ARQNL 2022 64 CEUR-WS.org This problem has been studied in the past, resulting in the creation of various syntactic transformations [8, 9, 10] for PBESs that aim to reduce the size of the underlying BES. A shortcoming, though, is that those techniques mostly have no support for quantifiers that may occur in the predicate formulae that form the right-hand side of equations. At the same time, these quantifiers can be (partially) responsible for the large size of the underlying BES. We address this, and achieve a larger reduction than previous work. Our approach is based on the observation that, under certain conditions, a quantifier may be moved from one equation to another, an operation which we call propagation. This effectively reduces the scope of said quantifier, and limits the impact it has on the size of the underlying BES. Since propagation cannot be applied to PBESs with cyclic dependencies, we instead propose to perform static anal- ysis to determine which predicate variables always occur in the same quantifier context, either directly or indirectly. The resulting information, which we call global propagated values, can be used to move quantifiers to other equations or even eliminate them completely. Our algorithm is a generalisation of the constant parameter elimination technique from [9]. Experiments with PBESs from the literature show that this may result in significant reduction of the size of the underlying BES, in one case even reducing an infinite BES to a finite one. The experiments also help us identify a couple of limitations, including the fact that our technique cannot deal with quantified variables whose domain is bounded by an expression. The static analysis can be strengthened by the application of guards: predicate formulae that characterise the dependency between predicate variables. We show how guards can be computed such that information about quantifiers is preserved. As a result, we produce stronger guards than those in [11]. Related Work We informally describe several existing syntactic transformation techniques for PBESs. The first technique is constant elimination [9], which identifies parameters whose value never changes. The analysis takes constant values from a target expression κ and propagates them through the PBES to determine whether their value stays constant. If so, this invariant [2, 12] is used to simplify the right-hand side of equations. Constant elimination by itself never results in state space reduction, but the resulting simplifications may speed up PBES instantiation and also enable other reduction techniques. The same paper also discusses redundant parameter elimination [9], which identifies parame- ters that do not influence any of the conditions in the PBES. This is achieved by extracting from the PBES an influence graph over parameters, together with a set of parameters that are consid- ered influential. Backwards reachability yields the set of all transitively influential parameters. Eliminating non transitively influential parameters from the PBES preserves its semantics, but reduces the size of the underlying BES. A generalisation is the state graph technique of [8], which distinguishes control flow parameters—parameters which can take a limited number of values and are updated deterministically—from data parameters, and performs an analysis of influential data parameters per control flow location. A data parameter that is not influential in a certain control flow location is reset to a fixed value when taking a transition to that location. Finally, abstractions may also benefit PBESs [10], and can for example be used to perform model checking of real-time systems. The approach in [10] requires a manual definition of the necessary abstraction, and is thus not fully-automated like the other works discussed above. 65 Overview We introduce basic concepts in Section 2. Section 3 provides a motivating example. Section 4 shows how to compute and apply global propagated values to simplify PBESs. We discuss guards in Section 5. Section 6 presents a small experiment and discusses limitations of the ideas. Finally, Section 7 concludes. 2. Preliminaries Sequences We adopt the following notation for finite sequences. Firstly, the empty sequence is denoted ǫ. Given a sequence v, v[i] denotes its ith element. If v is a sequence of length n, then a strictly increasing, total function f : {1, . . . , n′ } → {1, . . . , n} characterises a subsequence v′ of length n′ ≤ n, viz., v′ [i] = v[f (i)] for all 1 ≤ i ≤ n′ . The fact that v′ is a subsequence of v with characterising function f is denoted v′ ⊑f v. We omit the subscript f if it is not relevant. Furthermore, given two sequences v′ and v such that v′ ⊑f v, we write v[j] ∈ v′ iff there is an i such that f (i) = j; for such cases the partial inverse function f −1 yields f −1 (j) = i. We use list comprehensions to filter a sequence and obtain a subsequence, e.g., [v[i] | i mod 2 = 0] is the unique subsequence that contains the elements with an even index in v and only those. Lastly, we can calculate the complement of a subsequence, notation v \ v′ , which is defined as / v′ ]. To concatenate two sequences, we use ++. [v[i] | v[i] ∈ Abstract Data In this work, we consider abstract data types and expressions involving those types. Syntactic and semantic data types are distinguished: syntactic data sorts are denoted with D, D′ etc., while the corresponding semantic domains are denoted D, D′ etc. We assume that the domains are non-empty; we typically use letters v, w, . . . for the values they contain. In our examples, we may use B and N (domains B and N) to represent the sorts of Booleans and natural numbers, respectively. Expressions contain data variables (we typically use names such as d, d′ , d1 , . . . ) from the set V. The notation d:D indicates that the sort of variable d is D. Interpreting expressions is done by the interpretation function J·Kδ, which requires a data environment δ that maps variables to a value from their corresponding semantic domain. Intu- itively, Jf Kδ denotes the semantics of the expression f in the context of δ. The data environment is not relevant for interpreting an expression that does not contain variables, called a ground term, in that case we write Jf K. An update to an environment δ is written δ[v/d], defined as δ[v/d](d) = v and δ[v/d](e) = δ(e) for all e 6= d. We also allow sequences of values and sequences of variables in data environment updates, i.e., given a sequence of distinct variables d and a sequence of values v of equal length and equal sort, δ[v/d] is the data environment δ[v[1]/d[1]] . . . [v[n]/d[n]]. Equation Systems A parameterised Boolean equation system (PBES) is a sequence of equations over predicate formulae, where each equation is labelled with either a least or a greatest fixpoint. Here, we shortly introduce the relevant concepts, the interested reader is referred to [2] for a detailed discussion. Definition 1. A predicate formula is defined by the following grammar: ϕ ::= b | ϕ ∨ ϕ | ϕ ∧ ϕ | ¬ϕ | ∃d:D. ϕ | ∀d:D. ϕ | X(e) 66 where b is an expression of sort B, d is a variable of sort D, X is a predicate variable of sort DX → B, which is taken from some set X of sorted predicate variables and argument e of sort DX is a sequence of expressions. A shorthand for ¬ϕ ∨ ψ is ϕ ⇒ ψ. A predicate formula ϕ is interpreted in the context of a data environment δ and predicate environment η, which maps each predicate variable X ∈ X to a subset of the corresponding semantic domain DX . This is denoted JϕKηδ and inductively defined as: JbKηδ ⇔ JbKδ JX(e)Kηδ ⇔ JeKδ ∈ η(X) Jϕ ∧ ψKηδ ⇔ JϕKηδ and JψKηδ hold Jϕ ∨ ψKηδ ⇔ JϕKηδ or JψKηδ hold J¬ϕKηδ ⇔ JϕKηδ does not hold J∀d:D. ϕKηδ ⇔ for all v ∈ D, JϕKηδ[v/d] holds J∃d:D. ϕKηδ ⇔ for some v ∈ D, JϕKηδ[v/d] holds An expression of the shape X(e) is called a predicate variable instance (PVI). The set of all PVIs occurring in a formula ϕ is iocc(ϕ). A predicate formula is monotone if and only if all PVIs occur in the scope of an even number of negations. Substitution of a subformula ψ by ψ ′ in ϕ is denoted ϕ[ψ ′ /ψ]; this is also lifted to sequences, so we allow ϕ[ψ1′ , . . . , ψn′ /ψ1 , . . . , ψn ] in the same way as for environment updates. The set of data variables occurring freely in ϕ, i.e., those not bound in a quantifier, is vars(ϕ). Definition 2. A parameterised Boolean equation system (PBES) is a sequence of equations, as specified by the following grammar: E ::= ∅ | (µX(d:D) = ϕ)E | (νX(d:D) = ϕ)E where ∅ is the empty PBES, µ denotes the least and µ denotes the greatest fixpoint, X ∈ X is a predicate variable, d of sort D is a sequence that contains the parameters of X, which are pairwise distinct, and ϕ is a monotone predicate formula. As we will see later in de definition of the semantics of a PBES, the order of equations is relevant, and a PBES is thus not simply a set of equations. We do not write the trailing ∅ and typically use σ to refer to an arbitrary fixpoint. The variables that occur on the left-hand side of an equation in a PBES E are bound in E; bnd(E) is the set of variables bound in E. A PBES E is well-formed if and only if for every X ∈ bnd(E), there is precisely one equation in E We say E is closed if it does not contain free (predicate) variables, i.e., all variables occurring in the right- hand side of an equation σX(d:D) = ϕ are either bound as parameter in d or by a quantifier that occurs in ϕ, while all predicate variables occurring in ϕ must be contained in bnd(E). From here on, we only consider closed, well-formed PBESs. Therefore, each X ∈ bnd(E) uniquely identifies the components of the corresponding equation, and we thus denote the parameters of X as dX :DX and its right-hand side as ϕX . The signature of E is defined as sig(E) = {(X, DX ) | X ∈ bnd(E)}. Lastly, a PBES that does not contain parameters, i.e., dX = ǫ for all X, is also called a Boolean equation system (BES). The denotational semantics of a PBES is a predicate environment, called the solution. Definition 3. Given environments η and δ, the solution of a PBES E is inductively defined as: J∅Kηδ = η 67 J(µX(dX :DX ) = ϕX )EKηδ = JEKη[µTX /X]δ J(νX(dX :DX ) = ϕX )EKηδ = JEKη[νTX /X]δ where TX is the function TX (R) = {v ∈ DX | JϕX K(JEKη[R/X]δ)δ[v/dX ]}. Since, for bound variables, the solution of a closed PBES is independent of the environments η and δ, we may also plainly write JEK. The solution η = JEK satisfies each equation in E (it holds that v ∈ η(X) ⇔ JϕX Kηδ[v/d] for all X ∈ bnd(E) and δ), while giving priority to the fixpoints of the equations that occur early in the PBES. Since each predicate formula is monotone, the function TX is also monotone and, by Tarski’s theorem, it has unique least and greatest fixpoints in the complete lattice (2D , ⊆). Example 1. Consider the PBES E, defined as µX(b:B) = Y (b) ∨ X(true) νY (b:B) = Y (b) ∧ (b ⇒ X(b)) η[µTX /X] The semantics JEKηδ unfolds to η[µTX /X][νTY /Y ]. Here, the greatest fixpoint νTYη of the function TYη (P ) = {v ∈ B | JY (b) ∧ (b ⇒ X(b))K(J∅Kη[P/Y ]δ)δ[v/b]} = {v ∈ B | JY (b) ∧ (b ⇒ X(b))Kη[P/Y ]δ[v/b]} = {v ∈ B | JY (b)Kη[P/Y ]δ[v/b] ∧ Jb ⇒ X(b)Kη[P/Y ]δ[v/b]} = {v ∈ B | v ∈ P ∧ (v ⇒ v ∈ η(X))} depends on η(X). If true ∈ η(X), then νTYη = B, otherwise νTYη = {false}. Hence, false ∈ νTYη holds regardless of η. Consequently, for the least fixpoint µTX of the function TX (R) = {v ∈ B | JY (b) ∨ X(true)K(JνY (b:B) = Y (b) ∧ (b ⇒ X(b))Kη[R/X]δ)δ[v/d]} η[R/X] = {v ∈ B | JY (b) ∨ X(true)K(J∅Kη[R/X][νTY /Y ]δ)δ[v/d]} η[R/X] = {v ∈ B | JY (b) ∨ X(true)Kη[R/X][νTY /Y ]δ[v/d]} η[R/X] = {v ∈ B | v ∈ νTY ∨ true ∈ R} we must have false ∈ µTX . The set {false} is thus the least fixpoint µTX . It follows that JEKηδ(X) = JEKηδ(Y ) = {false}. Note that if the equations of E are swapped to create E ′ , then the greatest fixpoint of Y becomes dominating and we have JE ′ Kηδ(X) = JE ′ Kηδ(Y ) = B. Typically, we are not interested in the complete solution of a PBES, but we only want to know whether JêK ∈ JEK(X̂) holds for some target PVI X̂(ê), where e is a sequence of ground terms. The most common way to partially solve a PBES E, i.e., compute JêK ∈ JEK(X̂), is through a procedure called instantiation [6], akin to state-space exploration in model checking. Starting from the target PVI, the successors of a node X(e) are discovered by traversing the right-hand side ϕX [e/dX ]. The result is a BES (or sometimes a parity game) containing one equation Xe for each X(e) reachable from X̂(ê). This BES can be solved through Gauss elimination [13] or one of the popular parity game algorithms, e.g., Zielonka’s recursive algorithm [14]. 68 s := ǫ length(s) ≤ 2 s 6= ǫ push(k) pop(head(s)) s := (k mod 6) ⊲ s s := tail(s) Figure 1: Extended finite state machine modelling a stack. 3. Motivating Example To motivate our approach, we consider a small example of a stack that can hold at most two natural numbers. However, for every number k that one tries to push onto the stack, the remainder k mod 6 is stored. See the extended finite state machine in Figure 1. The stack is stored in variable s, which is initially set to the empty list. Each transition is labelled with a condition, an action with parameters and a variable update (in that order). The leftmost transition can be executed for any natural number k. Here, ǫ denotes the empty list and the functions head and tail respectively take the first element and the remainder of a list. Elements are prepended to a list with the operator ⊲. This machine has a state space consisting of 43 states and an infinite number of transitions. Suppose that we want to check the property “there is a number n for which there is an execution in which push(n) is enabled infinitely often”. In the first-order modal µ-calculus, this can be expressed by the formula ϕ, defined as ϕ z }|1 { ϕ = ∃n:N. νX. (µY. h⊤iY ∨ (hpush(n)itrue ∧ X)), | {z } | {z } ϕ2 ϕ3 where ⊤ indicates the set of all actions and subformulae ϕ1 , ϕ2 and ϕ3 are as indicated. The question whether the bounded stack satisfies this formula is answered by the solution for Z(ǫ) in the following PBES (how this PBES originates from (subformulae of) ϕ is indicated on the right): νZ(s:List(N )) = ∃n:N. X(s, n) }ϕ νX(s:List(N ), n:N ) = Y (s, n) } ϕ1 ) µY (s:List(N ), n:N ) = (∃k:N. length(s) ≤ 2 ∧ Y (k mod 6 ⊲ s, n)) ϕ2 ∨ (s 6= ǫ) ∧ Y (tail(s), n) ∨ (length(s) ≤ 2) ∧ X(s, n) } ϕ3 Here, we used the parameterised List data sort to store a finite stack of natural numbers. In the right-hand side of Y , the quantifier ∃k:N in the first disjunct stems from the fact that action push(k) can be executed for any k. The second disjunct corresponds to executing pop(head (s)). The last disjunct expresses that push(n) is enabled (subformula ϕ3 ), which is the case if and only if length(s) ≤ 2. Due to the quantifier in the right-hand side of the equation for Z, we are not able to finitely instantiate this PBES. A first observation that may help to resolve this issue is that in some 69 cases we can use (reverse) substitution of predicate variables to move quantifiers. For example, assuming W only occurs in the right-hand side of V , the following operations preserve [2] the semantics of V : J(νV = ∀d′ :D. W (d′ ))(µW (d:D) = ϕW )Kηδ(V ) = J(νV = ∀d′ :D. ϕW [d′ /d])Kηδ(V ) = J(νV = W )(µW = ∀d′ :D. ϕW [d′ /d])Kηδ(V ) In the first step, we substitute the right-hand side of Y for the PVI W (d′ ); in the second step we do the reverse, but also move the quantifier ∀d′ :D. We managed to propagate the quantifier from the right-hand side of V through the PVI W (d) to the right-hand side of W , in the process eliminating W ’s parameter and thus reducing the signature of the PBES. A similar transformation can be applied if W occurs multiple times in the surrounding PBES, as long as it always occurs in the context of the same quantifiers. In our motivating example, this approach cannot be applied due to the circular dependency between X and Y . However, observe that after a value is chosen for n in the right-hand side of Z, it does not change any more. Thus, however many substitutions we perform in the right-hand side of Z, all occurrences of X and Y have the shape X(f, n) and Y (f, n), where f is some expression. This suggests that all PVIs for X and Y indirectly occur in the context of the quantifier ∃n:N . In the next section we show how to identify such cases and how to exploit them to reduce the signature of a PBES. 4. Global Propagated Values As illustrated by our motivating example, we need an analysis of the complete PBES, and generally cannot rely on local propagation of quantifiers between equations. Our analysis yields so called global propagated values, which indicate when a predicate variable (indirectly) occurs in the context of the same quantifiers throughout the PBES. Before we discuss these ideas in detail, we first introduce several more preliminary concepts. We generally use Q ∈ {∀, ∃} to denote an arbitrary quantifier and Q ∈ ({∀, ∃} × V)* to denote a finite sequence of quantified variables. Each element in such a sequence is a pair of a quantifier and a sorted variable, which is not necessarily unique in Q. The function v extracts the variables, i.e., v((Q, d)) = d; lifted to sequences, we have v(Q) = v(Q[1]), . . . , v(Q[n]). As a generalisation of PVIs, we also consider quantified predicate variable instances (QPVIs). These have the shape Q. X(e): a PVI preceded by zero or more quantifiers. The recursive function qiocc extracts maximal QPVIs from a predicate formula: qiocc(Q, b) = ∅ qiocc(Q, ¬ϕ) = qiocc(ǫ, ϕ) qiocc(Q, ϕ ∧ ψ) = qiocc(ǫ, ϕ) ∪ qiocc(ǫ, ψ) qiocc(Q, ϕ ∨ ψ) = qiocc(ǫ, ϕ) ∪ qiocc(ǫ, ψ) qiocc(Q, ∀d:D. ϕ) = qiocc(Q ++ [(∀, d)], ϕ) qiocc(Q, ∃d:D. ϕ) = qiocc(Q ++ [(∃, d)], ϕ) qiocc(Q, X(e)) = {Q. X(e)} where Q is a sequence of quantified variables. Furthermore, we overload qiocc, such that we have qiocc(ϕ) = qiocc(ǫ, ϕ). 70 To reason about QPVIs, parts of the expression sequence e they carry and the quantifier context they occur in, we introduce the concept of propagated values. Definition 4. A propagated value is a tuple (Qp , dp , ep ), where Qp is a sequence of quantified variables, dp is a sequence of variables and ep is a sequence of expressions and dp and ep are of equal length and sort. Henceforth we will denote a propagated value with Qp . dp := ep . The propagated value (ǫ, ǫ, ǫ) is denoted ⊥. The relation between QPVIs and propagation values is captured in the propagation relation. Definition 5. The propagation relation ֒→, which relates QPVIs and propagated values, is the largest relation such that for all (Q. X(e)) ֒→ Qp . dp := ep the following conditions hold: • Qp is a suffix of Q; and • there is a characterising function f such that ep ⊑f e and dp ⊑f dX ; and • vars(e \ ep ) ∩ v(Qp ) = ∅ and vars(ep ) ⊆ v(Qp ). Note that every element in dX is unique, so for every sequence dp , there is at most one function f such that dp ⊑f dX . Thus, given Q. X(e) ֒→ Qp . dp := ep , the sequence dp uniquely determines ep . Let prop(Q. X(e)) denote left projection of ֒→, i.e., prop(Q. X(e)) = {y | Q. X(e) ֒→ y}. A propagated value y = Q′ . d := e is associated to a predicate variable X iff y ∈ S ′ X Q,e prop(Q. X(e)). To indicate this, we write Q . d := e. A propagated value may be associ- ated to more than one predicate variable, e.g., the propagated value ⊥ is associated to all predicate X variables. Given some predicate variable X and two propagated values y = Qp . dp := dp and X y ′ = Q′p . d′p := e′p , we write y ≼X y ′ if and only if, for some f , dp ⊑f d′p , ep ⊑f e′p and Qp is a suffix of Q′p . Note again that there is at most one such f , due to the uniqueness of elements in dX . In general, the preorder ≼X gives rise to a unique infimum. When considering only the set prop(Q. X(e)), the preorder ≼X also yields a unique supremum. Moreover prop(Q. X(e)) is finite, so (prop(Q. X(e)), ≼X ) is a complete lattice. Although the correctness of our theory does not depend on choosing a specific W propagated value in prop(Q. X(e)), we are often interested in the supremum, denoted with prop(Q. X(e)), since it contains the most information about Q. X(e). Example 2. Let X be a predicate variable, with parameters dX = (d1 , d2 , d3 ). We have prop(∀m:N. ∃n:N. X(d3 , m + d2 , 2n)) = {⊥, ∃n:N. d3 := 2n}. The parameters d1 and d2 do not occur in a propagated value since the corresponding expressions d3 and m + d2 respectively contain variables d3 and d2 that are not bound in the surrounding quantifiers, and hence violate the condition vars(ep ) ⊆ v(Qp ). Consequently, vars(e \ ep ) contains at least m, d2 and d3 ; the quantifier ∀m thus cannot occur in a propagated value since this would violate the condition vars(e \ ep ) ∩ v(Qp ) = ∅. For the QPVI ϕ = ∃n:N. X(3, d2 , n + 1), we have W prop(ϕ) = ∃n:N. d1 , d3 := 3, n + 1. In general, the procedure of quantifier propagation based on forward and backward substitu- tion (outlined in Section 3) does not preserve the semantics if the variable we substitute occurs 71 in multiple QPVIs or in a dependency cycle. To perform an analysis of such cases, we need to reason about information that is common among two QPVIs. This information is exactly captured by the lattice (prop(Q. X(e)) ∩ prop(Q′ . X(e′ )), ≼X ). Furthermore, we also want to reason about which predicate variables are unreachable from the target PVI X̂(ê). For this, we introduce the special value ⊤ such S that y ≼X ⊤ for all propagated values y. The definition of the infimum ∧· in the lattice ( Q,e prop(Q. X(e)) ∪ {⊤}, ≼X ) follows in the standard way. We lift ∧· to sets in the ordinary way (note that ∧· is commutative and associative); ∧· of an empty set yields ⊤. Algorithm 1: Computing shared information of two propagated values. X X Input: Q1 . d1 := e1 , Q 2 . d 2 : = e2 1 Set f1 , f2 such that d1 ⊑f1 d and d2 ⊑f2 d; 2 Q := longest common suffix of Q1 and Q2 ; −1 −1 −1 3 d := [dX [i] | dX [i] ∈ d1 ∧ dX [i] ∈ d2 ∧ e1 [f1 (i)] = e2 [f2 (i)] ∧ vars(e1 [f1 (i)]) ⊆ v(Q)]; −1 4 e := [e1 [f1 (i)] | dX [i] ∈ d]; 5 del := vars(e1 \ e) ∪ vars(e2 \ e); 6 while del 6= ∅ do 7 Q := longest suffix Q′ of Q such that v(Q′ ) ∩ del = ∅ ; 8 d := [d[i] | vars(e[i]) ⊆ v(Q′ )] ; 9 e′ := [e[i] | vars(e[i]) ⊆ v(Q′ )] ; 10 del := vars(e \ e′ ) ; 11 e := e′ ; X 12 return Q. d := e X X To support the intuition, Algorithm 1 shows how to compute Q1 . d1 := e1 ∧· Q2 . d2 := e2 . Initially, we take the quantifiers shared by Q1 and Q2 (line 2) and the parameters that are shared by d1 and d2 and for which the corresponding expressions in e1 and e2 are matching and bound in Q (line 3). The list of expressions e is initialised to match d (line 4). Then, we collect variables from expressions in e1 and e2 that did not make it into e (line 5). If such variables exist, we ensure that v(Q) does not contain deleted variables (line 7) and update d and e to take into account variables that are no longer bound in Q (lines 8, 9 and 11). The update of e might cause more variables to be deleted, so we update del (line 10) and iterate. Once this process finishes, the conditions v(Q) ∩ (vars(e1 \ e) ∪ vars(e2 \ e)) = ∅ and vars(e) ⊆ v(Q) are satisfied, and we return. Example 3. Consider the following PBES: νX(n:N ) = Q1 . X(e1 ) ∧ Q2 . X(e2 ) µY (n1 , n2 , n3 , n4 , n5 :N ) = X(n1 + n2 + n3 + n4 + n5 ) where Q1 . X(e1 ) = ∀m1 :N. ∃m2 :N. ∀m3 :N. ∃m4 :N. Y (n + m1 , m2 , m3 + 2, m3 , m4 ) 72 Q2 . X(e2 ) = ∀m1 :N. ∀m2 :N. ∀m3 :N. ∃m4 :N. Y (n + m1 , m2 , 5m3 , m3 , m4 ) We have W X prop(Q1 . X(e1 )) = ∃m2 :N. ∀m3 :N. ∃m4 :N. n2 , n3 , n4 , n5 := m2 , m3 + 2, m3 , m4 W X prop(Q2 . X(e2 )) = ∀m2 :N. ∀m3 :N. ∃m4 :N. n2 , n3 , n4 , n5 := m2 , 5m3 , m3 , m4 X W W Q. d := e = prop(Q1 . X(e1 )) ∧· prop(Q2 . X(e2 )) = (∃m4 :N. n5 := m4 ) Variable n is not bound in Q1 or Q2 , so n + m1 does not occur in either propagated value by the condition vars(e) ⊆ v(Q). The longest common suffix of quantifiers is (∀, m3 )(∃, m4 ). Thus, m2 is also not contained in e, by the same condition. Furthermore, the expressions m3 + 2 and 5m3 are not equivalent, so parameter n3 is excluded from d. Consequently, (∀, m3 ) ∈ / Q, to satisfy the condition vars(e1 \ e) ∩ v(Q) = ∅, and m3 ∈ / e, to satisfy vars(e) ⊆ v(Q). Let X̂(ê) be a target PVI for which we want to know the solution. Then, given a PBES E, we can detect globally quantified parameters with the following algorithm. Here, the function qi distributes quantifiers over other logical operators. For now, assume that guardX(e) (ϕ) always returns true; its function will be explained later. X̂ pv0 (X̂) = (dX̂ := ê) for every X ∈ bnd(E) \ {X̂}, pv0 (X) = ⊤ for every X ∈ bnd(E), ^ W pvi+1 (X) = pvi (X) ∧· ■ { prop(Q′ . X(e′ )) | ∃Y ∈ bnd(E). Y pvi (Y ) = (Q. d := e) ∧ (Q′ . X(e′ )) ∈ qiocc(qi(Q. ϕY [e/d])) ∧ ∃δ. JguardX(e) (Q. ϕY [e/d])Kδ} ^ Output: gpv = ■ pvi i≥0 X̂ Initially, we take the ground terms from ê and construct a propagated value dX̂ := ê. For other variables, the initial propagated value is set to ⊤. In every iteration, we construct right-hand sides based on the propagated values we have found so far, in a similar fashion as quantifier propagation. From the new right-hand sides, we extract quantified predicates and compute their infimum with the current propagated value. This propagation of information continues until we reach a fixpoint gpv. We remark that this algorithm generalises the algorithm for finding constant parameters [9], since any constant value computed by the latter algorithm is also found by our algorithm. The information that gpv yields is applied as follows. In an equation σX(d:D) = ϕX , where X gpv(X) = Q. d := e, we apply the substitution d := e to ϕX and also add the quantifiers Q to ϕX . Formally, given a PBES E, a target PVI X̂(ê) and the global propagated value function gpv, this is captured by the function global-prop: global-prop(∅) = ∅ 73 global-prop((σX(dX :DX ) = ϕX )E ′ ) = ( (σX(dX :DX ) = Q. ϕX [e/d])global-prop(E ′ ) X if gpv(X) = Q. d := e global-prop(E ′ ) if gpv(X) = ⊤ First, note that if gpv(X) = ⊥ for all X ∈ bnd(E), then global-prop does not modify E. X Otherwise, if gpv(X) = Q. d := e, the parameters of dX that are also contained in d do not occur any longer in Q. ϕX [e/d], the new right-hand side of X. Those parameters are now redundant, and they can be eliminated with the redundant parameter elimination technique from [9]. Subsequently, the quantifier-inside rewriter can eliminate quantifiers from QPVIs where parameter elimination has reduced the set of free variables. If gpv(X) = ⊤, then the equation for X is unreachable from X̂(ê), and it is removed from E. Example 4. Consider the target PVI X(4) and the PBES below: µX(n:N ) = ∀m′ :N. Y (n, m′ ) νY (n, m:N ) = Z(n, m) ∧ (n ≥ 2 ⇒ Y (n/2, m)) ∧ (n ≤ 5 ⇒ Y (n + 7, m)) µZ(n, m:N ) = (n < m ∧ Z(n, m − 1)) ∨ (n ≥ m) The globally propagated values relative to X(4) are iteratively computed as follows: pv0 (X) = n := 4 pv0 (Y ) = ⊤ pv0 (Z) = ⊤ ′ ′ pv1 (X) = n := 4 pv1 (Y ) = ∀m :N. n, m := 4, m pv1 (Z) = ⊤ ′ ′ pv2 (X) = n := 4 pv2 (Y ) = ∀m :N. m := m pv2 (Z) = ∀m′ :N. n, m := 4, m′ pv3 (X) = n := 4 pv3 (Y ) = ∀m′ :N. m := m′ pv3 (Z) = n, m := n, m ′ ′ pv4 (X) = n := 4 pv4 (Y ) = ∀m :N. m := m pv4 (Z) = ⊥ ′ ′ gpv(X) = n := 4 gpv(Y ) = ∀m :N. m := m gpv(Z) = ⊥ Parameter n is not constant in the equation for Y , since its value may change through the PVI Y (n + 7, m). That updated value can also reach Z through the PVI Z(n, m), hence n is also not constant in Z. Applying the global propagated values yields: µX(n:N ) = ∀m′ :N. Y (4, m′ ) νY (n, m:N ) = ∀m′ :N. (Z(n, m′ ) ∧ (n ≥ 2 ⇒ Y (n/2, m′ )) ∧ (n ≤ 5 ⇒ Y (n + 7, m′ ))) µZ(n, m:N ) = (n < m ∧ Z(n, m − 1)) ∨ (n ≥ m) After redundant parameter elimination, which removes parameter n of X and m of Y , and distributing quantifiers over other operators (eliminating them when the variable they bind does not occur), we obtain µX = Y (4) νY (n:N ) = (∀m′ :N. Z(n, m′ )) ∧ (n ≥ 2 ⇒ Y (n/2)) ∧ (n ≤ 5 ⇒ Y (n + 7)) µZ(n, m:N ) = (n < m ∧ Z(n, m − 1)) ∨ (n ≥ m) This PBES has fewer parameters and should thus be easier to solve. In particular, we avoid the instantiation of many nodes Y (n, m) for all different values of m. For Z(n, m), though, we still require infinitely many nodes for all values of m. 74 Next, we shortly revisit the motivating example from Section 3. Example 5. Consider again the PBES from Section 3. The global propagated values we obtain for the target PVI Z(ǫ) are: gpv(Z) = s := ǫ gpv(X) = ∃n:N. n := n gpv(Y ) = ∃n:N. n := n Thus, the simplified PBES obtained after applying the global-prop function, redundant parameter elimination and quantifier elimination is: νZ = X(ǫ) νX(s:List(N )) = Y (s) νY (s:List(N )) = (∃k:N. length(s) ≤ 2 ∧ Y (k mod 6 ⊲ s)) ∨ (s 6= ǫ) ∧ Y (tail (s)) ∨ length(s) ≤ 2 ∧ X(s) All references to the variables n introduced by the quantifier in the mu-calculus formula of Section 3 have been eliminated. The simplified PBES has a finite underlying BES, and is thus much more easy to solve than the PBES we started out with. We have two theorems stating the correctness of the function global-prop. The first theorem claims that the computation of global propagated values terminates. Theorem 1. Quantified predicate analysis terminates for every PBES E. Proof. Let E be a PBES. In every iteration i, there is at least one X ∈ bnd(E) for which either (i) X X pvi (X) changes from ⊤ to Q. d := e; or (ii) if pvi (X) = Q. d := e, then some d[i] is removed from d or some prefix from Q is removed (or both). Since E consists of a finite number of equations, each with a finite number of parameters, this process must terminate. The second theorem states that the function global-prop preserves the semantics of the target PVI. A proof can be found in [1, Theorem 7.25]. Theorem 2. Let E be a closed PBES and X̂(ê) some target PVI. Then JêK ∈ JEK(X̂) if and only if JêK ∈ Jglobal-prop(E)K(X̂). 5. Guards for Predicate Formulae The analysis of the previous section relies solely on static dependencies between predicate variables, by extracting all QPVIs Q. X(e) in a right-hand side with the function qiocc. It does not take into account any other parts of the predicate formula, which might cause subformula X(e) to be irrelevant. For example, in the formula n ≤ 2 ∧ X(m), X(m) is irrelevant if the constant n = 7 was deduced. A formula that characterises for which values of m the PVI X(m) is relevant, is called a guard (a formal definition follows). Guards also play an important role in other types of static analysis of PBESs. The concept of guards first appeared in [9], although it only shows how to construct a guard for a normal 75 form called predicate formula normal form (PFNF). Keiren et al. [8] propose a function that over-approximates guards for arbitrary predicate formulae. Correctness results for this guard function can be found in Keiren’s thesis [11], although the proof contains a small issue (see the discussion following Theorem 3). Here, we improve on these results by strengthening the guards we compute. Before we give a formal definition of a guard, we first introduce several auxiliary notions. Firstly, we have logical equivalence: ϕ ≡ ψ iff for all η and δ, JϕKηδ = JψKηδ. The number of PVIs occurring in ϕ is denoted with npred(ϕ) and the ith PVI in ϕ is PVI(ϕ, i). We write ϕ[i 7→ ψ] to indicate the syntactic replacement of the ith PVI in ϕ (counted from left to right) by ψ. To denote the simultaneous substitution of the ith PVI by some formula ψi for all 1 ≤ i ≤ npred(ϕ) that satisfy the condition f , we write ϕ[i 7→ ψi ]f (i) . We now formally introduce the concept of guards, which originates from [11, Lemma 6.27]. Definition 6. Given a predicate formula ϕ, a formula ψ which does not contain PVIs is a guard for the ith PVI of ϕ if and only if ϕ ≡ ϕ[i 7→ ψ ∧ PVI(ϕ, i)]. To understand the intuition behind guards, consider an equation σX(d:D) = ϕ, the ith PVI in ϕ, PVI(ϕ, i) = Y (e) and associated guard ψ. The guard ψ expresses for which values of e the PVI Y (e) is relevant in ϕ. After all, if JψKδ is false, then JY (e)Kηδ is not relevant to the value of the subformula ψ ∧ Y (e). Thus, in general, a guard over-approximates the true dependency between X and Y , as far as that dependency originates from the occurrence PVI(ϕ, i). Remark that true is a guard for any PVI. We further demonstrate the concept in the next example. Example 6. Consider the predicate formula ϕ = (∃m:N. m = 6 ∧ (W (m) ∨ X)) ∨ (Y ∧ (¬b ⇒ Z)) The PVIs W (m) and X are guarded in the same way: m = 6 is a guard for both these PVIs. From this, we can deduce that replacing W (m) by W (6) preserves logical equivalence of ϕ. For Y , the only guard is true; Z also has ¬b as guard. Hence, if we somehow deduce that b never takes on the value false, the PVI Z can be eliminated from ϕ. The following example shows that the substitution applied in the definition of a guard can yield unexpected results when a variable d has both bound and free occurrences. Example 7. Consider the formula ϕ = (n < 2) ∧ ∀n:N. X(n). The PVI X(n) occurs in the conjunctive context of n < 2, which at first sight leads to believe that n < 2 is a guard. However, we have the logical inequivalence ϕ 6≡ ϕ[1 7→ (n < 2) ∧ X(n)], since the new occurrence of n introduced by the substitution is bound instead of free. Hence, n < 2 is not a guard. To identify situations such as in the latter example, we say ϕ is capture-avoiding iff no free variable of ϕ has a bound occurrence in ϕ, i.e., there is no subformula Qd:D. ϕ′ of ϕ such that d ∈ vars(ϕ). Remark that any predicate formula can be transformed into an equivalent capture-avoiding formula by performing the appropriate alpha conversion, that is, renaming the variables bound in quantifiers. The function guard computes guards for capture-avoiding formulae. 76 Definition 7. Let ϕ be a capture-avoiding predicate formula and i ≤ npred(ϕ). Then, the function guard is defined inductively as follows: guardi (Y (e)) = true guardi (¬ϕ) = guardi (ϕ) ( i s(ϕ) ∧ guardi−npred(ϕ) (ψ) if i > npred(ϕ) guard (ϕ ∧ ψ) = s(ψ) ∧ guardi (ϕ) if i ≤ npred(ϕ) ( s(¬ϕ) ∧ guardi−npred(ϕ) (ψ) if i > npred(ϕ) guardi (ϕ ∨ ψ) = s(¬ψ) ∧ guardi (ϕ) if i ≤ npred(ϕ) guardi (∀d:D. ϕ) = s(∀d:D. ϕ) ∧ guardi (ϕ) guardi (∃d:D. ϕ) = s(¬∃d:D. ϕ) ∧ guardi (ϕ) where s(ϕ) = ϕ[i 7→ true]i≤npred(ϕ) s(¬ϕ) = ¬ϕ[i 7→ false]i≤npred(ϕ) The definition does include the case guardi (b), since it cannot occur due to i ≤ npred(ϕ). In [8], the function s was defined as s(ϕ) = ϕ if npred(ϕ) = 0, and true otherwise. Furthermore, quantifiers were not taken into account. That results in guard yielding a weaker formula than our guard function does. The next theorem shows that our function guard indeed yields guards and that these may be simultaneously applied to all PVIs in a predicate formula. For the proof of this theorem, we refer to [1, Theorem 7.38]. Theorem 3. For all capture-avoiding formulae ϕ, it holds that ϕ ≡ ϕ[i 7→ (guardi (ϕ) ∧ PVI(ϕ, i))]i≤npred(ϕ) Remark that the assumption that ϕ is capture-avoiding is necessary for the correctness of the guard function. Without this assumption, we may compute guard1 ((n ≤ 2) ∧ ∀n:N. X(n)) = (n ≤ 2), which is not a guard (see also Example 7). This predicate formula is also a counter- example to the correctness of [11, Lemma 6.27], which is the equivalent of Theorem 3 in the current work. Adding the assumption that ϕ is capture-avoiding resolves the issue. 6. Implementation The proposed generalisations for constant elimination and guards have been implemented as an extension of the tool pbesconstelm, which implements constant elimination and is part of the mCRL2 toolset [15], available through https://mcrl2.org. The analysis of quantifiers can be enabled by the command line option --check-quantifiers. Before computing global propagated values with the fixpoint algorithm of Section 4, all guards and occurrences of QPVIs are first obtained by recursively traversing each right-hand side in the PBES. Furthermore, we 77 gather information on the free occurrences of variables, such that the effect of the quantifier- inside rewriter can be approximated. Hence, no traversal of the right-hand side (which can be very large) is required during the fixpoint computation itself. To demonstrate a possible application of our ideas, we first perform a small experiment with a model of the alternating bit protocol (ABP) [16], where the data domain D has been restricted to only five elements1 . The transition system corresponding to this model has 182 states. We consider the property ∀d:D. νW. [⊤]W ∧ νX. µY. νZ. ([r(d)]X ∧ (hr(d)itrue ⇒ [r(d)]Y ) ∧ [r(d)]Z which expresses that whenever r(d) is enabled infinitely often, then it also taken infinitely often, i.e., it is treated fairly. This formula occurred earlier in [11]. Observe that the universally quantified value d does not occur meaningfully in the fixpoint W . We thus expect that the same quantifier in the corresponding PBES can be eliminated. We take the PBES that encodes this formula on the ABP and instantiate it to a BES, which has 3641 nodes. Applying the original constant elimination algorithm from [9] on the PBES does not reduce this number. After applying our generalised algorithm that deals with quantifiers, the instantiated BES is reduced to 2913 nodes. Our second experiment concerns the cache coherence protocol (CCP), as modelled in [17]. The instance we consider has two threads, two processes and one region. We slightly altered the specification, such that process identifiers are modelled with natural number instead of a dedicated finite sort; this does not change its behaviour. We consider the property that, if the system is stable, each region has no more copies than the number of processors minus one, formulated in [11] as ∀procId :N. ¬ µX. h⊤i ∨ (hc_copyitrue ∧ hlockempty(procId )itrue ∧ hhomequeueempty(procId )itrue ∧ hremotequeueempty(procId )itrue) The corresponding PBES contains a QPVI of the shape ∀procId :N. X(d, procId ) and has an underlying BES of infinite size; the PBES can hence not be instantiated. However, after applying global propagation, the size of the BES is reduced to 50507 nodes. Global propagation also achieves a reduction when applied to the original specification, where the sort of variable procId contains only two elements. In that case, the size of the BES is reduced from 101197 to 50507 nodes. Experiments with several other model checking PBESs yielded no additional reduction over what is achieved by standard constant elimination. In nearly all cases, this is caused by the fact that the PBES contains quantifiers that cannot be sufficiently distributed over other operators. That is, in the computation of global propagated values, qiocc(qi(Q. ϕY [e/d])) yields QPVIs Q′ . X(e′ ) such that Q′ is empty. This is especially common when the µ-calculus formula used to construct the PBES mixes ∀ and ∨ or the diamond modality hai (respectively ∃ and ∧ or the box modality [a]). Note, however, that many formal properties of interest are linear (they can thus be encoded in LTL) and do not mix said operators. 1 The files used for the experiments are archived at https://doi.org/10.5281/zenodo.6793890 78 The effectiveness of our technique is also limited when quantifiers are accompanied by bounds. In some cases, the PBES contains a subformula of the shape ∀d:D. f ∧ X(e) (respectively ∃d:D. f ⇒ X(e)), where f is an expression over d that restricts its domain. In other cases, such subformulae appear during the computation of qi(Q. ϕY [e/d]). Our technique cannot deals with either case, although a reduction might be possible in theory. 7. Conclusion We saw that although quantifiers can prevent PBES instantiation, they have barely received attention in the literature on syntactic PBES transformations. Hence, we proposed to simplify PBESs based on global propagated values. Our experiment with global propagation indicates that it can indeed achieve a reduction of the state space. Furthermore, we identified an improvement for the computation of guards. As discussed on Section 6, one of the main limitations of our technique is that we cannot deal with bounds on quantified variables. We believe the support of quantifier bounds will greatly benefit its practical applicability, so we plan to handle these cases in future work. References [1] T. Neele, Reductions for Parity Games and Model Checking, Ph.D. thesis, Eindhoven University of Technology, 2020. [2] J. F. Groote, T. A. C. Willemse, Parameterised boolean equation systems, Theoretical Computer Science 343 (2005) 332–369. doi:10.1016/j.tcs.2005.06.016. [3] J. F. Groote, T. A. C. Willemse, Model-checking processes with data, Science of Computer Programming 56 (2005) 251–273. doi:10.1016/j.scico.2004.08.002. [4] T. Chen, B. Ploeger, J. van de Pol, T. A. C. Willemse, Equivalence Checking for Infinite Systems using Parameterized Boolean Equation Systems, in: CONCUR 2007, volume 4703 of LNCS, 2007, pp. 120–135. doi:10.1007/978-3-540-74407-8_9. [5] H. Seidl, Fast and simple nested fixpoints, Information Processing Letters 59 (1996) 303–308. doi:10.1016/0020-0190(96)00130-5. [6] B. Ploeger, J. W. Wesselink, T. A. C. Willemse, Verification of reactive systems via instan- tiation of Parameterised Boolean Equation Systems, Information and Computation 209 (2011) 637–663. doi:10.1016/j.ic.2010.11.025. [7] T. Neele, T. A. C. Willemse, J. F. Groote, Finding Compact Proofs for Infinite-Data Parame- terised Boolean Equation Systems, Science of Computer Programming 188 (2020) 102389. doi:10.1016/j.scico.2019.102389. [8] J. J. A. Keiren, W. Wesselink, T. A. C. Willemse, Liveness Analysis for Parameterised Boolean Equation Systems, in: ATVA 2014, volume 8837 of LNCS, 2014, pp. 219–234. doi:10.1007/978-3-319-11936-6_16. [9] S. Orzan, W. Wesselink, T. A. C. Willemse, Static Analysis Techniques for Parameterised Boolean Equation Systems, in: TACAS 2009, volume 5505 of LNCS, 2009, pp. 230–245. doi:10.1007/978-3-642-00768-2_22. 79 [10] S. Cranen, M. Gazda, W. Wesselink, T. A. C. Willemse, Abstraction in Fixpoint Logic, ACM Transactions on Computational Logic 16 (2015) 29:1–29:39. doi:10.1145/2740964. [11] J. J. A. Keiren, Advanced Reduction Techniques for Model Checking, Ph.D. thesis, Eind- hoven University of Technology, 2013. doi:10.6100/IR757862. [12] S. Orzan, T. A. C. Willemse, Invariants for Parameterised Boolean Equation Systems, Theoretical Computer Science 411 (2010) 1338–1371. doi:10.1016/j.tcs.2009.11.001. [13] A. Mader, Modal µ-calculus, model checking and Gauß elimination, in: TACAS 1995, volume 1019 of LNCS, 1995, pp. 72–88. doi:10.1007/3-540-60630-0_4. [14] W. Zielonka, Infinite games on finitely coloured graphs with applications to automata on infinite trees, Theoretical Computer Science 200 (1998) 135–183. doi:10.1016/ S0304-3975(98)00009-7. [15] O. Bunte, J. F. Groote, J. J. A. Keiren, M. Laveaux, T. Neele, E. P. de Vink, W. Wesselink, A. Wijs, T. A. C. Willemse, The mCRL2 Toolset for Analysing Concurrent Systems: Improvements in Expressivity and Usability, in: TACAS 2019, volume 11428 of LNCS, 2019, pp. 21–39. doi:10.1007/978-3-030-17465-1_2. [16] J. A. Bergstra, J. W. Klop, Verification of an alternating bit protocol by means of process algebra protocol, in: MMSSS 1985, volume 215 of LNCS, 1986, pp. 9–23. doi:10.1007/ 3-540-16444-8_1. [17] J. Pang, W. Fokkink, R. Hofman, R. Veldema, Model checking a cache coherence protocol of a Java DSM implementation, Journal of Logic and Algebraic Programming 71 (2007) 1–43. doi:10.1016/j.jlap.2006.08.007. 80