Towards Verification of Security-Aware E-services Silvio Ranise Dipartimento di Informatica Università di Verona, Italy Abstract. We study an extension of the relational transducers intro- duced by Abiteboul, Vianu, Fordham, and Yesha, which are capable of specifying transaction protocols and their interplay with security con- straints. We investigate the decidability of relevant verification problems such as goal reachability and log validation. 1 Introduction Web-services providing e-commerce capabilities to support business transactions between several parties over the Internet are more and more widespread. The development of such services involves several issues involving security, authenti- cation, and the design of business models to name a few. Relational transducers have been proposed [17] to model the transaction be- havior of e-services for e-commerce and to allow for the analysis of the underlying transaction protocols. Roughly, a relational transducer is a machine capable of translating an input sequence of relations into an output sequence of relations; its state being a relational database. A particular class of relational transducers, called semi-positive cumulative (spocus) transducers, has been identified and studied because its specification is declarative and simple to understand, and some interesting verification problems turn out to be decidable (see again [17]). The input-output mapping in spocus transducers is implicitly defined by a set of non-recursive (variant of) Datalog rules. This is based on the observation that any set of Datalog rules defines a query which is a (partial) function from rela- tional databases (the input relations of the transducers) to answer relations (the output relations of the transducer), see, e.g., [16]. Although spocus transducers support declarative specifications and some in- teresting verification problems are decidable, one of their major shortcomings is the lack of support for the specification and verification of security requirements. In particular, the class of security requirements pertaining to authorization play a crucial role in several business transactions. For example, failure to meet au- thorization constraints may cause economic losses and may even have legal im- plications. In this paper, we propose an extension of the spocus transducers that overcome this problem. In particular, following [12], we extend the rules for input-output in spocus transducers with constraint Datalog rules to formalize access policy statements. This allows us to develop our ideas in the framework of first-order logic and to adapt and reuse specification techniques belonging to the line of research which uses extensions of Datalog to express policy statements (see, e.g., [7, 2, 11]). Technically, the situation is more complex than with spocus transducers as certain patterns in authorization policies (typically, delegation) require recursion (see, e.g., [3]). The (variant of) Datalog rules used in [17] for spocus transducers were assumed to be non-recursive. Plan of the paper. Section 2 introduces some background notions on first-order logic, relational databases, and Datalog rules. Section 3 extends spocus transduc- ers with (constraint) Datalog rules for access policy specifications and introduces two verification problems. In Section 3.1, a simple (yet representative) business process specified by means of a policy-aware transducer illustrates the need to have recursive Datalog rules and a first-order theory to be able to correctly spec- ify delegation and trust relationships, respectively, which are relevant to capture important business rules. As the verification problems can be reduced to satisfi- ability problems for a certain class of formulae, Section 4 defines such a class and proves the decidability of the satisfiability problem under suitable assumptions (for lack of space, proofs are in the extended version [13] of this paper) and Sec- tion 5 shows how to reduce verification to satisfiability. In particular, Section 5.1 discusses the hypotheses under which the fix-point computation of the query induced by a set of constraint Datalog rules terminates. Section 6 concludes. 2 Preliminaries We assume familiarity with the basic syntactic and semantic notions of first- order logic (see, e.g., [8]). We work in first-order logic with equality and consider the equality symbol = as a logical constant. A constraint is a (finite) conjunction of literals. An expression is a term, an atom, a literal, or a formula. A Σ(x)- expression is an expression built out of the symbols in a signature Σ where at most the variables in the sequence x may occur free, and we write E(x) to emphasize that E is a Σ(x)-expression. (Below, by abuse of notation, we consider sequences also as finite sets and use the standard set-theoretic operations to combine them.) Let Σ be a signature. A Σ-theory T is a set of Σ-sentences. In this paper, we assume that all theories are consistent (i.e. they admit at least one model). A Σ-formula ϕ(x) is T -satisfiable iff there exists a model M of T (i.e. M |= T ) such that M |= ∃x.ϕ(x). The satisfiability modulo theory T (in symbols, SMT(T )) problem consists of establishing the T -satisfiability of any quantifier- free Σ-formula. A formula ϕ(x) is T -valid if for every model M of T , we have M |= ∀x.ϕ(x). Two formulae ϕ and ψ are T -equivalent iff the formula ϕ ↔ ψ is T -valid. A theory T admits quantifier elimination if for an arbitrary formula ϕ(x) (possibly containing quantifiers), it is possible to compute a T -equivalent quantifier-free formula ϕ′ (x). A Σ-theory T is locally finite if Σ is finite and, for every set of constants a, there are finitely many ground terms t1 , ..., tka , called representatives, such that for every ground (Σ ∪ a)-term u, we have T |= u = ti 106 for some i. If the representatives are effectively computable from a and ti is computable from u, then T is effectively locally finite. For simplicity, we will often say “locally finite” to mean “effectively locally finite.” We consider the Bernays-Schönfinkel-Ramsey (BSR) class, also called Ef- fectively Propositional Logic, whose satisfiability problem is well-known to be decidable. A formula of the BSR class is of the form ∃x.∀y.ϕ(x, y), where x, y are (disjoint) tuples of variables and ϕ is a quantifier-free formula built out of a signature containing only relation and constant symbols (i.e. no function symbol occurs in ϕ). A finite instance of (a n-ary relation) r is a formula of the form m _ ∀x.r(x) ↔ x = cj , j=1 where x = x1 , ..., xn is a sequence of variables, cj = cj1 , ..., cjn is a sequence of constants (for j = 1, ..., m), and x = cj abbreviates (x1 = cj1 ∧ · · · ∧ xn = cjn ). Let R be a finite set of predicate symbols, a database over R is a conjunction of finite instances of r ∈ R; we sometimes call R a database schema. A Datalog formula is a BSR formula of the form n ^ n ^ ∀x, y. Ai (x, y) → A0 (x) also written as a rule A0 (x) ← Ai (x, y), i=1 i=1 Vnn) and x, y are disjoint tuples of variables. where Ai is an atom (for i = 0, 1, ..., Furthermore, A0 is said the head, i=1 Ai (x, y) the body of the rule, and when n = 0, the Datalog rule is also called a fact. A set of Datalog rules is semi-positive if whenever a negative literal appears in the body of a rule, then the relation symbol of this literal is not the relation symbol of any literal which is the head of a non-trivial rule in the set. Another generalization of Datalog rules is that of constraint Datalog rule where, for some Σ-theory T , one quantifier-free Σ(x, y)- formula may appear in its body. A set of Datalog (semi-positive or constraint Datalog) rules is recursive if some predicate symbol occur both in the heads and the bodies of the rules; otherwise, we say it to be non-recursive. 3 Security-Aware Transducers For the rest of the paper, we fix a Σ-theory T and a relational signature R := In ∪ Out ∪ DB ∪ Policy such that X ∩ Y = ∅ and X ∩ Σ = ∅ for X, Y ∈ {In, Out, DB , Policy }. Definition 1. A (cumulative and policy-aware) transducer (over (Σ, R)) is a tuple (past , τ, ω, π) where – past = past i1 , ..., past in for In = {ri1 , ..., rin } is a sequence of fresh predicate symbols, i.e. past ∩ (Σ ∪ R) = ∅, 107 – given a finite database over the database schema I ⊆ In of the form ^ ni _ (∀x.ri (x) ↔ x = cji ), ri ∈I j=1 where cji are constant symbols, τ (I) is a finite set of (cumulative) transitions of the form ni _ ∀x.past ′ri (x) ↔ x = cji ∨ past ri (x), j=1 for each ri ∈ I and ∀x.past ′ri (x) ↔ past ri (x), for each ri ∈ In \ I. Let τ be the set of cumulative transitions for every finite instance over every possible sub-set of In; Vnis a a finite set of semi-positive non-recursive rules of the form A0 ← – ω i=1 Ai such that (i) A0 is an Out-atom, (ii) Ai is a (In ∪ past ∪ DB )-literal (for i = 1, ..., n), and (iii) every variable appearing in a rule must occur in at least one positive literal; – π is a finite Vn set of (possibly recursive) constraint Datalog rules of the form A0 ← i=1 Ai ∧ ψ such that (i) A0 is a Policy -atom, (ii) Ai is a (DB ∪ Policy ∪ past )-atom, for i = 1, ..., n, (iii) ψ is a quantifier-free Σ-formula, and (iv) the V set of variables occurring in A0 are a sub-set of the variables n occurring in i=1 Ai . To understand why what we have just defined is a transducer, recall that (con- straint) Datalog rules define queries on databases; which, in turn, can be seen as mappings associating a so-called answer relation to each database (see, e.g., [6] for details). In this perspective, we can consider the union of (non-recursive) semi-positive rules in ω with the constraint (Datalog) rules in π as defining a mapping between databases over In ∪past ∪DB to databases over Out ∪past . By taking Policy = ∅ and T to be the empty theory, it is easy to see that our notion of transducer reduces to that of relational Spocus transducer in [17]. Notice that the set π may contain recursive (constraint) Datalog rules. Recursion is cru- cial to express important mechanisms in policy management such as delegation. So, although, recursion complicates the reduction of verification problems for cumulative and policy-aware transducers to logical satisfiability problems (see Section 5 below), it is of paramount importance for naturally expressing several patterns of policy management. Below, given a cumulative and policy-aware transducer (past , τ, ω, π) over (Σ, R) and a (Σ ∪ R ∪ past )-formula ϕ, the formula obtained by replacing each symbol s ∈ In ∪ Out ∪ past occurring in ϕ with a fresh symbol sj and each symbol p′ ∈ past ′ with fresh symbols pj+1 will be denoted by ren(ϕ, j) for j ≥ 0. 108 Definition 2. Let (past , τ, ω, π) be a cumulative and policy-aware transducer over (Σ, R) and db be a finite instance over DB . The sequence I1 , O1 ; ...; In , On is a (finite) run (with length n) of (past , τ, ω, π) iff the formula 0 V db ∧ p∈past ∀x.p (x) ↔ false ∧ Vn (1) j=1 ren(Ij , j) ∧ ren(τ (Ij ), j) ∧ ren(ω, j) ∧ ren(π, j) ∧ ren(Oj , j) is T -satisfiable, where I1 , ..., In is a finite sequence where each Ij is an instance over In, for j = 1, ..., n, called the input sequence, and O1 , ..., On is a finite sequence where each Oj is an instance over Out, for j = 1, ..., n, called the output sequence. Given a cumulative and policy-aware transducer (past , τ, ω, π) over (Σ, R) and a (Σ ∪ R ∪ past )-formula ϕ, R1 ; ...; Rn be a run of the transducer, and Log ⊆ (In ∪ Out); then, prj (Log, R1 ; ...; Rn ) is a finite sequence of formulae obtained from R1 ; ...; Rn by forgetting all those finite instances over (In ∪ Out) \ Log. A goal is a BSR formula of the form n ^ ∃x. Aj (x) j=1 where Aj is an Out-literal, for j = 1, ..., n. Definition 3. Let (past , τ, ω, π) be a cumulative and policy-aware transducer over (Σ, R) and Log ⊆ In ∪ Out. We define the following two verification prob- lems for (past , τ, ω, π): Goal reachability: does a given goal γ hold in the last output of some run of (past , τ, ω, π)? Log validity: given a finite sequence L1 , ..., Ln where each Lj is a finite in- stance over Log for j = 1, ..., n, does there exist a finite run S1 ; ...; Sn of (past , τ, ω, π) such that L1 , ..., Ln = prj (Log, S1 ; ...; Sn )? Goal reachability consists of checking whether a set of goals can be reached by some run of the transducer. This verification problem is a first sanity check on the design of the business model underlying the transducer as the latter is usually conceived to reach a certain goal, e.g., delivering a product provided that certain conditions are met. Log validation consists of checking whether a given log sequence can be generated by some input sequence. This problem arises, for example, when the transducer of a supplier is allowed to run on a customer’s site for efficiency or convenience. The trace provided by the log allows the supplier to validate the transaction carried out by the customer. 3.1 Example We consider a business model where a customer wants to buy a book, is billed for it, pays, and then takes delivery and a discount voucher for his/her next 109 purchase if he/she is a preferred customer and is an employee of an accredited company. One kind of preferred customers are those affiliated to a organization EOrg, which issues credentials to certify that a person is one of its members. The business process has a database for accredited companies which are trusted by the business model to pass to other companies the fact of being accredited as well as the possibility of declaring other companies being accredited. The identification of an employee of a company is done by issuing a certificate signed by the company. Companies are organized according to a certain hierarchy which, for the purpose of this paper, can be assumed to be a partial order. The customer, willing to get a discount voucher should provide suitable valid credentials that he/she is a preferred customer (e.g., a member of EOrg) and an employee of an accredited company. We need the theory of partial orders Tpo to specify the trust relationship between companies. The signature Σpo of Tpo consists of the binary relation is trusted by (written infix) and its axioms are the following three sentences: ∀x, y, z.(x is trusted by y ∧ y is trusted by z → x is trusted by z), ∀x.x is trusted by x , and ∀x, y.(x is trusted by y ∧ y is trusted by x → x = y) . Intuitively, x is trusted by y means that company y trusts company x with respect to the fact of being accredited and the possibility of delegating this capability. The business model can be formalized by a policy-aware transducer (whose policies use the theory Tpo ) as follows. We fix the following relational signature R := In ∪ Out ∪ DB ∪ Policy , where In := {order , pay, eorg, emplcert }, Out := {sendbill , deliver , sendvoucher }, DB := {price, available , accredited }, Policy := {preferred , employee, employeeof }. The business process has three databases price, available , and accredited storing the prices of books, their availability, and the set of (root) accredited companies, respectively. A customer interacts with the business system by in- serting tuples in four input relations, order , pay , eorg, and emplcert . The system responds by producing output relations sendbill , deliver , and sendvoucher and it keeps track of the history of the business transaction using the state relations: past order , past pay , past eorg , and past emplcert . The state of the transducer cumu- latively adds the tuples inserted into the input relations. The output rules ω of the transducer are the following: sendbill (X, Y, Z) ← order (X, Z) ∧ price(X, Y ) ∧ ¬past pay (X, Y, Z) deliver (X, Z) ← past order (X, Z) ∧ price(X, Y ) ∧ pay (X, Y, Z) ∧ ¬past pay (X, Y, Z) sendvoucher (Z) ← past deliver (X, Z) ∧ preferred (Z) ∧ employee(Z) . The first rule governs the sending of a bill whose amount is Y about a book X to a customer Z when an order is placed on X by Z, the price of X is Y , and Z has not already been paid for X. The second rule enables the delivery of a book X to Z if X has been ordered by Z and it is being paid the correct price by Z. Finally, the last rule is about sending the discount voucher to Z if a book has 110 order (Book1 , Alice) pay (Book1 , 8, Alice) input order (Book2 , Bob) pay (Book2 , 15, Bob) eorg(Bob) sequence eorg (Alice) emplcert (Alice, Comp3 ) output sendbill (Book1 , 8, Alice) deliver (Book1 , Alice) sendvoucher (Alice) sequence sendbill (Book2 , 15, Bob) deliver (Book2 , Bob) Table 1. A run of a policy-aware transducer been delivered to Z and this last is both a preferred customer and an employee of an accredited company. Finally, the policy rules π of the transducer are the following: preferred (Z) ← past eorg (Z) employee(Z) ← past emplcert (Z, U ) ∧ employeeof (Z, U ) employeeof (Z, F ) ← accredited (F ) employeeof (Z, U ) ← employeeof (Z, F ) ∧ U is trusted by F The first rule simply establishes whether Z is a preferred customer by checking if the credential saying that Z is an EOrg member has been presented. (There can be other rules of this kind once the business model establishes that customers having a certain affiliation become preferred customers.) The last three rules check if Z has presented a certificate saying that he is an employee of a com- pany which is in the suitable trust relationship with some accredited company. Notice the use of recursion to express delegation of the capability of certifying the fact of being an employee of an accredited company to some of its units. Since employeeof is a recursive predicate, ω ∪ π is not a set of non-recursive semi-positive rules and, hence, the associated transducer is not spocus; all the techniques developed in [17] cannot be used to investigate verification problems for this business process. Also the results in [14] cannot be used here as the rules considered in that work (which allow for non-cumulative transducers) are assumed to be non-recursive. An interesting line of future work is to allow for non-cumulative rules also for the policy-aware transducers considered in this paper (i.e. in presence of recursive rules for policy specification). Table 1 shows an example of a run for the transducer specified above. We have assumed that price(Book1 , 8), price(Book2 , 15), accredited (Comp1 ), Comp3 is trusted by Comp2 , and Comp2 is trusted by Comp1 . Now, if we take Log = Out, then the instance of the log validity problem for the run in Figure 1 reduces to the checking Tpo -satisfiability of the following formula (only an excerpt is shown for the sake of conciseness):   ∀x, y.price(x, y) ↔ ((x = Book1 ∧ y = 8) ∨ (x = Book2 ∧ y = 15)) ∧  ∀x.accredited (x) ↔ x = Comp1 ∧ ∧ (Comp3 is trusted by Comp2 ) ∧ (Comp2 is trusted by Comp1 ) ∀x, y.past 0order (x, y) ↔ false ∧ ∀x, y, z.past 0pay (x, y, z) ↔ false ∧   ∧ ∀x.past 0eorg (x) ↔ false ∧ ∀x, y.past 0emplcert (x, y) ↔ false 111 ∀x, y.(order 0 (x, y) ↔ (x = Book1 ∧ y = Alice) ∨ (x = Book2 ∧ y = Bob)) ∧ ∀x, y.(eorg 0 (x) ↔ x = Alice ∧ 0 ∀x, y, z.(order (x, z) ∧ price 0 (x, y) ∧ ¬past 0pay (x, y, z) → sendbill 0 (x, y, z)) ∧ ... From the piece formula above, it is not difficult to see that the output sequence (at time instant 0) of the transducer is exactly the one depicted in the first column, second line of Table 1. 4 An extension of the BSR class Since the goal reachability and the log validity problems will be reduced to the satisfiability of a certain class of first-order formulae (see Section 5), we define such a class of formulae and study their decidability. Let T be Σ-theory, we are interested in studying the T -satisfiability of for- mulae of the form ∃x.∀y.ϕ(x, y) where ϕ is a quantifier-free Σ R -formula and Σ R := Σ ∪ R such that Σ ∪ R = ∅. Sentences of this form are called BSR(Σ R )-sentences. If T is the empty theory, then BSR(Σ R )-sentences are BSR formulae. We show the decidability of BSR(Σ R )-sentences under suitable hypotheses on the theory T in two steps. First, we eliminate universal quantifiers by identi- fying finitely many instances which are sufficient for (un-)satisfiability checking (under suitable hypotheses). Second, we show the decidability of the resulting quantifier-free formulae. Lemma 1 (Instantiation). Let T be a locally finite Σ-theory and the class of models of T be closed under sub-structures. Furthermore, let R be a finite set of predicate symbols such that Σ ∩ R = ∅. The BSR(Σ R )-sentence ∃x.∀y.ϕ(x, y) is satisfiable iff it is satisfiable in a finite model iff the quantifier-free formula ^ ∃x. ϕ(x, yσ) σ is satisfiable, where σ ranges over the substitutions mapping the variables y into the set of representative Σ(x)-terms and yσ denotes the simultaneous application of σ to the variables of the tuple y. We are left with the problem of showing the decidability of the satisfiability of quantifier-free formulae over Σ R . Theorem 1 (Decidability). Let T be a locally finite Σ-theory whose class of models is closed under sub-structures and the SMT(T ) problem be decidable; R be a finite set of predicate symbols such that Σ ∩ R = ∅. Then, the satisfiability of BSR(Σ R )-sentences is decidable. 112 The assumption of the background theory T to be locally finite, at first, seems to be quite restrictive. However, the constraint domains considered as relevant for trust management in [12] have quite simple algebraic structures so as to allow for efficient algorithm to answer policy access queries. We believe that locally finite theories can be put to productive use for the verification of policy-aware transducers as suitable abstractions of this class of constraint domains. 5 Decidability of goal reachability and log validity Before describing the reduction of the goal reachability and the log validity problems to the satisfiability of the extension of the BSR class considered in the previous section, we re-consider how semi-positive (non-recursive) rules and (constraint) Datalog rules define queries on databases, or, equivalently, mapping of databases to answer relations. As already observed in Section 3, this is crucial to characterize the input-output behavior of the class of relational transducers considered in this paper. 5.1 Constraint Datalog queries A database query is a mapping associating to each database an (answer) relation. A set of (constraint) Datalog rules can be seen as a way to implicitly define queries as follows. Let T be a Σ-theory admitting quantifier-elimination and R be a database schema such that Σ ∩ R = ∅. Let Π = ω ∪ π be a finite set of rules such that R = B ∪ E for some cumulative and policy-aware transducer (past , τ, ω, π), where B is the set of predicate symbols occurring only in the body of the rules in Π and not in their heads, and E is the set of predicate symbols occurring both inVthe body and head of the rules in Π. Consider a rule ρ ∈ Π of n the form e(x) ← i=1 bi (x, y) ∧ ψ(x, y) where e ∈ E, the predicate symbol of bi is in R (for i = 1, ..., n), and ψ is a quantifier-free Σ(x, y)-formula, the constraint query associated to ρ is the formula n ^ ∃y. bi (x, y) ∧ ψ(x, y). i=1 If bi contains only symbols in B (as it is the case of the semi-positive non- recursive rules in ω), then it is possible to replace each occurrence of such sym- bols with disjunctions of conjunctions of equalities (recall the definition of finite instance in Section 2). In this way, the resulting (existentially quantified) for- mula turns out to be a Σ(x, y)-formula and the decidability result above (i.e. Theorem 1) can be used. This is the key insight used in reducing both log va- lidity and goal reachability to the satisfiability of BSR formulae in [17] which in our framework corresponds to the case there π = ∅. However, this is not enough for the class of transducers considered in this paper because of the presence of the constraint Datalog rules in Π which may be recursive (we have already ar- gued that this complication is necessary to be able to express certain patterns 113 function constraintFixPoint(F : constraint facts, R : constraint Datalog rules) 1 results ←− F ; Changed ←− true; 2 while Changed do 3 Changed ←− false 4 foreach rule ∈ R do 5 foreach tuple of constraint facts constructed from results do 6 newres ←− constraint facts obtained by constraint rule application between rule and tuple 7 foreach fact ∈ newres do 8 if (results 6|= fact) then results ←− results ∪ {fact }; 9 Changed ←− true; 10 end 11 end 12 end 13 end 14 return results Fig. 1. Least fix-point computation of constraint Datalog rules of policy management such as delegation). As a consequence, we need a fix- point characterization for the queries associated to a set Π of possible recursive rules. To this end, we proceed as follows. First, we regardW each finite instance of r ∈ In ∪ Out ∪ DB as a constraint fact, i.e. ∀r(x) ↔ m j=1 x = c j is transformed into the Datalog rule r(x) ← m j W j=1 x = c , where x is a tuple of variables and cj is a tuple of constants, for j = 1, ..., m. Let r0 (x) ← ni=1 ri (x) ∧ ψ0 (x) be a V constraint Datalog rule in Π for ri ∈ R and i = 0, 1, ..., n and ri (xki ) ← ψi (xki ) be a constraint fact for ψi be a Σ(xi )-quantifier-free formula, ki the arity of ri , and i = 1, ..., n. A constraint rule application produces m ≥ 0 facts of the form r0 (x) ← ψj′ (x) where ψj′ is a quantifier-free Σ(x)-formula for j = 1, ..., m Wm (m ≥ 0) and j=1 ψj′ is equivalent (by the elimination of quantifiers in T ) to the formula n ^ ∃y.( ψi (xki ) ∧ ψ0 (x)), i=1 where y is the tuple of variables occurring in the body of the rule but not in the head. The least fix-point of a set of constraint Datalog rules can be computed by the algorithm in Figure 1. The function constraintFixPoint terminates when all derivable new facts are implied by previously derived facts.1 The requirement 1 The test at line 8 can be reduced (by eliminating quantifiers) to a T -satisfiability test on a conjunction of two quantifier-free Σ-formulae. To understand how, consider two facts r1 (x) ← ψ1 (x) and r2 (y) ← ψ2 (y). To check that latter is a logical consequence of the former (modulo T ), it is sufficient to check the T -validity of ∀x.ψ1 (x) → ∀y.ψ2 (y). This, reasoning by refutation, is equivalent to check the T -unsatisfiability of the negation of the previous formula, which, by eliminating quantifiers can be reduced to a quantifier-free formula. 114 that T admits elimination of quantifiers is not sufficient to guarantee termination of the algorithm in Figure 1. We need the following notion which is adapted from [16]. Definition 4. Let T be a Σ-theory and x be a finite set of variables. If for every (possibly infinite) T -satisfiable set S of Σ(x)-constraint, there exists a finite subset Sf in ⊆ S such that for every ψ in S, there exists a ψ ′ in Sf in for which ∀x.ψ(x) → ψ ′ (x) is T -valid, then T is said to be constraint compact. If Sf in is effectively computable from S, then T is said to be effectively constraint compact (in the following, when we say ‘constraint compact’, we in fact always mean ‘effectively constraint compact’). Constraint compactness is a sufficient condition for the termination of the algo- rithm in Figure 1. Theorem 2 ([16]). Let T be a constraint compact theory. Then, the algorithm in Figure 1 terminates for every query. The sketch of the proof is as follows. By contradiction, assume that the algorithm does not terminate. Thus, every iteration of the loop produces at least one fact which is not a consequence of those which were already derived. Since there are only finitely many different predicate symbols, there must be at least one such symbol for which the algorithm derives an infinite set S of facts. But, T is assumed to be constraint compact and so we can replace S with a finite subset Sfin such that the facts in S \ Sfin are consequence of those in Sfin . As the algorithm checks for the logical consequence of newly derived facts before adding it to the resulting set of derived facts, all the facts in S \ Sfin should not be considered: a contradiction. Interestingly, locally finite theories are also constraint compact. Proposition 1. If T is an (effectively) locally finite theory, then it is also (ef- fectively) constraint compact. Proof. Let S be a set of constraints over a finite set x of variables. Then, since the theory T is locally finite, there exists a finite set RTerms of Σ(x)-terms such that for every term u in S, there exists t ∈ RTerms, T |= u = t. By using the terms in RTerms and the finite signature Σ, it is possible to build only finitely many distinct Σ(x)-atoms ψ1 (x), ..., ψn (x) (also called representative atoms) such that for any further Σ(x)-atom ψ(x), there exists i ∈ {1, ..., n}, T |= ∀x.ψi (x) ↔ ψ(x). If we consider a Σ(x)-constraint φ(x), it is possible to collect all the atoms occurring in it, say {α1 , ..., αm }; for each αj (for j = 1, ..., m), find the equivalent representative atom ψkj , and then substitute each αj with the corresponding ψkj in φ and then eliminating duplicates. The result will be a constraint taken from the finitely many distinct conjunctions of literals built out of the finitely many representative atoms ψ1 (x), ..., ψn (x). This can be generalized to sets of constraints in the obvious way. Clearly, this implies the constraint compactness of T . ⊓ ⊔ As an immediate consequence of the last two facts we obtain the following result. 115 Corollary 1. Let T be an (effectively) locally finite theory. Then, the algorithm in Figure 1 terminates for every query. This will be a crucial ingredient in the reduction of goal reachability and log validity to the satisfiability of the BSR(Σ R ) class. 5.2 Reduction to satisfiability We first consider the log validity problem as goal reachability can be reduced to this problem. Lemma 2. Let T be a locally finite Σ-theory admitting elimination of quanti- fiers. Furthermore, let (past , τ, ω, π) be a cumulative and policy-aware transducer over (Σ, R), db be a database over DB , and Log ⊆ In ∪ Out. The log validity problem for (past , τ, ω, π) reduces to the T -satisfiability problem of an effectively computable BSR(Σ R )-formula. Proof. The proof is along the lines of Theorem 3.1 in [17] for Spocus transducers. A log L1 , ..., Ln is valid if there exists an input sequence I1 , ..., In generating it. We view I1 , ..., In as database over the set of predicate symbols obtained by making n copies of each relation r ∈ In yielding r1 , ..., rn . To state that the input sequence I1 , ..., In yields the log L1 , ..., Ln , we require that the relations in I1 , ..., In recorded by the log have the values specified by L1 , ..., Ln together with the relations in Out determined by I1 , ..., In . For input relations, the situation is easy (we analyze concrete cases as the generalization are straightforward). For example, suppose that Lj requires that a tuple c belongs to the relation rj in the input sequence; this can be stated by the following BSR(R) formula: ∃x.(rj (x) ∧ x = c). (2) For example, if the log specifies that rj contains the tuples c and d, the inclusion of rj in the log is state by the following BSR(R) formula: ∀x.(rj (x) → (x = c ∨ x = d)). (3) For relation in the output sequence, we need to resort to Corollary 1 above: the fix-point algorithm in Figure 1 terminates on all queries and generates a (finite) set of constraint facts. Then, for each fact of the form rj (x) ← ψji (x) in the output set, we take the disjunction of each ψji for i ≥ 0. Let ϕj (x) be the resulting quantifier-free Σ(x)-formula. The remainder is similar to the case of input relations. Requiring that a tuple belongs to the relation rj in the output sequence can be expressed by a formula similar to (2) except that rj is replaced by ϕj obtained by the fix-point computation. Similarly, if the log specifies that rj contains the two tuples c and d, the inclusion of rj in the log is stated by a formula which is similar to (3) except that rj is replaced by ϕj obtained by the fix-point computation. It is easy to see that the resulting formula is a BSR(R) formula and its satisfiability is equivalent to the existence of an input sequence yielding the desired log. ⊓ ⊔ 116 Before stating our main decidability results for log validity and goal reachability of cumulative and policy-aware transducers, we observe the following. An imme- diate consequence of Lemma 2 and Theorem 1 is the decidability of both verifi- cation problems for cumulative and policy-aware transducers whose Σ-theory is locally finite and admits elimination of quantifiers. Unfortunately, local finiteness and elimination of quantifiers are difficult to reconcile as the former is obtained for theories with simple “algebraic structure” while the latter is satisfied for those with rich “algebraic structure”. To illustrate, we consider three theories. The theory whose signature contains only constants c1 , ..., cn and axiomatized by the following sentences: ci 6= cj for i, j = 1, ..., n and i 6= j ∀x.x = c1 ∨ · · · x = cn , is both locally finite and admits elimination of quantifier. The theory Tlo of linear orders whose signature contains the binary relation  and axiomatized by the following sentences: ∀x, y, z.(x  y ∧ y  z → x  z) ∀x, y.(x  y ∧ y  x → x = y) ∀x.x  x ∀x, y.(x  y ∨ y  x) is locally finite, the SMT(Tlo ) problem is decidable, but it does not admit elim- ination of quantifiers. The theory Tdlo of dense linear orders is the theory over the same signature of Tlo and obtained by adding the following two sentences to the set of axioms of Tlo : ∀x, y.(x ≺ y → ∃z.(x ≺ z ∧ z  y)) and ∃x, y.x 6= y , where t1 ≺ t2 abbreviates t1  t2 ∧ t1 6= t2 for t1 , t2 variables. The SMT(Tdlo ) problem is decidable, Tdlo admits elimination of quantifiers, but it is not locally finite. Fortunately, it is sometimes possible to reconcile local finiteness and quan- tifier elimination. For the theories Tlo and Tdlo , it is possible to show that a quantifier-free formula is Tlo -satisfiable iff it is Tdlo -satisfiable. So, to check the Tlo -satisfiability of BSR(Σ R ) sentences, we can use its being locally finite to ap- ply the instantiation procedure underlying the proof of Lemma 2 while we can use the fact that Tdlo admits elimination of quantifiers to obtain the termination of the fix-point algorithm in Figure 1 (as we know that the Tlo -satisfiability of quantifier-free formulae is preserved). This phenomenon is not an accident but an application of the notion of model completeness of a theory. A Σ-theory T is model complete iff, for all models M, N of T , if M is a sub-structure of N , then M is also an elementary sub-structure of N , i.e. for every Σ-formula ϕ(x) and all elements a, we have that M |= ϕ(a) iff N |= ϕ(a). Intuitively, the elementary sub-model M of a model N preserves the set of satis- fiable first-order formulae (and hence of quantifier-free formulae in particular). It is possible to show that Tdlo is model complete (see, e.g., [4]). The key property 117 (see Remark 3.5.6 in [4]) is that M is a sub-model of the set of universal sen- tences which are logical consequences of Tdlo (if T is a theory, the set of universal sentences which are T -valid is denoted with T ∀ ) iff M is a sub-model of some ∀ model of Tdlo . (The validity of universal sentences is the dual of the satisfiability ∀ ∀ of quantifier-free formulae.) Now, since Tdlo = Tlo (see [10]), we have formally justified the use of Tlo in Lemma 2 and that of Tdlo to eliminate quantifiers in the algorithm of Figure 1. Notice that the class of models of Tlo is closed under sub-structures since its axioms are universal sentences (see, e.g., [4]). Further- more, these observations constitute the sketch of the proof of our first (main) decidability result about the verification of cumulative and policy-aware trans- ducers. Theorem 3. Let (past , τ, ω, π) be a cumulative and policy-aware transducer over (Σ, R), db be a database over DB , and Log ⊆ In ∪ Out. If (a) T is a locally fi- nite theory such that the SMT(T ) problem is decidable and its class of models is closed under sub-structures, (b) T∗ ⊇ T is a model complete theory admitting elimination of quantifiers, and (c) T ∀ = T∗∀ , then the log validity problem for (past , τ, ω, π) is decidable. We are now ready to state and prove our second (main) decidability result. Theorem 4. Let (past , τ, ω, π) be a cumulative and policy-aware transducer over (Σ, R), db be a database over DB , and Log ⊆ In ∪ Out. If (a) T is a locally fi- nite theory such that the SMT(T ) problem is decidable and its class of models is closed under sub-structures, (b) T∗ ⊇ T is a model complete theory admitting elimination of quantifiers, and (c) T ∀ = T∗∀ , then the goal reachability problem for (past , τ, ω, π) is decidable. Proof. Again the proof is along the lines of Theorem 3.2 in [17] for Spocus trans- ducers. First of all, observe that only runs of length two should be considered. To understand why this is so, consider an input sequence I1 , ..., In with n > 2. Since outputs depend only on the current input, the database, and the state relations (storing the union of all previous inputs), the last output in the run of the transducer on I1 , ..., In is the same as the last output on the run of the same transducer on the following input sequence of length 2: (I1 ∪ · · · In−1 ), In . At this point, the problem is reduced to that of the satisfiability of a BSR(R) with the same technique described in the proof of Lemma 2. ⊓ ⊔ 6 Conclusion We have introduced a class of policy-aware relational transducers. We have stud- ied the hypotheses under which log validity and goal reachability are decidable for this class of transducers by a reduction to the satisfiability problem of an extension of the BSR class of formulae. There are two main lines of future work. First, we intend to study the de- cidability of those verification problems involving two or more transducers such as containment and equivalence [17]. Second, it would be interesting to see how 118 to implement the algorithm for checking the satisfiability of the extension of the BSR class on top of state-of-the-art theorem provers or Satisfiability Modulo Theories solvers. The latter, in particular, with the recent interest in developing decision procedures for the BSR class (see, e.g., [5]) and techniques for quantifier instantiation (see, e.g., [9]) seem to be ideal candidates. Acknowledgments. The work presented in this paper was partially supported by the FP7-ICT-2007-1 Project no. 216471, “AVANTSSAR: Automated Validation of Trust and Security of Service-oriented Architectures.” We thank Silvio Ghilardi and the mem- bers of the AVANTSSAR project for many stimulating discussions. References 1. Franz Baader and Silvio Ghilardi. Connecting many-sorted theories. Journal of Symbolic Logic, 72:535–583, 2007. 2. M. Y. Becker, C. Fournet, and A. D. Gordon. Security Policy Assertion Language (SecPAL). http://research.microsoft.com/en-us/projects/SecPAL/. 3. M. Y. Becker and S. Nanz. The Role of Abduction in Declarative Authorization Policies. In 10th International Symposium on Practical Aspects of Declarative Languages (PADL), 2008. 4. C.-C. Chang and J. H. Keisler. Model Theory. North-Holland, Amsterdam-London, third edition, 1990. 5. L. de Moura and N. Bjørner. Deciding effectively propositional logic using dpll and substitution set. In Int. Joint Conference on Automated Reasoning, 2008. 6. J. Van den Bussche. Constraint Databases, chapter Constraint databases, queries, and query languages. Springer, 2000. 7. J. DeTreville. Binder, a logic-based security language. In IEEE Symposium on Security and Privacy, Los Alamitos, CA, USA, 2002. IEEE Computer Society. 8. H. B. Enderton. A Mathematical Introduction to Logic. Academic Press, New York-London, 1972. 9. Y. Ge, C. Barrett, and C. Tinelli. Solving quantified verification conditions using satisfiability modulo theories. Annals of Mathematics and Artificial Intelligence, 2009. To appear. 10. S. Ghilardi. Model theoretic methods in combined constraint satisfiability. Journal of Automated Reasoning, 33(3-4), 2004. 11. Y. Gurevich and I. Neeman. DKAL: Distributed-knowledge authorization lan- guage. In Proceedings of CSF 2008, pages 149–162. IEEE Computer Society, 2008. 12. N. Li and J. C. Mitchell. Datalog with constraints: a foundation for trust manage- ment langauges. In PADL’03, pages 58–73, 2003. 13. S. Ranise. Towards Verification of Security-Aware E-Services. Extended version of the paper with the same title in FTP’09. 14. M. Spielmann. Verification of relational transducers for electronic commerce. In 19th ACM Symp. on Princ. of DB Systems (PODS). ACM Press, 2000. 15. C. Tinelli and C. G. Zarba. Combining non-stably infinite theories. Journal of Automated Reasoning, 34(3), 2005. 16. D. Toman. Computing the well-founded semantics for constraint extensions of datalog. In 2nd Int. WS on Constraint Database Systems, volume 1191 of LNCS, pages 64–79, 1997. 17. S. Abitebouland V. Vianu, B. Fordham, and Y. Yesha. Relational Transducers for Electronic Commerce. J. of Comp. and Sys. Sciences, 61:236–269, 2000. 119