-

Verification of Conjunctive-Query Based Semantic Artifacts?

Babak Bagheri Hariri

0 1

Diego Calvanese

calvaneseg@inf.unibz.it 0 1

Giuseppe De Giacomo

Riccardo De Masellis

0 0 Dip. di Informatica e Sistemistica Sapienza Universita` di Roma Via Ariosto , 25, 00185 Rome , Italy 1 KRDB Research Centre Free University of Bozen-Bolzano I-39100 Bolzano , Italy

We introduce semantic artifacts, which are a mechanism that provides both a semantically rich representation of the information on the domain of interest in terms of an ontology, including the underlying data, and a set of actions to change such information over time. In this paper, the ontology is specified as a DL-Lite TBox together with an ABox that may contain both (known) constants and unknown individuals (labeled nulls, represented as Skolem terms). Actions are specified as sets of conditional effects, where conditions are based on conjunctive queries over the ontology (TBox and ABox), and effects are expressed in terms of new ABoxes. In this setting, which is obviously not finite state, we address the verification of temporal/dynamic properties expressed in -calculus. Notably, we show decidability of verification, under a suitable restriction inspired by the notion of acyclicity in data exchange.

The artifact-centric approach to service modeling, introduced recently, considers both data and processes as first-class citizens in service design and analysis. Artifacts are advocated as a sort of middle ground between a conceptual formalization of a dynamic system and an actual implementation of the system itself [ 1 ]. The verification of temporal properties in the presence of data represents a significant challenge for the research community, since the system becomes infinite-state, and hence the usual analysis based on model checking of finite-state systems [ 2 ] is impossible in general. Recently, there have been some advancements on this issue, considering suitably constrained relational database settings for the data component (which acts also as a data storage for the process), see e.g., [ 3,4 ].

In this paper, we follow the line of [ 3 ], and rely on the work in knowledge representation to propose a more conceptual treatment of artifacts. We do so by enriching artifacts with a semantic layer constituted by a full-fledged ontology expressed in description logics. In particular, our semantic artifacts include an ontology specified in DL-LiteR [ 5 ], which is the core of the web ontology language OWL2 QL1, and it is particularly well suited for data management. The TBox of the ontology captures intensional information ? This work has been supported by the EU FP7-ICT Project ACSI (257593). 1 http://www.w3.org/TR/owl2-profiles/ on the domain of interest, similarly to conceptual data models, such as UML class diagram, though as a software component to be used at runtime. The ABox, which acts as the artifact state, maintains the data of interest, which are accessed by relying on query answering through ontologies. As a query language, we use unions of conjunctive queries, possibly composing their certain answers through full FOL constructs [ 6 ]. A semantic artifact has associated actions whose execution changes the state of the artifact, i.e., its ABox. Such actions are specified as sets of conditional effects, where conditions are queries over the ontology and effects are expressed in terms of new ABoxes. To capture data that are acquired from external users/environments during the execution of actions, such ABoxes may contain special constants called labeled nulls, represented as Skolem terms. These represent individuals that the artifact does not know, but on which it knows some facts. Actions have no pre-condition, instead processes over the semantic artifact are used to specify which actions can be executed at each step. We model such processes as condition/action rules, where the condition is again expressed as a query over the ontology.

In this setting, which is obviously not finite state, we address the verification of temporal/dynamic properties expressed in -calculus [ 7 ], where atomic formulas are queries over the ontology that can refer only to known individuals. Unsurprisingly, we show that even for very simple semantic artifacts and dynamic properties verification is undecidable. However, we also show that for a very rich class of semantic artifacts, verification is indeed decidable and reducible to finite-state model checking. To obtain this result, we rely on recent results on the finiteness of the chase of tuple-generating dependencies in the data exchange literature [ 8 ]. 2

Preliminaries

As ontology language, we make use of DL-LiteR, a member of the DL-Lite family [ 5 ], and hence a tractable DL particularly suited for dealing with ontologies (or KBs) with very large ABoxes, which can be managed through relational database technology. DLLiteR is also the core of the standard web ontology language OWL2 QL. In DL-LiteR, concepts and roles are formed according to the following syntax:

B ::= C ::=

N j 9U B j :B j 9U .B

U ::= V ::=

P j P U j :U N , B, and C respectively denote a concept name, a basic concept, and an arbitrary concept. P , P , U , and V respectively denote a role name, an inverse role, a basic role, and an arbitrary role. A DL-LiteR ontology is a pair (T; A), where T is a TBox, i.e., a finite set of (concept and role) inclusion assertions of the forms B v C and U v V ; and A is an ABox, i.e., a finite set of facts (also called membership assertions) of the forms N (c1) and P (c1; c2), where N and P occur in T , and c1 and c2 are constants. We denote with CA the set of constants appearing in A. The semantics of a DL-LiteR ontology is the usual one for DLs, see [ 5 ]. Notice that in DL-LiteR the unique name assumption is immaterial, since there is no way of deducing facts about equality. We say that A is consistent wrt T if (T; A) is satisfiable, i.e., admits at least one model.

A union of conjunctive queries (UCQ) q over a DL-LiteR TBox T and constants C is a FOL formula of the form 9y1.conj 1(x; y1) _ _ 9yn.conj n(x; yn), with free variables x, existentially quantified variables y1; : : : ; yn. Each conj i(x; yi) in q is a conjunction of atoms of the form N (z), P (z; z0), z = z0, where N and P respectively denote a concept and a role name occurring in T , and z, z0 are constants in C or variables in x or yi, for some i 2 f1; : : : ; ng. Given constants C (typically CA), the (certain)answers to q over (T; A) wrt C is the set ansC(q; T; A) of substitutions2 of the free variables of q with constants in C such that q evaluates to true in every model of (T; A). We also consider an extension of UCQs, called ECQs, which are the range-restricted queries of the query language EQL-Lite(UCQ) [ 6 ], that is, the FOL query language whose atoms are UCQs evaluated according to the certain answer semantics above. Formally, an ECQ over T and C is a possibly open formula of the form

Q ::= q j Q1 ^ Q2 j :Q j 9x.Q; where q denotes a UCQ over T and C, with the proviso that every free variable of a negative subquery, i.e., of the form :Q, must occur in a positive subquery (to guarantee range-restriction). Given constants C the answer to Q over (T; A) wrt C, is the set ansC(Q; T; A) of tuples of constants in C obtained by evaluating the FOL formula Q in the standard way, while considering each UCQ q appearing in it as a primitive predicate with extension ansC(q; T; A). For the connection with epistemic logic, see [ 6 ]. 3

Semantic Artifacts

A semantic artifact S = (T; A0; R) is a stateful device constituted by the information ontology (T; A0) and the action specification R.

– T is a DL-LiteR TBox, fixed once and for all, and capturing the intensional knowledge about the domain modeled by the semantic artifact. – A0 is a DL-LiteR ABox, which expresses extensional information, and constitutes the initial state of the artifact. We call proper constants the constants CA0 in A0, and labeled nulls all new constants introduced by action execution. – R is a set of actions, which change the state of the semantic artifact, i.e., the extensional information component.

An action is constituted by a signature and an effect specification. The action signature is constituted by a name and a list of individual input parameters. Such parameters need to be substituted by constants for the execution of the action. Given a substitution for the input parameters, we denote by the action with actual parameters.3

The effect specification consists of a set fe1; : : : ; eng of effects, which are assumed to take place simultaneously. Their formalization is inspired by the notion of mappings in–daqti+a e^xcQhianigsea[q8u].eSrypeocviefircTallayn,danCAef0fewcittheixhaass tfhreeefovarmriaqbi+les^, tQhait mayAi0incwluhdeeres:ome of the input parameters as query constants, qi+ is a UCQ, and Qi is an arbitrary ECQ whose free variables are included in those of qi+, namely in x. Intuitively, qi + selects the tuples to instantiate the effect, and Qi filters aways some of them. 2 As customary, we can view each substitution simply as a tuple of constants, assuming some ordering of the free variables of q. 3 We disregard a specific treatment of the output to the user, and assume instead that the user can freely pose queries to the ontology and thus extract implicit or explicit information from the states through which the semantic artifact evolves.

Loan

< closed 1..* < owes < peer

Customer In Debt Customer

Gold

Customer – A0i is a set of facts over T , which include as constants: constants in CA0 , parameters, free variables of qi+, and implicitly existentially quantified variables.

Given a state A of S , and a substitution for the parameters of the action , the effect ei extracts from A the set ans CA ((qi+ ^ Qi ) ; T ; A) of tuples of constants (proper constants and labeled nulls), and for each such tuple asserts a set A0i of facts obtained from A0i by applying the substitution for the free variables of qi+. For each existentially quantified variable z in A0i , a labeled null is introduced having the form fz;ei (x) , where fz;ei (x) is a Skolem function, depending on the existential variable z and the effect ei, having as arguments the free variables x of qi+. We denote by ei (A) the overall set of facts, i.e., ei (A) = S 2ansCA (Qi ;T;A) A0i . The overall effect of the action with parameter substitution over A is a new state do( ; T ; A) = S1 i n ei (A) for S . Notice that do( ; T ; A) may be inconsistent wrt T . In this case, the action with over A is not executable.

Let’s make some observations on such actions. The effects of an action are naturally a form of update of the previous state, and not of belief revision [ 9 ]. That is, we never learn new facts on the state in which an action is executed, but only on the state resulting from the action execution. We also observe that existentially quantified variables introduced by actions effects are used as witnesses of values chosen by the external user/environment when executing the action. We assume that such a choice depends only on the specific effects and the information retrieved by the query in the premises. We model such a choice by introducing suitable labeled nulls generated by appropriate Skolem functions. Finally, we observe that we do not make any persistence (or frame) assumption in our formalization [ 10 ]. In principle at every move we substitute the whole old state, i.e., ABox, with a new one. On the other hand, it should be clear that we can easily write effect specifications that copy big chunks of the old state into the new one. For example, P (x; y) P (x; y) copies the entire set of assertions involving the role P . Example 1. Let us consider a semantic artifact S = (T ; A0; R), where T is the DL-LiteR formalization of the UML diagram in Figure 1, which can be described as follows. A bank considers two specific types of customers: in-debt customers have a loan with the bank, while gold customers have access to special privileges. In-debt customers may have closed their loan. A relationship peer(c; p) between two customers denotes that p can vouch for c. The initial state is A0 = fGold(john); Cust(ann); peer(mark; john)g. R includes the following actions (we use brackets to isolate UCQs in ECQs):

GetLoan(c) : f [9p:peer(c; p) ^ Gold(p)]

fowes(c; newl (c))g;

CopyAll g

That is, customer c gets a loan provided that s/he has a peer that is “gold”. We use CopyAll as a shortcut to denote effects that copy all concepts and roles.

CloseAllLoans(c) : f [owes(c; l)] fclosed(c; l)g; CopyAll g

That is, customer c closes all his/her loans which are moved to the closed relation.

UpdateDebts : f [owes(x; l)] ^ :[closed(x; l)] fowes(x; l)g; [InDebt(x)] ^ 8l:[owes(x; l)] [closed(x; l)]

CopyAllExceptOwesClosedInDebt g fCust(x)g; That is, “remove” from owes those tuples that are in closed, and remove in-debt customers whose loans are all closed from InDebt, keeping them in Cust. CopyAllExceptOwesClosedInDebt copies everything except owes, closed, and InDebt. 4

Processes

Notice that semantic artifacts include information and actions to change such information. However, they do not say anything about how or when to apply a certain action. In other words, semantic artifact are agnostic to processes that use them. Processes over a semantic artifact S = (T ; A0; R) are (possibly nondeterministic) programs that use the state of S (accessed through T ) to store their (intermediate and final) computation results, and the actions in R as atomic instructions. The state A can be arbitrarily queried through query answering over T , while it can be updated only through the actions in R. There are many ways to specify processes over S . Here we adopt a rule-based specification.

A condition/action rule for a semantic artifact S is an expression of the form Q 7! , where is an action in R and Q is an ECQ over T and CA0 , whose free variables are exactly the parameters of . Such a rule expresses that, for each tuple for which condition Q holds, the action with actual parameters can be executed.

A process is a finite set = f 1; : : : ; ng of rules. Notice that processes don’t force the execution of actions but constrain them: the user of the process will be able to choose any of the actions that the rules forming the process allow. Notice also that our processes inherit entirely their states from the semantic artifact S , see e.g., [ 1 ]. Other choices are also possible: the process could maintain its own state besides the one of the semantic artifact. As long as such an additional state is finite, or embeddable into the semantic artifact itself, the results here would easily extend to such a case.

The execution of a process over a semantic artifact S is defined as follows: we start from the initial state A0 of the artifact, and for each rule Q 7! in , evaluate Q, and for each tuple returned execute , obtaining a new state A0 = do( ; T ; A0) if A0 is consistent wrt T (i.e., is actually executable), and so on. In this way we build a transition system ( ; S ) whose states represent possible artifact states and where each transition represents the execution of an instantiated action that is allowed according to the process. ( ; S ) captures the behavior of the process over the artifact S . In principle we can model-check such a transition system to verify dynamic properties. This is exactly what we are going to do next. However, one has to consider that in general such a transition system is infinite, so the classical results on model checking [ 2 ], which are developed for finite transition systems, do not apply.

Example 2. Referring to the example above, a process might include the following rules: – [Cust(x)] ^ :[9y:owes(x; y)] 7! GetLoan(x), i.e., a customer can get a loan if she does not have one already; – 9y:([owes(x; y)] ^ :[closed(x; y)]) 7! CloseAllLoans(x), i.e., a customer that owes loans that are not closed, can close them all at once; – true 7! UpdateDebts, i.e., it is always possible to perform UpdateDebts.

Verification Formalism

We turn to the verification formalism to be used to specify dynamic properties of processes running over semantic artifacts. Several choices are possible. Here we focus on a variant of -calculus [ 7 ], which is one of the most powerful temporal logics that subsumes both linear time logics, such as LTL and PSL, and branching time logics such as CTL and CTL* [ 2 ]. In particular, we introduce a variant of -calculus, called L that conforms with the basic assumption of our formalism, namely the use of ECQs to talk about the semantic information contained in the state of the artifact. Formally, given a semantic artifact S = (T; A0; R), formulas of L over S have the following form: ::= Q j : j 1 ^ 2 j 9x: j 2 j 3 j Z: j Z: j Z where Q is an ECQ over T and CA0 (but not labeled nulls), and Z is a predicate variable.

The symbols and can be considered as quantifiers, and we make use of the notions of scope, bound and free occurrences of variables, closed formulas, etc. The definitions of these notions are the same as in first-order logic, treating and as quantifiers. In fact, we are interested only in closed formulas as specification of temporal properties to verify. For formulas of the form Z: and Z: , we require the syntactic monotonicity of wrt Z: every occurrence of the variable Z in must be within the scope of an even number of negation signs. In -calculus, given the requirement of syntactic monotonicity, the least fixpoint Z: and the greatest fixpoint Z: always exist. In order to define the meaning of such formulas we resort to transition systems. Let A = ( ; S) be the transition system for a process over a semantic artifact S = (T; A0; R). We denote by A the states of A. Let V be a predicate and individual variable valuation on A, i.e., a mapping from the predicate variables Z to subsets of the states A, and from individual variables to constants in CA0 . Then, we assign meaning to -calculus formulas by associating to ( ; S) and V an extension function ( )VA, which maps -calculus formulas to subsets of A. The extension function ( )VA is defined inductively as follows:

A j ansCA0 (QV; T; A)g = fA 2 = ZV = (Q)VA (Z)A (: V)VA A ( A)VA ( 1 ^ 2)VA = ( 1)VA \A ( 2)VA ((93x:)VA)VA (2 )A

V A ( Z: )V ( Z: )VA A j E = Sf( )V[x=c] j c 2 CA0 g = fA 2 A j 9A0: A )A A0 and A0 2 ( )VAg = fA 2 A j 8A0: A )A A0 implies A0 2 ( )VAg = TfE A j ( )VA[Z=E] E g = SfE ( )VA[Z=E]g Here A )A A0 holds iff there exists a rule Q 7! in such that there exists a 2 ansCA (Q; T; A) and A0 = do( ; T; A0). That is, there exist a rule in that can fire on A and produce an instantiated action , which applied on A has A0 as effect.

Intuitively, the extension function ( )A assigns to the various -calculus constructs

V the following meanings. The boolean connectives have the expected meaning, while quantification is restricted to constants of CA0 , which are the only constants that the ECQs in the formula can retrieve. We also use the usual FOL abbreviations. The extension of 3 consists of the states A such that for some state A0 with A )A A0, we have that holds in A0. While the extension of 2 consists of the states A such that for all states A0 with A )A A0, we have that holds in A0. The extension of Z: is the smallest subset E of A such that, assigning to Z the extension E , the resulting extension of is contained in E . That is, the extension of X: is the least fixpoint of the operator

E :( )V[Z=E] (here V[Z=E ] denotes the predicate valuation obtained from V by forcing the valuation of Z to be E ). Similarly, the extension of X: is the greatest subset E of A such that, assigning to X the extension E , the resulting extension of contains A E . That is, the extension of X: is the greatest fixpoint of the operator E :( )V[X=E]. When is a closed formula, ( )VA does not depend on V, and we denote it by A.

The reasoning problem we are interested in is model checking, i.e., verify whether a L closed formula holds for the process over the semantic artifact S. Formally, such a problem is defined as checking whether A0 2 A, that is, whether is true in the root of the initial state A0 of transition system ( ; S).

Example 3. Continuing on our running example, we consider the following simple safety property: It is always true that gold customers in A0 remain so. This property can be written as:

Z:([Gold(x)] ^ 2Z)): This formula is true, since no action (among the ones above) removes individuals from being Gold.

Next, we consider a simple liveness property: It is possible to reach a state in which a gold customer is also an in-debt customer.

Z:([9x:Gold(x) ^ InDebt(x)] _ 3Z) This formula is true, because the ontology implies that if x participates to owes then x is an instance of InDebt; and we can reach a state in which 9x:Gold(x) ^ owes(x; y) holds by firing the action GetLoan(john), which is allowed by the process. 6

Homomorphism and Bisimulation

We want to understand when two ABoxes A1 and A2 over a common DL-LiteR TBox T provide the same answers to all EQL queries. Given two relational structures I1 and I2 over the same set of relations, and a set C of constants, a C-homomorphism h from I1 to I2 is a mapping from the domain of I1 to the domain of I2 that preserves constants in C and relations, i.e., h(cI1 ) = cI2 for each c 2 C, and if (d1; : : : ; dn) 2 rI1 , then (h(d1); : : : ; h(dn)) 2 rI2 , for each relation r. Then, we say that there is a C-homomorphism from A1 to A2 wrt T , denoted A1 !CT A2, iff there is a C-homomorphism from IA1 to each model of (T; A2), where IA1 is the structure whose domain is CA1 , whose constants are interpreted as themselves, and N IA1 = fc j N (c) 2 A1g for each concept name N , similarly for role names. This property can be checked by resorting to query answering over ontologies. For the ABox A1, let QA1 be the boolean conjunctive query obtained as the conjunction of all facts in A1, in which the constants not in C are treated as existentially quantified variables. Lemma 1. Given a DL-LiteR TBox T , two ABoxes A1 and A2 over T , and a set C of constants, we have that A1 !CT A2 iff ansC(QA1 ; T; A2) = true. Proof (sketch). We remind that a DL-LiteR ontology (T; A) has a unique (up to renaming of domain elements) canonical-model [ 5 ], which is the model that has a CAhomomorphism to each model of (T; A). By composing homomorphisms, we have that A1 !CT A2 iff there is a C-homomorphism from IA1 to the canonical model of (T; A2). The claim then follows by considering that there is a C-homomorphism from IA1 to the canonical model of (T; A2) iff ansC(QA1 ; T; A2) = true [ 11,5 ]. tu A1 and A2 are said C-homomorphically equivalent wrt T if A1 !CT A2 and A2 !CT A1. Theorem 1. Let C be a set of constants, and A1, A2 two ABoxes that are Chomomorphically equivalent wrt a TBox T . Then, for every ECQ Q over T using only constants in C, we have that ansC(Q; T; A1) = ansC(Q; T; A2).

Next we want to capture when two states of a single transition system or more generally of two transition systems, possibly obtained by applying different processes to different semantic artifacts sharing the same TBox and constants in the initial state, can be considered behaviourally equivalent, in the sense that they satisfy exactly the same

L formulas. To formally capture such an equivalence, we make use of the notion of bisimulation [ 12 ], suitably extended to deal with query answering over ontologies.

Given two artifact transition systems A1 = ( 1; S1) (with states A1 ) and A2 = ( 2; S2) (with states A2 ) such that S1 = (T; A10; R1) and S2 = (T; A20; R2) share the same TBox T and the same constants C = CA10 = CA20 , a bisimulation is a relation B A1 A2 such that: (A1; A2) 2 B implies that: 1. A1 and A2 are C-homomorphically equivalent wrt T ; 2. if A1 )A1 A01 then there exists A02 such that A2 )A2 A02 and (A01; A02) 2 B; 3. if A2 )A2 A02 then there exists A01 such that A1 )A1 A01 and (A01; A02) 2 B. We say that two states A1 and A2 are bisimilar, if there exists a bisimulation B such that (A1; A2) 2 B. Two transition systems A1 with initial state A10 and A2 with initial state A20 are bisimilar if (A10; A20) 2 B.

The following theorem states that the formula evaluation in L is indeed invariant wrt bisimulation, so we can equivalently check any bisimilar transition systems. Theorem 2. Let A1 and A2 be two transition systems obtained from two semantic artifacts sharing the same TBox and constants. Then, for two states A1 of A1 and A2 of A2 (including the initial ones) are bisimilar iff for all L closed formulas over the two semantic artifacts, we have that A1 2 ( )A1 iff A2 2 ( )A2 .

Proof. The proof is analogous to the standard proof of bisimulation invariance of calculus [ 7 ], though taking into account our specific definition of bisimulation, using Theorem 1 to guarantee that ECQs are evaluated identically over bisimilar states. tu 7

Undecidability and Decidability

We now show that, not surprisingly, verification in the infinite state setting we considered is undecidable, but that it becomes decidable under some interesting conditions inspired by the recent literature on data exchange [ 8 ]. Our results rely on the possibility of building special semantic artifacts that we call “inflationary approximates”. For such special artifacts there exists a tight correspondence between the application of an action and a step in the chase of a set of tuple-generating dependencies (TGDs) [ 13,8 ]

Given a semantic artifact S = (T; A0; R), its inflationary approximate is the semantic artifact S+ = (T +; A0; R+) defined as follows. T + is obtained from T by dropping all assertions involving negation on the right-hand side, thus obtaining a TBox formed by positive inclusions only. R+ is formed by one action specification + for each action specification 2 R, where + is obtained from by: – removing all input parameters from the signature – note that the variables in qi+ that used to be parameters in , become free variables in +; – substituting each effect ei : qi+ ^ Qi A0i with ei : qi+ A0i – note that we need to preserve the Skolem functions name in the transformation; – adding effects to copy all concept and role names, namely adding an effect N (x) N (x) for each concept name N of T , and an effect P (x; y) P (x; y) for each role name P of T .

Observe that executing actions in S+ can never give rise to an inconsistency, since T + does not contain any negative information [ 5 ].

We also consider the generic process >, in which all condition/action rules have the trivially true condition. Hence, > allows for executing every action at every step. With these notions in place, it is easy to prove that verification in this setting is undecidable. Theorem 3.

L model checking of processes over semantic artifacts is undecidable. Proof (sketch). We show that it is already undecidable to check, given a semantic artifact S;+ = (;; A0; R), of the form of an inflationary approximate of an artifact with an empty TqBisoxa,baonodlethaen tUraCnsQit.ioWnesyosbtseemrvAe t=hat t(he>se;Sto;+f)a,lwlhaectthioenrsAi0n 2S+ Zca(nq _be3seZe)nAa,swahesreet of TGDs, indeed it suffices to consider one TGD for each disjunct in the UCQ on the left-hand side of an effect specification. So, we can reduce to the above model checking problem the problem of answering boolean UCQs in a relational database under a set of TGDs, which is undecidable [ 14 ]

Next, we observe a notable property of the transition system ( >; S+) generated by the generic process + of a semantic artifact S. Namely, each do( +; >T;o)veisr tahmeionnfloattoionnicaroypearpaptroor.xTimhaisteimSplies that by repeatedly applying such operators starting from the ABox A0 in S+, we get at the limit (possibly transfinite) a single ABox Amax , which is the least fixpoint of such operators taken collectively [ 15,16 ]. Such an ABox contains, every fact generated by repeatedly executing actions from the inflationary approximate S+, that is every state A+ of ( >; S+) is such that A+ is contained in Amax . More interestingly, we show next that Amax contains also every A generated by repeatedly executing actions from the original S. Lemma 2. Let S = (T; A0; R) be a semantic artifact and a process over S. Then every state A of the transition system ( ; S) is a subset of Amax defined as above. Proof (sketch). We can show by induction that, for every sequence of actions 1 1; 2 2; : : : ; n n generated by the process starting from A0, the resulting state do( n n; T; do( do( 2 2; T; do( 1 1; T; A0)) )) is a subset of the corresponding resulting state do( n+T +; do( do( 2+; T +; do( 1+; T +; A0)) )) of the inflationary approx., which is a subset of Amax . tu

In other words, Amax generated by the generic process > running over the inflationary approximate S+ of a semantic artifact S, bounds all states A that any process can generate by running on S. Hence if for any reason Amax is finite, then the transition system ( ; S) is finite. Hence, being model checking of finite transition system decidable (in fact polynomial in the size of the transition system), we get that also model checking of ( ; S) is decidable.

To get finiteness guarantees on Amax , we take advantage of the correspondence between action execution and TGDs chase steps, as in the proof of Theorem 3. We build on this correspondence by further considering that in DL-LiteR, every UCQ q over a TBox can be rewritten as a new UCQ rew (q) over the same alphabet, to be evaluated over the ABox considered as a relational database [ 5 ] (that is dropping the TBox).

In the literature for data exchange, several conditions that guarantee the finiteness of the chase of TGDs have been considered [ 17,18 ]. Here we focus on the original notion of weakly-acyclic TGDs [ 17 ]. Weak-acyclicity is a syntactic notion that involves the so-called dependency graph of the set of TGDs. Informally, a set D of TGDs is weakly-acyclic if there are no cycles in the dependency graph of D involving “existential” relation positions. The key property of weakly-acyclic TGDs is that chasing a relational database with them (i.e., applying them in all possible ways) generates a set of facts (a database) that is finite. See [ 17 ] for details.

Given a semantic artifact S = (T; A0; R) and its inflationary approximate S+ = (T +; A0; R+), we define the set E+ of effect specifications that includes an effect rew (qi+) Ai for each effect qi+ ; Ai of an action + 2 R+. Notice that the set E+ ; can be seen as a set of TGDs. We say that a semantic artifact S is weakly-acyclic if the set E+ seen as a set of TGDs is weakly-acyclic. (Note that the semantic artifact in our ; example is trivially weakly-acyclic.) Lemma 3. Let S = (T; A0; R) be a weakly-acyclic semantic artifact and S+ its inflationary approximate. Then Amax computed as above for S+ is finite.

Proof (sketch). We have to show that starting from A0 we get to the least fixpoint Amax of the do( +; T; ) operator in a finite number of steps. To do so, we follow the line of the proof of finiteness of chase of weakly-acyclic TGDs in [ 17 ], and show that the number of Skolem terms generated by the effects of actions is bounded by a polynomial in the size of A0. Differently from [ 17 ], we cannot rely on the notion of homomorphism to stop firing actions, but have to use membership of the new set of facts in the previous ones. tu

As a direct consequence of Lemma 2 and Lemma 3, for weakly-acyclic semantic artifacts verification is decidable.

Theorem 4. decidable.

L model checking of processes over weakly-acyclic semantic artifacts is Proof (sketch). The result follows by observing that every state generated by the execution of any process over a weakly-acyclic semantic artifact S is a subset of Amax , which by Lemma 3 is finite. Hence, we can apply a model checking procedure for -calculus on finite-state systems [ 7 ]. tu Note that the proof of Theorem 4 is giving us a single exponential upper bound (in the size of A0) for L model checking involving weakly-acyclic semantic artifacts. In this paper we have studied verification of processes over semantic artifacts. We obtain an interesting decidability result by relying on the notion of inflationary approximate, which allows for a connection with the theory of chase of TGDs in relational databases. We close by observing that while we fully used the ontology for querying the artifact state, we use it in a limited way when updating the state, namely only to guarantee consistency. Ontology update has its own semantic and computational difficulties, see e.g., [ 19 ], which our approach sidesteps. However, if one could introduce a suitable notion of inflationary approximate in that case, the approach presented here could be used to devise decidable cases.

1. Cohn , D. , Hull , R.: Business artifacts: A data-centric approach to modeling business operations and processes . IEEE Bull. on Data Engineering 32 ( 3 ) ( 2009 ) 3 - 9

2. Clarke , E.M. , Grumberg , O. , Peled , D.A. : Model checking . The MIT Press ( 1999 )

3. Cangialosi , P. , De Giacomo , G. , De

Masellis

, R. , Rosati , R.: Conjunctive artifact-centric services . In: Proc. of ICSOC 2010 . Volume 6470 of LNCS ., Springer ( 2010 ) 318 - 333

4. Damaggio , E. , Deutsch , A. , Vianu , V. : Artifact systems with data dependencies and arithmetic . In: Proc. of ICDT 2011 . ( 2011 ) 66 - 77

5. Calvanese , D. , De Giacomo , G. , Lembo , D. , Lenzerini , M. , Rosati , R. : Tractable reasoning and efficient query answering in description logics: The DL-Lite family . J. of Automated Reasoning 39 ( 3 ) ( 2007 ) 385 - 429

6. Calvanese , D. , De Giacomo , G. , Lembo , D. , Lenzerini , M. , Rosati , R.: EQL-Lite: Effective first-order query processing in description logics . In: Proc. of IJCAI 2007 . ( 2007 ) 274 - 279

7. Emerson , E.A. : Automated temporal reasoning about reactive systems . In: Logics for Concurrency: Structure versus Automata . Volume 1043 of LNCS . Springer ( 1996 ) 41 - 101

8. Kolaitis , P.G.: Schema mappings, data exchange, and metadata management . In: Proc. of PODS 2005 . ( 2005 ) 61 - 75

9. Katsuno , H. , Mendelzon , A. : On the difference between updating a knowledge base and revising it . In: Proc. of KR'91 . ( 1991 )

10. Reiter , R. : Knowledge in Action: Logical Foundations for Specifying and Implementing Dynamical Systems . The MIT Press ( 2001 )

11. Chandra , A.K. , Merlin , P.M.: Optimal implementation of conjunctive queries in relational data bases . In: Proc. of STOC'77 . ( 1977 ) 77 - 90

12. Milner , R.: An algebraic definition of simulation between programs . In: Proc. of IJCAI'71 . ( 1971 ) 481 - 489

13. Abiteboul , S. , Hull , R. , Vianu , V. : Foundations of Databases. Addison Wesley ( 1995 )

14. Beeri , C. , Vardi , M.Y.: The implication problem for data dependencies . In: Proc. of ICALP'81 . Volume 115 of LNCS., Springer ( 1981 ) 73 - 85

15. Tarski , A. : A lattice-theoretical fixpoint theorem and its applications . Pacific J. of Mathematics 5 ( 2 ) ( 1955 ) 285 - 309

16. Gurevich , Y. , Shelah , S. : Fixed-point extensions of first order logic . Annals of Pure and Applied Logics 32 ( 1986 ) 265 - 280

17. Fagin , R. , Kolaitis , P.G. , Miller , R.J. , Popa , L. : Data exchange: Semantics and query answering . Theor. Comp. Sci . 336 ( 1 ) ( 2005 ) 89 - 124

18. Meier , M. , Schmidt , M. , Lausen , G.: On chase termination beyond stratification . PVLDB 2 ( 1 ) ( 2009 ) 970 - 981

19. Calvanese , D. , Kharlamov , E. , Nutt , W. , Zheleznyakov , D. : Evolution of DL-Lite knowledge bases . In: Proc. of ISWC 2010 . Volume 6496 of LNCS ., Springer ( 2010 ) 112 - 128