=Paper= {{Paper |id=None |storemode=property |title=Verification of Conjunctive-Query Based Semantic Artifacts |pdfUrl=https://ceur-ws.org/Vol-745/paper_3.pdf |volume=Vol-745 |dblpUrl=https://dblp.org/rec/conf/dlog/HaririCGM11 }} ==Verification of Conjunctive-Query Based Semantic Artifacts== https://ceur-ws.org/Vol-745/paper_3.pdf
                Verification of Conjunctive-Query Based
                           Semantic Artifacts?

                         Babak Bagheri Hariri1 , Diego Calvanese1 ,
                     Giuseppe De Giacomo2 , and Riccardo De Masellis2
            1
          KRDB Research Centre                             Dip. di Informatica e Sistemistica
    Free University of Bozen-Bolzano                         Sapienza Università di Roma
         I-39100 Bolzano, Italy                           Via Ariosto, 25, 00185 Rome, Italy
{bagheri,calvanese}@inf.unibz.it                          lastname@dis.uniroma1.it



         Abstract. We introduce semantic artifacts, which are a mechanism that provides
         both a semantically rich representation of the information on the domain of interest
         in terms of an ontology, including the underlying data, and a set of actions to
         change such information over time. In this paper, the ontology is specified as a
         DL-Lite TBox together with an ABox that may contain both (known) constants
         and unknown individuals (labeled nulls, represented as Skolem terms). Actions are
         specified as sets of conditional effects, where conditions are based on conjunctive
         queries over the ontology (TBox and ABox), and effects are expressed in terms of
         new ABoxes. In this setting, which is obviously not finite state, we address the
         verification of temporal/dynamic properties expressed in µ-calculus. Notably, we
         show decidability of verification, under a suitable restriction inspired by the notion
         of acyclicity in data exchange.


1      Introduction
The artifact-centric approach to service modeling, introduced recently, considers both
data and processes as first-class citizens in service design and analysis. Artifacts are
advocated as a sort of middle ground between a conceptual formalization of a dynamic
system and an actual implementation of the system itself [1]. The verification of temporal
properties in the presence of data represents a significant challenge for the research
community, since the system becomes infinite-state, and hence the usual analysis based
on model checking of finite-state systems [2] is impossible in general. Recently, there
have been some advancements on this issue, considering suitably constrained relational
database settings for the data component (which acts also as a data storage for the
process), see e.g., [3,4].
    In this paper, we follow the line of [3], and rely on the work in knowledge representa-
tion to propose a more conceptual treatment of artifacts. We do so by enriching artifacts
with a semantic layer constituted by a full-fledged ontology expressed in description
logics. In particular, our semantic artifacts include an ontology specified in DL-LiteR [5],
which is the core of the web ontology language OWL2 QL1 , and it is particularly well
suited for data management. The TBox of the ontology captures intensional information
?
     This work has been supported by the EU FP7-ICT Project ACSI (257593).
 1
     http://www.w3.org/TR/owl2-profiles/
2        B. Bagheri Hariri, D. Calvanese, G. De Giacomo, and R. De Masellis

on the domain of interest, similarly to conceptual data models, such as UML class
diagram, though as a software component to be used at runtime. The ABox, which acts
as the artifact state, maintains the data of interest, which are accessed by relying on
query answering through ontologies. As a query language, we use unions of conjunctive
queries, possibly composing their certain answers through full FOL constructs [6]. A
semantic artifact has associated actions whose execution changes the state of the artifact,
i.e., its ABox. Such actions are specified as sets of conditional effects, where conditions
are queries over the ontology and effects are expressed in terms of new ABoxes. To
capture data that are acquired from external users/environments during the execution of
actions, such ABoxes may contain special constants called labeled nulls, represented as
Skolem terms. These represent individuals that the artifact does not know, but on which
it knows some facts. Actions have no pre-condition, instead processes over the semantic
artifact are used to specify which actions can be executed at each step. We model such
processes as condition/action rules, where the condition is again expressed as a query
over the ontology.
     In this setting, which is obviously not finite state, we address the verification of
temporal/dynamic properties expressed in µ-calculus [7], where atomic formulas are
queries over the ontology that can refer only to known individuals. Unsurprisingly, we
show that even for very simple semantic artifacts and dynamic properties verification
is undecidable. However, we also show that for a very rich class of semantic artifacts,
verification is indeed decidable and reducible to finite-state model checking. To obtain
this result, we rely on recent results on the finiteness of the chase of tuple-generating
dependencies in the data exchange literature [8].

2    Preliminaries
As ontology language, we make use of DL-LiteR , a member of the DL-Lite family [5],
and hence a tractable DL particularly suited for dealing with ontologies (or KBs) with
very large ABoxes, which can be managed through relational database technology. DL-
LiteR is also the core of the standard web ontology language OWL2 QL. In DL-LiteR ,
concepts and roles are formed according to the following syntax:
             B ::= N | ∃U                                   U ::= P | P −
             C ::= B | ¬B | ∃U .B                           V ::= U | ¬U
N , B, and C respectively denote a concept name, a basic concept, and an arbitrary
concept. P , P − , U , and V respectively denote a role name, an inverse role, a basic role,
and an arbitrary role. A DL-LiteR ontology is a pair (T, A), where T is a TBox, i.e., a
finite set of (concept and role) inclusion assertions of the forms B v C and U v V ;
and A is an ABox, i.e., a finite set of facts (also called membership assertions) of the
forms N (c1 ) and P (c1 , c2 ), where N and P occur in T , and c1 and c2 are constants.
We denote with CA the set of constants appearing in A. The semantics of a DL-LiteR
ontology is the usual one for DLs, see [5]. Notice that in DL-LiteR the unique name
assumption is immaterial, since there is no way of deducing facts about equality. We say
that A is consistent wrt T if (T, A) is satisfiable, i.e., admits at least one model.
    A union of conjunctive queries (UCQ) q over a DL-LiteR TBox T and constants C
is a FOL formula of the form ∃y 1 .conj 1 (x, y 1 ) ∨ · · · ∨ ∃y n .conj n (x, y n ), with free
                           Verification of Conjunctive-Query Based Semantic Artifacts             3

variables x, existentially quantified variables y 1 , . . . , y n . Each conj i (x, yi ) in q is a
conjunction of atoms of the form N (z), P (z, z 0 ), z = z 0 , where N and P respectively
denote a concept and a role name occurring in T , and z, z 0 are constants in C or variables
in x or yi , for some i ∈ {1, . . . , n}. Given constants C (typically CA ), the (certain-
)answers to q over (T, A) wrt C is the set ans C (q, T, A) of substitutions2 θ of the free
variables of q with constants in C such that qθ evaluates to true in every model of (T, A).
We also consider an extension of UCQs, called ECQs, which are the range-restricted
queries of the query language EQL-Lite(UCQ) [6], that is, the FOL query language
whose atoms are UCQs evaluated according to the certain answer semantics above.
Formally, an ECQ over T and C is a possibly open formula of the form
                           Q ::= q | Q1 ∧ Q2 | ¬Q | ∃x.Q,
where q denotes a UCQ over T and C, with the proviso that every free variable of a
negative subquery, i.e., of the form ¬Q, must occur in a positive subquery (to guarantee
range-restriction). Given constants C the answer to Q over (T, A) wrt C, is the set
ans C (Q, T, A) of tuples of constants in C obtained by evaluating the FOL formula Q in
the standard way, while considering each UCQ q appearing in it as a primitive predicate
with extension ans C (q, T, A). For the connection with epistemic logic, see [6].

3    Semantic Artifacts
A semantic artifact S = (T, A0 , R) is a stateful device constituted by the information
ontology (T, A0 ) and the action specification R.
  – T is a DL-LiteR TBox, fixed once and for all, and capturing the intensional knowl-
     edge about the domain modeled by the semantic artifact.
  – A0 is a DL-LiteR ABox, which expresses extensional information, and constitutes
     the initial state of the artifact. We call proper constants the constants CA0 in A0 , and
     labeled nulls all new constants introduced by action execution.
  – R is a set of actions, which change the state of the semantic artifact, i.e., the
     extensional information component.
    An action ρ is constituted by a signature and an effect specification. The action signa-
ture is constituted by a name and a list of individual input parameters. Such parameters
need to be substituted by constants for the execution of the action. Given a substitution θ
for the input parameters, we denote by ρθ the action with actual parameters.3
    The effect specification consists of a set {e1 , . . . , en } of effects, which are assumed
to take place simultaneously. Their formalization is inspired by the notion of mappings
in data exchange [8]. Specifically, an effect ei has the form qi+ ∧ Q−      i     A0i where:
      +       −
  – qi ∧ Qi is a query over T and CA0 with x as free variables, that may include some
     of the input parameters as query constants, qi+ is a UCQ, and Q−          i is an arbitrary
     ECQ whose free variables are included in those of qi+ , namely in x. Intuitively, qi+
     selects the tuples to instantiate the effect, and Q−  i filters aways some of them.
 2
   As customary, we can view each substitution simply as a tuple of constants, assuming some
   ordering of the free variables of q.
 3
   We disregard a specific treatment of the output to the user, and assume instead that the user can
   freely pose queries to the ontology and thus extract implicit or explicit information from the
   states through which the semantic artifact evolves.
4        B. Bagheri Hariri, D. Calvanese, G. De Giacomo, and R. De Masellis

                                                            Customer
                                                 < peer
                                    < closed
                                                     In Debt         Gold
                           Loan
                                   1..* < owes      Customer       Customer



                              Fig. 1. A semantic artifact ontology


  – A0i is a set of facts over T , which include as constants: constants in CA0 , parameters,
     free variables of qi+ , and implicitly existentially quantified variables.
    Given a state A of S, and a substitution σ for the parameters of the action ρ, the
effect ei extracts from A the set ans CA ((qi+ ∧ Q− i )σ, T, A) of tuples of constants (proper
constants and labeled nulls), and for each such tuple θ asserts a set A0i σθ of facts obtained
from A0i σ by applying the substitution θ for the free variables of qi+ . For each existentially
quantified variable z in A0i σ, a labeled null is introduced having the form fz,ei (x)σθ,
where fz,ei (x) is a Skolem function, depending on the existential variable z and the
effect ei , having as argumentsSthe free variables x of qi+ . We denote by ei σ(A) the overall
                                                      0
set of facts, i.e., ei σ(A) = θ∈ans C (Qi σ,T,A) Ai σθ. The overall effect of the action ρ
                                        A                                  S
with parameter substitution σ over A is a new state do(ρσ, T, A) = 1≤i≤n ei σ(A) for
S. Notice that do(ρσ, T, A) may be inconsistent wrt T . In this case, the action ρ with σ
over A is not executable.
    Let’s make some observations on such actions. The effects of an action are naturally a
form of update of the previous state, and not of belief revision [9]. That is, we never learn
new facts on the state in which an action is executed, but only on the state resulting from
the action execution. We also observe that existentially quantified variables introduced by
actions effects are used as witnesses of values chosen by the external user/environment
when executing the action. We assume that such a choice depends only on the specific
effects and the information retrieved by the query in the premises. We model such a
choice by introducing suitable labeled nulls generated by appropriate Skolem functions.
Finally, we observe that we do not make any persistence (or frame) assumption in our
formalization [10]. In principle at every move we substitute the whole old state, i.e.,
ABox, with a new one. On the other hand, it should be clear that we can easily write
effect specifications that copy big chunks of the old state into the new one. For example,
P (x, y)      P (x, y) copies the entire set of assertions involving the role P .
Example 1. Let us consider a semantic artifact S = (T, A0 , R), where T is the DL-LiteR
formalization of the UML diagram in Figure 1, which can be described as follows. A bank
considers two specific types of customers: in-debt customers have a loan with the bank, while
gold customers have access to special privileges. In-debt customers may have closed their loan.
A relationship peer(c, p) between two customers denotes that p can vouch for c. The initial state
is A0 = {Gold(john), Cust(ann), peer(mark, john)}. R includes the following actions (we use
brackets to isolate UCQs in ECQs):

       GetLoan(c) : { [∃p.peer(c, p) ∧ Gold(p)]            {owes(c, newl (c))},    CopyAll }

That is, customer c gets a loan provided that s/he has a peer that is “gold”. We use CopyAll as a
shortcut to denote effects that copy all concepts and roles.

              CloseAllLoans(c) : { [owes(c, l)]           {closed(c, l)},     CopyAll }
                            Verification of Conjunctive-Query Based Semantic Artifacts         5

That is, customer c closes all his/her loans which are moved to the closed relation.
         UpdateDebts : { [owes(x, l)] ∧ ¬[closed(x, l)]   {owes(x, l)},
                         [InDebt(x)] ∧ ∀l.[owes(x, l)] ⊃ [closed(x, l)]       {Cust(x)},
                         CopyAllExceptOwesClosedInDebt }
That is, “remove” from owes those tuples that are in closed, and remove in-debt customers whose
loans are all closed from InDebt, keeping them in Cust. CopyAllExceptOwesClosedInDebt
copies everything except owes, closed, and InDebt.


4     Processes
Notice that semantic artifacts include information and actions to change such information.
However, they do not say anything about how or when to apply a certain action. In
other words, semantic artifact are agnostic to processes that use them. Processes over a
semantic artifact S = (T, A0 , R) are (possibly nondeterministic) programs that use the
state of S (accessed through T ) to store their (intermediate and final) computation results,
and the actions in R as atomic instructions. The state A can be arbitrarily queried through
query answering over T , while it can be updated only through the actions in R. There
are many ways to specify processes over S. Here we adopt a rule-based specification.
    A condition/action rule π for a semantic artifact S is an expression of the form
Q 7→ ρ, where ρ is an action in R and Q is an ECQ over T and CA0 , whose free variables
are exactly the parameters of ρ. Such a rule expresses that, for each tuple θ for which
condition Q holds, the action ρ with actual parameters θ can be executed.
    A process is a finite set Π = {π1 , . . . , πn } of rules. Notice that processes don’t
force the execution of actions but constrain them: the user of the process will be able to
choose any of the actions that the rules forming the process allow. Notice also that our
processes inherit entirely their states from the semantic artifact S, see e.g., [1]. Other
choices are also possible: the process could maintain its own state besides the one of the
semantic artifact. As long as such an additional state is finite, or embeddable into the
semantic artifact itself, the results here would easily extend to such a case.
    The execution of a process Π over a semantic artifact S is defined as follows: we
start from the initial state A0 of the artifact, and for each rule Q 7→ ρ in Π, evaluate Q,
and for each tuple θ returned execute ρθ, obtaining a new state A0 = do(ρθ, T, A0 ) if
A0 is consistent wrt T (i.e., ρθ is actually executable), and so on. In this way we build a
transition system Υ (Π, S) whose states represent possible artifact states and where each
transition represents the execution of an instantiated action that is allowed according
to the process. Υ (Π, S) captures the behavior of the process Π over the artifact S. In
principle we can model-check such a transition system to verify dynamic properties. This
is exactly what we are going to do next. However, one has to consider that in general
such a transition system is infinite, so the classical results on model checking [2], which
are developed for finite transition systems, do not apply.
Example 2. Referring to the example above, a process Π might include the following rules:
    – [Cust(x)] ∧ ¬[∃y.owes(x, y)] 7→ GetLoan(x), i.e., a customer can get a loan if she does not
      have one already;
    – ∃y.([owes(x, y)] ∧ ¬[closed(x, y)]) 7→ CloseAllLoans(x), i.e., a customer that owes loans
      that are not closed, can close them all at once;
    – true 7→ UpdateDebts, i.e., it is always possible to perform UpdateDebts.
6       B. Bagheri Hariri, D. Calvanese, G. De Giacomo, and R. De Masellis

5   Verification Formalism
We turn to the verification formalism to be used to specify dynamic properties of
processes running over semantic artifacts. Several choices are possible. Here we focus
on a variant of µ-calculus [7], which is one of the most powerful temporal logics that
subsumes both linear time logics, such as LTL and PSL, and branching time logics such
as CTL and CTL* [2]. In particular, we introduce a variant of µ-calculus, called µL that
conforms with the basic assumption of our formalism, namely the use of ECQs to talk
about the semantic information contained in the state of the artifact. Formally, given a
semantic artifact S = (T, A0 , R), formulas of µL over S have the following form:

             Φ ::= Q | ¬Φ | Φ1 ∧ Φ2 | ∃x.Φ | 2Φ | 3Φ | µZ.Φ | νZ.Φ | Z

where Q is an ECQ over T and CA0 (but not labeled nulls), and Z is a predicate variable.
    The symbols µ and ν can be considered as quantifiers, and we make use of the
notions of scope, bound and free occurrences of variables, closed formulas, etc. The
definitions of these notions are the same as in first-order logic, treating µ and ν as
quantifiers. In fact, we are interested only in closed formulas as specification of temporal
properties to verify. For formulas of the form µZ.Φ and νZ.Φ, we require the syntactic
monotonicity of Φ wrt Z: every occurrence of the variable Z in Φ must be within the
scope of an even number of negation signs. In µ-calculus, given the requirement of
syntactic monotonicity, the least fixpoint µZ.Φ and the greatest fixpoint νZ.Φ always
exist. In order to define the meaning of such formulas we resort to transition systems.
Let A = Υ (Π, S) be the transition system for a process Π over a semantic artifact
S = (T, A0 , R). We denote by ΣA the states of A. Let V be a predicate and individual
variable valuation on A, i.e., a mapping from the predicate variables Z to subsets of the
states ΣA , and from individual variables to constants in CA0 . Then, we assign meaning to
µ-calculus formulas by associating to Υ (Π, S) and V an extension function (·)A     V , which
maps µ-calculus formulas to subsets of ΣA . The extension function (·)A         V is defined
inductively as follows:

            (Q)A
               V        = {A ∈ ΣA | ans CA0 (QV , T, A)}
            (Z)A
               V        = Z V ⊆ ΣA
            (¬Φ)AV      = ΣA − (Φ)A V
                      A
                          (Φ1 )A
            (Φ1 ∧ Φ2 )V = S             A
                               V ∩ (Φ2 )V
                   A             A
            (∃x.Φ)V     = {(Φ)V[x/c] | c ∈ CA0 }
            (3Φ)VA
                        = {A ∈ ΣA | ∃A0 . A ⇒A A0 and A0 ∈ (Φ)A      V}
                 A                       0           0           0
            (2Φ)V       = {A  ∈ Σ A | ∀A   . A ⇒A  A   implies A   ∈ (Φ)A
                                                                        V}
                   A                         A
                          T
            (µZ.Φ)V = {E ⊆ ΣA | (Φ)V[Z/E] ⊆ E}
            (νZ.Φ)A     = {E ⊆ ΣA | E ⊆ (Φ)A
                          S
                   V                             V[Z/E] }

Here A ⇒A A0 holds iff there exists a rule Q 7→ ρ in Π such that there exists a
θ ∈ ans CA (Q, T, A) and A0 = do(ρθ, T, A0 ). That is, there exist a rule in Π that can
fire on A and produce an instantiated action ρθ, which applied on A has A0 as effect.
     Intuitively, the extension function (·)AV assigns to the various µ-calculus constructs
the following meanings. The boolean connectives have the expected meaning, while
quantification is restricted to constants of CA0 , which are the only constants that the ECQs
                           Verification of Conjunctive-Query Based Semantic Artifacts           7

in the formula can retrieve. We also use the usual FOL abbreviations. The extension of
3Φ consists of the states A such that for some state A0 with A ⇒A A0 , we have that Φ
holds in A0 . While the extension of 2Φ consists of the states A such that for all states
A0 with A ⇒A A0 , we have that Φ holds in A0 . The extension of µZ.Φ is the smallest
subset Eµ of ΣA such that, assigning to Z the extension Eµ , the resulting extension of Φ
is contained in Eµ . That is, the extension of µX.Φ is the least fixpoint of the operator
λE.(Φ)A V[Z/E] (here V[Z/E] denotes the predicate valuation obtained from V by forcing
the valuation of Z to be E). Similarly, the extension of νX.Φ is the greatest subset Eν
of ΣA such that, assigning to X the extension Eν , the resulting extension of Φ contains
Eν . That is, the extension of νX.Φ is the greatest fixpoint of the operator λE.(Φ)A
                                                                                   V[X/E] .
                                    A                                              A
When Φ is a closed formula, (Φ)V does not depend on V, and we denote it by Φ .
     The reasoning problem we are interested in is model checking, i.e., verify whether
a µL closed formula Φ holds for the process Π over the semantic artifact S. Formally,
such a problem is defined as checking whether A0 ∈ ΦA , that is, whether Φ is true in the
root of the initial state A0 of transition system Υ (Π, S).
Example 3. Continuing on our running example, we consider the following simple safety prop-
erty: It is always true that gold customers in A0 remain so. This property can be written as:

                           ∀x.([Gold(x)] ⊃ νZ.([Gold(x)] ∧ 2Z)).

This formula is true, since no action (among the ones above) removes individuals from being Gold.
    Next, we consider a simple liveness property: It is possible to reach a state in which a gold
customer is also an in-debt customer.

                            µZ.([∃x.Gold(x) ∧ InDebt(x)] ∨ 3Z)

This formula is true, because the ontology implies that if x participates to owes then x is an
instance of InDebt; and we can reach a state in which ∃x.Gold(x) ∧ owes(x, y) holds by firing
the action GetLoan(john), which is allowed by the process.


6    Homomorphism and Bisimulation
We want to understand when two ABoxes A1 and A2 over a common DL-LiteR TBox
T provide the same answers to all EQL queries. Given two relational structures I1
and I2 over the same set of relations, and a set C of constants, a C-homomorphism
h from I1 to I2 is a mapping from the domain of I1 to the domain of I2 that
preserves constants in C and relations, i.e., h(cI1 ) = cI2 for each c ∈ C, and if
(d1 , . . . , dn ) ∈ rI1 , then (h(d1 ), . . . , h(dn )) ∈ rI2 , for each relation r. Then, we
say that there is a C-homomorphism from A1 to A2 wrt T , denoted A1 →CT A2 , iff
there is a C-homomorphism from IA1 to each model of (T, A2 ), where IA1 is the
structure whose domain is CA1 , whose constants are interpreted as themselves, and
N IA1 = {c | N (c) ∈ A1 } for each concept name N , similarly for role names. This
property can be checked by resorting to query answering over ontologies. For the ABox
A1 , let QA1 be the boolean conjunctive query obtained as the conjunction of all facts in
A1 , in which the constants not in C are treated as existentially quantified variables.
Lemma 1. Given a DL-LiteR TBox T , two ABoxes A1 and A2 over T , and a set C of
constants, we have that A1 →CT A2 iff ans C (QA1 , T, A2 ) = true.
8       B. Bagheri Hariri, D. Calvanese, G. De Giacomo, and R. De Masellis

Proof (sketch). We remind that a DL-LiteR ontology (T, A) has a unique (up to re-
naming of domain elements) canonical-model [5], which is the model that has a CA -
homomorphism to each model of (T, A). By composing homomorphisms, we have that
A1 →CT A2 iff there is a C-homomorphism from IA1 to the canonical model of (T, A2 ).
The claim then follows by considering that there is a C-homomorphism from IA1 to the
canonical model of (T, A2 ) iff ans C (QA1 , T, A2 ) = true [11,5].                t
                                                                                   u

A1 and A2 are said C-homomorphically equivalent wrt T if A1 →CT A2 and A2 →CT A1 .

Theorem 1. Let C be a set of constants, and A1 , A2 two ABoxes that are C-
homomorphically equivalent wrt a TBox T . Then, for every ECQ Q over T using
only constants in C, we have that ans C (Q, T, A1 ) = ans C (Q, T, A2 ).

    Next we want to capture when two states of a single transition system or more
generally of two transition systems, possibly obtained by applying different processes to
different semantic artifacts sharing the same TBox and constants in the initial state, can
be considered behaviourally equivalent, in the sense that they satisfy exactly the same
µL formulas. To formally capture such an equivalence, we make use of the notion of
bisimulation [12], suitably extended to deal with query answering over ontologies.
    Given two artifact transition systems A1 = Υ (Π1 , S1 ) (with states ΣA1 ) and A2 =
Υ (Π2 , S2 ) (with states ΣA2 ) such that S1 = (T, A10 , R1 ) and S2 = (T, A20 , R2 ) share
the same TBox T and the same constants C = CA10 = CA20 , a bisimulation is a relation
B ⊆ ΣA1 × ΣA2 such that: (A1 , A2 ) ∈ B implies that:
 1. A1 and A2 are C-homomorphically equivalent wrt T ;
 2. if A1 ⇒A1 A01 then there exists A02 such that A2 ⇒A2 A02 and (A01 , A02 ) ∈ B;
 3. if A2 ⇒A2 A02 then there exists A01 such that A1 ⇒A1 A01 and (A01 , A02 ) ∈ B.
We say that two states A1 and A2 are bisimilar, if there exists a bisimulation B such that
(A1 , A2 ) ∈ B. Two transition systems A1 with initial state A10 and A2 with initial state
A20 are bisimilar if (A10 , A20 ) ∈ B.
    The following theorem states that the formula evaluation in µL is indeed invariant
wrt bisimulation, so we can equivalently check any bisimilar transition systems.

Theorem 2. Let A1 and A2 be two transition systems obtained from two semantic
artifacts sharing the same TBox and constants. Then, for two states A1 of A1 and A2 of
A2 (including the initial ones) are bisimilar iff for all µL closed formulas Φ over the
two semantic artifacts, we have that A1 ∈ (Φ)A1 iff A2 ∈ (Φ)A2 .

Proof. The proof is analogous to the standard proof of bisimulation invariance of µ-
calculus [7], though taking into account our specific definition of bisimulation, using
Theorem 1 to guarantee that ECQs are evaluated identically over bisimilar states.     t
                                                                                      u


7    Undecidability and Decidability
We now show that, not surprisingly, verification in the infinite state setting we considered
is undecidable, but that it becomes decidable under some interesting conditions inspired
by the recent literature on data exchange [8]. Our results rely on the possibility of
building special semantic artifacts that we call “inflationary approximates”. For such
                            Verification of Conjunctive-Query Based Semantic Artifacts             9

special artifacts there exists a tight correspondence between the application of an action
and a step in the chase of a set of tuple-generating dependencies (TGDs) [13,8]
     Given a semantic artifact S = (T, A0 , R), its inflationary approximate is the seman-
tic artifact S + = (T + , A0 , R+ ) defined as follows. T + is obtained from T by dropping
all assertions involving negation on the right-hand side, thus obtaining a TBox formed
by positive inclusions only. R+ is formed by one action specification ρ+ for each action
specification ρ ∈ R, where ρ+ is obtained from ρ by:
  – removing all input parameters from the signature – note that the variables in qi+ that
     used to be parameters in ρ, become free variables in ρ+ ;
  – substituting each effect ei : qi+ ∧ Q−  i    A0i with ei : qi+    A0i – note that we need
     to preserve the Skolem functions name in the transformation;
  – adding effects to copy all concept and role names, namely adding an effect N (x)
     N (x) for each concept name N of T , and an effect P (x, y)            P (x, y) for each
     role name P of T .
Observe that executing actions in S + can never give rise to an inconsistency, since T +
does not contain any negative information [5].
     We also consider the generic process Π> , in which all condition/action rules have the
trivially true condition. Hence, Π> allows for executing every action at every step. With
these notions in place, it is easy to prove that verification in this setting is undecidable.
Theorem 3. µL model checking of processes over semantic artifacts is undecidable.
Proof (sketch). We show that it is already undecidable to check, given a semantic artifact
S∅+ = (∅, A0 , R), of the form of an inflationary approximate of an artifact with an empty
TBox, and the transition system A = Υ (Π> , S∅+ ), whether A0 ∈ µZ(q ∨ 3Z)A , where
q is a boolean UCQ. We observe that the set of all actions in S + can be seen as a set
of TGDs, indeed it suffices to consider one TGD for each disjunct in the UCQ on the
left-hand side of an effect specification. So, we can reduce to the above model checking
problem the problem of answering boolean UCQs in a relational database under a set of
TGDs, which is undecidable [14]                                                         t
                                                                                        u
    Next, we observe a notable property of the transition system Υ (Π> , S + ) generated
by the generic process Π> over the inflationary approximate S + of a semantic artifact
S. Namely, each do(ρ+ , T, ·) is a monotonic operator. This implies that by repeatedly
applying such operators starting from the ABox A0 in S + , we get at the limit (possibly
transfinite) a single ABox Amax , which is the least fixpoint of such operators taken col-
lectively [15,16]. Such an ABox contains, every fact generated by repeatedly executing
actions from the inflationary approximate S + , that is every state A+ of Υ (Π> , S + ) is
such that A+ is contained in Amax . More interestingly, we show next that Amax contains
also every A generated by repeatedly executing actions from the original S.
Lemma 2. Let S = (T, A0 , R) be a semantic artifact and Π a process over S. Then
every state A of the transition system Υ (Π, S) is a subset of Amax defined as above.
Proof (sketch). We can show by induction that, for every sequence of actions
ρ1 θ1 , ρ2 θ2 , . . . , ρn θn generated by the process Π starting from A0 , the resulting state
do(ρn θn , T, do(· · · do(ρ2 θ2 , T, do(ρ1 θ1 , T, A0 )) · · · )) is a subset of the corresponding
                                               +           +
resulting state do(ρ+          +                    +            +
                            n T , do(· · · do(ρ2 , T , do(ρ1 , T , A0 )) · · · )) of the inflationary
approx., which is a subset of Amax .                                                               t
                                                                                                   u
10      B. Bagheri Hariri, D. Calvanese, G. De Giacomo, and R. De Masellis

    In other words, Amax generated by the generic process Π> running over the infla-
tionary approximate S + of a semantic artifact S, bounds all states A that any process Π
can generate by running on S. Hence if for any reason Amax is finite, then the transition
system Υ (Π, S) is finite. Hence, being model checking of finite transition system de-
cidable (in fact polynomial in the size of the transition system), we get that also model
checking of Υ (Π, S) is decidable.
    To get finiteness guarantees on Amax , we take advantage of the correspondence
between action execution and TGDs chase steps, as in the proof of Theorem 3. We build
on this correspondence by further considering that in DL-LiteR , every UCQ q over a
TBox can be rewritten as a new UCQ rew (q) over the same alphabet, to be evaluated
over the ABox considered as a relational database [5] (that is dropping the TBox).
    In the literature for data exchange, several conditions that guarantee the finiteness
of the chase of TGDs have been considered [17,18]. Here we focus on the original
notion of weakly-acyclic TGDs [17]. Weak-acyclicity is a syntactic notion that involves
the so-called dependency graph of the set of TGDs. Informally, a set D of TGDs is
weakly-acyclic if there are no cycles in the dependency graph of D involving “existential”
relation positions. The key property of weakly-acyclic TGDs is that chasing a relational
database with them (i.e., applying them in all possible ways) generates a set of facts (a
database) that is finite. See [17] for details.
    Given a semantic artifact S = (T, A0 , R) and its inflationary approximate S + =
(T , A0 , R+ ), we define the set E∅+ of effect specifications that includes an effect
   +

rew (qi+ )    Ai for each effect qi+     Ai of an action ρ+ ∈ R+ . Notice that the set E∅+
can be seen as a set of TGDs. We say that a semantic artifact S is weakly-acyclic if the
set E∅+ seen as a set of TGDs is weakly-acyclic. (Note that the semantic artifact in our
example is trivially weakly-acyclic.)
Lemma 3. Let S = (T, A0 , R) be a weakly-acyclic semantic artifact and S + its infla-
tionary approximate. Then Amax computed as above for S + is finite.
Proof (sketch). We have to show that starting from A0 we get to the least fixpoint Amax
of the do(ρ+ , T, ·) operator in a finite number of steps. To do so, we follow the line
of the proof of finiteness of chase of weakly-acyclic TGDs in [17], and show that the
number of Skolem terms generated by the effects of actions is bounded by a polynomial
in the size of A0 . Differently from [17], we cannot rely on the notion of homomorphism
to stop firing actions, but have to use membership of the new set of facts in the previous
ones.                                                                                   t
                                                                                        u
     As a direct consequence of Lemma 2 and Lemma 3, for weakly-acyclic semantic
artifacts verification is decidable.
Theorem 4. µL model checking of processes over weakly-acyclic semantic artifacts is
decidable.
Proof (sketch). The result follows by observing that every state generated by the exe-
cution of any process Π over a weakly-acyclic semantic artifact S is a subset of Amax ,
which by Lemma 3 is finite. Hence, we can apply a model checking procedure for
µ-calculus on finite-state systems [7].                                             t
                                                                                    u
Note that the proof of Theorem 4 is giving us a single exponential upper bound (in the
size of A0 ) for µL model checking involving weakly-acyclic semantic artifacts.
                            Verification of Conjunctive-Query Based Semantic Artifacts            11

8    Conclusions
In this paper we have studied verification of processes over semantic artifacts. We obtain
an interesting decidability result by relying on the notion of inflationary approximate,
which allows for a connection with the theory of chase of TGDs in relational databases.
We close by observing that while we fully used the ontology for querying the artifact
state, we use it in a limited way when updating the state, namely only to guarantee
consistency. Ontology update has its own semantic and computational difficulties, see
e.g., [19], which our approach sidesteps. However, if one could introduce a suitable
notion of inflationary approximate in that case, the approach presented here could be
used to devise decidable cases.

References
 1. Cohn, D., Hull, R.: Business artifacts: A data-centric approach to modeling business operations
    and processes. IEEE Bull. on Data Engineering 32(3) (2009) 3–9
 2. Clarke, E.M., Grumberg, O., Peled, D.A.: Model checking. The MIT Press (1999)
 3. Cangialosi, P., De Giacomo, G., De Masellis, R., Rosati, R.: Conjunctive artifact-centric
    services. In: Proc. of ICSOC 2010. Volume 6470 of LNCS., Springer (2010) 318–333
 4. Damaggio, E., Deutsch, A., Vianu, V.: Artifact systems with data dependencies and arithmetic.
    In: Proc. of ICDT 2011. (2011) 66–77
 5. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning
    and efficient query answering in description logics: The DL-Lite family. J. of Automated
    Reasoning 39(3) (2007) 385–429
 6. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: EQL-Lite: Effective
    first-order query processing in description logics. In: Proc. of IJCAI 2007. (2007) 274–279
 7. Emerson, E.A.: Automated temporal reasoning about reactive systems. In: Logics for
    Concurrency: Structure versus Automata. Volume 1043 of LNCS. Springer (1996) 41–101
 8. Kolaitis, P.G.: Schema mappings, data exchange, and metadata management. In: Proc. of
    PODS 2005. (2005) 61–75
 9. Katsuno, H., Mendelzon, A.: On the difference between updating a knowledge base and
    revising it. In: Proc. of KR’91. (1991)
10. Reiter, R.: Knowledge in Action: Logical Foundations for Specifying and Implementing
    Dynamical Systems. The MIT Press (2001)
11. Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational
    data bases. In: Proc. of STOC’77. (1977) 77–90
12. Milner, R.: An algebraic definition of simulation between programs. In: Proc. of IJCAI’71.
    (1971) 481–489
13. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison Wesley (1995)
14. Beeri, C., Vardi, M.Y.: The implication problem for data dependencies. In: Proc. of ICALP’81.
    Volume 115 of LNCS., Springer (1981) 73–85
15. Tarski, A.: A lattice-theoretical fixpoint theorem and its applications. Pacific J. of Mathematics
    5(2) (1955) 285–309
16. Gurevich, Y., Shelah, S.: Fixed-point extensions of first order logic. Annals of Pure and
    Applied Logics 32 (1986) 265–280
17. Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: Semantics and query answering.
    Theor. Comp. Sci. 336(1) (2005) 89–124
18. Meier, M., Schmidt, M., Lausen, G.: On chase termination beyond stratification. PVLDB
    2(1) (2009) 970–981
19. Calvanese, D., Kharlamov, E., Nutt, W., Zheleznyakov, D.: Evolution of DL-Lite knowledge
    bases. In: Proc. of ISWC 2010. Volume 6496 of LNCS., Springer (2010) 112–128