Dealing with Inconsistencies due to Class Disjointness in SPARQL Update Albin Ahmeti1,3 , Diego Calvanese2 , Axel Polleres3 , and Vadim Savenkov3 1 Vienna University of Technology, Favoritenstraße 9, 1040 Vienna, Austria 2 Faculty of Computer Science, Free University of Bozen-Bolzano, Bolzano, Italy 3 Vienna University of Economics and Business, Welthandelsplatz 1, 1020 Vienna, Austria Abstract. The problem of updating ontologies has received increased attention in recent years. In the approaches proposed so far, either the update language is restricted to (sets of) atomic updates, or, where the full SPARQL Update language is allowed, the TBox language is restricted to RDFS where no inconsistencies can arise. In this paper we discuss directions to overcome these limitations. Starting from a DL-Lite fragment covering RDFS and concept/class disjointness axioms, we define two semantics for SPARQL Update: under cautious semantics, incon- sistencies are resolved by rejecting all updates potentially introducing conflicts; under brave semantics, instead, conflicts are overridden in favor of new informa- tion where possible. The latter approach builds upon existing work on the evolution of DL-Lite knowledge bases, setting it in the context of generic SPARQL updates. 1 Introduction This paper initiates the study of SPARQL updates in the context of potential inconsis- tencies: as a minimalistic ontology language allowing for inconsistencies, we consider RDFSΒ¬ , an extension of RDFS [8] with class disjointness axioms of the form {𝑃 disjointWith 𝑄} from OWL [10], sometimes referred to as negative inclusions or NIs [4], as the corresponding description logic encoding of this statement is 𝑃 βŠ‘ ¬𝑄. As a running example, we assume a triple store 𝐺 with an RDFSΒ¬ ontology (TBox) 𝒯 encoding an educational domain, asserting a range restriction plus mutual disjointness of the concepts like professor and student (we use Turtle syntax [2], in which dw abbreviates OWL’s disjointWith keyword, and dom and rng respectively stand for the domain and range keywords of RDFS). 𝒯 = {:studentOf dom :Student. :studentOf rng :Professor. :Professor dw :Student. } Consider the following SPARQL update [6] request 𝑒 in the context of the TBox 𝒯 : INSERT {?X :studentOf ?Y} WHERE {?X :attendsClassOf ?Y} Consider an ABox with data on student tutors that happen to attend each other’s classes: π’œ1 = {:jimmy :attendsClassOf :ann. :ann :attendsClassOf :jimmy}. Here, 𝑒 would create two assertions :jimmy :studentOf :ann and :ann :studentOf :jimmy. Due to the range and domain constraints in 𝒯 , these assertions result in clashes both for Jimmy and for Ann. Note that all inconsistencies Table 1. DL-LiteRDFSΒ¬ assertions vs. RDF(S), where 𝐴, 𝐴′ denote concept (or, class) names, 𝑃 , 𝑃 β€² denote role (or, property) names, 𝛀 is the set of IRI constants (excl. the OWL/RDF(S) vocabulary) and π‘₯, 𝑦 ∈ 𝛀 . For RDF(S), we use abbreviations (rsc, sp, dom, rng, a) as introduced in [11]. TBox RDFSΒ¬ TBox RDFSΒ¬ TBox RDFSΒ¬ ABox RDFSΒ¬ β€² β€² β€² β€² 1. 𝐴 βŠ‘ 𝐴 𝐴 sc 𝐴. 3. βˆƒπ‘ƒ βŠ‘ 𝐴 𝑃 dom 𝐴. 5. 𝐴 βŠ‘ ¬𝐴 𝐴 dw 𝐴. 6. 𝐴(π‘₯) π‘₯ a 𝐴. 2. 𝑃 β€² βŠ‘ 𝑃 𝑃 β€² sp 𝑃 . 4. βˆƒπ‘ƒ βˆ’ βŠ‘ 𝐴 𝑃 rng 𝐴. 7. 𝑃 (π‘₯, 𝑦) π‘₯ P 𝑦. are in the new data, and thus we say that 𝑒 is intrinsically inconsistent for the particular ABox π’œ1 . Our solution for such updates will be to discard problematic assignments but keep those that cause no clashes. Now, let π’œ2 be the ABox {:jimmy :attendsClassOf :ann. :jimmy a :Professor}. It is clear that after the update 𝑒, the ABox will become inconsistent, due to the property assertion :jimmy :studentOf :ann, implying that Jimmy is both a professor and a student which contradicts the TBox disjointness axiom. In contrast to the previous case, the clash now is between the prior knowledge and the new data. We propose two update semantics, extending our previous work [1] for dealing with such inconsistencies and provide rewriting algorithms for implementing them using the basic constructs of the SPARQL language (e.g., making use of the UNION, MINUS, FILTER, and OPTIONAL operators). In the remainder of the paper, after some short preliminaries (Sec. 2) we discuss checking of intrinsic inconsistencies in Sec. 3, and then in Sec. 4 we present two semantics for dealing with general inconsistencies in the context of materialized triple stores. An overview of related work and concluding remarks can be found in Sec. 5. 2 Preliminaries We introduce basic notions about RDF graphs, RDFSΒ¬ ontologies, and SPARQL queries. In this paper we use RDF and DL notation interchangeably, treating RDF graphs that do not use non-standard RDFSΒ¬ vocabulary [12] as sets of TBox and ABox assertions. Definition 1 (RDFSΒ¬ ABox, TBox, triple store). We call a set 𝒯 of inclusion asser- tions of the forms 1–5 in Table 1 an (RDFSΒ¬ ) TBox, a set π’œ of assertions of the forms 6–7 in Table 1 an (RDF) ABox, and the union 𝐺 = 𝒯 βˆͺ π’œ an (RDFSΒ¬ ) triple store. Definition 2 (Interpretation, satisfaction, model, consistency). An interpretation ⟨π›₯ℐ , ·ℐ ⟩ consists of a non-empty set π›₯ℐ and an interpretation function ·ℐ , which maps – each atomic concept 𝐴 to a subset 𝐴ℐ of π›₯ℐ , ?𝐢 sc ?𝐷. ?𝑆 a ?𝐢. ?𝑃 dom ?𝐢. ?𝑆 ?𝑃 ?𝑂. ?𝑆 a ?𝐷. ?𝑆 a ?𝐢. ?𝑃 sp ?𝑄. ?𝑆 ?𝑃 ?𝑂. ?𝑃 rng ?𝐢. ?𝑆 ?𝑃 ?𝑂. ?𝑆 a ?𝐢,?𝐷. ?𝐢 dW ?𝐷. ?𝑆 ?𝑄 ?𝑂. ?𝑂 a ?𝐢. βŠ₯ Fig. 1. Minimal RDFS rules from [11] plus class disjointness β€œclash” rule from OWL2 RL [10]. – each negation of atomic concept 𝐴 to (¬𝐴ℐ ) = π›₯ℐ βˆ– 𝐴ℐ , – each atomic role 𝑃 to a binary relation 𝑃 ℐ over π›₯ℐ , and – each element of 𝛀 to an element of π›₯ℐ . For expressions βˆƒπ‘ƒ and βˆƒπ‘ƒ βˆ’ , the interpretation function is defined as (βˆƒπ‘ƒ )ℐ = {π‘₯ ∈ π›₯ℐ | βˆƒπ‘¦.(π‘₯, 𝑦) ∈ 𝑃 ℐ } resp. (βˆƒπ‘ƒ βˆ’ )ℐ = {𝑦 ∈ π›₯ℐ | βˆƒπ‘₯.(π‘₯, 𝑦) ∈ 𝑃 ℐ }. An interpretation ℐ satisfies an inclusion assertion 𝐸1 βŠ‘ 𝐸2 (of one of the forms 1–5 in Table 1), if 𝐸1ℐ βŠ† 𝐸2ℐ . Analogously, ℐ satisfies ABox assertions of the form 𝐴(π‘₯), if π‘₯ℐ ∈ 𝐴ℐ , and of the form 𝑃 (π‘₯, 𝑦), if (π‘₯ℐ , 𝑦 ℐ ) ∈ 𝑃 ℐ . An interpretation ℐ is called a model of a triple store 𝐺 (resp., a TBox 𝒯 , an ABox π’œ), denoted ℐ |= 𝐺 (resp., ℐ |= 𝒯 , ℐ |= π’œ), if ℐ satisfies all assertions in 𝐺 (resp., 𝒯 , π’œ). Finally, 𝐺 is called consistent, if it does not entail both 𝐢(π‘₯) and ¬𝐢(π‘₯) for any concept 𝐢 and constant π‘₯ ∈ 𝛀 , where entailment is defined as usual. As in [1], we treat only restricted SPARQL queries corresponding to (unions of) conjunctive queries without non-distinguished variables over DL ontologies: Definition 3 (BGP, CQ, UCQ, query answer). A conjunctive query (CQ) π‘ž, or basic graph pattern (BGP), is a set of atoms of the form 6–7 from Table 1, where now π‘₯, 𝑦 ∈ 𝛀 βˆͺ 𝒱.4 A union of conjunctive queries (UCQ) 𝑄, or UNION pattern, is a set of CQs. We denote with 𝒱(π‘ž) (or 𝒱(𝑄)) the set of variables from 𝒱 occurring in π‘ž (resp., 𝑄). An answer (under RDFSΒ¬ Entailment) to a CQ π‘ž over a triple store 𝐺 is a substitution πœƒ of the variables in 𝒱(π‘ž) with constants in 𝛀 such that every model of 𝐺 satisfies all facts in π‘žπœƒ. We denote the set of all such ⋃︀ answers with ans rdfs (π‘ž, 𝐺) (or simply ans(π‘ž, 𝐺)). The set of answers to a UCQ 𝑄 is π‘žβˆˆπ‘„ ans(π‘ž, 𝐺). We also recall from [1], that query answering in the presence of ontologies can be done either by rule-based pre-materialization of the ABox or by query rewriting. Let rewrite(π‘ž, 𝒯 ) be the UCQ resulting from applying PerfectRef [3] (or, equivalently, the stripped-down version from [1, Alg.1]) to a query π‘ž and let 𝐺 = 𝒯 βˆͺ π’œ be a triple store. Furthermore, let mat(𝐺) be the triple store obtained from exhaustive application of the inference rules in Fig. 1 on a consistent triple store 𝐺, andβ€”analogouslyβ€”let chase(π‘ž, 𝒯 ) refer to β€œmaterialization” w.r.t. 𝒯 applied to a CQ π‘ž. The next result transfers from [1] to consistent RDFSΒ¬ stores. Proposition 1. Let 𝐺 = 𝒯 βˆͺ π’œ be a consistent triple store, and π‘ž a CQ. Then, ans(π‘ž, 𝐺) = ans(rewrite(π‘ž, 𝒯 ), π’œ) = ans(π‘ž, mat(𝐺)). We have used this previously to define the semantics of SPARQL update operations. Definition 4 (SPARQL update operation, simple update of a triple store). Let 𝑃𝑑 and 𝑃𝑖 be BGPs, and 𝑃𝑀 a BGP or UNION pattern. Then an update operation 𝑒(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 ) has the form DELETE 𝑃𝑑 INSERT 𝑃𝑖 WHERE 𝑃𝑀 Let 𝐺 = 𝒯 βˆͺ π’œ be a triple store. Then the simple update of⋃︀ 𝐺 w.r.t. 𝑒(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 ) is defined as 𝐺𝑒(𝑃𝑑 ,𝑃𝑖 ,𝑃𝑀 ) = (𝐺 βˆ– π’œπ‘‘ ) βˆͺ π’œπ‘– , where π’œπ‘‘ = πœƒβˆˆans(𝑃𝑀 ,𝐺) gr(𝑃𝑑 πœƒ), ⋃︀ π’œπ‘– = πœƒβˆˆans(𝑃𝑀 ,𝐺) gr(𝑃𝑖 πœƒ), and gr(𝑃 ) denotes the set of ground triples in pattern 𝑃 . 4 𝒱 is a countably infinite set of variables (written as ’?’-prefixed alphanumeric strings). For the sake of readability of the algorithms, we assume that all update operations 𝑒(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 ) in this paper contain no constants in the BGPs 𝑃𝑑 and 𝑃𝑖 , and that all property assertions (?𝑋 p ?π‘Œ ) in 𝑃𝑑 have distinct variables ?𝑋 and ?π‘Œ . Enhancing our results to updates with constants and variable equalities in 𝑃𝑑 and 𝑃𝑖 is not difficult, but requires distinguishing special cases: e.g., instead of replacing the variable 𝑦 in a pattern 𝑄 by 𝑧, the expression 𝑄 FILTER(𝑦 = 𝑧) can be used in the case when 𝑦 is a constant; instead of 𝑄(𝑦) MINUS 𝑃 for a variable 𝑦, 𝑄 FILTER NOT EXISTS 𝑃 should be used for ground 𝑄. We call a triple store or (ABox) materialized if in each state it always guarantees πΊβˆ–π’― = mat(𝐺)βˆ–mat(𝒯 ). In the present paper, we will always focus on β€œmaterialization preserving” semantics for SPARQL update operations, which we dubbed Semmat 2 in [1] and which preserves a materialized triple store. We recall the intuition behind Semmat 2 , given an update 𝑒 = (𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 ): (i) delete the instantiations of 𝑃𝑑 plus all their causes; (ii) insert the instantiations of 𝑃𝑖 plus all their effects. Definition 5 (Semmat 2 [1]). Let 𝑒(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 ) be an update operation. Then Semmat 𝐺𝑒(𝑃𝑑2,𝑃𝑖 ,𝑃𝑀 ) = 𝐺𝑒(𝑃 caus ,𝑃 eff ,{𝑃𝑀 }{𝑃 fvars }) 𝑑 𝑖 𝑑 caus Here, given a pattern 𝑃 , 𝑃 = flatten(rewrite(𝑃, 𝒯 )); 𝑃 eff = chase(𝑃, 𝒯 ) and 𝑃 = {?𝑣 a rdfs:Resource |?𝑣 ∈ 𝒱(𝑃 caus ) βˆ– 𝒱(𝑃 )}, where flatten(Β·) computes the fvars set of all triples occurring in the UCQ π‘Ÿπ‘’π‘€π‘Ÿπ‘–π‘‘π‘’(𝑃, 𝒯 ), which in our case allows us to obtain all possible β€œcauses” of each atom in 𝑃𝑑 , and β€œ?v a rdfs:Resource” is a shortcut for a pattern that binds ?𝑣 to any π‘₯ ∈ 𝛀 occurring in 𝐺. We refer to [1] for further details, but stress that as such, Semmat 2 is not able to detect or deal with inconsistencies arising from 𝑃𝑖 and 𝐺. In what follows, we will discuss how this can be remedied. 3 Checking Consistency of a SPARQL Update Within previous work on the evolution of DL-Lite knowledge bases [4], updates given in the form of pairs of ABoxes π’œπ‘‘ , π’œπ‘– have been studied. However, whereas such update might be viewed to fit straightforwardly to the corresponding π’œπ‘‘ , π’œπ‘– in Def. 4, in [4] it is assumed that π’œπ‘– is consistent with the TBox, and thus one only needs to consider how to deal with inconsistencies between the update and the old state of the knowledge base. This a priori assumption may be insufficient for SPARQL updates though, where concrete values for inserted triples are obtained from variable bindings in the WHERE clause, and depending on the bindings, the update can be either consistent or not. This is demonstrated by the update 𝑒 from Sec. 1 which, when applied to the ABox π’œ4 , results in an inconsistent set π’œπ‘– of insertions . We call this intrinsic inconsistency of an update relative to a triple store 𝐺 = 𝒯 βˆͺ π’œ. Definition 6. Let 𝐺 be a triple store. The update 𝑒 is said to be intrinsically consistent w.r.t. 𝐺 if the set of new assertions π’œπ‘– from Def. 4 generated by applying 𝑒 to 𝐺, taken in isolation from the ABox of 𝐺, does not contradict the TBox of 𝐺. Otherwise, the update is said to be intrinsically inconsistent w.r.t. 𝐺. Algorithm 1: constructing a SPARQL ASK query to check intrinsic inconsistency (for the definition of 𝑃𝑖eff , cf. Def. 5) Input: RDFSΒ¬ TBox 𝒯 , SPARQL update 𝑒(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 ) Output: A SPARQL ASK query returning 𝑇 π‘Ÿπ‘’π‘’ if 𝑒 is intrinsically inconsistent eff 1 if βŠ₯ ∈ 𝑃𝑖 then 2 return ASK {} //𝑒 contains clashes in itself, i.e., is inconsistent for any triple store 3 else 4 π‘Š := { FILTER(False)}; //neutral element w.r.t. union 5 foreach pair of triple patterns (?𝑋 a 𝑃 ), (?π‘Œ a 𝑅) in 𝑃𝑖eff do 6 if 𝑃 βŠ‘ ¬𝑅 ∈ 𝒯 then 7 π‘Š := π‘Š UNION {{𝑃𝑀 πœƒ1 [?𝑋 ↦→?𝑍]} . {𝑃𝑀 πœƒ2 [?π‘Œ ↦→?𝑍]}} for a fresh ?𝑍 8 return ASK WHERE {π‘Š } Intrinsic inconsistency of the update differs crucially from the inconsistency w.r.t. the old state of the knowledge base, illustrated by the ABox π’œ2 from Sec. 1. This latter case can be addressed by adopting an update policy that prefers newer assertions in case of conflicts, as studied in the context of DL-Lite KB evolutions [4], which we will discuss in Sec. 4 below. Intrinsic inconsistencies however are harder to deal with, since there is no cue which assertion should be discarded in order to avoid the inconsistency. Our proposal here is thus to discard all mutually inconsistent pairs of insertions. We first present an algorithm for checking intrinsic inconsistency by means of SPARQL ASK queries and then a safe rewriting algorithm. This rewriting is based on an observation that clashing triples can be introduced by a combination of two bindings of variables in the WHERE clause, as the example in the Sec. 1 (the ABox π’œ1 ) illustrates. To handle such cases, two copies of the WHERE clause 𝑃𝑀 are created by the rewriting in Algorithms 1 and 2, for each pair of disjoint concepts according to the TBox of the triple store. These algorithms use notation described in Rem. 1 below. Remark 1. Our rewriting algorithms rely on producing fresh copies of the WHERE clause. Assume πœƒ, πœƒ1 , πœƒ2 , . . . to be substitutions replacing each variable in a given formula with a distinct fresh one. For a substitution 𝜎, we also define πœƒ[𝜎] resp. πœƒπ‘– [𝜎] to be an extension of 𝜎, renaming each variable at positions not affected by 𝜎 with a distinct fresh one. For instance, let 𝐹 be a triple (?𝑍 :studentOf ?π‘Œ ). Now, 𝐹 πœƒ makes a variable disjoint copy of 𝐹 : ?𝑍1 :studentOf ?π‘Œ1 for fresh ?𝑍1 , ?π‘Œ1 . 𝐹 [?𝑍 ↦→?𝑋] is just a substitution of ?𝑍 by ?𝑋 in 𝐹 . Finally, 𝐹 πœƒ[?𝑍 ↦→?𝑋] results in ?𝑋 :studentOf ?π‘Œ2 for fresh ?π‘Œ2 . We assume that all occurrences of 𝐹 πœƒ[𝜎] stand for syntactically the same query, but that 𝐹 πœƒ[𝜎1 ] and 𝐹 πœƒ[𝜎2 ], for distinct 𝜎1 and 𝜎2 , can only have variables in π‘Ÿπ‘Žπ‘›π‘”π‘’(𝜎1 ) ∩ π‘Ÿπ‘Žπ‘›π‘”π‘’(𝜎2 ) in common. That is, the choice of fresh variables is defined by the parameterizing substitution 𝜎. Now, the possibility of unifying two variables ?𝑋 and ?π‘Œ in 𝑃𝑀 on a given triple store can be tested with the query {𝑃𝑀 πœƒ1 [?𝑋 ↦→?𝑍]}{𝑃𝑀 πœƒ2 [?π‘Œ ↦→?𝑍]} where πœƒ1 and πœƒ2 are variable renamings as in Rem. 1 and ?𝑍 is a fresh variable. In order to check the intrinsic consistency of an update, this condition should be evaluated for every pair of variables of 𝑃𝑀 , the unification of which leads to a clash. A SPARQL ASK query based on this idea is produced by Alg. 1. Note that it suffices to Algorithm 2: Safe rewriting safe(𝑒) Input: RDFSΒ¬ TBox 𝒯 , SPARQL update 𝑒(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 ) Output: SPARQL update safe(𝑒) eff 1 if βŠ₯ ∈ 𝑃𝑖 then 2 return 𝑒(𝑃𝑑 , 𝑃𝑖 , FILTER(False)) 3 π‘Š := { FILTER(False)}; //neutral element w.r.t. union 4 foreach pair of triple patterns (?𝑋 a 𝑃 ), (?π‘Œ a 𝑅) in 𝑃𝑖eff do 5 if 𝑃 βŠ‘ ¬𝑅 ∈ 𝒯 then 6 //cf. Rem. 1 for notation πœƒ[. . .] 7 π‘Š := π‘Š UNION {𝑃𝑀 πœƒ1 [?𝑋 ↦→?π‘Œ ]} UNION {𝑃𝑀 πœƒ2 [?π‘Œ ↦→?𝑋]}} 8 return 𝑒(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 MINUS {π‘Š }) check only triples of the form {?𝑋 a ?𝐢} at line 5 of Alg. 1, since disjointness conditions can only be formulated for concepts, according to the syntax in Table 1. Furthermore, since we are taking the facts in 𝑃𝑖eff extended by all facts implied by 𝒯 , at line 6 of Alg. 1 it suffices to check the disjointness conditions explicitly mentioned in 𝒯 and not all those which are implied by 𝒯 . Example 1. Consider the update 𝑒 from Sec. 1, in which the INSERT clause 𝑃𝑖 can create clashing triples. To identify potential clashes, Alg. 1 first applies the infer- ence rule for the range constraint, and computes 𝑃𝑖eff = {?𝑋 a :Student . ?π‘Œ a :Professor}. Now both variables ?𝑋, ?π‘Œ occur in the triples of type (6) from Table 1 with clashing concept names. The following ASK query is produced by Alg. 1. ASK WHERE { ?X :attendsClassOf ?Y . ?Y :attendsClassOf ?X1 } (In this and subsequent examples we omit the trivial FILTER(False) union branch used in rewritings to initialize variables with disjunctive conditions, such as π‘Š in Alg. 1) Suppose that an insert is not intrinsically consistent for a given triple store. One solution would be to discard it completely, should the above ASK query return True. Another option which we consider here is to only discard those variable bindings from the WHERE clause, which make the INSERT clause 𝑃𝑖 inconsistent. This is the task of the safe rewriting safe(Β·) in Alg. 2, removing all variable bindings that participate in a clash between different triples of 𝑃𝑖 . Let 𝑃𝑀 be a WHERE clause, in which the variables ?𝑋 and ?π‘Œ should not be unified to avoid clashes. With πœƒ1 , πœƒ2 being β€œfresh” variable renamings as in Rem. 1, Alg. 2 uses the union of 𝑃𝑀 πœƒ1 [?𝑋 ↦→?π‘Œ ] and 𝑃𝑀 πœƒ2 [?π‘Œ ↦→?𝑋] to eliminate unsafe bindings that send ?𝑋 and ?π‘Œ to a same value. Example 2. Alg. 2 extends the WHERE clause of the update 𝑒 from Sec. 1 as follows: INSERT{?X :studentOf ?Y} WHERE{?X :attendsClassOf ?Y MINUS{{?X1 :attendsClassOf ?X} UNION {?Y :attendsClassOf ?Y2}}} Note that the safe rewriting can make the update void. For instance, safe(𝑒) has no effect on the ABox π’œ1 from Sec. 1, since there is no cue, which of :jimmy :attendsClassOf :ann, :ann :attendsClassOf :jimmy needs to be dis- missed to avoid the clash. However, if we extend this ABox with assertions both satisfy- ing the WHERE clause of 𝑒 and not causing undesirable variable unifications, safe(𝑒) would make insertions based on such bindings. For instance, adding the fact :bob :attendsClassOf :alice to π’œ1 would assert :bob :studentOf :alice as a result of safe(𝑒). A rationale for using MINUS rather than FILTER NOT EXISTS in Alg. 2 (and also in a rewriting in forthcoming Sec. 4) can be illustrated by an update in which variables in the INSERT and DELETE clauses are bound in different branches of a UNION: DELETE {?V a :Professor} INSERT {?X :studentOf ?Y} WHERE {{?X :attendsClassOf ?Y} UNION {?V :attendsClassOf ?W}} A safe rewriting of this update (abbreviating :attendsClassOf as :aCo) is DELETE {?V a :Professor} INSERT {?X :studentOf ?Y} WHERE { {{?X :aCo ?Y} UNION {?V :aCo ?W}} MINUS{ {{?X1 :aCo ?X} UNION {?V1 :aCo ?W1}} UNION {{?Y :aCo ?Y2} UNION {?V2 :aCo ?W2}} } } It can be verified that with FILTER NOT EXISTS in place of MINUS this update makes no insertions on all triple stores: the branches {?V1 :aCo ?W1} and {?V2 :aCo ?W2} are satisfied whenever {?X :aCo ?Y} is, making FILTER NOT EXISTS eval- uate to False whenever {?X :aCo ?Y} holds. We conclude this section by formalizing the intuition of update safety. For a triple store 𝐺 and an update 𝑒 = (𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 ), let J𝑃𝑀 K𝑒𝐺 denote the set of variable bind- ings computed by the query β€œ SELECT ?𝑋1 , . . . , ?π‘‹π‘˜ WHERE 𝑃𝑀 ” over 𝐺, where ?𝑋1 , . . . , ?π‘‹π‘˜ are the variables occurring in 𝑃𝑖 or in 𝑃𝑑 . Theorem 1. Let 𝒯 be a TBox, let 𝑒 be a SPARQL update (𝑃𝑖 , 𝑃𝑑 , 𝑃𝑀 ), and let query π‘žπ‘’ and update safe(𝑒) = (𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀′ ) result from applying Alg. 1 resp. Alg. 2 to 𝑒 and 𝒯 . Then, the following properties hold for an arbitrary RDFSΒ¬ triple store 𝐺 = 𝒯 βˆͺ π’œ: (1) π‘žπ‘’ (𝐺) = True iff βˆƒπœ‡, πœ‡β€² ∈ J𝑃𝑀 K𝑒𝐺 s.t. πœ‡(𝑃𝑖 ) ∧ πœ‡β€² (𝑃𝑖 ) ∧ 𝒯 |= βŠ₯; (2) J𝑃𝑀 K𝑒𝐺 βˆ– J𝑃𝑀′ K𝑒𝐺 = {πœ‡ ∈ J𝑃𝑀 K𝑒𝐺 | βˆƒπœ‡β€² ∈ J𝑃𝑀 K𝑒𝐺 s.t. πœ‡(𝑃𝑖 ) ∧ πœ‡β€² (𝑃𝑖 ) ∧ 𝒯 |= βŠ₯}. 4 Materialization Preserving Update Semantics In this section we discuss resolution of inconsistencies between triples already in the triple store and newly inserted triples. Our baseline requirement for each update seman- tics is formulated as the following property. Definition 7 (Consistency-preserving). Let 𝐺 be a triple store and 𝑒(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 ) an update. A materialization preserving update semantics Sem is called consistency pre- serving in RDFSΒ¬ if the evaluation of update 𝑒, i.e., 𝐺Sem 𝑒(𝑃𝑑 ,𝑃𝑖 ,𝑃𝑀 ) , results in a consistent triple store. Our two consistency preserving semantics are respectively called brave and cautious. The brave semantics always gives priority to newly inserted triples by discarding all pre-existing information that contradicts the update. The cautious semantics is exactly the opposite, discarding inserts that are inconsistent with facts already present in the triple store; i.e., the cautious semantics never deletes facts unless explicitly required by the DELETE clause of the SPARQL update. Algorithm 3: Brave semantics Semmat brave Input: Materialized triple store 𝐺 = 𝒯 βˆͺ π’œ, SPARQL update 𝑒(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 ) Semmat Output: 𝐺𝑒(𝑃𝑑brave,𝑃𝑖 ,𝑃𝑀 ) β€² caus 1 𝑃𝑑 := 𝑃𝑑 ; eff 2 foreach triple pattern (?𝑋 a 𝐢) in 𝑃𝑖 do 3 foreach 𝐢 β€² s.t. 𝐢 βŠ‘ ¬𝐢 β€² ∈ 𝒯 or 𝐢 β€² βŠ‘ ¬𝐢 ∈ 𝒯 do 4 if (?𝑋 a 𝐢 β€² ) ∈ / 𝑃𝑑′ then 5 𝑃𝑑 := 𝑃𝑑 . {?𝑋 a 𝐢 β€² }caus β€² β€² 6 return 𝐺𝑒(𝑃 β€² ,𝑃 eff ,{𝑃 }𝑃 fvars ) 𝑑 𝑖 𝑀 𝑑 Both semantics rely upon incremental update semantics Semmat 2 , introduced in Sec. 2, which we aim to extend to take into account class disjointness. Note that for the present section we assume updates to be intrinsically consistent, which can be checked or enforced beforehand in a preprocessing step by the safe rewriting discussed in Sec. 3. In this section, we lift our definition of update operation to include also updates (𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 ) with 𝑃𝑀 produced by the safe rewriting Alg. 2 from some update satisfying Def. 4. What remains to be defined is the handling of clashes between newly inserted triples and triples already present in the triple store. The intuitions of our semantics for a SPARQL update 𝑒(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 ) in the context of an RDFSΒ¬ TBox are as follows: – brave semantics Semmat brave : (i) delete all instantiations of 𝑃𝑑 and their causes, plus all the non-deleted triples in 𝐺 clashing with instantiations of triples in 𝑃𝑖 to be inserted, again also including the causes of these triples; (ii) insert the instantiations of 𝑃𝑖 plus all their effects. – cautious semantics Semmat caut : (i) delete all instantiations of 𝑃𝑑 and their causes; (ii) insert all instantiations of 𝑃𝑖 plus all their effects, unless they clash with some non-deleted triples in 𝐺: in this latter case, perform neither deletions nor insertions. For a SPARQL update 𝑒, we will define rewritings of 𝑒 implementing the above seman- tics, which can be shown to be materialization preserving and consistency preserving. 4.1 Brave Semantics The rewriting in Alg. 3 implements the brave update semantics Semmat brave ; it can be viewed as combining the idea of FastEvol [4] with Semmat 2 to handle inconsistencies by giving priority to triples that ought to be inserted, and deleting all those triples from the store that clash with the new ones. The DELETE clause 𝑃𝑑′ of the rewritten update is initialized with the set 𝑃𝑑 of triples from the input update 𝑒. Rewriting ensures that also all β€œcauses” of deleted facts are removed from the store, since otherwise deleted triples will be re-inserted by materialization. To this end, line 1 of Alg. 3 adds to 𝑃𝑑′ all facts from which 𝑃𝑑 can be derived. Then, for each triple implied by 𝑃𝑖 (that is, for each triple in 𝑃𝑖eff ) the algorithm computes clashing patterns and adds them to the DELETE clause 𝑃𝑑′ , along with their causes. Note that it suffices to only consider disjointness assertions that are syntactically contained in 𝒯 (and not all that are implied by 𝒯 ), since we assume that the triple store is materialized. Finally, the WHERE clause of the rewritten update is extended to satisfy the syntactic restriction that all variables in 𝑃𝑑′ must have matches in the WHERE clause: bindings of β€œfresh” variables introduced to 𝑃𝑑′ by eventual domain or range constraints are provided by the part 𝑃𝑑fvars , cf. Def. 5 and Ex. 3 below. The rewritten update is evaluated over the triple store, computing its new materialized and consistent state. Example 3. Ex. 2 in Sec. 3 provided a safe rewriting safe(𝑒) of the update 𝑒 from Sec. 1. According to Alg. 3, this safe update is rewritten to: DELETE {?X a :Professor . ?X1 :studentOf ?X . ?Y a :Student . ?Y :studentOf ?Y1} INSERT {?X :studentOf ?Y . ?X a :Student . ?Y a :Professor} WHERE {{?X :attendsClassOf ?Y MINUS{{?X2 :attendsClassOf ?X} UNION {?Y :attendsClassOf ?Y2}}} OPTIONAL {?X1 :studentOf ?X} OPTIONAL {?Y :studentOf ?Y1} } The DELETE clause removes potential clashes for the inserted triples. Note that also property assertions implying clashes need to be deleted, and the respective triples in 𝑃𝑑′ contain fresh variables ?𝑋1 and ?π‘Œ 1. These variables have to be bound in the WHERE clause, and therefore 𝑃𝑑fvars adds two optional clauses to 𝑃𝑀 of safe(𝑒), which is a computationally reasonable implementation of the concept 𝑃 fvars from Def. 5. Theorem 2. Alg. 3, given a SPARQL update 𝑒 and a consistent materialized triple store 𝐺 = 𝒯 βˆͺ π’œ, computes a new consistent and materialized state w.r.t. brave semantics. 4.2 Cautious Semantics Unlike Semmat mat brave , its cautious version Semcaut always gives priority to triples that are already present in the triple store, and dismisses any inserts that are inconsistent with it. We implement this semantics as follows: (i) the DELETE command does not generate inconsistencies and thus is assumed to be always possible; (ii) the update is actually executed only if the triples introduced by the INSERT clause do not clash with state of the triple graph after all deletions have been applied. Cautious semantics thus treats insertions and deletions asymmetrically: the former depend on the latter but not the other way round. The rationale is that deletions never cause inconsistencies and can remove clashes between the old and the new data. As in the case of brave semantics, cautious semantics is implemented using rewriting, presented in Alg. 4. First, the algorithm issues an ASK query to check that no clashes will be generated by the INSERT clause, provided that the DELETE part of the update is executed. If no clashes are expected, in which case the ASK query returns False, the brave update from the previous section is applied. For a safe update 𝑒 = (𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 ), the ASK query is generated as follows. For each triple pattern {?𝑋 a 𝐢} among the effects of 𝑃𝑖 , at line 3 Alg. 4 enumerates all concepts 𝐢 β€² that are explicitly mentioned as disjoint with 𝐢 in 𝒯 . As in the case of brave semantics, this syntactic check is sufficient due to the assumption that the update is applied to a materialized store; by the same reason also no property assertions need to be taken into account. Algorithm 4: Cautious semantics Semmat caut Input: Materialized triple store 𝐺 = 𝒯 βˆͺ π’œ, SPARQL update 𝑒(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑀 ) Semmat Output: 𝐺𝑒(𝑃𝑑caut ,𝑃𝑖 ,𝑃𝑀 ) 1 π‘Š := { FILTER(False)} // neutral element w.r.t. union eff 2 foreach (?𝑋 a 𝐢) ∈ 𝑃𝑖 do 3 foreach 𝐢 β€² s.t. 𝐢 βŠ‘ ¬𝐢 β€² ∈ 𝒯 or 𝐢 β€² βŠ‘ ¬𝐢 ∈ 𝒯 do βˆ’ 4 𝛩𝐢 β€² := { FILTER(False)} 5 foreach (?π‘Œ a 𝐢 β€² ) ∈ 𝑃𝑑caus do βˆ’ βˆ’ 6 𝛩𝐢 β€² := 𝛩𝐢 β€² UNION {𝑃𝑀 πœƒ[?π‘Œ ↦→?𝑋]} βˆ’ 7 π‘Š := π‘Š UNION {{?𝑋 a 𝐢 β€² } MINUS {𝛩𝐢 β€² }} 8 𝑄 := ASK WHERE {{𝑃𝑀 }.{π‘Š }}; 9 if 𝑄(𝐺) then 10 return 𝐺 11 else Semmat 12 return 𝐺𝑒(𝑃𝑑brave ,𝑃𝑖 ,𝑃𝑀 ) For each concept 𝐢 β€² disjoint from 𝐢, we need to check that a triple matching the pattern {?𝑋 a 𝐢 β€² } is in the store 𝐺 and will not be deleted by 𝑒. Deletion happens if there is a pattern {?π‘Œ a 𝐢 β€² } ∈ 𝑃𝑑caus such that the variable ?π‘Œ can be bound to the same value as ?𝑋 in the WHERE clause 𝑃𝑀 . Line 6 of Alg. 4 produces such a check, using a copy of 𝑃𝑀 , in which the variable ?π‘Œ is replaced by ?𝑋 and all other variables are replaced with distinct fresh ones. Since there can be several such triple patterns in 𝑃𝑑caus , testing for clash elimination via the DELETE clause requires a disjunctive graph pattern βˆ’ β€² 𝛩𝐢 β€² constructed at line 6 and combined with {?𝑋 a 𝐢 } using MINUS at line 7. Finally, the resulting pattern is appended to the list π‘Š of clash checks using UNION . As a result, {𝑃𝑀 }.{π‘Š } queries for triples that are not deleted by 𝑒 and clash with an instantiation of some class membership assertion {?𝑋 a 𝐢} ∈ 𝑃𝑖eff . Theorem 3. Alg. 4, given a SPARQL update 𝑒 and a consistent materialized triple store 𝐺 = 𝒯 βˆͺ π’œ, computes a new consistent and materialized state w.r.t. cautious semantics. Example 4. Alg. 4 rewrites the safe update safe(𝑒) from Ex. 2 as follows: ASK WHERE{{?X :attendsClassOf ?Y MINUS{{?X1 :attendsClassOf ?X} UNION {?Y :attendsClassOf ?Y2}}} .{{?Y a :Student} UNION {?X a :Professor}}} Now, consider an update 𝑒′ having both INSERT and DELETE clauses: DELETE {?Y a :Professor} INSERT{?X a :Student} WHERE {?X :attendsClassOf ?Y} The update 𝑒′ inserts a single class membership fact and thus is always intrinsically consistent. The ASK query in Alg. 4 takes the DELETE clause of 𝑒′ into account: ASK WHERE {{?X :attendsClassOf ?Y} .{{?X a :Professor} MINUS {?Z :attendsClassOf ?X }}} 5 Discussion and Conclusions In this paper, we have taken a step further from our previous work, in combining SPARQL Update and RDFS entailment by also adding class/concept disjointness as a first step towards dealing with inconsistencies in the context of SPARQL Update. As discussed throughout the paper, previous approaches to handle inconsistencies in DL KB evolution (e.g., [4, 5, 9]) have assumed that the set of ABox assertions to be inserted is intrinsically consistent w.r.t. the TBox, and thus inconsistencies are treated only w.r.t. the old state of the knowledge base. As we have shown, this assumption is not trivially verifiable in the context of SPARQL updates, where DELETE/INSERT atoms are instantiated by a WHERE clause, and clashing triples could be instantiated within the same INSERT operation. We have addressed this problem by providing means to check whether a SPARQL update is intrinsically consistent and defining a safe rewriting that removes intrinsic clashes during inserts on-the-fly. Next, taken that the problem of intrinsic consistency is solved, we have demonstrated how to extend the approach of [4] to SPARQL updates. We have defined a materialization and consistency preserving rewriting for SPARQL updates that essentially combines the ideas of [4] and our previous work on SPARQL updates under RDFS for materialized triple stores [1], dealing with clashes due to class disjointness axioms in a brave manner. That is, we overwrite inconsistent information in the triple store in favor of information being inserted. Alternatively, we have also defined a dual consistency-preserving update semantics that on the contrary discards insertions that would lead to inconsistencies. Besides practical evaluation of the proposed algorithms, we plan to further extend our work towards increasing coverage of more expressive logics and OWL profiles, namely additional axioms from OWL 2 RL or OWL 2 QL [10]. Also, it could be useful to investigate further semantics, allowing for compromises between fully discarding the inconsistent old data and refusing the entire update due to clashes, and lift our methods to work with stores that are not fully materialized. The consideration of negative information is an important issue also in other related works on knowledge base updates: for instance, the seminal work on database view maintenance by Gupta et al. [7] is also used in the context of materialized views using Datalog rules with stratified negation. Likewise, let us mention the work of Winslett [13] on formula-based semantics to updates, where negation is also considered. Acknowledgements. This work was supported by the Vienna Science and Technology Fund (WWTF), project ICT12-SEE, and EU IP project Optique (Scalable End-user Access to Big Data), grant agreement n. FP7-318338. References 1. Ahmeti, A., Calvanese, D., Polleres, A.: Updating RDFS aboxes and tboxes in SPARQL. In: Proc. of the 13th Int. Semantic Web Conf. (ISWC). Lecture Notes in Computer Science, vol. 8796, pp. 441–456. Springer (2014) 2. Beckett, D., Berners-Lee, T., Prud’hommeaux, E., Carothers, G.: RDF 1.1 Turtle – Terse RDF Triple Language. W3C Recommendation, World Wide Web Consortium (Feb 2014), available at http://www.w3.org/TR/turtle/ 3. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: The DL-Lite family. J. of Automated Reasoning 39(3), 385–429 (2007) 4. Calvanese, D., Kharlamov, E., Nutt, W., Zheleznyakov, D.: Evolution of DL-Lite knowledge bases. In: Proc. of the 9th Int. Semantic Web Conf. (ISWC). Lecture Notes in Computer Science, vol. 6496, pp. 112–128. Springer (2010) 5. De Giacomo, G., Lenzerini, M., Poggi, A., Rosati, R.: On instance-level update and erasure in description logic ontologies. J. of Logic and Computation 19(5), 745–770 (2009) 6. Gearon, P., Passant, A., Polleres, A.: SPARQL 1.1 update. W3C Recommendation, World Wide Web Consortium (Mar 2013), available at http://www.w3.org/TR/ sparql11-update/ 7. Gupta, A., Mumick, I.S., Subrahmanian, V.S.: Maintaining views incrementally. In: Proc. of the ACM SIGMOD Int. Conf. on Management of Data. pp. 157–166 (1993) 8. Hayes, P., Patel-Schneider, P.: RDF 1.1 semantics. W3C Recommendation, World Wide Web Consortium (Feb 2014), available at http://www.w3.org/TR/rdf11-mt/ 9. Liu, H., Lutz, C., Milicic, M., Wolter, F.: Updating description logic ABoxes. In: Doherty, P., Mylopoulos, J., Welty, C.A. (eds.) Proc. of the 10th Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR). pp. 46–56. AAAI Press (2006) 10. Motik, B., Grau, B.C., Horrocks, I., Wu, Z., Fokoue, A., Lutz, C.: OWL 2 Web Ontology Language profiles (second edition). W3C Recommendation, World Wide Web Consortium (Dec 2012), available at http://www.w3.org/TR/owl2-profiles/ 11. MuΓ±oz, S., PΓ©rez, J., GutiΓ©rrez, C.: Minimal deductive systems for RDF. In: Proc. of the 4th European Semantic Web Conf. (ESWC). Lecture Notes in Computer Science, vol. 4519, pp. 53–67. Springer (2007) 12. Polleres, A., Hogan, A., Delbru, R., Umbrich, J.: RDFS & OWL reasoning for linked data. In: Reasoning Web. Semantic Technologies for Intelligent Data Access – 9th Int. Summer School Tutorial Lectures (RW), Lecture Notes in Computer Science, vol. 8067, pp. 91–149. Springer (2013) 13. Winslett, M.: Updating Logical Databases. Cambridge University Press (2005)