=Paper=
{{Paper
|id=None
|storemode=property
|title=SPARQL Update under RDFS Entailment in Fully Materialized and Redundancy-Free Triple Stores
|pdfUrl=https://ceur-ws.org/Vol-1059/ordring2013-paper3.pdf
|volume=Vol-1059
|dblpUrl=https://dblp.org/rec/conf/semweb/AhmetiP13
}}
==SPARQL Update under RDFS Entailment in Fully Materialized and Redundancy-Free Triple Stores==
SPARQL Update under RDFS Entailment in Fully Materialized and Redundancy-Free Triple Stores Albin Ahmeti1 and Axel Polleres2 1 Vienna University of Technology, Favoritenstraße 9, 1040 Vienna, Austria 2 Vienna University of Economics and Business, Welthandelsplatz 1, 1020 Vienna, Austria Abstract. Processing the dynamic evolution of RDF stores has recently been stan- dardized in the SPARQL 1.1 Update specification. However, computing answers entailed by ontologies in triple stores is usually treated orthogonal to updates. Even the W3C’s recent SPARQL 1.1 Update language and SPARQL 1.1 Entail- ment Regimes specifications explicitly exclude a standard behavior how SPARQL endpoints should treat entailment regimes other than simple entailment in the context of updates. In this paper, we take a first step to close this gap, by drawing from query rewriting techniques explored in the context of DL-Lite. We define a fragment of SPARQL basic graph patterns corresponding to (the RDFS fragment of) DL-Lite and the corresponding SPARQL update language discussing possible semantics along with potential strategies for implementing them. We treat both (i) reduced RDF Stores, that is, redundancy-free RDF stores that do not store any RDF triples (corresponding to DL Lite ABox statements) entailed by others already, and (ii) materialized RDF stores, which store all entailed triples explicitly. 1 Introduction SPARQL provides a standard for accessing structured Data on the Web and the avail- ability of such a standard may well be called one of the key factors to the success and increasing adoption of RDF and semantic technologies. Still, in its first iteration the SPARQL specification has neither defined any standard way how to treat entailments with respect to RDFS and OWL ontologies, nor provided standard means how to update dynamic RDF data. Both these gaps have been addressed within the recent SPARQL 1.1 specification, which both provides means to define query answers under ontological entailments (SPARQL 1.1 Entailment Regimes [6]) as well as an update language to update RDF data stored in a triple store (SPARQL 1.1 Update [5]).3 Nonetheless, these specifications do not provide a standard behavior how SPARQL endpoints should treat entailment regimes other than simple entailment in the context of updates. The main issue of updates under entailments is how updates shall deal with implied statements: – What does it mean if an implied triple is explicitly (re-)inserted (or, resp., deleted)? – Which (if any) additional triples should be inserted, (or, resp., deleted) upon updates? 3 For the sake of simplicity we do not take into account NAMED graphs in this paper, which is why we speak of “triple stores” as opposed to “graph stores”. In the context of RDFS one might further ask about how to deal with axiomatic triples such as rdf:type a rdf:Property., issues around the treatment of blank nodes [12], or, in the context of OWL, the treatment of inconsistencies arising through updates; for the sake of this present paper, we do not treat these additional issues separately, but focus on a deliberately minimal treatment of RDFS inferences along the lines of [14]. As it turns out, even in this confined setting, updates as defined in the SPARQL 1.1 Update specification impose non-trivial challenges; particularly, specific issues arise through the interplay of INSERT, DELETE, and WHERE clauses within a single SPARQL update operation, which – to the best of our knowledge – have not been considered in the literature in combination as of yet. Example 1. As a running example, let us assume a RDF triple store 𝐺, containing data about family relationships along with a RDFS ontology 𝑂𝑓 𝑎𝑚 comprising the following axioms (in Turtle syntax). :hasFather rdfs:subPropertyOf :hasParent. :hasMother rdfs:subPropertyOf :hasParent. :Father rdfs:subClassOf :Parent. :Mother rdfs:subClassOf :Parent. :hasFather rdfs:range :Father; rdfs:domain :Child. :hasMother rdfs:range :Mother; rdfs:domain :Child. :hasParent rdfs:range :Parent; rdfs:domain :Child. Assuming now that the following data assertions are also in the RDF Store 𝐺 :joe :hasParent :jack. :joe :hasMother :jane. the following query should return both :jack and :jane as (RDFS entailed) answers: SELECT ?Y WHERE { :joe :hasParent ?Y. } A SPARQL endpoint that only considers simple entailment though, would only return :jack as an answer. The intended behavior for the query in Example 1 is typically achieved by either (i) computing entailed answers at query runtime, which can actually be achieved by query rewriting techniques, or (ii) by materializing all implied triples in the store, normally at loading time. That is, on the one hand, borrowing from query rewriting techniques from DL-Lite (such as e.g. PerfectRef [3], several refinements of which have been proposed in the literature) one can reformulate such a query to return also the implied answers. A mini- malist query reformulation algorithm – that is, a down-stripped version of PerfectRef [3], covering the essential entailments of RDFS – is shown in Alg. 1. Example 2 (cont’d). If we interpret the WHERE clause in the query in Example 1 as a conjunctive query, and the RDFS axioms from 𝑂𝑓 𝑎𝑚 as DL TBox, the resulting UCQ as from Alg. 1 can be just translated back into SPARQL as follows. SELECT ?Y WHERE {{ :joe :hasParent ?Y. } UNION { :joe :hasFather ?Y. } UNION { :joe :hasMother ?Y. }} Indeed this query returns both parents, :jane and :jack. While the rewritten query is exponential in the worst case wrt. the size of the ontology, for moderate size ontologies, such a rewriting approach is quite feasible. On the other hand, an alternative is to materialize all inferences (for instance produced by the minimalist inference rules set for RDFS from [14]) in the triple store, such that the original query can be used ’as is’. Example 3 (cont’d). The materialized version of 𝐺 would contain the following triples – for conciseness we only show assertional implied triples here, that is triples from rules 2)𝑏), 3)𝑏), 4)𝑎)+𝑏) from [14]. :joe :hasParent :jack. :joe a :Child. :jack a :Parent. :joe :hasMother :jane; :hasParent :jane. :jane a :Mother, :Parent. Obviously, on such an endpoint, the original query from Example 1 would already return the expected results. However, when it comes to updating the RDF store in terms of SPARQL 1.1 Update, things become less clear. Example 4 (cont’d). The following operation tries to delete an implied class membership whereas it tries to (re-)insert another implied class membership. DELETE{?X a :Child.} INSERT {?Y a :Mother.} WHERE {?X :hasMother ?Y.} Various existing triple stores offer different solutions to these problems, ranging from – probably most commonly – ignoring entailments in updates altogether to keeping explicit and implicit triples separate and recomputing all entailments upon updates. That is, in the former case (ignoring entailments) updates only refer to explicitly asserted information, which may result in non-intuitive behaviors, whereas the latter case (re- materialization) may be very costly – while still not eliminating all non-intuitive cases, as we will see. The problem is aggravated by no systematic approach to which implied triples to store explicitly in a triple store and which not. As an example, take for instance Table 1. DL-Lite axioms vs. RDF(S) Algorithm 1: rewrite DLRDFS RDFS Input: Conjunctive query 𝑞, TBox 𝒯 1 𝐴1 ⊑ 𝐴2 A1 rdfs : subClassOf A2 . 2 ∃𝑃 ⊑ 𝐴 P rdfs : domain A. Output: Union (set) of conjunctive queries 3 ∃𝑃 − ⊑ 𝐴 P rdfs : range A. 1 𝑃 := {𝑞} 4 𝑃1 ⊑ 𝑃2 P1 rdfs : subPropertyOf P2 . 2 repeat 5 𝐴(𝑥) x rdf : type A. 3 𝑃 ′ := 𝑃 6 𝑃 (𝑥, 𝑦) x P y. 4 foreach 𝑞 ∈ 𝑃 ′ do 5 foreach 𝑔 in 𝑞 do // expansion Table 2. Semantics of gr(𝑔, 𝐼) in Alg.1 6 foreach inclusion axiom 𝐼 in 𝒯 do 𝑔 𝐼 gr(𝑔/𝐼) 7 if 𝐼 is applicable {︀ to 𝑔 then }︀ 8 𝑃 := 𝑃 ∪ 𝑞[𝑔/ gr(𝑔, 𝐼)] 𝐴(𝑥) 𝐵⊑𝐴 𝐵(𝑥) ′ 𝐴(𝑥) ∃𝑃 ⊑ 𝐴 𝑃 (𝑥, _) 9 until 𝑃 = 𝑃 𝐴(𝑥) ∃𝑃 − ⊑ 𝐴 𝑃 (_, 𝑥) 10 return 𝑃 𝑃1 (𝑥, 𝑦) 𝑃2 ⊑ 𝑃1 𝑃2 (𝑥, 𝑦) Here, ‘_’ stands for a “fresh” variable. DBpedia which stores certain entailed triples, but not all (cf. the example in [16]). In this paper we try to argue for a more systematic approach for dealing with updates in the context of RDFS entailments. More specifically, we will distinguish between two kinds of triple stores, that is (i) reduced RDF Stores, that is, redundancy-free RDF stores that do not store any assertional (ABox) triples entailed by others already, and (ii) materialized RDF stores, which store all entailed ABox triples explicitly. Next, we propose alternative update samentics that preserve the respective types (i) and (ii) of triple stores, and discuss possible implementation strategies, partially inspired by query rewriting techniques that have become popular in the context of ontology-based data access (OBDA) [11] and DL-Lite [3]. For the sake of this paper we will assume RDFS ontologies to be static and only assertional statements (ABoxes) to be updated. Along these lines, we will define a fragment of the SPARQL 1.1 Update language, that allows to Add/Delete assertional ABox statements, but not terminological statements (using the RDFS vocabulary). As already shown in [8], erasure of ABox statements is deterministic in the context of RDFS, but insertion and particularly the interplay of DELETE/INSERT in SPARQL 1.1 Update has not been considered therein. The remainder of this paper continues with preliminaries (RDFS, SPARQL, DL-Lite, SPARQL update operations) in Section 2. We then present alternative update semantics along with possible implementation strategies in Section 3, and subsequently discuss future and related work (Section 4) before we conclude (Section 5). 2 Preliminaries Since we will draw from ideas coming from OBDA and DL-Lite, we will use RDF graphs and RDFS ontologies compatible with DL-Lite ontologies in the present paper. Definition 1 (RDFS ontology, ABox, TBox, triple store). Let 𝐴 be an atomic concept (or class, resp.) name and 𝑃 be an atomic role (or, property, resp.) name, and 𝛤 be a set of constants which in context of RDF(S) coincides with the set 𝐼 of IRIs. We assume the IRIs used for concepts, roles, and constants to be disjoint from IRIs of the RDFS and OWL vocabularies.4 Let 𝑥, 𝑦 ∈ 𝛤 , then we call a set of inclusion axioms of the forms 1-4 in Tab. 1 a RDFS ontology, or (RDFS) TBox, whereas we call a set of assertions of the forms 5+6 in Tab. 1 an (RDF) ABox, where we view RDF and DL notation interchangeably. Finally, we call a container 𝐺 = 𝒯 ∪ 𝒜 holding a TBox 𝒯 and an ABox 𝒜 an (RDFS) triple store. That is, more informally, we treat any RDF graph consisting of triples without non-standard RDFS vocabulary use as a pair of ABox and TBox triples, where, for the purposes of this paper, we will consider the latter fixed/immutable. Further to this, we will treat conjunctive queries (with only distinguished variables) and SPARQL basic graph patterns (BGPs) interchangeably in this paper as follows. Note that we do not provide details on more complex patterns (OPTIONAL, NOT EXISTS, FILTER, etc.) in 4 That is, we assume no “non-standard use” [16] of these vocabularies. We may also assume concept names, role names and constants mutually disjoint (or distinguish them in the sense of “punning” [13] based on their position in atoms or RDF triples). SPARQL 1.1 which are defined on top of BGP matching, but refer the reader to [9] for details on the semantics of these patterns. Definition 2 (BGP, CQ, UCQ). A conjunctive query (CQ) 𝑞, or basic graph pattern (BGP), is a set of atoms of the forms 5+6 from Tab. 1, where 𝑥, 𝑦 are either constants from 𝛤 or variables (written as alphanumeric strings preceded by ’?’). A union of conjunctive queries (UCQ) 𝑄, or UNION pattern, is a set of conjunctive queries. By 𝑉 𝑎𝑟(𝑞) (or 𝑉 𝑎𝑟(𝑄), resp.) we denote the set of variables occurring in 𝑞 (or 𝑄, resp.). This definition allows only restricted forms of general SPARQL BGPs that corre- spond to conjunctive queries along the lines of Description Logics; that is, we do not allow, e.g., queries with variables in predicate positions nor “terminological” queries, e.g. {?X rdfs:subClassOf ?Y.}. We neither consider blank nodes separately in this paper.5 By these restrictions, we can treat query answering and BGP matching in SPARQL analogously and define it in terms of interpretations and models (as usual in Description Logics) as follows. Definition 3 (Interpretation, satisfaction, model). An interpretation ℐ = ⟨𝛥ℐ , ·ℐ ⟩ consists of a non-empty set 𝛥ℐ called the object domain, and an interpretation function ·ℐ which maps – each atomic concept 𝐴 to a subset of the domain 𝐴ℐ ⊆ 𝛥ℐ , – each atomic role 𝑃 to a binary relation over the domain 𝑃 ℐ ⊆ 𝛥ℐ × 𝛥ℐ , – each element of 𝛤 to an element of 𝛥ℐ . For concept descriptions ⃒ the interpretation }︀ function is defined as follows: ℐ – (∃𝑅) = 𝑥 ∈ 𝛥ℐ ⃒ ∃𝑦.(𝑥, 𝑦) ∈ 𝑅ℐ {︀ ℐ ⃒ – (∃𝑅− ) = 𝑦 ∈ 𝛥ℐ ⃒ ∃𝑥.(𝑥, 𝑦) ∈ 𝑅ℐ {︀ }︀ An interpretation ℐ satisfies an inclusion axiom 𝐸1 ⊑ 𝐸2 (of one of the forms 1-4 in Tab. 1) if 𝐸1ℐ ⊆ 𝐸2ℐ . Analogously, ℐ satisfies an ABox assertion of the form – 𝐴(𝑥) if 𝑥ℐ ∈ 𝐴ℐ – 𝑃 (𝑥, 𝑦) if (𝑥ℐ , 𝑦 ℐ ) ∈ 𝑃 ℐ Finally, an interpretation ℐ is called a model of a triple store 𝐺 = 𝒯 ∪ 𝒜, denoted ℐ |= 𝐺, if ℐ |= 𝒯 and ℐ |= 𝒜, that is, if ℐ satisfies all terminological axioms in 𝒯 and all ABox assertions in 𝒜. Definition 4 (Query answers). For a CQ 𝑞 (or, UCQ 𝑄, resp.) and a triple store 𝐺, a substitution 𝜃 from variables in 𝑉 𝑎𝑟(𝑞) to constants in 𝛤 such that 𝑞𝜃 is true (or, there exists a 𝑞 ∈ 𝑄 with 𝑞𝜃 is true) in every model of 𝐺 is called an answer (under RDFS Entailment) to 𝑞, and we denote the set of all answers by 𝑎𝑛𝑠RDFS (𝑞, 𝐺) (or, 𝑎𝑛𝑠RDFS (𝑄, 𝐺), resp.). Since we only treat RDFS entailment, but no other entailment regimes in this paper, in the following we will drop the subscript and simply write 𝑎𝑛𝑠 instead of 𝑎𝑛𝑠RDFS . The following results follow immediately from e.g. [8, 14] & [3] and show that query answering under RDF can be done by either query rewriting or materialization. 5 Blank nodes in a triple store may be considered as constants and we do not allow blank nodes in queries, which does not affect the expressivity of SPARQL. Proposition 1. Let 𝐺 = 𝒯 ∪ 𝒜 be a triple store, 𝑞 be a CQ. Further, let 𝑟𝑒𝑤𝑟𝑖𝑡𝑒(𝑞, 𝒯 ) be as per Alg. 1 above, whereas by 𝑚𝑎𝑡𝑒𝑟𝑖𝑎𝑙𝑖𝑧𝑒(𝐺) = 𝒯 ∪ 𝒜′ we denote the triple store obtained from exhaustive application of the following inference rules on 𝐺:6 ?𝑃 rdfs:subPropertyOf ?𝑄. ?𝑆 ?𝑃 ?𝑂. ?𝐶 rdfs:subClassOf ?𝐷. ?𝑆 rdf:type ?𝐶. ?𝑆 ?𝑄 ?𝑂. ?𝑆 rdf:type ?𝐷. ?𝑃 rdfs:domain ?𝐶. ?𝑆 ?𝑃 ?𝑂. ?𝑃 rdfs:range ?𝐶. ?𝑆 ?𝑃 ?𝑂. ?𝑆 rdf:type ?𝐶. ?𝑂 rdf:type ?𝐶. Then, 𝑎𝑛𝑠(𝑞, 𝐺) = 𝑎𝑛𝑠(𝑟𝑒𝑤𝑟𝑖𝑡𝑒(𝑞, 𝒯 ), 𝐺 ∖ 𝒯 ) = 𝑎𝑛𝑠(𝑞, 𝑚𝑎𝑡𝑒𝑟𝑖𝑎𝑙𝑖𝑧𝑒(𝐺) ∖ 𝒯 ). Various existing triple stores (such as e.g. BigOWLIM [2]) have the option to perform materialization directly upon loading data. Accordingly, we will call materialized triple stores all triple stores that in each state always guarantee 𝐺 = 𝑚𝑎𝑡𝑒𝑟𝑖𝑎𝑙𝑖𝑧𝑒(𝐺). On the other extreme, this suggests alternatively to discuss triple stores that do not store any redundant ABox triples. By 𝑟𝑒𝑑𝑢𝑐𝑒(𝐺) we will denote the hypothetical operator that produces the reduced “core” of 𝐺, and we call a triple store reduced if 𝐺 = 𝑟𝑒𝑑𝑢𝑐𝑒(𝐺). While we do not provide the algorithm to compute 𝑟𝑒𝑑𝑢𝑐𝑒(𝐺), we note that this core is uniquely determined in our setting whenever 𝒯 is acyclic (which is usually a safe assumption)7 ; it could be naïvely computed by exhaustively “marking” each triple that can be inferred from applying any of the rules in Prop. 1 and subsequently removing all marked elements of 𝒜. The following observation follows trivially. Lemma 1. Let 𝐺 = 𝒯 be a triple store with an empty ABox, then 𝐺 is both reduced and materialized. Finally, we introduce the notion of a SPARQL update operation. Definition 5 (SPARQL update operation). Let 𝑃𝑑 , 𝑃𝑖 be BGPs and 𝑃𝑤 a BGP or UNION pattern. Then an update operation 𝑢(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑤 ) has the form DELETE 𝑃𝑑 INSERT 𝑃𝑖 WHERE 𝑃𝑤 Intuitively, the semantics of executing 𝑢(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑤 ) on 𝐺, denoted as 𝐺𝑢(𝑃𝑑 ,𝑃𝑖 ,𝑃𝑤 ) is defined interpreting both 𝑃𝑑 and 𝑃𝑖 as “templates” to be instantiated with the solutions of 𝑎𝑛𝑠(𝑃𝑤 , 𝐺), resulting in sets of ABox statements 𝒜𝑑 and 𝒜𝑖 to be deleted from and inserted into 𝐺, respectively. A naïve update semantics follows straightforwardly. Definition 6 (Naïve update semantics). Let 𝐺 = 𝒯 ∪𝒜 be a triple store, 𝑢(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑤 ) be an update operation, then 𝐺𝑢(𝑃𝑑 ,𝑃𝑖 ,𝑃𝑤 ) = (𝐺 ∖ 𝒜𝑑 ) ∪ 𝒜𝑖 , where ⋃︁ ⋃︁ 𝒜𝑑 = 𝑃𝑑 𝜃 and 𝒜𝑖 = 𝑃𝑖 𝜃 . 𝜃∈𝑎𝑛𝑠(𝑃𝑤 ,𝐺) 𝜃∈𝑎𝑛𝑠(𝑃𝑤 ,𝐺) As easily seen, this naïve semantics neither preserves materialized nor reduced triple stores; we leave it to the reader to confirm this observation on the update from Example 4 both on the reduced triple store from Example 1 and, resp., on the materialized triple store from Example 3. 6 These rules correspond to rules 2)𝑏), 3)𝑏), 4)𝑎)+𝑏) from [14]; since we neither consider blank nodes nor terminological inferences, we do not need the further rules from [14]. 7 We note that even in the case when the TBox is cyclic we could define a deterministic way to remove redundancies, e.g. by preserving the lexicographically smallest ABox statements only within a cycle. That is, given TBox 𝐴 ⊑ 𝐵 ⊑ 𝐶 ⊑ 𝐴 and ABox 𝐴(𝑥), 𝐶(𝑥); we would delete 𝐶(𝑥) and retain 𝐴(𝑥) only, to preserve reducedness. 3 Materialized/Reduced Preserving Update Semantics In this section, we will investigate alternative update semantics under RDFS entailment that either preserve materialized or reduced stores and discuss in how far these semantics can – similar to query answering – be implemented in off-the-shelf SPARQL 1.1 triple stores by simple rewriting techniques. We will look into the following possible semantics: Sem𝑚𝑎𝑡 0 : As a baseline for a materialized-preserving semantics, we discuss to apply the naïve semantics, followed by (re-)materialization of the whole triple store. Sem𝑚𝑎𝑡 1 : Another materialized-preserving semantics could follow the intention to – delete the instantiations of 𝑃𝑑 plus all their causes; – insert the instantiations of 𝑃𝑖 plus all their effects. Sem𝑚𝑎𝑡 2 : This semantics is also materialized-preserving but extends Sem𝑚𝑎𝑡 1 with the intention to additionally (recursively) delete “dangling” effects for instantiations of 𝑃𝑑 , that is, effect of to-be-deleted triples that would not be implied any longer by any non-deleted triples after deletion. Sem𝑟𝑒𝑑 0 : Again, the baseline for a reduced-preserving semantics would be to apply the naïve semantics, followed by (re-)reducing the triple store. Sem𝑟𝑒𝑑 1 : This reduced-preserving semantics extends Sem0 𝑟𝑒𝑑 by additionally deleting the causes of instantiations of 𝑃𝑑 . The definitions of semantics Sem𝑚𝑎𝑡 0 and Sem𝑟𝑒𝑑 0 are straightforward. Definition 7 (Baseline materialized- and reduced-preserving update semantics). Let 𝐺 and 𝑢(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑤 ) be as in Def. 6 above. Then, analogously to Def. 6, we define Sem𝑚𝑎𝑡 0 as Sem𝑚𝑎𝑡 𝐺𝑢(𝑃0𝑑 ,𝑃𝑖 ,𝑃𝑤 ) = 𝑚𝑎𝑡𝑒𝑟𝑖𝑎𝑙𝑖𝑧𝑒(𝐺𝑢(𝑃𝑑 ,𝑃𝑖 ,𝑃𝑤 ) ) and Sem𝑟𝑒𝑑 0 as Sem𝑟𝑒𝑑 𝐺𝑢(𝑃0𝑑 ,𝑃𝑖 ,𝑃𝑤 ) = 𝑟𝑒𝑑𝑢𝑐𝑒(𝐺𝑢(𝑃𝑑 ,𝑃𝑖 ,𝑃𝑤 ) ) Let us proceed with a quick “reality-check” on these two baseline semantics by means of our running example. Example 5. By referring back to the update from Example 4: as easily seen, neither Sem𝑚𝑎𝑡 0 (executed on the materialized triple store of Example 3), nor Sem𝑟𝑒𝑑 0 (executed on the reduced triple store of Example 1) would have any effect. As this behavior is quite arguable, which is why we will proceed with discussing how the other intentions could be implemented or, resp., what implications such alternative update semantics would have. Alternative Materialized-Preserving Semantics. Let us first look into materialized- preserving semantics in more detail. As it turns out, Sem𝑚𝑎𝑡 1 can be achieved fairly straightforwardly, building upon similar rewriting ideas as query rewriting. However, as we will argue, there might be reasons why one wants to adopt the other semantics. As we have seen, in the setting of RDFS we can use Alg. 1 rewrite to expand a CQ 𝑞 to a UCQ that considers all its “causes”. A slight variation can of course be used to compute the set of all causes, that is, in the most naïve fashion by just “flattening” the set of sets returned by Alg. 1 to a simple set; we denote this flattening operation on a set of sets 𝑆 as 𝑓 𝑙𝑎𝑡𝑡𝑒𝑛(𝑆). Likewise, we can easily define a “turned around” rewriting that computes all the “effects” of a pattern by modifying Alg. 1 such that it simply considers Table 2 with the first and third column “swapped” (and ‘_’ just matching any variable or constant).8 Let us call the resulting algorithm 𝑟𝑒𝑤𝑟𝑖𝑡𝑒𝑒𝑓 𝑓 . Likewise, we could have defined 𝑟𝑒𝑤𝑟𝑖𝑡𝑒𝑒𝑓 𝑓 algorithm as a modification of 𝑚𝑎𝑡𝑒𝑟𝑖𝑎𝑙𝑖𝑧𝑒(𝐺) applied to BGPs instead of to the graph itself.9 For the sake of completeness, in the same sense we can also analogously to 𝑟𝑒𝑑𝑢𝑐𝑒(𝐺) define an algorithm 𝑟𝑒𝑑𝑢𝑐𝑒(𝑃, 𝒯 ) which takes a pattern as input and reduces it wrt. a TBox 𝒯 . Using these considerations, we can thus define both rewritings that consider all causes, as well as rewritings that consider all effects: Definition 8 (Cause/Effect rewriting). Given a BGP insert or delete template 𝑃 for an update operation over the triple store 𝐺 = 𝒯 ∪ 𝒜, we define the all-causes-rewriting of 𝑃 as 𝑃 𝑐𝑎𝑢𝑠 = 𝑓 𝑙𝑎𝑡𝑡𝑒𝑛(𝑟𝑒𝑤𝑟𝑖𝑡𝑒(𝑃, 𝒯 )); likewise, we define the all-effects-rewriting of 𝑃 as 𝑃 𝑒𝑓 𝑓 = 𝑓 𝑙𝑎𝑡𝑡𝑒𝑛(𝑟𝑒𝑤𝑟𝑖𝑡𝑒𝑒𝑓 𝑓 (𝑃, 𝒯 )). This leads (almost) straightforwardly to a rewriting-based definition of Sem𝑚𝑎𝑡 1 : Definition 9. Let 𝑢(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑤 ) be an update operation, then Sem𝑚𝑎𝑡 𝐺𝑢(𝑃1𝑑 ,𝑃𝑖 ,𝑃𝑤 ) = 𝐺𝑢(𝑃 𝑐𝑎𝑢𝑠 ,𝑃 𝑒𝑓 𝑓 ,{𝑟𝑒𝑤𝑟𝑖𝑡𝑒(𝑃𝑤 )}{𝑃 𝑓 𝑣𝑎𝑟𝑠 }) 𝑑 𝑖 𝑑 𝑐𝑎𝑢𝑠𝑃 where 𝑃𝑑𝑓 𝑣𝑎𝑟𝑠 = {?x a rdfs:Resource. | for each ?x ∈ 𝑉 𝑎𝑟(𝑃𝑑 𝑑 ) ∖ 𝑉 𝑎𝑟(𝑃𝑑 )}. The only tricky part in this definition is the rewriting of the WHERE clause, where 𝑟𝑒𝑤𝑟𝑖𝑡𝑒(𝑃𝑤 ) is joined10 with a new pattern 𝑃𝑑𝑓 𝑣𝑎𝑟𝑠 that binds the “free” variables (i.e., the new variables denoted by ‘_’ in Tab. 2, which were introduced by Alg. 1) in the rewrit- ten DELETE clause, 𝑃𝑑𝑐𝑎𝑢𝑠 . Here ?x a rdfs:Resource. stands as a shortcut for the UCQ {{?x ?x𝑝 ?x𝑜 } UNION {?x𝑠 ?x ?x𝑜 } UNION {?x𝑠 ?x𝑝 ?x}} which binds any term occurring in the triple store. Example 6. Getting back to the materialized version of the triple store from Example 3, the update from Example 4 would, according to Sem𝑚𝑎𝑡1 be rewritten to DELETE { ?X a :Child. ?X :hasFather ?x1. ?X :hasMother ?x2. ?X :hasParent ?x3. } INSERT { ?Y a :Mother. ?Y a :Parent. } WHERE { { ?X :hasMother ?Y. } { ?x1 a rdfs:Resource. ?x2 a rdfs:Resource. ?x3 a rdfs:Resource.} } 8 We note that this could be viewed equivalently as simply applying the inference rules from Prop. 1 exhaustively to 𝑞. 9 Please note that it is not our intention to provide optimized algorithms here, but just to convey the feasibility of this rewriting. 10 A sequence of ’{}’-delimited patterns in SPARQL corresponds to a join, where such joins can again be nested with UNIONs, with the obvious semantics, for details cf. [9]. with the resulting triple store containing the following triples. :jane a :Mother, :Parent. :jack a :Father, :Parent. It is easy to argue that Sem𝑚𝑎𝑡1 is materialized-preserving. However, this semantics might still result in potentially non-intuitive behaviors. For instance, in this semantics the following subsequent calls of INSERTs and DELETEs might leave “traces” as shown by the following example. Example 7. Assuming 𝐺 = 𝑂𝑓 𝑎𝑚 from Example 1 with an empty ABox. Under Sem𝑚𝑎𝑡 1 semantics the following sequence of updates would leave as a trace the re- sulting triples as in Example 6, which would not be the case under the naïve semantics. DELETE{} INSERT {:joe :hasMother :jane;:hasFather :jack} WHERE{}; DELETE {:joe :hasMother :jane;:hasFather :jack} INSERT{} WHERE{}; Sem𝑚𝑎𝑡 2 tries to address the issue of such “traces”, but can no longer be formulated by such a relatively straightforward rewriting. For the present, preliminary paper we leave out a detailed definition/implementation capturing the intention of Sem𝑚𝑎𝑡 2 ; an obvious starting point would be the “Delete and Rederive” algorithm [7], but whether Sem𝑚𝑎𝑡 2 can be implemented by rewriting to SPARQL update operations following the naïve semantics is open. We emphasize though, that a semantics that achieves the intention of Sem𝑚𝑎𝑡 2 would still potentially run into arguable cases, since it might run into removing explicitly added assertions, whenever removed assertions cause these, as shown by the following example. Example 8. Assuming a materialized-preserving update semantics implementing the intention of Sem𝑚𝑎𝑡 2 , the behaviour of the following update sequence is left open: DELETE {} INSERT {:x a :Parent.} WHERE {}; DELETE {} INSERT {:x a :Father.} WHERE {}; DELETE {:x a :Father.} INSERT {} WHERE {}; Note that the second and third update operation in this example show that even insertion and immediate subsequent deletion of a single triple is not idempotent for Sem𝑚𝑎𝑡 2 . In a strict reading of Sem𝑚𝑎𝑡 2 ’s intention, :x a :Parent. would count as a dangling effect, since it is an effect of the triple in the second insertion with no other causes in the triple store, and thus should be removed upon the third update operation. Nonetheless, implementations of (materialized) triple stores may make a distinction between implicit and explicitly inserted triples (e.g. by storing explicit and implicit triples in separate graph); however, we consider the distinction between implicit triples and explicitly inserted ones non-trivial in the context of SPARQL 1.1 Update: for instance, is a triple inserted based upon implicit bindings in the WHERE clause of an INSERT statement to be considered “explicitly inserted” or not? The authors of the present paper tend towards avoiding such distinction, but we have more discussion of such rather philosophical aspects of possible SPARQL update semantics on our agenda; for now, let us rather turn our attention to the potential alternatives for reduced-preserving semantics. Alternative Reduced-Preserving Semantics. Again, similar to Sem𝑚𝑎𝑡 2 , for both the baseline semantics Sem𝑟𝑒𝑑 0 and Sem 𝑟𝑒𝑑 1 we leave it open whether it can be implemented by rewriting to SPARQL update operations following the naïve semantics, i.e., without the need to apply 𝑟𝑒𝑑𝑢𝑐𝑒(𝐺) over the whole triple store after each update; a strategy to avoid calling 𝑟𝑒𝑑𝑢𝑐𝑒(𝐺) would roughly include the following steps: – delete the instantiations 𝑃𝑑 plus all the effects of instantiations of 𝑃𝑖 , which will be implied anyways upon the new insertion, thus preserving reduced; – insert instantiations of 𝑃𝑖 only if they are not implied, that is, neither already implied by the current state of 𝐺 or all their causes in 𝐺 were to be deleted. We leave further investigation of whether these steps can be cast into one or several update requests directly by rewriting techniques to future work. Rather, we show that we can capture the intention of Sem𝑟𝑒𝑑 1 by a relatively straight- forward extension of the baseline semantics. Definition 10 (Sem𝑟𝑒𝑑 1 ). Let 𝑢(𝑃𝑑 , 𝑃𝑖 , 𝑃𝑤 ) be an update operation, then Sem𝑟𝑒𝑑 𝐺𝑢(𝑃1𝑑 ,𝑃𝑖 ,𝑃𝑤 ) = 𝑟𝑒𝑑𝑢𝑐𝑒(𝐺𝑢(𝑃 𝑐𝑎𝑢𝑠 ,𝑃𝑖 ,{𝑟𝑒𝑤𝑟𝑖𝑡𝑒(𝑃𝑤 )}{𝑃 𝑓 𝑣𝑎𝑟𝑠 }) ) 𝑑 𝑑 where 𝑃𝑑𝑐𝑎𝑢𝑠 and 𝑃𝑑𝑓 𝑣𝑎𝑟𝑠 are as before. Example 9. Getting back to the reduced version of the triple store from Example 1, the update from Example 4 would, according to Sem𝑟𝑒𝑑1 be rewritten to DELETE { ?X a :Child. ?X :hasFather ?x1. ?X :hasMother ?x2. ?X :hasParent ?x3. } INSERT { ?Y a :Mother. } WHERE { { ?X :hasMother ?Y. } { ?x1 a rdfs:Resource. ?x2 a rdfs:Resource. ?x3 a rdfs:Resource.} } with the resulting triple store after (re-)reduction containing the following triple. :jane a :Mother. Note that in a reduced store there is no need to delete effects of 𝑃𝑑 , which make the considerations that lead us to Sem𝑚𝑎𝑡 2 irrelevant for a reduced-preserving semantics, as shown in following example. Example 10. Under Sem𝑟𝑒𝑑 1 , as opposed to Sem1 𝑚𝑎𝑡 , the update sequence of Example 7 would leave no traces. However, note that the deletion of :joe :hasParent :jack in Example 9 might be viewed as non-intuitive, plus the update sequence of Example 8 would likewise result in an empty ABox, thus neither guaranteeing idempotence of single triple insertion followed by immediate deletion. We finally observe that the rewriting for Sem𝑟𝑒𝑑 1 is almost identical to Sem𝑚𝑎𝑡 1 , but reduced-preservation depends on a separate post-processing step not readily available in off-the-shelf triple stores, whereas Sem𝑚𝑎𝑡 1 could be directly executed by rewriting on any triple store, preserving materialization. 4 Further Related Work and Possible Future Directions There has already been work on updates over TBox-es in the context of tractable ontologies such as RDFS [8] and DL-Lite [4]. Even though such approaches give deterministic answers in general while computing updates, they are either limited to DELETEs or INSERTs, but not both at the same time nor in combination with templates filled by WHERE clauses, as in SPARQL 1.1; that is, these approaches are not based on BGP matching but rather on a set of ABox assertions to be updated known a priori. Pairing both DELETE and INSERT, as in our case, poses new challenges which we tried to fill the gap by introducing different rewriting semantics and taking into account both materialized and reduced stores. In the future, we plan to extend our work in the context of DL-Lite where we could reuse off-the-shelf, thoroughly studied rewriting techniques, and at the same time benefiting from a more expressive query language. Expanding beyond the simple language at the intersection of RDFS and DL-Lite towards more features of DL-Lite or coverage of less restricted RDF graphs and BGPs would impose new challenges: for instance, consistency checking and consistency-preserving updates (such as treated in [4]) do not yet play a role in the setting of RDFS; we are looking forward to investigate extensions in these directions. Besides, we are planning to implement all the proposed semantics and to test them against existing triple stores. As for further related works, in the context of reduced stores, let us also refer to [15], where the cost of redundancy elimination under various (rule-based) entailment regimes – including RDFS – is discussed in detail. In the area of database theory, there has been a lot of work on updating logical databases: Winslett [17] distinguishes between model-based and formula-based updates; our approach clearly falls in the latter category, more concretely, it could be viewed as sets of propositional knowledge base updates [10] generated by SPARQL instantiating DELETE/INSERT templates. Moreover, in the domain of updates over views, the “Delete and Rederive” algorithm [7] is closely related to our discussion about “dangling” effects (particularly, to our proposed semantics Sem𝑚𝑎𝑡 2 ) where the tuples to be deleted from (recursive) materialized views are determined if only there are no tuples in relational stores that map to the corresponding tuples in the view, i.e. there is no alternative derivation for the tuples in the view; as mentioned above, it is on our agenda to investigate whether such behavior can be achieved in terms of update-rewriting techniques rather than full re-derivations. Let us further note that in the more applied area of databases, there are obvious parallels between some of our considerations and CASCADE DELETEs in SQL (that is, deletions under foreign key constraints), in the sense that we trigger additional deletions of causes/effects in some of the proposed updated semantics discussed herein. Finally, it is worth mentioning that similar work has been done also in classical AI, e.g., in [1], where the main idea is to choose the maximal subset of propositions from a set that do not imply a certain proposition (which is going to be deleted), hereby coinciding with the delete rewriting of our Sem𝑚𝑎𝑡 1 semantics. 5 Conclusions We have discussed the semantics of SPARQL 1.1 Update in the context of RDFS. To the best of our knowledge, this is the first work to discuss in depth how to combine SPARQL 1.1 Entailment Regimes with the new SPARQL 1.1 Update language. While we acknowledge that our results are at a preliminary stage and we have been operating on a very restricted language in this paper, we could demonstrate that even in such a restricted setting (only capturing minimal RDFS entailments, only allowing ABox updates, restricting BGPs) the definition of a SPARQL 1.1 Update semantics under entailments is a non-trivial task. While we proposed several possible semantics, neither of those might seem intuitive for all possible use cases. So, while a “one-size-fits-all” update semantics might not exist, we believe that this is a timely topic that deserves increased attention. Particularly, in the light of the increased importance that RDF and SPARQL are experiencing also in dynamic domains where data is continuously updated (dealing with dynamics in Linked Data, querying sensor data or stream reasoning) further research in this area is needed. Acknowledgments This work has been funded by the Vienna Science and Technology Fund (WWTF, project ICT12-015), and by the Vienna PhD School of Informatics. References 1. Alchourron, C.E., Gardenfors, P., Makinson, D.: On the logic of theory change: Contraction functions and their associated revision functions. Theoria 48, 14–37 (1982) 2. Bishop, B., Kiryakov, A., Ognyanoff, D., Peikov, I., Tashev, Z., Velkov, R.: OWLIM: A family of scalable semantic repositories. Semantic Web 2(1), 33–42 (2011) 3. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: The DL-Lite family. Journal of Automated Reasoning 39(3), 385–429 (2007) 4. Calvanese, D., Kharlamov, E., Nutt, W., Zheleznyakov, D.: Evolution of DL-Lite knowledge bases. In: ISWC. pp. 112–128 (2010) 5. Gearon, P., Passant, A., Polleres, A.: SPARQL 1.1 Update (Mar 2013), W3C Recommendation 6. Glimm, B., Ogbuji, C., Hawke, S., Herman, I., Parsia, B., Polleres, A., Seaborne, A.: SPARQL 1.1 Entailment Regimes (Mar 2013), W3C Recommendation 7. Gupta, A., Mumick, I.S., Subrahmanian, V.S.: Maintaining views incrementally. In: ACM SIGMOD Int’l Conf. on Management of Data. pp. 157–166. ACM (1993) 8. Gutierrez, C., Hurtado, C., Vaisman, A.: Updating RDFS: from theory to practice. In: 8th Extended Semantic Web Conf. (ESWC 2011) (2011) 9. Harris, S., Seaborne, A.: SPARQL 1.1 Query Language (Mar 2013), W3C Recommendation 10. Katsuno, H., Mendelzon, A.O.: A unified view of propositional knowledge base updates. In: IJCAI. pp. 1413–1419 (1989) 11. Kontchakov, R., Rodriguez-Muro, M., Zakharyaschev, M.: Ontology-based data access with databases: A short course. In: Reasoning Web 2013, LNCS, vol. 8067, pp. 194–229. Springer (2013) 12. Mallea, A., Arenas, M., Hogan, A., Polleres, A.: On Blank Nodes. In: Proceedings of the 10th International Semantic Web Conference (ISWC 2011). LNCS, vol. 7031. Springer (Oct 2011) 13. Motik, B.: On the Properties of Metamodeling in OWL. Journal of Logic and Computation 17(4), 617–637 (2007) 14. Muñoz, S., Pérez, J., Gutiérrez, C.: Minimal deductive systems for RDF. In: ESWC. pp. 53–67 (2007) 15. Pichler, R., Polleres, A., Skritek, S., Woltran, S.: Complexity of redundancy detection on RDF graphs in the presence of rules, constraints, and queries. Semantic Web – Interoperability, Usability, Applicability 4(4) (2013) 16. Polleres, A., Hogan, A., Delbru, R., Umbrich, J.: RDFS & OWL reasoning for linked data. In: Reasoning Web 2013, LNCS, vol. 8067, pp. 91–149. Springer (Jul 2013) 17. Winslett, M.: Updating Logical Databases. Cambridge University Press (2005)