Introduction

Nested Constructs vs. Sub-Selects in SPARQL?

Axel Polleres

Juan Reutter

Egor V. Kostylev

1 0 Pontificia Universidad Cato ́lica de Chile and Center for Semantic Web research , CL 1 University of Oxford , UK 2 Vienna University of Economics and Business , AT

The issue of subqueries in SPARQL has appeared in different papers as an extension point to the original SPARQL query language. Particularly, nested CONSTRUCT in FROM clauses are a feature that has been discussed as a potential input for SPARQL 1.1 which was resolved to be left out in favour of select subqueries under the-unproven-conjecture that such subqueries can express nested construct queries. In this paper, we show that it is indeed possible to unfold nested SPARQL construct queries into subqueries in SPARQL 1.1; our transformation, however, requires an exponential blowup in the nesting depth. This suggests that nested construct queries are indeed a useful syntactic feature in SPARQL that cannot compactly be replaced by subqueries.

Introduction

SPARQL as a standard query language for the Semantic Web data has been the subject of many theoretical and practical works in over the last years. Upon version 1.0 of its standardisation, works have been published on its expressivity [ 2, 8, 10, 11 ] and on adding features, among others value creation and subqueries [ 2, 9 ].

These features have been taken on board in version 1.1 of the SPARQL specification. The first feature corresponds to an extension of SPARQL select queries, allowing SELECT clauses to assign values (i.e., blank nodes, literals or IRIs) to particular variables, by means of a new keyword AS. But more importantly, SPARQL 1.1 now permits the use of subqueries, that is, select queries can be arbitrarily nested within WHERE clauses [4, Section 12].

To showcase these operators, let us start from the following simple SPARQL 1.0 query that essentially replaces all object values in a given graph with blank nodes:

CONSTRUCT { ?S ?P []} WHERE { ?S ?P ?O }.

In SPARQL 1.1 we could assign blank nodes explicitly by means of a subquery: CONSTRUCT { ?S ?P ?Y } WHERE

{ SELECT ?S ?P ( bnode() AS ?Y ) WHERE { ?S ?P ?O } }, which can be written more compactly4 as follows ? This work was supported by Jubila¨umsstiftung WU Wien as well as by the WWTF project “SEE”, and also by Iniciativa Cient´ıfica Milenio Grant 120004 4 The BIND keyword is just syntactic sugar for writing a sub-SELECT query which just assigns a single expression to a variable more concisely.

CONSTRUCT { ?S ?P ?Y } WHERE { ?S ?P ?O BIND( bnode() AS ?Y ) }.

While nested select queries are allowed in SPARQL 1.1, other forms of subqueries such as nested CONSTRUCT queries in FROM clauses, as proposed in [ 2, 9 ], were not included to the new specification under the—unproven—conjecture that select subqueries can express nested construct queries.5

For instance, consider the following query Q that uses nested constructs:

The intention of this query is to construct first an RDF document where a person U is a friend of V if U knows V and U works together with V. This is done by means of the inner construct subquery of Q. The resulting graph is then queried by the WHERE clause of the outer query, which creates a graph of good friends: X is a good friend of Y if X and Y are friends and have another friend in common. With a little work one can see that query Q is indeed equivalent to the following SPARQL 1.1 query: CONSTRUCT { ?X ex:goodFriend ?Y } WHERE { { SELECT ?X ex:friends AS ?W1 ?Y

WHERE { ?X foaf:knows ?Y . ?X ex:worksWith ?Y } } . { SELECT ?X ex:friends AS ?W2 ?Z

WHERE { ?X foaf:knows ?Z . ?X ex:worksWith ?Z } } . { SELECT ?Y ex:friends AS ?W3 ?Z

WHERE { ?Y foaf:knows ?Z . ?Y ex:worksWith ?Z } } }. In essence, we have replaced and rewritten each of the triples in the outer WHERE clause for the conditions in the inner construct query that give rise to these triples, as it is for instance commonly done in query rewriting using views [ 1 ].

But can this idea be generalised for all nested construct queries? That is, is it true that given any SPARQL query Q that uses nested CONSTRUCT one can find a query in SPARQL 1.1 (without nested CONSTRUCT) that is equivalent to Q? This question was considered in [ 7 ], with a positive answer for a restricted case of SPARQL queries where one disallows blank nodes in templates and only uses the functionalities in SPARQL 1.0. On the other hand, it was shown that there are some queries with nested CONSTRUCT that cannot be expressed as a SPARQL 1.0 query (this follows from the close connection between construct queries and data exchange [ 3 ]). However, the results in [ 7 ] consider a much weaker language, because they do not consider any of the SPARQL 1.1 operators, and in particular the binding of variables in SELECT subqueries.

In this paper we show that by allowing full SPARQL 1.1 one can indeed express nested construct queries, that is, in other words, the language of SPARQL 1.1 construct 5 https://www.w3.org/2009/sparql/track/issues/7 queries is composable. However, our result, just as [ 7 ], assumes set semantics, contrary to bag semantics in the specification. Also, our rewriting, even if polynomial when just one construct query is nested in another, is exponential in the depth of nesting. Moreover, it is quite cumbersome and unintuitive. Nevertheless, this is the first step in proving the conjecture that nested SELECT queries are indeed the right notion for subqueries in SPARQL. We do believe that looking at set semantics is a natural and important first step in the direction of this work.

The plan of this paper is as follows: after formalisation of SPARQL 1.1 in Section 2, we define the notion of composability in Section 3, present our rewriting in Section 4, and conclude in Section 5. 2

SPARQL Algebra

We next recapitulate the SPARQL algebra as well as basic notions on RDF. RDF Graphs Let I, L, and B be countably infinite pairwise disjoint sets of IRIs, literals, and blank nodes, respectively, where literals include numbers, strings, and Boolean values true and false. The set of (RDF) terms T is I [ L [ B. An (RDF) triple is an element (s; p; o) of (I [ B) I T where s is called the subject, p the predicate, and o the object. An (RDF) graph is a finite set of RDF triples.

SPARQL Algebra Syntax We concentrate on the core of SPARQL 1.1 and build upon the formalisation by [ 8 ]. We distinguish three types of syntactic building blocks— filter expressions, patterns, and queries, built over terms T and an infinite set V = f?x; ?y; : : :g of variables, disjoint from T as defined next.

(Filter) expressions are defined inductively as follows: – all variables in V and all terms in I [ L are expressions, – bNode() and IRINode(b), for b 2 B, are expressions, – if ?x is a variable in V then bound(?x) is an expression, – if R1 is an expression then so are isBlank(R1), isIRI(R1) and isLiteral(R1), – if also R2 is an expression then so are (R1 = R2), (R1 < R2), (:R1), (R1 ^ R2), – if also R3 is an expression, then so is if(R1; R2; R3).

We use expressions R1 _ R2, R1 ! R2 and R1 $ R2 as abbreviations for :(:R1 ^ :R2), :R1 _ R2 and (R1 ^ R2) _ (:R1 ^ :R2), respectively.

A triple pattern is a triple in (I [ L [ V) (I [ V) (I [ L [ V). Then, (graph) patterns are inductively defined as follows: – a BGP, that is a possibly empty set of triple patterns, is a pattern; – if P1 and P2 are patterns, then so are P1 AND P2 and P1 UNION P2; – if also R is an expression, then P1 FILTER R and P1 OPTR P2 are patterns; – if also ?x is a variable not appearing in P1, then P1 BIND R AS ?x is a pattern; – if also X is a set of variables, then SELECT X WHERE R is a pattern.

If a pattern is of the form SELECT X WHERE R then the set of its free variables is X; otherwise, it is the union of the free variables of all its immediate subpatterns and, in case of P1 BIND R AS ?x, of the singleton set f?xg.

Finally, there are several types of queries in SPARQL. Most commonly used and studied are select queries, which are just patterns in our formalisation. In this paper, however, we concentrate on construct queries of the form

CONSTRUCT H WHERE P;

where H is a set of triples from (T [ V) (I [ V) (T [ V), called template, and P is a pattern. For S a pattern, condition, template, etc., let IRI(S), blank(S) and var(S) denote all the IRIs, blank nodes and variables, respectively, that appear in S.

The standard includes several other constructs besides the ones defined above. Some of them, such as subqueries in expressions (i.e., Exists in filters), VALUES and MINUS are easily (and polynomially) expressible, under set semantics, adopted here (see below), via the core operators [ 5, 6 ]. Others, such as the named graph operator GRAPH and property paths, are immaterial to our research and omitted for brevity. Finally, note that filter expression IRINode(b), whose intention is to create a fresh IRI for each blank node b, does not appear in SPARQL 1.1. However, this operator can be expressed in our formalisation, provided it additionally include the string concatenation operator as well as casting operators from strings to IRIs and back, which are present in the specification.

To distinguish SPARQL 1.1 as in the standard and our formalisation we denote SC the language of construct queries as in the latter.

SPARQL Algebra Semantics The semantics of patterns in SC is defined in terms of (solution) mappings, that is, partial functions from variables V to terms T. The domain of , denoted dom( ), is the set of variables over which is defined. Mappings 1 and 2 are compatible, written 1 2, if 1(?x) = 2(?x) for each ?x in dom( 1) \ dom( 2). If 1 2, then 1 [ 2 is the mapping obtained by extending 1 according to 2 on all the variables in dom( 2) n dom( 1).

The evaluation [[R]] of an expression R in SC with respect to a mapping is a value in T [ ferrorg, where error is a special symbol not in T, defined as follows: – [[?x]] is (?x) if ?x 2 dom( ) and error otherwise, while [[`]] is ` for ` 2 I[L; – [[bNode()]] is a fresh blank node that does not appear in the queried graph G, while [[IRINode(b)]] = g(b), where g is an injective function from B to InIRI(G); – [[bound(?x)]] is true if ?x 2 dom( ) and false otherwise; – [[isBlank(R1)]] , [[isIRI(R1)]] , and [[isLiteral(R1)]] are true if [[R1]] 2 B, [[R1]] 2 I, and [[R1]] 2 L, respectively, and false otherwise; – [[R1 R2]] , for a comparison operator , is [[R1]] [[R2]] if [[R1]] and [[R2]] are both not error and of suitable types, or error otherwise; – [[:R1]] is true if [[R1]] = false, it is false if [[R1]] = true, and it is error otherwise, while [[R1 ^ R2]] is true if [[R1]] = [[R2]] = true, it is false if [[R1]] or [[R2]] is false, and it is error otherwise; – [[if(R1; R2; R3)]] = [[R2]] if [[R1]] = true, it is [[R3]] if [[R1]] = false, and error otherwise.

The semantics of patterns over a graph G is defined as follows, where (P ) is the pattern obtained from P by replacing its variables according to : – [[P ]]G = : var(P ) ! T j (P ) G for a BGP P ; – [[P1 AND P2]]G = j 1 2 [[P1]]G; 2 2 [[P2]]G; = 1 [ 2 ; – [[P1 UNION P2]]G = [[P1]]G [ [[P2]]G; – [[P1 FILTER R]]G = j 2 [[P1]]G; [[R]] = true ; – [[P1 OPTR P2]]G = [[P1 AND P2]]G [ j 2 [[P1]]G; 8 2 2 [[P2]]G: 6 2 or [[R]] [ 2 = false ; – [[P BIND R AS ?x]]G = 0 j 2 [[P ]]G; 0 = [ f?x 7! [[R]] g; [[R]] 6= error [ j 2 [[P ]]G; [[R]] = error ; – [[SELECT X FROM P1]]G = j = 0jX ; 0 2 [[P1]]Gg, where 0jX is a restriction of 0 to X.

To define the semantics of construct queries in SC fix, for every template H and graph G, a family F (H; G) of renaming functions. This family contains, for every mapping from var(H) to T, an injective function f : blank(H) ! B n blank(G). These functions must have pairwise disjoint ranges (i.e., there are no b and b0 such that f 1 (b) = f 2 (b0) for different 1 and 2). Then, [[CONSTRUCT H WHERE P ]]G = f (f (t)) j

2 [[G]]P ; t is a triple in H and (f (t)) is well-formedg; where f is the corresponding renaming function for in F (H; G). Here, a triple is well-formed if it is indeed an RDF triple, that is, does not have a blank node as predicate, literal as subject, etc.6

Note that we adopt set semantics, contrary to bag (multi-set) semantics in the specification. We leave the consideration of bag semantics for future work. 3

Definitions and Problem Statement

In this paper we address the question of composability of SPARQL.

Definition 1. A query language L with the same input and output domains D is composable if for any queries q1; q2 2 L there is a query q 2 L such that q2(q1(D)) = q(D) for any D 2 D, where q0(D0) is the output of a query q0 on an input D0.

The language of SPARQL select queries does not satisfy the requirements in this definition, because its input domain is RDF graphs and its output domain is sets of mappings. However, the language of SPARQL construct queries does satisfy the requirements, because its inputs, the set of RDF graphs, are its outputs as well; therefore, this is the subject of inquiry of this paper.

Note, however, that whether [[Q2]][[Q1]]G = [[Q]]G holds, for construct queries Q1, Q2 and Q, depends on the particular blank-node generating functions in the definitions of the semantics of bNode() and construct templates. Since the intuitive meaning of blank nodes is to represent existentially quantified named nulls whose exact names are immaterial, we silently consider RDF graphs and sets of mappings up to isomorphism, that is, up to bijective renaming of blank nodes. In this way checking whether [[Q2]][[Q1]]G is equivalent to [[Q]]G under such renaming does not depend on the particular choice of the generating functions.

As it is mentioned in the introduction, the question of composability of construct queries was considered in [ 7 ], and it was shown, using techniques from data exchange [ 3 ], that these queries are not composable. However, the language considered 6 Note that this can be achieved by extending P with the respective FILTER expressions. in this previous work is different from ours, because it includes neither the BIND operator nor the bNode() and IRINode(b) functions. The main result of this paper is the following theorem, which shows that the negative result does not extend to SC . Theorem 1. The language SC of construct queries is composable. 4

Composability of Construct Queries

Consider the following queries in language SC described in Section 2.

Q1 = CONSTRUCT H1 WHERE P1; Q2 = CONSTRUCT H2 WHERE P2:

Q = CONSTRUCT H2 WHERE P;

In the rest of the paper we explain how to rewrite these queries to just one query also in SC , such that [[Q2]][[Q1]]G = [[Q]]G for any graph G. Since we apply Q2 to the result of Q1, we call these queries the outer and the inner, respectively.

Note that the template of the rewriting Q is the template of the outer query Q2. The pattern P of Q is defined as

Punif (Pblank; Prew)

for patterns Punif, Pblank, and Prew, with the former taking the latter two as parameters (i.e., subpatterns). Next we define these patterns and study their properties. It is important to mention that rewriting Q is always of polynomial size in the size of Q1 and Q2 (but the rewriting is exponential in the number of queries we nest in this fashion).

Without loss of generality we assume that all local (not free) variables in different SELECT subpatterns of P1 and P2 have different names; and that var(P1) \ var(P2) = ; (we can always achieve this by renaming). In what follows we denote the variables in var(P1) and in var(P2) by ?x and ?y, respectively (both possibly with subscripts). We also assume that P2 does not use any BIND sub-patterns; it is possible to cover the general case by a construction similar to the one described below, but the notation becomes more elaborated.

We start with Pblank. The idea of this pattern is to produce the same set of mappings as P1 except that each of them is extended to new variables bound to fresh blank nodes, one variable for each blank node in H1. To this end, let ?b1; : : : ; ?bl be fresh variables, for l the size of blank(H1). Then

Pblank = P1 BIND bNode() AS ?b1 : : : BIND bNode() AS ?bl:

The following property of Pblank follows from the definition of the BIND operator (recall that we consider sets of mappings up to renaming of blank nodes).

Lemma 1. For each mapping , let

Then, for any graph G, we have that map each ?bi, 1 i

l, to a fresh blank node. [[Pblank]]G = f 1 j 1 = [ ; 2 [[P1]]Gg:

Having Pblank at hand we define pattern Prew. Its intention is to be a rewriting of P2 such that it works not over the result of Q1, but over the original input graph. For any input graph, its answer, that is, its output set of mappings, coincides with the answer of P2, except that, first, the mappings are extended to some additional variables (e.g., all projections are discarded), and, second, instead of blank nodes constructed by Q1 the answer to Prew has the corresponding IRIs IRINode(b) in the ranges of its mappings. Note, however, that blank nodes b here are from template H1 and not the blank nodes f (b) in the result of Q1, so it could be several f (b) for each IRINode(b) (more precisely, one for each in the evaluation of P ). Replacing IRINode(b) with f (b) is done by the third part Punif of P , which takes into account both patterns Pblank and Prew.

Consider any occurrence p of a triple pattern (s1; s2; s3) in pattern P2. Note that we consider each occurrence, not just each triple pattern: for example the two occurrences in (?x; ?y; ?z) AND (?x; ?y; ?z) are considered separately. Let p be a renaming function that maps each free variable ?x of P1 to a fresh variable ?xp and let p be a renaming function that maps each ?y 2 var(p) to a fresh variable ?yp. Assuming that p extends to IRIs and literals as identity, let bplank further extend p to each blank node b as IRINode(b). For each triple t = (r1; r2; r3) in template H1 let Psub(t; p) = P 3, with patterns P i, i = 1; 2; 3, defined as follows, taking P 0 as p(P1): – if si is a variable in V that is different from all s1; : : : ; si 1 then – if si is a variable that is equal to some sj for 1 j < i then P i = P i 1 BIND blank(ri) AS p(si); p P i = P i 1 FILTER bplank(ri) = p(si); – finally, if si 2= V then

P i = P i 1 FILTER bplank(ri) = si: For example, if p = (?y1; ?y2; ?y3) and t = (?x1; ?x2; b) then Psub(t; p) is p(P1) BIND ?x1p AS ?yp1 BIND ?x2p AS ?yp2 BIND IRINode(b) AS ?yp3; while if p = (?y1; `; ?y1) and t = (?x1; ?x2; ?x3) then Psub(t; p) is

p(P1) BIND ?x1p AS ?yp1 FILTER ?x2p = ` FILTER ?x3p = ?yp1:

Let p = (s1; s2; s3) be an occurrence of triple pattern in P2 as above, and H1 = t1; : : : ; tm. Then let Pp be the pattern (Psub(t1; p) UNION : : : UNION Psub(tm; p)) FILTER bound(?y1) ^ ^ bound(?yk); where ?y1; : : : ; ?yk are all the variables among s1, s2, and s3.

Patterns Pp are rewritings of all particular occurrences p of triple patterns in pattern P2. The rewritings of different occurences, however, have no variables in common by construction. Therefore, we introduce a condition Rjoin(p1; p2; ?y) for any two occurrences of triple patterns p1 and p2 in P2 that share a variable ?y, that is defined as follows, for all the blank nodes b1; : : : ; bl in blank(H1): :bound(?yp1 ) _ :bound(?yp2 ) _ (?yp1 6= IRINode(b1) ^ : : : ^ ?yp1 6= IRINode(bl) ^ ?yp1 = ?yp2 ) _ ((?yp1 = IRINode(b1) _ : : : _ ?yp1 = IRINode(bl)) ^ ?yp1 = ?yp2 ^ ?x1p1 ?x1p2 ^ ^ ?xpn1 ?xpn2 ); where ?x1; : : : ; ?xn are all the free variables of P1 and ?x1 ?x2 is the condition (bound(?x1) $ bound(?x2)) ^ (bound(?x1) ! ?x1 = ?x2): The idea of Rjoin(p1; p2; ?y) is as follows: if both of ?yp1 and ?yp2 are defined and mapped to usual IRIs, literals or blank nodes, then they should be the same; if they are defined and mapped to some IRINode(bi), then they should be not only the same, but also be associated to exactly the same mapping in the answer to the inner pattern.

We now define the rewriting Prew of P2 by structural induction, and start with conditions. To this end, the rewriting Rew(R0) of a condition R0 in P2 is obtained from R0 by the following two steps, for b1; : : : ; bl all the blank nodes of H1 and p1; : : : ; pk all the occurrences of triple patterns of P2 that mention ?y: 1. replace each isBlank(?y) by the expression isBlank(?y) _ ?y = IRINode(b1) _ _ ?y = IRINode(bl); 2. replace each variable ?y by the expression if(bound(?yp1 ); ?yp1 ; if(bound(?yp2 ); ?yp2 ; : : : ; if(bound(?ypk 1 ); ?ypk 1 ; ?ypk ) : : :)): Finally, the rewriting Rew(P 0) for any subpattern P 0 of P2 (including occurrences of triple patterns) is defined as follows: – if P 0 is the empty BGP P; then Rew(P 0) = P;, – if P 0 is a singleton BGP fpg then Rew(P 0) = Pp, – if P 0 is a BGP fp1; : : : ; pkg for k > 1 then Rew(P 0) = Rew(fp1g AND AND fpkg) (see below), – if P 0 = P10 ANDP20 then Rew(P 0) = (Rew(P10)ANDRew(P20)) FILTER R, where R is a conjunction of Rjoin(p1; p2; ?y) for each triple pattern p1 in P10 and each triple pattern p2 in P20 such that p1 and p2 have a common variable, as well as for each their common variable ?y, – if P 0 = P10 OPTR0 P20 then Rew(P 0) = Rew(P10) OPTRew(R0)^R Rew(P20), where

R is as in the case of AND, – if P 0 = P10 UNION P20 then Rew(P 0) = Rew(P10) UNION Rew(P20), – if P 0 = P10 FILTER R0 then Rew(P 0) = Rew(P10) FILTER Rew(R0); – if P 0 = SELECT X WHERE P10 then Rew(P 0) = Rew(P10).

The first important property of Prew is given in the following lemma. Let P2 be obtained from P2 by replacing each subpattern SELECT X WHERE P 0 by P 0, that is, by disregarding all projections (recall that we assume local variables in different subpatterns to have different names).

Lemma 2. For any graph G, mapping 2 [[Prew]]G, variable ?y 2 Vars(P2 ) and triple patterns p1; p2 of P2 that mention ?y, if both ?yp1 and ?yp2 are bound by then (?yp1 ) = (?yp2 ).

Let us extend, for technical reasons, the set of terms to all pairs [b; 1] for all blank nodes b in H1 and mappings 1. Then, having this definition and Lemma 2 at hand, we can “gather” every mapping 2 [[Prew]]G to its gathering mapping as follows, for every ?y 2 Vars(P2 ): – ?y is bound in if and only if one of ?yp1 ; : : : ; ?ypk is bound in , where p1; : : : ; pk are all triple patterns of P2 that mention ?y; – if i is such that ?ypi is bound and ?ypi = IRINode(b) for some b 2 blank(H1) then (?y) = [b; ( jXpi )], where Xpi = x1pi ; : : : ; xpni is the copy of the free variables x1; : : : ; xn of P1 corresponding to pi and is the renaming of each xjpi to xj ; – if i is such that ?ypi is bound and ?ypi 6= IRINode(b) for any b 2 blank(H1) then (?y) = (?ypi ).

The second property of Prew is as follows.

Lemma 3. For any graph G, a mapping is a gathering of some 2 [[Prew]]G if and only if there exists 2 2 [[P2 ]][[Q1]]G such that, for every ?y 2 Vars(P2 ), – if 2(?y) is a blank node f 1 (b) created by H1 with respect to a mapping 1 2 [[P1]]G then (?y) = [b; 1], – otherwise 2(?y) = (?y).

Having this lemma at hand, we can define the pattern Punif (Pblank; Prew). To this end, let Y = ?y1; : : : ; ?ym be all the free variables of P2 and let P i, 1 i m, be defined as follows, taking P 0 = Pblank AND Prew:

P i = (P i 1 BIND V?yi AS ?yi) FILTER R?yi ; where V?yi is the following condition, for b1; : : : ; bl all the blank nodes of H1: if(S?yi = IRINode(b1); ?b1; : : : if(S?yi = IRINode(bl); ?bl; S?yi ) : : :); S?yi is as follows, for p1; : : : ; pk all the triple patterns in P2 that uses ?yi: if(bound(?ypi1 ); ?ypi1 ; : : : if(bound(?ypik 1 ); ?ypik 1 ; ?ypik ) : : :); and R?yi is as follows, for free variables ?x1; : : : ; ?xn of P1 (which appear in Pblank): (bound(?ypi1 ) ^ (?ypi1 = IRINode(b1) _ : : : _ ?ypi1 = IRINode(bl))

! (?x1 ?x1p1 ^ ^ ?xn ?xpn1 )) ^ (bound(?ypik ) ^ (?ypik = IRINode(b1) _ : : : _ ?ypik = IRINode(bl))

! (?x1 ?x1pk ^ ^ ?xn ?xpnk )): Then we take the following pattern as the pattern P in resulting query Q: ^

Punif (Pblank; Prew) = SELECT Y WHERE P m:

We have the following lemma, whose immediate corollary is Theorem 1. Lemma 4. For any construct queries Q1 and Q2 and the query Q constructed on the base of Q1 and Q2 it holds that [[Q2]][[Q1]]G = [[Q]]G for any graph G.

Conclusion

For space reasons we cannot give a complete example of our construction, but we have uploaded one at http://web.ing.puc.cl/˜jreutter/construct/.

Our results confirm the fact that nested CONSTRUCT queries are, in theory, an unnecessary feature of SPARQL, at least concerning set semantics. However, our construction introduces a blowup when translating from nested CONSTRUCT to SPARQL 1.1, which can be exponential in the depth of nesting. We conjecture that this blowup is unavoidable in the worst case. But even if this blowup was not important, the rewriting appears to be a very deep and technical translation, and unfortunately it is not easy to discover the intentions of nested CONSTRUCT queries simply by looking at their translation. For this reason we believe that nested CONSTRUCT queries are actually a desired feature of SPARQL. As a future work we would like to investigate update queries in SPARQL. Since we conjecture that update queries can be expressed as nested construct queries, it would be conceivable that the former are also a replaceable (but welcome) addition of SPARQL 1.1.

1. Abiteboul , S. , Hull , R. , Vianu , V. : Foundations of databases , vol. 8 . Addison-Wesley Reading ( 1995 )

2. Angles , R. , Gutierrez , C. : Subqueries in SPARQL . In: AMW' 11 ( 2011 )

3. Fagin , R. , Kolaitis , P.G. , Popa , L. , Tan , W.C. : Composing schema mappings: Second-order dependencies to the rescue . ACM Trans. Database Syst . 30 ( 4 ), 994 - 1055 ( 2005 )

4. Harris , S. , Seaborne , A. : SPARQL 1.1 Query Language . W3C Rec ., W3C (Mar 2013 )

5. Kaminski , M. , Kostylev , E.V. ,

Cuenca

Grau , B. : Semantics and Expressive Power of Subqueries and Aggregates in SPARQL 1.1 . In: WWW' 16 ( 2016 )

6. Kontchakov , R. , Kostylev , E.V. : On expressibility of non-monotone operators in sparql . In: KR'16 ( 2016 )

7. Kostylev , E.V. , Reutter , J.L. , Ugarte , M.: CONSTRUCT Queries in SPARQL . In: ICDT'15 . vol. 31 ( 2015 )

8. Pe´rez, J., Arenas , M. , Gutierrez , C. : Semantics and Complexity of SPARQL . ACM Transactions on Database Systems 34 ( 3 ) ( 2009 )

9. Polleres , A. , Scharffe , F. , Schindlauer , R.: SPARQL+ + for mapping between RDF vocabularies . In: ODBASE'07 . pp. 878 - 896 ( 2007 )

10. Polleres , A. , Wallner , J.P. : On the relation between SPARQL1.1 and answer set programming . Journal of Applied Non-Classical Logics 23 ( 1-2 ), 159 - 212 ( 2013 )

11. Zhang , X., Van den Bussche, J.: On the primitivity of operators in SPARQL . Inf. Process. Lett. 114 ( 9 ), 480 - 485 ( 2014 )