<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Nested Constructs vs. Sub-Selects in SPARQL?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Axel Polleres</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juan Reutter</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Egor V. Kostylev</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Pontificia Universidad Cato ́lica de Chile and Center for Semantic Web research</institution>
          ,
          <addr-line>CL</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Oxford</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Vienna University of Economics and Business</institution>
          ,
          <addr-line>AT</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>The issue of subqueries in SPARQL has appeared in different papers as an extension point to the original SPARQL query language. Particularly, nested CONSTRUCT in FROM clauses are a feature that has been discussed as a potential input for SPARQL 1.1 which was resolved to be left out in favour of select subqueries under the-unproven-conjecture that such subqueries can express nested construct queries. In this paper, we show that it is indeed possible to unfold nested SPARQL construct queries into subqueries in SPARQL 1.1; our transformation, however, requires an exponential blowup in the nesting depth. This suggests that nested construct queries are indeed a useful syntactic feature in SPARQL that cannot compactly be replaced by subqueries.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        SPARQL as a standard query language for the Semantic Web data has been the subject
of many theoretical and practical works in over the last years. Upon version 1.0 of
its standardisation, works have been published on its expressivity [
        <xref ref-type="bibr" rid="ref10 ref11 ref2 ref8">2, 8, 10, 11</xref>
        ] and on
adding features, among others value creation and subqueries [
        <xref ref-type="bibr" rid="ref2 ref9">2, 9</xref>
        ].
      </p>
      <p>These features have been taken on board in version 1.1 of the SPARQL
specification. The first feature corresponds to an extension of SPARQL select queries, allowing
SELECT clauses to assign values (i.e., blank nodes, literals or IRIs) to particular
variables, by means of a new keyword AS. But more importantly, SPARQL 1.1 now permits
the use of subqueries, that is, select queries can be arbitrarily nested within WHERE
clauses [4, Section 12].</p>
      <p>To showcase these operators, let us start from the following simple SPARQL 1.0
query that essentially replaces all object values in a given graph with blank nodes:</p>
      <p>CONSTRUCT { ?S ?P []} WHERE { ?S ?P ?O }.</p>
      <p>In SPARQL 1.1 we could assign blank nodes explicitly by means of a subquery:
CONSTRUCT { ?S ?P ?Y } WHERE</p>
      <p>{ SELECT ?S ?P ( bnode() AS ?Y ) WHERE { ?S ?P ?O } },
which can be written more compactly4 as follows
? This work was supported by Jubila¨umsstiftung WU Wien as well as by the WWTF project
“SEE”, and also by Iniciativa Cient´ıfica Milenio Grant 120004
4 The BIND keyword is just syntactic sugar for writing a sub-SELECT query which just assigns
a single expression to a variable more concisely.</p>
      <p>CONSTRUCT { ?S ?P ?Y } WHERE { ?S ?P ?O BIND( bnode() AS ?Y ) }.</p>
      <p>
        While nested select queries are allowed in SPARQL 1.1, other forms of subqueries
such as nested CONSTRUCT queries in FROM clauses, as proposed in [
        <xref ref-type="bibr" rid="ref2 ref9">2, 9</xref>
        ], were
not included to the new specification under the—unproven—conjecture that select
subqueries can express nested construct queries.5
      </p>
      <p>For instance, consider the following query Q that uses nested constructs:</p>
      <p>The intention of this query is to construct first an RDF document where a person U
is a friend of V if U knows V and U works together with V. This is done by means of
the inner construct subquery of Q. The resulting graph is then queried by the WHERE
clause of the outer query, which creates a graph of good friends: X is a good friend of
Y if X and Y are friends and have another friend in common. With a little work one can
see that query Q is indeed equivalent to the following SPARQL 1.1 query:
CONSTRUCT { ?X ex:goodFriend ?Y } WHERE {
{ SELECT ?X ex:friends AS ?W1 ?Y</p>
      <p>WHERE { ?X foaf:knows ?Y . ?X ex:worksWith ?Y } } .
{ SELECT ?X ex:friends AS ?W2 ?Z</p>
      <p>WHERE { ?X foaf:knows ?Z . ?X ex:worksWith ?Z } } .
{ SELECT ?Y ex:friends AS ?W3 ?Z</p>
      <p>
        WHERE { ?Y foaf:knows ?Z . ?Y ex:worksWith ?Z } } }.
In essence, we have replaced and rewritten each of the triples in the outer WHERE
clause for the conditions in the inner construct query that give rise to these triples, as it
is for instance commonly done in query rewriting using views [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        But can this idea be generalised for all nested construct queries? That is, is it true
that given any SPARQL query Q that uses nested CONSTRUCT one can find a query
in SPARQL 1.1 (without nested CONSTRUCT) that is equivalent to Q? This
question was considered in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], with a positive answer for a restricted case of SPARQL
queries where one disallows blank nodes in templates and only uses the
functionalities in SPARQL 1.0. On the other hand, it was shown that there are some queries with
nested CONSTRUCT that cannot be expressed as a SPARQL 1.0 query (this follows
from the close connection between construct queries and data exchange [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]). However,
the results in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] consider a much weaker language, because they do not consider any
of the SPARQL 1.1 operators, and in particular the binding of variables in SELECT
subqueries.
      </p>
      <p>
        In this paper we show that by allowing full SPARQL 1.1 one can indeed express
nested construct queries, that is, in other words, the language of SPARQL 1.1 construct
5 https://www.w3.org/2009/sparql/track/issues/7
queries is composable. However, our result, just as [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], assumes set semantics, contrary
to bag semantics in the specification. Also, our rewriting, even if polynomial when
just one construct query is nested in another, is exponential in the depth of nesting.
Moreover, it is quite cumbersome and unintuitive. Nevertheless, this is the first step
in proving the conjecture that nested SELECT queries are indeed the right notion for
subqueries in SPARQL. We do believe that looking at set semantics is a natural and
important first step in the direction of this work.
      </p>
      <p>The plan of this paper is as follows: after formalisation of SPARQL 1.1 in Section 2,
we define the notion of composability in Section 3, present our rewriting in Section 4,
and conclude in Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>SPARQL Algebra</title>
      <p>We next recapitulate the SPARQL algebra as well as basic notions on RDF.
RDF Graphs Let I, L, and B be countably infinite pairwise disjoint sets of IRIs,
literals, and blank nodes, respectively, where literals include numbers, strings, and Boolean
values true and false. The set of (RDF) terms T is I [ L [ B. An (RDF) triple is an
element (s; p; o) of (I [ B) I T where s is called the subject, p the predicate, and
o the object. An (RDF) graph is a finite set of RDF triples.</p>
      <p>
        SPARQL Algebra Syntax We concentrate on the core of SPARQL 1.1 and build
upon the formalisation by [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. We distinguish three types of syntactic building blocks—
filter expressions, patterns, and queries, built over terms T and an infinite set V =
f?x; ?y; : : :g of variables, disjoint from T as defined next.
      </p>
      <p>(Filter) expressions are defined inductively as follows:
– all variables in V and all terms in I [ L are expressions,
– bNode() and IRINode(b), for b 2 B, are expressions,
– if ?x is a variable in V then bound(?x) is an expression,
– if R1 is an expression then so are isBlank(R1), isIRI(R1) and isLiteral(R1),
– if also R2 is an expression then so are (R1 = R2), (R1 &lt; R2), (:R1), (R1 ^ R2),
– if also R3 is an expression, then so is if(R1; R2; R3).</p>
      <p>We use expressions R1 _ R2, R1 ! R2 and R1 $ R2 as abbreviations for :(:R1 ^
:R2), :R1 _ R2 and (R1 ^ R2) _ (:R1 ^ :R2), respectively.</p>
      <p>A triple pattern is a triple in (I [ L [ V) (I [ V) (I [ L [ V). Then, (graph)
patterns are inductively defined as follows:
– a BGP, that is a possibly empty set of triple patterns, is a pattern;
– if P1 and P2 are patterns, then so are P1 AND P2 and P1 UNION P2;
– if also R is an expression, then P1 FILTER R and P1 OPTR P2 are patterns;
– if also ?x is a variable not appearing in P1, then P1 BIND R AS ?x is a pattern;
– if also X is a set of variables, then SELECT X WHERE R is a pattern.</p>
      <p>If a pattern is of the form SELECT X WHERE R then the set of its free variables is
X; otherwise, it is the union of the free variables of all its immediate subpatterns and,
in case of P1 BIND R AS ?x, of the singleton set f?xg.</p>
      <p>Finally, there are several types of queries in SPARQL. Most commonly used and
studied are select queries, which are just patterns in our formalisation. In this paper,
however, we concentrate on construct queries of the form</p>
      <sec id="sec-2-1">
        <title>CONSTRUCT H</title>
      </sec>
      <sec id="sec-2-2">
        <title>WHERE P;</title>
        <p>where H is a set of triples from (T [ V) (I [ V) (T [ V), called template, and P
is a pattern. For S a pattern, condition, template, etc., let IRI(S), blank(S) and var(S)
denote all the IRIs, blank nodes and variables, respectively, that appear in S.</p>
        <p>
          The standard includes several other constructs besides the ones defined above. Some
of them, such as subqueries in expressions (i.e., Exists in filters), VALUES and MINUS
are easily (and polynomially) expressible, under set semantics, adopted here (see
below), via the core operators [
          <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
          ]. Others, such as the named graph operator GRAPH
and property paths, are immaterial to our research and omitted for brevity. Finally, note
that filter expression IRINode(b), whose intention is to create a fresh IRI for each blank
node b, does not appear in SPARQL 1.1. However, this operator can be expressed in our
formalisation, provided it additionally include the string concatenation operator as well
as casting operators from strings to IRIs and back, which are present in the specification.
        </p>
        <p>To distinguish SPARQL 1.1 as in the standard and our formalisation we denote SC
the language of construct queries as in the latter.</p>
        <p>SPARQL Algebra Semantics The semantics of patterns in SC is defined in terms
of (solution) mappings, that is, partial functions from variables V to terms T. The
domain of , denoted dom( ), is the set of variables over which is defined.
Mappings 1 and 2 are compatible, written 1 2, if 1(?x) = 2(?x) for each ?x in
dom( 1) \ dom( 2). If 1 2, then 1 [ 2 is the mapping obtained by extending
1 according to 2 on all the variables in dom( 2) n dom( 1).</p>
        <p>The evaluation [[R]] of an expression R in SC with respect to a mapping is a
value in T [ ferrorg, where error is a special symbol not in T, defined as follows:
– [[?x]] is (?x) if ?x 2 dom( ) and error otherwise, while [[`]] is ` for ` 2 I[L;
– [[bNode()]] is a fresh blank node that does not appear in the queried graph G,
while [[IRINode(b)]] = g(b), where g is an injective function from B to InIRI(G);
– [[bound(?x)]] is true if ?x 2 dom( ) and false otherwise;
– [[isBlank(R1)]] , [[isIRI(R1)]] , and [[isLiteral(R1)]] are true if [[R1]] 2 B,
[[R1]] 2 I, and [[R1]] 2 L, respectively, and false otherwise;
– [[R1 R2]] , for a comparison operator , is [[R1]] [[R2]] if [[R1]] and [[R2]]
are both not error and of suitable types, or error otherwise;
– [[:R1]] is true if [[R1]] = false, it is false if [[R1]] = true, and it is error
otherwise, while [[R1 ^ R2]] is true if [[R1]] = [[R2]] = true, it is false if
[[R1]] or [[R2]] is false, and it is error otherwise;
– [[if(R1; R2; R3)]] = [[R2]] if [[R1]] = true, it is [[R3]] if [[R1]] = false,
and error otherwise.</p>
        <p>The semantics of patterns over a graph G is defined as follows, where (P ) is the
pattern obtained from P by replacing its variables according to :
– [[P ]]G = : var(P ) ! T j (P ) G for a BGP P ;
– [[P1 AND P2]]G = j 1 2 [[P1]]G; 2 2 [[P2]]G; = 1 [ 2 ;
– [[P1 UNION P2]]G = [[P1]]G [ [[P2]]G;
– [[P1 FILTER R]]G = j 2 [[P1]]G; [[R]] = true ;
– [[P1 OPTR P2]]G = [[P1 AND P2]]G [ j 2 [[P1]]G; 8 2 2 [[P2]]G: 6
2 or [[R]] [ 2 = false ;
– [[P BIND R AS ?x]]G = 0 j 2 [[P ]]G; 0 = [ f?x 7! [[R]] g; [[R]] 6=
error [ j 2 [[P ]]G; [[R]] = error ;
– [[SELECT X FROM P1]]G = j = 0jX ; 0 2 [[P1]]Gg, where 0jX is a
restriction of 0 to X.</p>
        <p>To define the semantics of construct queries in SC fix, for every template H and
graph G, a family F (H; G) of renaming functions. This family contains, for every
mapping from var(H) to T, an injective function f : blank(H) ! B n blank(G).
These functions must have pairwise disjoint ranges (i.e., there are no b and b0 such that
f 1 (b) = f 2 (b0) for different 1 and 2). Then,
[[CONSTRUCT H WHERE P ]]G =
f (f (t)) j</p>
        <p>2 [[G]]P ; t is a triple in H and (f (t)) is well-formedg;
where f is the corresponding renaming function for in F (H; G). Here, a triple is
well-formed if it is indeed an RDF triple, that is, does not have a blank node as predicate,
literal as subject, etc.6</p>
        <p>Note that we adopt set semantics, contrary to bag (multi-set) semantics in the
specification. We leave the consideration of bag semantics for future work.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Definitions and Problem Statement</title>
      <p>In this paper we address the question of composability of SPARQL.</p>
      <p>Definition 1. A query language L with the same input and output domains D is
composable if for any queries q1; q2 2 L there is a query q 2 L such that q2(q1(D)) = q(D)
for any D 2 D, where q0(D0) is the output of a query q0 on an input D0.</p>
      <p>The language of SPARQL select queries does not satisfy the requirements in this
definition, because its input domain is RDF graphs and its output domain is sets of
mappings. However, the language of SPARQL construct queries does satisfy the
requirements, because its inputs, the set of RDF graphs, are its outputs as well; therefore,
this is the subject of inquiry of this paper.</p>
      <p>Note, however, that whether [[Q2]][[Q1]]G = [[Q]]G holds, for construct queries Q1,
Q2 and Q, depends on the particular blank-node generating functions in the
definitions of the semantics of bNode() and construct templates. Since the intuitive meaning
of blank nodes is to represent existentially quantified named nulls whose exact names
are immaterial, we silently consider RDF graphs and sets of mappings up to
isomorphism, that is, up to bijective renaming of blank nodes. In this way checking whether
[[Q2]][[Q1]]G is equivalent to [[Q]]G under such renaming does not depend on the
particular choice of the generating functions.</p>
      <p>
        As it is mentioned in the introduction, the question of composability of construct
queries was considered in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], and it was shown, using techniques from data
exchange [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], that these queries are not composable. However, the language considered
6 Note that this can be achieved by extending P with the respective FILTER expressions.
in this previous work is different from ours, because it includes neither the BIND
operator nor the bNode() and IRINode(b) functions. The main result of this paper is the
following theorem, which shows that the negative result does not extend to SC .
Theorem 1. The language SC of construct queries is composable.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Composability of Construct Queries</title>
      <p>Consider the following queries in language SC described in Section 2.</p>
      <p>Q1 = CONSTRUCT H1 WHERE P1;
Q2 = CONSTRUCT H2 WHERE P2:</p>
      <p>Q = CONSTRUCT H2 WHERE P;</p>
      <p>In the rest of the paper we explain how to rewrite these queries to just one query
also in SC , such that [[Q2]][[Q1]]G = [[Q]]G for any graph G. Since we apply Q2 to the
result of Q1, we call these queries the outer and the inner, respectively.</p>
      <p>Note that the template of the rewriting Q is the template of the outer query Q2. The
pattern P of Q is defined as</p>
      <sec id="sec-4-1">
        <title>Punif (Pblank; Prew)</title>
        <p>for patterns Punif, Pblank, and Prew, with the former taking the latter two as parameters
(i.e., subpatterns). Next we define these patterns and study their properties. It is
important to mention that rewriting Q is always of polynomial size in the size of Q1 and Q2
(but the rewriting is exponential in the number of queries we nest in this fashion).</p>
        <p>Without loss of generality we assume that all local (not free) variables in different
SELECT subpatterns of P1 and P2 have different names; and that var(P1) \ var(P2) =
; (we can always achieve this by renaming). In what follows we denote the variables
in var(P1) and in var(P2) by ?x and ?y, respectively (both possibly with subscripts).
We also assume that P2 does not use any BIND sub-patterns; it is possible to cover
the general case by a construction similar to the one described below, but the notation
becomes more elaborated.</p>
        <p>We start with Pblank. The idea of this pattern is to produce the same set of mappings
as P1 except that each of them is extended to new variables bound to fresh blank nodes,
one variable for each blank node in H1. To this end, let ?b1; : : : ; ?bl be fresh variables,
for l the size of blank(H1). Then</p>
        <p>Pblank = P1 BIND bNode() AS ?b1 : : : BIND bNode() AS ?bl:</p>
        <p>The following property of Pblank follows from the definition of the BIND operator
(recall that we consider sets of mappings up to renaming of blank nodes).</p>
      </sec>
      <sec id="sec-4-2">
        <title>Lemma 1. For each mapping , let</title>
        <p>Then, for any graph G, we have that
map each ?bi, 1
i</p>
        <p>l, to a fresh blank node.
[[Pblank]]G = f 1 j 1 =
[
;
2 [[P1]]Gg:</p>
        <p>Having Pblank at hand we define pattern Prew. Its intention is to be a rewriting of P2
such that it works not over the result of Q1, but over the original input graph. For any
input graph, its answer, that is, its output set of mappings, coincides with the answer of
P2, except that, first, the mappings are extended to some additional variables (e.g., all
projections are discarded), and, second, instead of blank nodes constructed by Q1 the
answer to Prew has the corresponding IRIs IRINode(b) in the ranges of its mappings.
Note, however, that blank nodes b here are from template H1 and not the blank nodes
f (b) in the result of Q1, so it could be several f (b) for each IRINode(b) (more
precisely, one for each in the evaluation of P ). Replacing IRINode(b) with f (b) is done
by the third part Punif of P , which takes into account both patterns Pblank and Prew.</p>
        <p>Consider any occurrence p of a triple pattern (s1; s2; s3) in pattern P2. Note that
we consider each occurrence, not just each triple pattern: for example the two
occurrences in (?x; ?y; ?z) AND (?x; ?y; ?z) are considered separately. Let p be a renaming
function that maps each free variable ?x of P1 to a fresh variable ?xp and let p be a
renaming function that maps each ?y 2 var(p) to a fresh variable ?yp. Assuming that
p extends to IRIs and literals as identity, let bplank further extend p to each blank node
b as IRINode(b). For each triple t = (r1; r2; r3) in template H1 let Psub(t; p) = P 3,
with patterns P i, i = 1; 2; 3, defined as follows, taking P 0 as p(P1):
– if si is a variable in V that is different from all s1; : : : ; si 1 then
– if si is a variable that is equal to some sj for 1
j &lt; i then
P i = P i 1 BIND
blank(ri) AS p(si);
p
P i = P i 1 FILTER bplank(ri) =
p(si);
– finally, if si 2= V then</p>
        <p>P i = P i 1 FILTER bplank(ri) = si:
For example, if p = (?y1; ?y2; ?y3) and t = (?x1; ?x2; b) then Psub(t; p) is
p(P1) BIND ?x1p AS ?yp1 BIND ?x2p AS ?yp2 BIND IRINode(b) AS ?yp3;
while if p = (?y1; `; ?y1) and t = (?x1; ?x2; ?x3) then Psub(t; p) is</p>
        <p>p(P1) BIND ?x1p AS ?yp1 FILTER ?x2p = ` FILTER ?x3p = ?yp1:</p>
        <p>Let p = (s1; s2; s3) be an occurrence of triple pattern in P2 as above, and H1 =
t1; : : : ; tm. Then let Pp be the pattern
(Psub(t1; p) UNION : : : UNION Psub(tm; p)) FILTER bound(?y1) ^
^ bound(?yk);
where ?y1; : : : ; ?yk are all the variables among s1, s2, and s3.</p>
        <p>Patterns Pp are rewritings of all particular occurrences p of triple patterns in
pattern P2. The rewritings of different occurences, however, have no variables in common
by construction. Therefore, we introduce a condition Rjoin(p1; p2; ?y) for any two
occurrences of triple patterns p1 and p2 in P2 that share a variable ?y, that is defined as
follows, for all the blank nodes b1; : : : ; bl in blank(H1):
:bound(?yp1 ) _ :bound(?yp2 ) _
(?yp1 6= IRINode(b1) ^ : : : ^ ?yp1 6= IRINode(bl) ^ ?yp1 = ?yp2 ) _
((?yp1 = IRINode(b1) _ : : : _ ?yp1 = IRINode(bl)) ^ ?yp1 = ?yp2 ^
?x1p1 ?x1p2 ^ ^ ?xpn1
?xpn2 );
where ?x1; : : : ; ?xn are all the free variables of P1 and ?x1
?x2 is the condition
(bound(?x1) $ bound(?x2)) ^ (bound(?x1) ! ?x1 = ?x2):
The idea of Rjoin(p1; p2; ?y) is as follows: if both of ?yp1 and ?yp2 are defined and
mapped to usual IRIs, literals or blank nodes, then they should be the same; if they are
defined and mapped to some IRINode(bi), then they should be not only the same, but
also be associated to exactly the same mapping in the answer to the inner pattern.</p>
        <p>We now define the rewriting Prew of P2 by structural induction, and start with
conditions. To this end, the rewriting Rew(R0) of a condition R0 in P2 is obtained from R0
by the following two steps, for b1; : : : ; bl all the blank nodes of H1 and p1; : : : ; pk all
the occurrences of triple patterns of P2 that mention ?y:
1. replace each isBlank(?y) by the expression
isBlank(?y) _ ?y = IRINode(b1) _
_ ?y = IRINode(bl);
2. replace each variable ?y by the expression
if(bound(?yp1 ); ?yp1 ; if(bound(?yp2 ); ?yp2 ; : : : ; if(bound(?ypk 1 ); ?ypk 1 ; ?ypk ) : : :)):
Finally, the rewriting Rew(P 0) for any subpattern P 0 of P2 (including occurrences of
triple patterns) is defined as follows:
– if P 0 is the empty BGP P; then Rew(P 0) = P;,
– if P 0 is a singleton BGP fpg then Rew(P 0) = Pp,
– if P 0 is a BGP fp1; : : : ; pkg for k &gt; 1 then Rew(P 0) = Rew(fp1g AND AND
fpkg) (see below),
– if P 0 = P10 ANDP20 then Rew(P 0) = (Rew(P10)ANDRew(P20)) FILTER R, where R
is a conjunction of Rjoin(p1; p2; ?y) for each triple pattern p1 in P10 and each triple
pattern p2 in P20 such that p1 and p2 have a common variable, as well as for each
their common variable ?y,
– if P 0 = P10 OPTR0 P20 then Rew(P 0) = Rew(P10) OPTRew(R0)^R Rew(P20), where</p>
        <p>R is as in the case of AND,
– if P 0 = P10 UNION P20 then Rew(P 0) = Rew(P10) UNION Rew(P20),
– if P 0 = P10 FILTER R0 then Rew(P 0) = Rew(P10) FILTER Rew(R0);
– if P 0 = SELECT X WHERE P10 then Rew(P 0) = Rew(P10).</p>
        <p>The first important property of Prew is given in the following lemma. Let P2 be
obtained from P2 by replacing each subpattern SELECT X WHERE P 0 by P 0, that
is, by disregarding all projections (recall that we assume local variables in different
subpatterns to have different names).</p>
        <p>Lemma 2. For any graph G, mapping 2 [[Prew]]G, variable ?y 2 Vars(P2 ) and
triple patterns p1; p2 of P2 that mention ?y, if both ?yp1 and ?yp2 are bound by then
(?yp1 ) = (?yp2 ).</p>
        <p>Let us extend, for technical reasons, the set of terms to all pairs [b; 1] for all blank
nodes b in H1 and mappings 1. Then, having this definition and Lemma 2 at hand, we
can “gather” every mapping 2 [[Prew]]G to its gathering mapping as follows, for
every ?y 2 Vars(P2 ):
– ?y is bound in if and only if one of ?yp1 ; : : : ; ?ypk is bound in , where p1; : : : ; pk
are all triple patterns of P2 that mention ?y;
– if i is such that ?ypi is bound and ?ypi = IRINode(b) for some b 2 blank(H1) then
(?y) = [b; ( jXpi )], where Xpi = x1pi ; : : : ; xpni is the copy of the free variables
x1; : : : ; xn of P1 corresponding to pi and is the renaming of each xjpi to xj ;
– if i is such that ?ypi is bound and ?ypi 6= IRINode(b) for any b 2 blank(H1) then
(?y) = (?ypi ).</p>
        <p>The second property of Prew is as follows.</p>
        <p>Lemma 3. For any graph G, a mapping is a gathering of some 2 [[Prew]]G if and
only if there exists 2 2 [[P2 ]][[Q1]]G such that, for every ?y 2 Vars(P2 ),
– if 2(?y) is a blank node f 1 (b) created by H1 with respect to a mapping 1 2
[[P1]]G then (?y) = [b; 1],
– otherwise 2(?y) = (?y).</p>
        <p>Having this lemma at hand, we can define the pattern Punif (Pblank; Prew). To this end,
let Y = ?y1; : : : ; ?ym be all the free variables of P2 and let P i, 1 i m, be defined
as follows, taking P 0 = Pblank AND Prew:</p>
        <p>P i = (P i 1 BIND V?yi AS ?yi) FILTER R?yi ;
where V?yi is the following condition, for b1; : : : ; bl all the blank nodes of H1:
if(S?yi = IRINode(b1); ?b1; : : : if(S?yi = IRINode(bl); ?bl; S?yi ) : : :);
S?yi is as follows, for p1; : : : ; pk all the triple patterns in P2 that uses ?yi:
if(bound(?ypi1 ); ?ypi1 ; : : : if(bound(?ypik 1 ); ?ypik 1 ; ?ypik ) : : :);
and R?yi is as follows, for free variables ?x1; : : : ; ?xn of P1 (which appear in Pblank):
(bound(?ypi1 ) ^ (?ypi1 = IRINode(b1) _ : : : _ ?ypi1 = IRINode(bl))</p>
        <p>! (?x1 ?x1p1 ^ ^ ?xn ?xpn1 )) ^
(bound(?ypik ) ^ (?ypik = IRINode(b1) _ : : : _ ?ypik = IRINode(bl))</p>
        <p>! (?x1 ?x1pk ^ ^ ?xn ?xpnk )):
Then we take the following pattern as the pattern P in resulting query Q:
^</p>
        <p>Punif (Pblank; Prew) = SELECT Y WHERE P m:</p>
        <p>We have the following lemma, whose immediate corollary is Theorem 1.
Lemma 4. For any construct queries Q1 and Q2 and the query Q constructed on the
base of Q1 and Q2 it holds that [[Q2]][[Q1]]G = [[Q]]G for any graph G.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>For space reasons we cannot give a complete example of our construction, but we have
uploaded one at http://web.ing.puc.cl/˜jreutter/construct/.</p>
      <p>Our results confirm the fact that nested CONSTRUCT queries are, in theory, an
unnecessary feature of SPARQL, at least concerning set semantics. However, our
construction introduces a blowup when translating from nested CONSTRUCT to SPARQL
1.1, which can be exponential in the depth of nesting. We conjecture that this blowup
is unavoidable in the worst case. But even if this blowup was not important, the
rewriting appears to be a very deep and technical translation, and unfortunately it is not easy
to discover the intentions of nested CONSTRUCT queries simply by looking at their
translation. For this reason we believe that nested CONSTRUCT queries are actually
a desired feature of SPARQL. As a future work we would like to investigate update
queries in SPARQL. Since we conjecture that update queries can be expressed as nested
construct queries, it would be conceivable that the former are also a replaceable (but
welcome) addition of SPARQL 1.1.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Abiteboul</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hull</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vianu</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Foundations of databases</article-title>
          , vol.
          <volume>8</volume>
          .
          <string-name>
            <surname>Addison-Wesley Reading</surname>
          </string-name>
          (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Angles</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gutierrez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Subqueries in SPARQL</article-title>
          . In: AMW'
          <volume>11</volume>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Fagin</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kolaitis</surname>
            ,
            <given-names>P.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Popa</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>W.C.</given-names>
          </string-name>
          :
          <article-title>Composing schema mappings: Second-order dependencies to the rescue</article-title>
          .
          <source>ACM Trans. Database Syst</source>
          .
          <volume>30</volume>
          (
          <issue>4</issue>
          ),
          <fpage>994</fpage>
          -
          <lpage>1055</lpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Harris</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seaborne</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>SPARQL 1.1 Query Language</article-title>
          .
          <source>W3C Rec</source>
          .,
          <source>W3C (Mar</source>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kaminski</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kostylev</surname>
            ,
            <given-names>E.V.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Cuenca</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          :
          <article-title>Semantics and Expressive Power of Subqueries and Aggregates in SPARQL 1.1</article-title>
          . In: WWW'
          <volume>16</volume>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kontchakov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kostylev</surname>
            ,
            <given-names>E.V.</given-names>
          </string-name>
          :
          <article-title>On expressibility of non-monotone operators in sparql</article-title>
          .
          <source>In: KR'16</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Kostylev</surname>
            ,
            <given-names>E.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reutter</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ugarte</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>CONSTRUCT Queries in SPARQL</article-title>
          .
          <source>In: ICDT'15</source>
          . vol.
          <volume>31</volume>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8. Pe´rez, J.,
          <string-name>
            <surname>Arenas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gutierrez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Semantics and Complexity of SPARQL</article-title>
          .
          <source>ACM Transactions on Database Systems</source>
          <volume>34</volume>
          (
          <issue>3</issue>
          ) (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Polleres</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scharffe</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schindlauer</surname>
          </string-name>
          , R.: SPARQL+
          <article-title>+ for mapping between RDF vocabularies</article-title>
          .
          <source>In: ODBASE'07</source>
          . pp.
          <fpage>878</fpage>
          -
          <lpage>896</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Polleres</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wallner</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          :
          <article-title>On the relation between SPARQL1.1 and answer set programming</article-title>
          .
          <source>Journal of Applied Non-Classical Logics</source>
          <volume>23</volume>
          (
          <issue>1-2</issue>
          ),
          <fpage>159</fpage>
          -
          <lpage>212</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , X., Van den Bussche, J.:
          <article-title>On the primitivity of operators in SPARQL</article-title>
          . Inf. Process. Lett.
          <volume>114</volume>
          (
          <issue>9</issue>
          ),
          <fpage>480</fpage>
          -
          <lpage>485</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>