<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Disentangling the Notion of Dataset in SPARQL</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Daniel Hernandez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claudio Gutierrez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Center for Semantic Web Research, Department of Computer Science, Universidad de Chile</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>The notion of dataset in SPARQL seems to be neither a simple nor a well de ned notion. In this paper we rst review the literature, current documentation and SPARQL engines to show the subleties behind this apparently simple notion and some of the ambiguities of its speci cation. Then we present formal speci cations and algorithms to deal with them in practice.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The concept of dataset in SPARQL is introduced in several di erent parts of the
W3C documentation (for example in [
        <xref ref-type="bibr" rid="ref10 ref12 ref2 ref3 ref4 ref7 ref9">2, 3, 4, 7, 9, 10, 12</xref>
        ]). The speci cation
is spread around, leaves open issues, contain subtleties that result in manifold
interpretations, and even in some corner cases, contradict each other.
      </p>
      <p>A rst source of misunderstandings is triggered by the two related, but
different notions of default dataset and dataset description. A dataset is de ned
by the standards of SPARQL and RDF as a set fG0; (u1; G1); : : : ; (un; Gn)g
where each Gi is an RDF graph and each ui is an IRI. G0 is called the default
graph and each pair (ui; Gi) is called a named graph. The default dataset is the
one a SPARQL endpoint uses when no explicit dataset description is provided
in a query request. In the mentioned documentation there is no unique model
proposed for dataset descriptions. Indeed, the standard de nes two formats for
dataset descriptions: (i) In the SPARQL grammar as a sequence of `FROM u' and
`FROM NAMED u' clauses where u is an IRI, and (ii) in the HTTP request with
the query string parameters `default-graph-iri' and `named-graph-iri'. If a
dataset description is provided, a dataset must be generated from it. The IRI
u in the dataset description clause `FROM NAMED u' is assumed to be a reference
to a resource that serializes a graph Gu. Thus, the reference is indirect. On the
contrary, in the resulting dataset, u will name directly the graph Gu.</p>
      <p>The second source of misunderstandings comes from the use of blank nodes.
In our experience the following concepts related to blank nodes are di cult to
understand or have subtleties when applied to datasets: the scoping graph, the
operation merge when composing the default graph and the scope limited to les
that are speci ed for blank node labels in RDF and SPARQL. Finally, there are
extensions that allow blank nodes as names for named graphs and literals as
subjects of RDF triples.</p>
      <p>Structure of this paper. The paper is structured in two main sections. In the
section 1 we thoroughly analize current misunderstantings in the documentation
regarding the de nition and use of datasets. In section 2 we propose a formal
model for dataset descriptions and give algorithms to build and use the query
dataset.
1</p>
    </sec>
    <sec id="sec-2">
      <title>Datasets in the literature and engines</title>
      <p>Can blank nodes be used as names of named graphs? According to the SPARQL
speci cation \An RDF dataset is a set fG; (u1; G1); (u2; G2); : : : ; (un; Gn)g where
G and each Gi are graphs, and each ui is an IRI. Each ui is distinct. G is called
the default graph. (ui; Gi), are called named graphs." [4, x18.1.3]. Thus, blank
nodes as names of graphs are not allowed in SPARQL. Restricting names to IRIs
is consistent with the SPARQL need to use names to retrieve graphs from the
Web. How we interpret the clause `FROM :b'?. We cannot use a blank node to
identify a resource or in the Web or a graph in the default dataset. The SPARQL
grammar does not allow the clause `GRAPH :b { P }'; if it where allowed, then
` :b' would not reference a speci c graph in the dataset because blank nodes
in the query and in the data are in di erent scopes.</p>
      <p>Despite the problems that blank nodes introduce in SPARQL, the RDF
speci cation allows blank nodes as names of graphs1. \An RDF dataset is a
collection of RDF graphs, and comprises: (i) Exactly one default graph, being an RDF
graph. The default graph does not have a name and MAY be empty. (ii) Zero or
more named graphs. Each named graph is a pair consisting of an IRI or a blank
node (the graph name), and an RDF graph. Graph names are unique within an
RDF dataset." [7, x4]. There is another speci cation, TriG, that assumes the
possibility of blank nodes as names of graphs: \In a TriG document a graph
IRI or blank node may be used as label for more than one graph statements.
The graph label of a graph statement may be omitted. In this case the graph is
considered the default graph of the RDF Dataset." [2, x2.2].</p>
      <p>
        Let s be and endpoint and u be the name of the graph Gu in the default
dataset of s. Then, according the The SPARQL 1.1 Graph Store Protocol [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] the
IRI s?graph=u allows to retrieve the graph Gu. What happens if u is a blank
node? According to RDF a blank node must never be used as a name to access
a resource, because it \has no intrinsic name." [7, x9]. Thus, providing an IRI
for Gu based on a blank node u contradicts the RDF semantics.
      </p>
      <p>Despite these these issues, in Jena 2.0 and Virtuoso 6.1 blank nodes as names
of dataset graphs are supported.</p>
      <p>
        Can an RDF triple have a literal as subject? According to the RDF speci cation
\An RDF triple consists of three components: the subject, which is an IRI or a
blank node; the predicate, which is an IRI; and the object, which is an IRI, a
literal or a blank node." [7, x3.1]. Thus, an RDF triple must not have a literal as
1 Note that the concept of dataset was not de ned in the earlier RDF 1.0 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] but was
included in RDF 1.1 [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], after the inclusion of RDF datasets in SPARQL.
subject. However, the de nition of triple pattern suggest that RDF triples may
include literals as subject, as triple patterns do: \A triple pattern is member of
the set: (T [ V ) (I [ V ) (T [ V )." [4, x18.1.5]. Note that T denotes the set
I [ B [ L, called the set of RDF terms. Indeed, the inclusion of literals as triple
subjects has been accepted by the RDF core Working Group:
[The RDF core Working Group] noted that it is aware of no reason why
literals should not be subjects and a future WG with a less restrictive
charter may extend the syntaxes to allow literals as the subjects of
statements.
|Should the subjects of RDF statements be allowed to be literals?
http://www.w3.org/2000/03/rdf-tracking/#rdfms-literalsubjects
In our experience SPARQL implementations do not support literals as triple
subjects. Jena 2.0 and Virtuoso 6.1 raise an error when uploading les with
literals in the subject position.
      </p>
      <p>The default dataset. The SPARQL speci cation states: \If a query provides
such a dataset description, then it is used in place of any dataset that the query
service would use if no dataset description is provided in a query." [4, x13.2].
Thus, every SPARQL endpoint may provide a default dataset to be used in the
absence of a dataset description. Note that the dataset description could be
de ned not only in the query but also in parameters of the request: \The RDF
dataset may also be speci ed in a SPARQL protocol request, in which case the
protocol description overrides any description in the query itself." [4, x13.2]. The
description can be included in three forms: A parameter in the IRI of the HTTP
request, a parameter in the body of the HTTP request or as dataset clauses in
the query [3, x2.1].</p>
      <p>
        Angles and Gutierrez [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] have interpreted the speci cation in a di erent way.
They assumed that the default dataset has no named graphs and an empty
default graph, i.e., the default dataset is always f;g. This interpretation follows
the principle of running queries against the Web so that the evaluation of a
query does not depend on the particular SPARQL endpoint that evaluates it.
On the contrary, in the speci cation, a query may be evaluated against a default
dataset of the SPARQL endpoint where the query is submitted.
      </p>
      <p>SPARQL federation allows using more than one dataset in the same query.
When a query includes a `SERVICE s { P }' clause, then the SPARQL endpoint
identi ed as s may evaluate the graph pattern P against the default dataset of s.
Indeed, the speci cation [10, x3.2] states that the `SERVICE' generate a request to
the endpoint identi ed as s with the query Q = `SELECT * WHERE { P }'. Thus,
as the query Q has no dataset description, the dataset used to evaluate P is the
default dataset of the endpoint identi ed by s.</p>
      <p>Note that as subselects cannot include dataset descriptions, all dereferencing
of graphs from the Web must be done by the endpoint that receives the whole
query. Thus, endpoints that are used to delegate the evaluation of graph patterns
can only use their own default datasets.</p>
      <p>Jena 2.0 and Virtuoso 1.6 follow the SPARQL speci cation, i.e., they use the
default dataset in the absence of a dataset description and in federated queries.
Can graphs in RDF datasets share blank nodes? The RDF speci cation is clear
in allowing blank nodes to be shared across graphs: \Blank nodes can be shared
between graphs in an RDF dataset." [7, x4]. The TriG language for serializing
datasets supports sharing blank nodes across graphs: \BlankNodes sharing the
same label in di erently labeled graph statements are considered to be the same
BlankNode." [2, x2.3.1]. Datasets engines such as Jena and Virtuoso preserve
the identity of blank nodes when loading dataset serializations that share blank
nodes across named graphs.</p>
      <p>
        Despite the clarity in the standards and the assumptions made by engine
developers, this question has been a source of misunderstandings. Mallea,
Arenas, Hogan and Polleres [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] assumed that blank nodes cannot be shared across
graphs. In a later work [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] they recognize their mistake: \This clari cation may
serve as a corrigendum for our previous paper in which we stated that blank nodes
cannot be shared across graphs in SPARQL [48]. This statement is misleading
in that although blank nodes cannot be shared across scoping graphs, they can be
shared across named graphs". Perez, Arenas and Gutierrez [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] assumed \for the
sake of the simplicity" that blank nodes where not shared by graphs in a RDF
dataset.
      </p>
      <p>This misunderstanding comes from the principle that blank nodes are scoped
to les (that is, according to the speci cation) and the assumption that each
graph must be contained in its own le (which is wrong). \Blank node identi ers
are local identi ers that are used in some concrete RDF syntaxes or RDF store
implementations. They are always locally scoped to the le or RDF store, and are
not persistent or portable identi ers for blank nodes. Blank node identi ers are
not part of the RDF abstract syntax, but are entirely dependent on the concrete
syntax or implementation. The syntactic restrictions on blank node identi ers,
if any, therefore also depend on the concrete RDF syntax or implementation.
Implementations that handle blank node identi ers in concrete syntaxes need to
be careful not to create the same blank node from multiple occurrences of the
same blank node identi er except in situations where this is supported by the
syntax." [7, x3.2]. \A blank node is a node that is not a URI reference or a
literal. In the RDF abstract syntax, a blank node is just a unique node that can
be used in one or more RDF statements, but has no intrinsic name." [7, x3.2].</p>
      <p>The SPARQL speci cation states that to evaluate a query, les referenced
in the dataset description must be retrieved to build the dataset. Nothing is
said about graphs that are included in the default dataset. \If a query provides
more than one FROM clause, providing more than one IRI to indicate the default
graph, then the default graph is the RDF merge of the graphs obtained from
representations of the resources identi ed by the given IRIs." [4, x13.2.1]. \The
FROM NAMED syntax suggests that the IRI identi es the corresponding graph, but
the relationship between an IRI and a graph in an RDF dataset is indirect. The
IRI identi es a resource, and the resource is represented by a graph (or, more
precisely: by a document that serializes a graph)." [4, x13.2.2].</p>
      <p>The use of the merge operation to combine graphs into the default graph
suggests that blank nodes can be shared by the RDF les dereferenced when
interpreting the dataset description. Moreover, merge is not applied on named
graphs, a fact that suggest that blank nodes coming from di erent les can be
shared across named graphs of the dataset resulting of the evaluation of a dataset
description.</p>
      <p>What does occur if a dataset description references only one remote le as
part of the default graph? The merge operation must be applied over a
single graph? How is evaluated the dataset description `FROM u FROM u'? Must the
graph resulting of dereferencing u be merged with itself?</p>
      <p>
        The way in which the merge operation is applied is another source of
misunderstandings. \In an RDF merge, blank nodes in the merged graph are not shared
with blank nodes from the graphs being merged." [4, x13.1]. A di erent de nition
of merge was provided by Angles and Gutierrez [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]: \The merge of graphs,
denoted G1 + G2, is the graph G1 [ G02 where G02 is the graph obtained from G2
by renaming its blank nodes to avoid clashes with those in G1." According the
SPARQL speci cation G1 + G2 does not share blank nodes with G1 nor G2 as
in the de nition provided by Angles and Gutierrez.
      </p>
      <p>We identify three strategies to avoid blank node clashes: (i) Rename blank
nodes when interpreting les, ensuring that no blank nodes are shared. (ii) Use
the merge operator to combine the graphs that compose the default graph. (iii)
Use both strategies, (i) and (ii).</p>
      <p>The strategy (i) is not mentioned in the SPARQL speci cation, but in our
experience many people think that it must be used as a direct consequence of
the fact that blank nodes are scoped to les. The strategy (i) is su cient to
ensure that blank nodes are scoped to les. On the contrary, the strategy (ii)
is not su cient, as blank nodes in named graphs are not renamed, hence can
be shared. The use of merge in strategies (ii) and (iii) may introduce spurious
identities for blank nodes. For example, let u be an IRI that references a le
that serializes the graph Gu. Let us consider a dataset description that contains
both clauses: `FROM u' and `FROM NAMED u'. Then, a blank node that occurs in Gu
may be renamed with a fresh blank node in the default graph when applying the
merge operation, losing its link with its occurrence in the named graph (u; Gu).</p>
      <p>
        The documentation about merge is not clear. The SPARQL 1.1 Service
Description speci cation [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] introduces the property UnionDefaultGraph to
indicate that a service uses the union of the named graphs as the default graph. A
property to describe default graphs as a merge of named graphs is not
introduced for service descriptions. This seems preferable as in the speci cations of
SPARQL and RDF 1.1 named graphs are allowed to share blank nodes. However,
it seems contrary to the use of merge to construct the default graph described
by several `FROM' clauses in the query as the SPARQL speci cation stated.
      </p>
      <p>Neither Jena nor Virtuoso allow referencing remote graphs in the dataset
description. Thus, they never dereference an IRI. Moreover, the merge operation
is never performed. Indeed, if (u1; G1) and (u2; G2) are two named graph in a
default graph of an endpoint identi ed as s, then the default graph speci ed
in the dataset description `FROM u1 FROM u2' is G1 [ G2. Thus, no blank node
renaming is done by Jena and Virtuoso at the moment of evaluating queries.</p>
      <p>The task of renaming blank nodes to ensure the le scope of them is
performed by Jena and Virtuoso at the moment of loading RDF les. Both engines
rename all blank nodes of the form ` :bn' with fresh blank nodes and set fresh
blank nodes for nodes that have no label. However, Virtuoso 6.1 use identi ers as
&lt;nodeID:// b01&gt; as blank node identi ers. This kind of identi ers responds true
when evaluating the function `isBlank(b)' but (as IRIs) are not scoped. This
use of blank node identi ers in Virtuoso does not follow the standard. Indeed, a
blank node identi er can be used in the query to refer a blank node in the data.
On the contrary, the standard limits the scope of blank nodes to basic graph
patterns and disallows the sharing of blank nodes across basic graph patterns
in queries. Jena follows the standard raising the error \blank node reuse is not
allowed in this point" if a blank occurs in more that one basic graph pattern.
How must be interpreted an IRI that occurs several times in the dataset
description of a query? The SPARQL speci cation leaves this question open in the case
where an IRI that occurs in more that one `FROM' or `FROM NAMED' clauses must
be dereferenced one or more times. \The actions required to construct the dataset
are not determined by the dataset description alone. If an IRI is given twice in
a dataset description, either by using two FROM clauses, or a FROM clause and
a FROM NAMED clause, then it does not assume that exactly one or exactly two
attempts are made to obtain an RDF graph associated with the IRI. Therefore,
no assumptions can be made about blank node identity in triples obtained from
the two occurrences in the dataset description. In general, no assumptions can be
made about the equivalence of the graphs." [4, x12.2.3]. Note that if an RDF le
is dereferenced twice the le may change during the attempts to get it, resulting
in di erent les. Thus, the results depend on the policy of the service to handle
di erent versions of an RDF le.</p>
      <p>Let us consider the dataset description `FROM NAMED u FROM NAMED u'. The
speci cation states that names must not be repeated in the dataset. \Each ui
[the IRI that names a graph] is distinct." [4, x18.1.3]. \Graph names are unique
within an RDF dataset." [7, x4]. \Using the same IRI in two or more FROM NAMED
clauses results in one named graph with that IRI appearing in the dataset." [4,
13.2.2]. Thus the question arises: What is the graph that should be used if u is
dereferenced twice (and these copies are not equal)?</p>
      <p>Let us consider the dataset description `FROM u FROM u' and let be Gu the
graph obtained of the dereferencing of u (assuming that only one request for u
was issued). If the default graph is the union of Gu with itself then the default
graph of the dataset will be Gu. On the contrary, if merge is applied, the default
graph G0 will be a graph that duplicate every triple containing a blank node in
Gu. Thus Gu will entail G0 but Gu 6 G0.</p>
      <p>Note that neither Jena nor Virtuoso handle this issue because they do not
support dereferencing IRIs that are not names of named graphs in the default
datasets.</p>
    </sec>
    <sec id="sec-3">
      <title>The notion of dataset</title>
      <p>The Web is constituted by a nite set of HTTP servers. A server can send
HTTP requests and responses to another server. Servers can be disconnected
or added to the Web. The internal con guration of a server may change, so the
same request can be answered with di erent responses if the server con guration
changes between the requests.</p>
      <p>We will assume that servers follow the REST design principles, so a GET
request does not change the internal con guration of the server that receives
it. As, an SPARQL client that sent only GET requests have no control over
changes in the data, then the order and repetition of requests cannot be used by
the client to ensure a result.</p>
      <p>In the Web, there is a nite set of les serializing RDF graphs. Every le is
accessible sending an HTTP request GET u to an HTTP server. The request
GET u may result in a response with the le referenced by u if the server s
associate u to a le.</p>
      <p>In the Web, there is a nite set of particular types of HTTP servers, called
SPARQL endpoints, that provide access to datasets via requests that contain
SPARQL queries. A SPARQL endpoint is identi ed by an IRI that can be used
to send queries. Let Q be a SPARQL query, then the request GET s?query=Q is
the SPARQL GET request that sends the query Q to the endpoint identi ed as
s. A successful response with a le serializing the result of Q is sent back if the
query is accepted and if no error occurs during the query execution. The result
is interpreted as a Boolean value, a sequence of mappings or an RDF graph.</p>
      <p>A SPARQL GET request may have the following query string parameters:
`query' (exactly 1), `default-graph-uri' (0 or more) and `named-graph-uri'
(0 or more). The SPARQL protocol includes also the possibility to send queries
via URL-encoded POST or via POST directly. In this paper we will focus in
SPARQL queries sent using the HTTP method GET.</p>
      <p>The description above roughly indicates how the protocols necessary to
evaluate SPARQL queries work. In what follows, we present a formal model that
simpli es and make explicit such protocols.
2.1</p>
      <sec id="sec-3-1">
        <title>A data model for SPARQL</title>
        <p>Let I, B, L and V be in nite disjoint sets containing the IRIs, the blank nodes,
the literals and the variables. Let IB and IBL be the sets I [ B and I [ B [ L,
respectively. A triple is a tuple in IB I IBL. A graph is a set of triples. A dataset
is set fG0; (u1; G1); : : : ; (un; Gn))g where G0; : : : ; Gn are graphs, u1; : : : ; un are
di erent IRIs and n 0. G0 is called the default graph and the rest are called
the named graphs. A dataset description is a pair (A; B) where A and B are sets
of IRIs. A mapping is a partial function with a nite domain in V with range in
IBL. There is a function that takes a pair (u; t) where u is an IRI and t is a
positive real number representing time, and returns either null, or a graph or a
dataset. Intuitively, (u; t) is the data that u makes accessible at the instant t.
We call endpoints the IRIs that make datasets accessible during a period.</p>
        <p>Note that most de nitions presented above where taken from the standards
and the literature, except the model of the dataset description. The de nition of
a dataset description as a pair of sets (instead of a list of dataset clauses) gives
an unequivocal semantics to the order of dataset clauses and to the repetition of
dataset clauses in the syntax of the dataset description. We argued previously
that assuming order or considering repetitions makes no sense if the client has
no control over how the data changes over time.
2.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Algorithms for the dataset</title>
        <p>
          A request is a pair of IRIs (a; b) at time t. A response is a tuple (a; b; r) where
a and b are IRIs and r is null (interpreted as a not-found error message2),
a graph, a sequence of mappings or a Boolean value. A query IRI is an IRI
u that has the format s?p where pre x s is called the IRI endpoint and the
query string p must contain the parameter `query' and zero or more
parameters `default-graph-uri' and `named-graph-uri'. The procedure to generate
the response to a request is described by the Algorithm 1. Note that this
algorithm allows retrieving the whole dataset through the endpoint IRI. This is not
supported by the current standard. However, it is possible to download it with
queries or with the graph store protocol [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Algorithm 1: Response(a; b)</title>
        <p>Data: A request (a; b) sent at an instant t.</p>
        <p>Result: The response of the request (a; b).
if b is a query IRI s?p and (s; t) is a dataset then
return (s; a; r) where r is the result of evaluating the query and parameters
indicated in the query string p in the endpoint s, that is, using the dataset
(s; t) as the default dataset.
else
end</p>
        <p>return (b; a; (b; t))</p>
        <p>The procedure to generate the dataset description is presented in Algorithm
2. The endpoint must rst check the parameters in the request. If no dataset
description is provided in such parameters it nds the dataset description in the
query.</p>
        <p>Each IRI u in a dataset description must be associated with a graph Gu. If
the default daset contains a graph named u, then Gu is the graph associated
with u in the dataset. Else, the graph Gu is the result of renaming the blank
2 Every request must result in a response that occurs after the request. In the HTTP
protocol, a request may result in manifold unsuccessful results (a time out, an
internal error, etc.). For the sake of the simplicity, in our model we consider an
unsuccessful result as a null result.</p>
        <sec id="sec-3-3-1">
          <title>Algorithm 2: DatasetDescription(a; s?p)</title>
          <p>Data: A request (a; s?p) where s?p is a query IRI.</p>
          <p>Result: The dataset description of (a; s?p).</p>
          <p>Let Q be the value that occurs in p as the property `query'.</p>
          <p>Let A and B be the set of IRIs that occur in p associated to the properties
`default-graph-iri' and `named-graph-iri', respectively.
if A [ B 6= ; then</p>
          <p>return (A; B)
else if there is at least a `FROM' or a `FROM NAMED' clause in Q then
Let A and B be the set of IRIs that occur in Q in the clauses `FROM' and
`FROM NAMED', respectively.</p>
          <p>return (A; B)
else
end</p>
          <p>the request has no dataset description.
nodes with fresh labels in the response of the request (s; u). This procedure is
formally described in Algorithm 3. Finally, Algorithm 4 describes the algorithm
to be used when answering the query.</p>
        </sec>
        <sec id="sec-3-3-2">
          <title>Algorithm 3: DatasetFromDescription(A; B)</title>
          <p>Data: A dataset description (A; B).</p>
          <p>Result: The dataset built from the description (A; B).</p>
          <p>Let G S Gi for all graph Gi in the default dataset (the scoping graph); let
G0 ; (the default graph); let D fG0g (the resulting dataset); let D0 be the
default dataset.
for u 2 A [ B do
if exists (u; G0) 2 D0 then</p>
          <p>Let Gu G0
else</p>
          <p>Send the request (s; u) and let (u; s; r) be its response.
if r is a graph then</p>
          <p>Let G0 be the result of replacing all blank nodes in r with blank
nodes that does not occurs in G;</p>
          <p>Let Gu G0
end
end
if u 2 A then</p>
          <p>Let G0
end
if u 2 B then</p>
          <p>Let D</p>
          <p>Gu
end
end
return D</p>
          <p>D [ f(u; Gu)g</p>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>Algorithm 4: Dataset(a; s?p)</title>
        <p>Data: A request (a; s?p) where s?p is a query IRI and (s; t) is a dataset.
Result: The dataset of the request at t.
if DatasetDescription(a; s?p) is de ned then
(A; B) DatasetDescription(a; s?p);
return DatasetFromDescription(A; B)
else
end</p>
        <p>return (s; t).
3</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusions</title>
      <p>We have proposed a formal model that de nes the dataset that is used to evaluate
every graph pattern in a query. The model was designed to be as close as possible
to the RDF and SPARQL speci cations; to formalize current natural-language
speci cation; and to clarify some ambiguities it has.</p>
      <p>Acknowledgements: The authors thank funding to Millennium Nucleus Center
for Semantic Web Research under Grant NC120004.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Angles</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Gutierrez. SQL Nested</surname>
          </string-name>
          <article-title>Queries in SPARQL</article-title>
          .
          <source>In Procedings on the Alberto Meldenzon Workshop (AMW)</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          .
          <source>RDF 1</source>
          .
          <article-title>1 Trig: RDF Dataset Language</article-title>
          . Recommendation, World Wide Web Consortium,
          <year>Febrary 2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Feigenbaum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. T.</given-names>
            <surname>Willians</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. G.</given-names>
            <surname>Clark</surname>
          </string-name>
          , and
          <string-name>
            <surname>E. Torres. SPARQL</surname>
          </string-name>
          <year>1</year>
          .1 Protocol. Recommendation, World Wide Web Consortium,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Harris</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Seaborne</surname>
          </string-name>
          .
          <source>SPARQL 1</source>
          .
          <article-title>1 Query Language</article-title>
          . Recommendation, World Wide Web Consortium,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hogan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Arenas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mallea</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Polleres</surname>
          </string-name>
          .
          <article-title>Everything you always wanted to know about blank nodes</article-title>
          .
          <source>Web Semantics: Science, Services and Agents on the World Wide Web</source>
          ,
          <volume>27</volume>
          {
          <issue>28</issue>
          (
          <issue>0</issue>
          ):
          <volume>42</volume>
          {
          <fpage>69</fpage>
          ,
          <year>2014</year>
          . ISSN 1570-
          <fpage>8268</fpage>
          . doi: http://dx.doi.org/10.1016/j.websem.
          <year>2014</year>
          .
          <volume>06</volume>
          .004. URL http://www. sciencedirect.com/science/article/pii/S1570826814000481. Semantic Web Challenge
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Klyne</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Carroll</surname>
          </string-name>
          .
          <article-title>Resource Description Framework (RDF): Concepts and Abstract Syntax</article-title>
          . Recommendation, World Wide Web Consortium,
          <year>February 2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Klyne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Carroll</surname>
          </string-name>
          , and B.
          <source>McBride. RDF 1.1 Concepts</source>
          and
          <string-name>
            <given-names>Abstract</given-names>
            <surname>Syntax</surname>
          </string-name>
          . Recommendation, World Wide Web Consortium,
          <year>February 2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mallea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Arenas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hogan</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Polleres</surname>
          </string-name>
          .
          <article-title>On blank nodes</article-title>
          . In L. Aroyo,
          <string-name>
            <given-names>C.</given-names>
            <surname>Welty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Alani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Taylor</surname>
          </string-name>
          , A.
          <string-name>
            <surname>Bernstein</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Kagal</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Noy</surname>
          </string-name>
          , and E. Blomqvist, editors,
          <source>The Semantic Web { ISWC</source>
          <year>2011</year>
          , volume
          <volume>7031</volume>
          of Lecture Notes in Computer Science, pages
          <volume>421</volume>
          {
          <fpage>437</fpage>
          . Springer Berlin Heidelberg,
          <year>2011</year>
          . ISBN 978-3-
          <fpage>642</fpage>
          -25072-9. doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>642</fpage>
          -25073-6 27. URL http://dx.doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -25073-6_
          <fpage>27</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Ogbuji</surname>
          </string-name>
          .
          <source>SPARQL 1</source>
          .
          <article-title>1 Graph Store Protocol</article-title>
          . Recommendation, World Wide Web Consortium, Mar.
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>E.</given-names>
            <surname>Prud</surname>
          </string-name>
          <article-title>'hommeaux and C. Buil-Aranda. SPARQL 1.1 Federated Query</article-title>
          . Recommendation, World Wide Web Consortium,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Arenas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Gutierrez</surname>
          </string-name>
          .
          <source>Semantics of SPARQL. Technical Report</source>
          , TR/DCC-2006-17, Universidad de Chile,
          <year>Octover 2006</year>
          . URL http: //users.dcc.uchile.cl/~jperez/papers/sparql_semantics.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>G. T.</given-names>
            <surname>Willians</surname>
          </string-name>
          .
          <source>SPARQL 1</source>
          .
          <article-title>1 Service Description</article-title>
          . Recommendation, World Wide Web Consortium, Mar.
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>