<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>E cient approximate SPARQL querying of Web of Linked Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>B.R.Kuldeep Reddy</string-name>
          <email>brkreddy@cse.iitm.ac.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>P.Sreenivasa Kumar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Indian Institute of Technology Madras</institution>
          ,
          <addr-line>Chennai</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The web of linked data represents a globally distributed dataspace which can be queried using the SPARQL query language. However, with the growth in size and complexity of the web of linked data, it becomes impractical for the user to know enough about its structure and semantics for the user queries to produce enough answers. This problem is addressed in the paper by making use of ontologies available on the web of linked data to produce approximate results. The existing approach, which generates multiple relaxed queries and executes them sequentially one by one, is improved by integrating the approximation steps with the query execution itself. Thus, by performing query relaxation on-the- y at runtime, the shared data between relaxed queries are not fetched repeatedly, resulting in signi cant performance bene ts. Further opportunities for optimization during query execution are identi ed and are used to prune relaxation steps which do not produce results. The implementation of our approach demonstrates its e cacy.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The traditional World Wide Web has allowed sharing of documents among users
on a global scale. The documents are generally represented in HTML, XML
formats and are accessed using URL and HTTP protocols creating a global
information space. However, in the recent years the web has evolved towards a
web of data [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] as the conventional web's data representation sacri ces much
of its structure and semantics [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and the links between documents are not
expressive enough to establish the relationship between them. This has lead
to the emergence of the global data space known as Linked Data[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>Linked data basically interconnects pieces of data from di erent sources
utilizing the existing web infrastructure. The data published is machine readable
that means it is explicitly de ned. Instead of using HTML, linked data uses
RDF format to represent data. The connection between data is made by typed
statements in RDF which clearly de nes the relationship between them resulting
in a web of data.</p>
      <p>
        Berners-Lee outlined a set of Linked Data Principles for publishing data on
the Web [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] in a way that all published data becomes part of a single global data
space:
      </p>
    </sec>
    <sec id="sec-2">
      <title>1. Use URIs as names for things</title>
      <p>2. Use HTTP URIs so that people can look up those names
3. When someone looks up a URI, provide useful information, using the
standards(RDF, SPARQL)
4. Include links to other URIs, so that they can discover more things</p>
      <p>
        The RDF model describes data in the form of subject, predicate and object
triples. The subject and object of a triple can be both URIs that each identify
an entity, or a URI and a string value respectively. The predicate denotes the
relationship between the subject and object, and is also represented by a URI.
SPARQL is the query language proposed by W3C recommendation to query
RDF data [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. A SPARQL query basically consists of a set of triple patterns.
It can have variables in the subject,object or predicate positions in each of the
triple pattern. The solution consists of binding these variables to entities which
are related with each other in the RDF model according to the query structure.
      </p>
      <p>
        There have been a number of approaches proposed to query the web of linked
data. One direction has been to crawl the web by following RDF links and build
an index of discovered data. The queries are then executed against these indexes.
This approach is followed by Sindice[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], Swoogle[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Another approach has been
to follow the federated query processing concepts [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], as in DARQ[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], which
decomposes a SPARQL query in subqueries, forwards these subqueries to multiple,
distributed query services, and, nally, integrates the results of the subqueries.
Another execution approach for evaluating SPARQL queries on linked data is
proposed in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. It is basically a run-time approach which executes the query
by asynchronously traversing RDF links to discover data sources at run-time.
SPARQL query execution takes place by iteratively dereferencing URIs to fetch
their RDF descriptions from the web and building solutions from the retrieved
data. The SPARQL query execution according to [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is explained with an
example below.
      </p>
      <p>Example. The SPARQL query shown in Figure 1 searches for Professors
employed by the university who have authored a publication. The query
execution begins by fetching the RDF description of the university by dereferencing its
URI. The fetched RDF description is then parsed to gather a list of all of its
publications. Parsing is done by looking for triples that match the rst pattern in the
query. The object URIs in the matched triples form the list of publications in the
university. Lets say &lt;http://site1/publ1.rdf&gt;, &lt;http://site2/publ2.rdf&gt;,
&lt;http://publ3/Mary.rdf&gt; were found to be the papers. The query execution
proceeds by fetching the RDF descriptions corresponding to the three
publications. Lets say rst publ1's graph is retrieved. It is parsed to check for triples
matching the second query pattern and it is found that publ1 was authored by
John &lt;http://site4/John.rdf&gt;. John's details are again fetched and the third
triple pattern in the query is searched in the graph to see whether he is of type
Professor and if he is, the result of query is formed and displayed as output.
Publ1's and Publ2's graphs and their author details would also be retrieved and
the query execution proceeded in a way similar to Publ1's.</p>
      <p>
        Consider the situation where the retrieved list publications authored by the
professors may not meet the requirements of the user in which case query
conditions can be relaxed to produce more results. For example, instead of looking
for only Professors, the query can be generalized by searching for all types of
people including lectures,graduate students etc. The algorithm presented in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
generates relaxed SPARQL queries and executes them sequentially one after
another to generate approximate answers. The algorithm was designed to work on
centralized RDF repositories and the approach is extended to the web of linked
data in this paper. The relaxed SPARQL queries formed share many query
conditions in common, which are not utilized to optimize the queries. Especially in a
distributed environment, like the web of linked data, avoiding repeated fetching
of data shared across the queries results in signi cant performance bene ts.
      </p>
      <p>Example. Figure 2 gives the two similar queries formed after the query term
Professor is replaced by Faculty and Person terms using RDFS ontology, which
are then executed sequentially. However, the rst two predicates are common
between the two queries, therefore to achieve e ciency the information
corresponding to them can be fetched once. Hence, instead of generating the two
queries the execution of the original query can continue by dereferencing the
URIs corresponding to Publ1 and its author and retrieving their RDF
descriptions, and the check performed by the third predicate to see whether the author
is a Professor or not is only replaced by Faculty and Person at the last step.
This allows the retrieval of the shared data only once instead of twice had the
existing approach been followed.</p>
      <p>
        The goal of this paper is to perform approximate SPARQL querying of the
web of linked data. This paper extends the approach presented in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] for relaxed
querying of centralized RDF repositories to the context of web of linked data and
along with the execution approach presented in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] takes into account di erent
namespaces being used. The idea of delaying query relaxation to run-time is
introduced in order to optimize the query performance and the various other
optimization opportunities present are also recognized and used.
2
      </p>
      <sec id="sec-2-1">
        <title>Similarity Measures</title>
        <p>
          In [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] the similarity measures were de ned to allow ranked approximate answers.
However the measures were designed for centralized RDF repositories and
considered only one ontology. In the context of linked data, each user publishing
data has the freedom to de ne his own ontology, but according to the principles
of linked data it has to be mapped to existing ontologies, therefore we assume
such mappings exist for the purposes of this paper.
        </p>
        <p>A triple pattern can be replaced by terms in the ontology in a number of ways.
Therefore, there is a need to attach a score to each relaxation which can be used
to rank them to ensure the quality of results. The score given to each
relaxation measures the similarity of it to the original triple pattern. Highest scoring
relaxation are executed rst followed by others in the decreasing order of the
similarity score. For example, we would rank the relaxation from (?X,type,professor)
to (?X,type,faculty) higher than (?X,type,professor) to (?X,type,person) as the
former is more similar to the original triple pattern. A SPARQL query consists
of a basic graph pattern which in turn consists of triple patterns. Therefore, the
score associated with an answer to a SPARQL query is computed by aggregating
the scores of relaxed triple patterns. Each triple pattern consists of a subject,
predicate and object parts, and each of them can be potentially relaxed. Their
aggregated score gives the score of the triple pattern.</p>
        <p>Similarity between nodes In a triple pattern t1, if the subject/object node
belongs to class c1 in the RDFS ontology and is relaxated to class c2 using the
ontology we use the idea of Least Common Ancestor to compute the similarity
of the two triple patterns. The Least Common Ancestor denotes the depth of
the common ancestor superclass of the two classes from the root in the RDFS
ontology.</p>
        <p>score(c1; c2) =
2</p>
        <sec id="sec-2-1-1">
          <title>Depth(LCA(c1; c2)) depth(c1) + depth(c2)</title>
          <p>Similarity between predicates In a triple pattern t1, if the predicate
belongs to class p1 in the RDFS ontology and is relaxed to class p2 using the
ontology we use the idea of Least Common Ancestor to compute the similarity of
the two triple patterns similar to that done for subject/object nodes. The Least
Common Ancestor denotes the depth of the common ancestor superproperty of
the two classes from the root in the RDFS ontology.</p>
          <p>score(p1; p2) =
2</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>Depth(LCA(p1; p2)) depth(p1) + depth(p2)</title>
          <p>Similarity between triple patterns If the triple pattern t1-(s1; p1; o1)
is relaxed to t2-(s2; p2; o2) we aggregate the similarity scores of the triple
pattern constituents to compute the overall similarity score of relaxed triple pattern.
similarity(t1; t2) = score(s1; s2) + score(p1; p2) + score(o1; o2)</p>
          <p>Score of an answer The bindings of the relaxed SPARQL queries form the
answers to the original SPARQL query. Since the original query is relaxed in a
number of ways we need a measure to rank the relevant answers to ensure the
quality of results. Thus, we de ne the score of each relevant answer as the
similarity of its corresponding relaxed SPARQL query from which it is produced to
the original SPARQL query. The similarity between the two queries is obtained
by combining the similarities of the triple patterns in them. Suppose the answer
A is obtained from query Q0 (t01; t02; t03:::t0n) which was formed after the original
query Q(t1; t2; t3::::tn) was relaxed.</p>
          <p>score(A) = Pn</p>
          <p>i=1 similarity(ti; t0i)
3</p>
          <p>
            Query Processing Algorithms
[
            <xref ref-type="bibr" rid="ref10">10</xref>
            ] presents an approach to generate relaxed SPARQL queries from the
original SPARQL query using RDFS ontology. It produces many relaxed versions
and assigns scores to them based on the similarity to the original query. The
relaxed queries are then executed one by one sequentially in the descending
order of their scores to get ranked approximate answers. However, the SPARQL
queries generated have many query conditions in common. Therefore, the
sequential execution approach of all the queries involves needlessly fetching the
same data repeatedly. In this section we present an optimized query processing
algorithm where relaxed queries are generated and answered on-the- y during
query execution resulting in signi cant performance bene ts.
          </p>
          <p>
            Algorithm 1 describes the approach presented in [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ] that can be extended
to produce approximate answers in the web of linked data. Lines 2-7 denote the
steps taken to generate multiple relaxed queries. The relaxation procedure is
described as a graph, called a relaxation graph here. First the given query is put
as a root in the relaxation graph. Then each triple is relaxed one-by-one and the
new query produced as a result is inserted as a child node of the query node
in the relaxation graph that led to it being produced. Each triple relaxation is
accompanied by computing its relaxation score and this score is attached to its
corresponding relaxed query. This process is repeated till all possible relaxed
queries are generated. Lines 11-18 execute the relaxed queries produced earlier
sequentially one by one. To generate ranked approximate results and ensure
the quality of answers the relaxed queries are executed in the descending order
of their similarity scores with the original query. The relaxed query with the
maximum score is executed rst following which the next query to be executed
is chosen with the highest score amongst its children and so on.
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Algorithm 1: Existing Approach</title>
      <p>Input : :Query Q</p>
      <p>Output: :Approximate answers
1 relaxationGraph =
2 Insert Q as root in relaxationGraph
3 while Q 6= do
4 foreach Triple ti in Q do
5 Relax ti to t0i
6 compute the score of approximation
7 Insert Q0i as a succeeding node of Q in relaxationGraph
8</p>
      <p>Q ( QsiblingNode or QsucceedingNode</p>
      <p>Figure 3 describes the execution of two queries of gure 2. The two queries
are generated from the query of gure 1 as described by algorithm 1. The query
in gure 1 nds the professors in the university who have authored a publication.
To get approximate answers, the query is relaxed by producing two queries in
which the query condition professor is replaced by faculty and persons. The left
box in gure 4 shows the execution of left query in gure 2 and similarly for the
box on the right. As we can see, many of the URIs dereferenced are the same
in both the cases. For both of them, the query execution takes place by rst
dereferencing the university's URI to retrieve its RDF graph. Then the details
of its publications publ1,publ2 and publ3 followed by its authors, John,Peter
and Mary, are fetched. The existing approach repeats this process twice for each
of the relaxed query when instead we can fetch the shared information once and
then perform the relaxation. This motivates us to integrate the approximation
process with the execution of the query and is described in algorithm 2.</p>
      <p>Algorithm 2 describes the proposed approach in this paper for e cient
approximate answering. Lines 3-16 repeat for each query predicate in the given
query. It begins with the seed, fetching its RDF graph. Then presence of the
query predicate is checked for in the fetched RDF graph. If it is present the
relaxation score for the predicate in the graph is given the maximum value of
1.0. Predicates belonging to di erent namespaces are assumed to be mapped
in accordance with the linked data principles. Otherwise, using the metrics
described in the earlier section the similarity score for each predicate in the RDF
graph is computed. The predicates are then sorted in the descending order of
their scores. The query execution proceeds by updating the seed with the object
URIs of the predicates, which are then dereferenced to retrieve their graphs.
Further similarities are computed and this process is repeated till a set of leaf
values are produced. The path from the root to the leaf values in lines 17-19
along the relaxed predicates gives the approximate answers.</p>
      <p>Figure 4 shows the query execution with the proposed approach for the query
in gure 1. The query execution takes place by fetching the university's details,
the details its publications and their authors just once. Once the publication's
authors details have been retrieved the third predicate checking whether the
person is of type professor can be relaxed to check for all people in the university
like lecturers and graduate students. Thus in e ect the relaxation mechanism has
been delayed to be performed on-the- y at run-time and by doing so the shared
data is not fetched repeatedly which results in signi cant performance bene ts.
4</p>
      <sec id="sec-3-1">
        <title>Optimizations</title>
        <p>The query processing described in the last section works by relaxing the query
on-the- y during query execution. This approach serves well to optimize the
query but there are further opportunities that arise during query execution that
can be exploited to optimize the query. To do so the vocabulary(RDFS/OWL)
describing the resources which gives the domains and ranges of various predicates
as well as the subclass/superclass hierarchy details of all classes is considered.
There are two cases that arise.</p>
        <p>Case1: If a predicate p is replaced by p0 with the subsequent predicate q,
and that range(p0 ) \ domain(q) is NULL the current relaxation of p is pruned
as it will not produce results. There may be a situation when the subsequent
predicate q is relaxed to q0 and range(p0 ) \ domain(q0 ) is not NULL in which
case some results are missed. Therefore, a minimum threshold for the score of
relaxation is maintained, and if the intersection of range(p0 ) \ domain(q) is
NULL the relaxation is pruned only if the score is below the threshold.</p>
        <p>Case2: If an object o is replaced by o0 with the subsequent predicate q, and
that o0 \ domain(q) is NULL the current relaxation can be pruned as it will not
produce results. There may be a situation when the subsequent predicate q is
relaxed to q0 and o0 \ domain(q0 ) is not NULL in which some results are missed.
Therefore, a minimum threshold is maintained for the score of relaxation, and if
the intersection of o0 \ domain(q) is NULL the relaxation is pruned only if the
score is below the threshold.</p>
        <p>Algorithm 2: Proposed Approach</p>
        <p>Input : :Query Q</p>
        <p>Output: :Approximate answers
1 let be the threshold
2 seed = intial set of uris
3 foreach queryP redicatek in Q do
4 while seed 6= do
5 foreach seedi do
6 Dereference seedi and retrieve its RDF graph R
7 Remove seedi from seed
8 foreach predicate pj in R do
9 if pj matches the corresponding query predicate
queryP redicatek then
relaxScore(pj) = 1
if pjobject isbound then</p>
        <p>compute relaxScore(pjobject ) with queryP redicatekobject
13
14</p>
        <p>compute the relaxScore(pj) with queryP redicatek
Sort all pj in the descending order of their relaxScores.
foreach pj do
if relaxScore(pj) &gt; then
if pjobject isnotbound then</p>
        <p>seed ( seed [ pjobject
foreach seedi in seed do</p>
        <p>Retrieve the path p from seedi to root</p>
        <p>Return p as the approximate answer</p>
        <p>Figures 5 illustrates the two cases. The gure on the left shows the query
during whose execution the predicate "worksForUniversity" is relaxed to
"worksFor". If there is a predicate "worksForCompany" in the retrieved RDF graph
of the entity and as it is a subproperty of "worksFor" the query condition is
relaxed to "worksForCompany". But the domain of the predicate succeeding it,
that is "hasNumberOfStudents", is the class of universities whereas the range
of the predicate "worksinCompany" is the class of Companies whose
intersection is NULL. Thus this relaxation is pruned. But there is a possibility that the
relaxation of the next predicate is from "hasNumberofStudents" to
"hasnumberofEmployees". In which case the domain of the new relaxed predicate is the
class of companies whose intersection with the range of earlier relaxed
predicate is again the class of companies. Hence, if the rst relaxation had not been
discarded results could have been produced. To handle this situation, the score
of relaxation is taken into account. If the score is above a certain prede ned
threshold, the relaxation is allowed and the query execution proceeds as usual.
The gure on the right shows the query during whose execution the object node
"paper" is relaxed to the class of "books". However, the next predicate
"publishedinConference" has the class of papers as it domain. Hence, the relaxation
to class of books produces a NULL set and can be pruned.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Algorithm 3: Optimizations</title>
      <p>Input : :Query Q</p>
      <p>Output: :Decision on whether to continue with current approximation
1 let t denote the triple being handled, which is approximated to t0
2 let q be the predicate succeeding t
3 let be the threshold on the score of approximation
4 if predicate p is relaxed to p0 then
5 if range(p0 ) \ domain(q) == NULL then
6 if score(t) &lt; then
7 try di erent relaxation of p
9
10
11
8 if object node o is relaxed to o0 then
if o0 \ domain(q) == NULL then
if score(t) &lt; then</p>
      <p>try di erent relaxation of o
5</p>
      <sec id="sec-4-1">
        <title>Experiments</title>
        <p>
          The experiments were conducted on a Pentium 4 machine running windows XP
with 1 GB main memory. All the programs were written in Java. The synthetic
data used for the simulations was generated with the LUBM benchmark data
generator [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. The LUBM benchmark is basically an university ontology that
describes all the constituents of a university like its faculty,courses,students etc.
The synthetic data is represented as a web of linked data with 200,890 nodes
denoting entities and 500,595 edges denoting the relationships between them.
The e cacy of the proposed idea was demonstrated by executing a set of queries
in gure 6 used in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] on the simulated web of linked data of a university and
comparing the results with the existing approach. Each of the query below can be
relaxed in a number of ways and the existing approach generates relaxed queries
and executes them sequentially one by one whereas in contrast the proposed
approach integrates the process of relaxation with the query execution to produce
approximate answers. The time taken to execute the query is proportional to the
number of URIs resolved to fetch their RDF descriptions during the course of
query execution. Therefore, this paper uses the reduction in the number of URIs
fetched as a metric to judge the results as the web of linked data was simulated
on a single machine.
        </p>
        <p>Query 1 searches for the teaching assistants of a particular course who have
a masters degree from a particular university. Approximate answers are
generated by relaxing the constraints step by step on the teaching assistant that is
the teaching assistant can handle any course and have a master's degree from
any university. Query 2 searches for assistant professors who teach a graduate
course. Approximate answers are produced by relaxing the conditions in steps
to look for all faculty who teach any course. Query 3 looks for assistant professor
advisors who have a particular research interest. The query is again relaxed in
steps by searching for all the people in the university who have any research
interest. Query 4 searches for advisors who are professors and work for a
particular university. Approximate answers are produced by looking for advisors who
can be any type of faculty and who work for any university. Query 5 searches
for professors who have authored a journal article. Approximate answers are
produced step by step by looking for all persons including graduate students
who have authored any type of paper. Figure 7 shows the number of URIs of
entities dereferenced with the existing and the proposed approaches. The query
performance improves by 75% for query 1, 80% for query 2, 83% for query 3,
78% for query 4 and 67% for query 5.
6</p>
      </sec>
      <sec id="sec-4-2">
        <title>Conclusions And Future Work</title>
        <p>The paper presented an approach towards allowing approximate querying of the
web of linked data. The proposed idea produces approximate answers by
relaxing the query conditions on-the- y during query execution using the ontologies
available on the web of linked data, in contrast with existing approach which
generates multiple relaxed queries and executes them sequentially. The
advantage of proposed approach is that it is able to avoid fetching the shared data
between the relaxed queries repeatedly, which results in signi cant performance
bene ts as shown in the experiments. Future work includes investigation of other
schemes, like top-k systems, towards producing approximate answers.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Franklin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>From databases to dataspaces: A new abstraction for information management</article-title>
          .
          <source>SIGMOD Record</source>
          <volume>34</volume>
          (
          <year>2005</year>
          )
          <volume>27</volume>
          {
          <fpage>33</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heath</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Linked data { the story so far</article-title>
          .
          <source>International Journal on Semantic Web and Information Systems</source>
          <volume>5</volume>
          (
          <issue>3</issue>
          ) (
          <year>2009</year>
          )
          <volume>1</volume>
          {
          <fpage>22</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Linked data - design issues</article-title>
          .
          <source>web page</source>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Prud'hommeaux</surname>
          </string-name>
          , E.,
          <string-name>
            <surname>Seaborne</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>SPARQL query language for RDF. W3C recommendation</article-title>
          ,
          <source>World Wide Web Consortium</source>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Tummarello</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Delbru</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oren</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <article-title>Sindice.com: Weaving the open linked data</article-title>
          . (
          <year>2008</year>
          )
          <volume>552</volume>
          {
          <fpage>565</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ding</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Finin</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joshi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cost</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peng</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reddivari</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Doshi</surname>
            ,
            <given-names>V.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sachs</surname>
          </string-name>
          , J.:
          <article-title>Swoogle a semantic web search and metadata engine</article-title>
          .
          <source>In: Proc. 13th ACM Conf. on Information and Knowledge Management. (Nov</source>
          .
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Sheth</surname>
            ,
            <given-names>A.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Larson</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          :
          <article-title>Federated database systems for managing distributed, heterogeneous, and autonomous databases</article-title>
          .
          <source>ACM Comput. Surv</source>
          .
          <volume>22</volume>
          (
          <issue>3</issue>
          ) (
          <year>1990</year>
          )
          <volume>183</volume>
          {
          <fpage>236</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Quilitz</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leser</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>Querying distributed rdf data sources with sparql</article-title>
          . In Hauswirth,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Koubarakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Bechhofer</surname>
          </string-name>
          , S., eds.
          <source>: Proceedings of the 5th European Semantic Web Conference. LNCS</source>
          , Berlin, Heidelberg, Springer Verlag (
          <year>June 2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Hartig</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Freytag</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          :
          <article-title>Executing SPARQL queries over the web of linked data</article-title>
          .
          <source>In: ISWC 2009: Proceedings of the 8th International Semantic Web Conference</source>
          , Chantilly,
          <string-name>
            <surname>VA</surname>
          </string-name>
          , USA. (
          <year>2009</year>
          )
          <volume>293</volume>
          {
          <fpage>309</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          , Liu,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <surname>X.</surname>
          </string-name>
          :
          <article-title>Computing relaxed answers on rdf databases</article-title>
          . In Bailey, J.,
          <string-name>
            <surname>Maier</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schewe</surname>
            ,
            <given-names>K.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thalheim</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
          </string-name>
          , X.S., eds.
          <source>: WISE</source>
          . Volume
          <volume>5175</volume>
          of Lecture Notes in Computer Science., Springer (
          <year>2008</year>
          )
          <volume>163</volume>
          {
          <fpage>175</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          , He in, J.:
          <article-title>Lubm: A benchmark for owl knowledge base systems</article-title>
          .
          <source>J. Web Sem</source>
          .
          <volume>3</volume>
          (
          <issue>2</issue>
          -3) (
          <year>2005</year>
          )
          <volume>158</volume>
          {
          <fpage>182</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>