<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Initial Usage Analysis of DBpedia's Triple Pattern Fragments</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ruben Verborgh</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Erik Mannens</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rik Van de Walle</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ghent University - iMinds</institution>
          ,
          <country country="BE">Belgium</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Queryable Linked Data is available through several interfaces, including SPARQL endpoints and Linked Data documents. Recently, the popular DBpedia dataset was made available through a Triple Pattern Fragments interface, which proposes to improve query availability by dividing query execution between clients and servers. In this paper, we present an initial usage analysis of this interface so far. In 4 months time, the server had an availability of 99.999%, handling 4,455,813 requests, more than a quarter of which were served from cache. These numbers provide promising evidence that Triple Pattern Fragments are a viable strategy for live applications on top of public queryable datasets. In this paper, we discuss the analysis of 4 months of usage data of the English DBpedia Triple Pattern Fragments interface, as well as availability data measured by an external party (Pingdom).</p>
      </abstract>
      <kwd-group>
        <kwd>Linked Data</kwd>
        <kwd>Linked Data Fragments</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        DBpedia [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is currently the most well-known dataset within the Semantic Web
community. It consists of hundreds of millions of RDF triples automatically generated from
the free Wikipedia encyclopedia. Such large Linked Datasets come with important
challenges, most prominently: how do we provide scalable queryable access to them? The
traditional answer has been to set up a public SPARQL endpoint [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], but such endpoints
suffer from low availability rates [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Yet reliable access is a prerequisite to build
applications on top of a queryable DBpedia interface.
      </p>
      <p>
        In October 2014, the DBpedia community opened a Triple Pattern Fragments interface
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] maintained by the authors of this paper. This interface is designed to allow high
availability on the server side, while still enabling live querying on the client side. Queries
take more time and bandwidth, because they are mostly executed by the client, but the
timings are consistent so that building applications on top of a public DBpedia interface
becomes realistic.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Related work</title>
      <p>
        In this section, we will briefly discuss existing Web APIs to Linked Datasets. Linked Data
Fragments (LDF, [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]) were introduced as a uniform view to capture the characteristics of
any Linked Data Web API. The common aspect of all interfaces is that, in one way or
another, they offer specific parts of a dataset. Each part is referred to as a Linked Data
Fragment, consisting of:
data the triples of the dataset that match an interface-specific selector;
metadata triples to describe the fragment itself;
controls hyperlinks and/or hypermedia forms that lead to other fragments.
      </p>
      <sec id="sec-2-1">
        <title>File-based datasets</title>
        <p>So-called data dumps are conceptually the most simple APIs: the data consists of all
triples in the dataset. They are combined into a (usually compressed) archive and
published at a single URL. Sometimes the archive contains metadata, but controls—with
the possible exception of HTTP URIs in RDF triples—are not present. Query execution is
the clients' responsibility.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Linked Data documents</title>
        <p>
          Datasets published through the Linked Data principles [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] are available as individual
documents per subject, which can be retrieved by performing an HTTP GET request on the
subject's URL (“dereferencing”). Each such document is a fragment, in which the data
consists of triples related to that subject, the metadata set might contain properties such
as author and publication data, and the controls consist of links to other Linked Data
documents. Querying is possible through strategies such as link traversal [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>SPARQL endpoints</title>
        <p>
          SPARQL endpoints [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] allow executing SPARQL queries [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] on a dataset through HTTP.
A SPARQL fragment's data consists of triples matching the query (assuming the CONSTRUCT
form); the metadata and control sets are empty. Query execution is performed entirely by
the server, and because each client can ask highly individualized requests, the reusability
of fragments is low. This, combined with complexity of SPARQL query execution, likely
contributes to the low availability of public SPARQL endpoints [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>Triple Pattern Fragments</title>
        <p>
          The Triple Pattern Fragments API [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] interface has been designed to minimize
serverside processing, while at the same time enabling efficient live querying on the client side.
A fragment's data consists of all triples that match a specific triple pattern, and can
possibly be paged. Each fragment page mentions the estimated total number of matches
to allow for query planning, and contains hypermedia controls to find all other Triple
Pattern Fragments of the same dataset. Since requests are less individualized, fragments
are more likely to be reused across clients, which increases the benefits of caching [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Deployment and analysis setup</title>
      <sec id="sec-3-1">
        <title>Server specifications</title>
        <p>The official DBpedia Triple Pattern Fragments interface is hosted on a virtual machine from
the Amazon Elastic Compute Cloud (EC2). We opted for an c3.2xlarge machine
configuration, which has the following characteristics:
virtual CPUs: 8
memory: 15GB
hard disk space: 2 × 80GB
price: $ 0.478 per hour (dedicated instance in Ireland)
We would like to stress that the above specifications are actually too high for our purpose;
as a result, the server is currently mostly idle. The issue is, however, that Amazon does
not allow customization of machines. While lighter configurations exist, they come with
lower disk throughput and/or bandwidth.</p>
        <sec id="sec-3-1-1">
          <title>The machine runs the following software:</title>
          <p>
            operating system: Ubuntu Linux 14.04 LTS
Web server: nginx 1.4.6
application server: Linked Data Fragments server 1.1.4 on top of Node.js
0.10.36
The nginx server acts as a reverse proxy and cache. All requests first reach nginx, which
checks whether a response is present in the cache based on a unique identifier consisting
of the request URI and the value of the HTTP Accept header. If so, it is sent to the client; if
not, the request is forwarded to the application server. The application server then parses
the request, and retrieves the DBpedia data from an HDT file [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ] that is loaded into
memory. It is then serialized in a format according to the Accept header, sent to the client,
and stored in the cache.
          </p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Analysis setup</title>
        <p>All incoming requests are logged line by line in a file by the nginx Web server. Note that
logging does not happen on the application server, as this server only receives those
requests that are not handled by the cache. Each log line contains the following fields:
 client IP address
 request URI
 value of the Accept header
 value of the Referer header
 value of the User-Agent header
 local server time
 response size
 response cache status
 response HTTP status code</p>
        <sec id="sec-3-2-1">
          <title>The resulting access logs are hosted publicly.</title>
          <p>Additionally, the availability of the HTTP interface is monitored by the external third-party
service Pingdom, because public availability can of course not reliably be monitored by the
Web server itself. Pingdom performs an HTTP request once every minute for the ?s
rdf:type ?o fragment and notes whether a response was successfully received. If no
timely responses arrives, the server is assumed to be unavailable. The results are
available in an online interface.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Usage analysis</title>
      <p>In this section, we will search an answer to these basic usage questions:






</p>
      <sec id="sec-4-1">
        <title>How many requests were issued?</title>
        <p>Which clients made these requests?
What types of content were those clients interested in?
Where did the requests originate from?
What kind of triple patterns were requested?
How effective has the cache been?</p>
        <p>What period of time was the server (un-)available?
We focused on requests with an HTTP 200 OK response only, in order to remove (very
minimal) noise from invalid requests against the interface.</p>
        <sec id="sec-4-1-1">
          <title>Number of requests</title>
          <p>The server logs reveal a total of 4,455,813 requests for Triple Pattern Fragments of the
English DBpedia version (URLs starting with http://fragments.dbpedia.org/2014/en)
during the four considered months, or an average of 1,113,953 requests per month.
November 2014 was responsible for twice the average monthly traffic. The majority of this
initial traffic originates from a machine within Ghent University (recognizable by the
157.193.0.0/16 IP address block), which was used to stress test the new server and
measure the execution times of various simple and complex queries. In the other months,
the traffic was much more varied.</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>User agents</title>
          <p>By extracting and parsing the value of the HTTP User-Agent header sent by clients, we
were able to see what kinds of clients were interested in DBpedia's Triple Pattern
Fragments. The vast majority of requests (70.1%) were performed by the Node.js Triple
Pattern Fragments client, which executes SPARQL queries by requesting triple patterns.
This client can either be used in a standalone manner, or as a library for other
applications. We cannot distinguish between access made by the standalone client and
access made by other software packages that use the client as a library. To mitigate this
in the future, we should suggest that such other software packages use their own user
agent identifier.</p>
          <p>
            The second and third most active clients were crawlers from the search engines Google
and Baidu respectively. This is especially remarkable because it contrasts with SPARQL
endpoints, which belong to the so-called “deep Web” [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ]: in order to access data, a user
must write a SPARQL query in an HTML form. The only SPARQL endpoint resources that
are accessible on the Web are SPARQL queries that are explicitly linked from another page
(such as this one). While the Triple Pattern Fragments specification only demands the
presence of a hypermedia form (which would thus also hide fragments in the deep Web),
the server implementation explicitly links to relevant fragments. For instance, the subjects
born in Slovenia fragment links to fragments for the birthplace predicate, Slovenia, and all
individual subjects born in Slovenia. This allows people and crawlers to browse the
interface similar to how Linked Data documents are navigated. An added value of Triple
Pattern Fragments is that all resources can be followed within the interface, not only those
resources that share the URI space of the current document (as is the cases with Linked
Data documents).
          </p>
          <p>The statistics also reveal browser usage, mainly through the Chrome browser. The Accept
header tells us that, out of 218,004 requests by various Chrome versions, 216,828 were
performed by the in-browser version of the Triple Pattern Fragments client during the
execution of SPARQL queries; 782 requests consist of HTML pages viewed by humans.
Note the 172,785 requests performed by the Pingdom bot to monitor the availability of the
interface. Not shown in the graph are 53,848 requests from Apache-HttpAsyncClient,
which likely originate from the early-stage Java Triple Pattern Fragments client. Finally,
the Perl client deserves a honorable mention with 19,287 requests.</p>
        </sec>
        <sec id="sec-4-1-3">
          <title>Requested content types</title>
          <p>The Triple Pattern Fragments interface exposes the same fragments through the same
URLs, regardless of content type. This is achieved through HTTP content negotiation. For
instance, the fragment “subjects born in Slovenia” has the URL http://fragments.
dbpedia.org/2014/en?predicate=http%3A%2F%2Fdbpedia.org%2Fontology%2FbirthPlace
&amp;object=http%3A%2F%2Fdbpedia.org%2Fresource%2FSlovenia. In order to retrieve an HTML
representation, a client should send an HTTP GET request with an Accept header that
prefers HTML; the same goes for other content types such as Turtle or JSON. We analyzed
the requested representations by looking at clients' most preferred options.
Since each type of client usually consumes a specific format, the distribution of clients
strongly influences the requested content types. It is therefore expected that the content
type requested by the Triple Pattern Fragments client (both standalone and in-browser)
prevails. We indeed see that the majority of requests (58.8%) has a preference for Turtle,
which used to be this client's preferred format up to version 1.2.1. From version 1.2.2
onwards, support for the quad-based serialization format TriG was added, which uses
graphs to separate fragment data from metadata and controls. Hence, we also see a large
amount (20.3%) of TriG requests—and this is expected to dominate in the future as older
client versions disappear.</p>
          <p>Browsers of course prefer HTML variants for displaying to humans. Some clients (mostly
crawlers) indicated they had no specific preference (*/*). The Pingdom bot does not
indicate any Accept header and is the major contributor to the none category. Other
requested formats include JSON (for which JSON-LD representations were returned) and
other RDF formats such as N-Quads and N-Triples.</p>
        </sec>
        <sec id="sec-4-1-4">
          <title>Geographic location</title>
          <p>In order to determine the geographic origins of requests, we performed automated
lookups on the client IP addresses. As indicated earlier, stress tests by Ghent University
were performed during the interface's first weeks. This is visible in the large portion
originating from Belgium (56.2%). 188,891 Belgian requests (4.24% of all requests) did
not originate from within Ghent University. After France, clients from the United States
and China were popular visitors, mostly through search engine crawlers.</p>
          <p>In total, the interface received traffic from 47 countries, 17 of which sent at least
1,000 requests.</p>
        </sec>
        <sec id="sec-4-1-5">
          <title>Requested triple patterns</title>
          <p>If an interface allows highly specific queries, like SPARQL endpoints do, we expect a great
variety of requests on the server side. Also, this brings detailed insights in the kind of
goals clients have. Since the Triple Pattern Fragments interface is deliberately more
simple, we expect to see more repeated queries—but less insights in how these smaller
queries contribute to a goal for the client.</p>
          <p>Unsurprisingly, the fragment requested every minute by Pingdom (?s rdf:type ?o) as
part of its availability monitoring process is most popular. This is closely followed by the all
fragment. Since this is the most generic fragment of the dataset, clients in practice often
use it to start their more complex process; i.e., it is the first form they fill out. However,
any fragment can in theory be used as a starting point, as the Triple Pattern Fragments
specification requires all of them to contain the same hypermedia controls. Since the all
fragment is a straightforward starting point, the 153,214 requests to this fragment format
give a vague indication of the total number of SPARQL queries that were executed by
clients. As clients might also perform other tasks, this number is likely inaccurate (and as
more clients are developed, this will become only more vague). For instance, the Referer
header values reveal that 8,955 requests originated from the UDUVUDU DBpedia Viewer,
which visualizes topics from DBpedia. Finally, we note a high number of &lt;s&gt;
rdfs:subClassOf ?o requests for specific instances of &lt;s&gt;. They were caused by the stress
testing queries we issued, which contained rdfs:subClassOf constructs.</p>
          <p>Other than the three cases above, no obvious patterns were found in the requests.</p>
        </sec>
        <sec id="sec-4-1-6">
          <title>Cache effectiveness</title>
          <p>A premise of the Triple Pattern Fragments interface is that clients partly reuse the same
fragments to achieve different but similar goals. With SPARQL endpoints, clients instead
send highly specialized requests; overlapping information between them cannot be reused
on the HTTP interface level. With Triple Pattern Fragments, the number of unique requests
is relatively smaller, so the cache can work more effectively.</p>
          <p>The nginx reverse proxy server has been configured to cache requested fragments for
a maximum time of 1 hour. Uniqueness of requests is determined by a combination of
URL and Accept header. As such, the Triple Pattern Fragments server generates each
unique response at most once per hour; all subsequent requests are handled by the
cache. Furthermore, the proxy server sets the expiration date of responses to 7 days in
the future. Clients that have a built-in cache themselves, such as browsers, are thereby
suggested to only repeat a request for a resource after a week. Note that the standalone
client does not have a persistent cache; therefore, each invocation of that client results in
new resource accesses.</p>
          <p>In total, 28.1% of responses were served from the nginx cache. This means that between
a quarter and a third of all responses were needed again by the same client or other
clients within the hour. A minority of 3.5% had been present in the cache for longer than
1 hour, so new versions needed to be fetched. Finally, a few requests (1,278) explicitly
asked to bypass the cache. So while the majority of requests was not cached, the caching
mechanism was able to reduce the load on the application server by 28.1%. Since the
dataset in this case is static, and the number of fragments finite, we could set a higher (of
even infinite) cache timeout. At the moment, however, there was no necessity to do so.
One of the main goals of the Triple Pattern Fragments interface is to maximize availability,
in order to allow building applications on public, live queryable Linked Data sources.
During the period of November 2014 to February 2015, a fragment was retrieved from the
server every minute to verify availability. This amounts to a total of 120 days ×
1,440 minutes per day = 172,800 minutes. According to Pingdom, only 1 of those
requests did not receive a timely answer; this occurred on 2014/11/26 at 3:52pm CET.
What exactly happened at this moment is unclear; nothing particular shows up in the
server logs at this point, except an unexpected gap between two Pingdom requests at
14:50 and 14:52 UTC (server time). We presume it is an Amazon-related outage due to
lack of evidence of software malfunction at this moment in time.</p>
          <p>In any case, the above allows to precisely calculate the availability during the observed
period of 4 months. Dividing the minutes of availability by the total number of minutes
gives 172,799 / 172,800 = 99.99942…%. This amounts to an availability level of “5
nines”, or on average maximum 25 seconds downtime per month (assuming months of 30
days).</p>
          <p>Note that the total number of requests logged from Pingdom (172,785 as indicated above)
is 15 short of the expected total of 172,800. Since Pingdom did not report any other
outages, we are unsure about the cause. Slightly incomplete logging could be
a straightforward explanation, for instance, if Pingdom dropped the connection before the
full response was received.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>When the official Triple Pattern Fragments interface for DBpedia was released, we mostly
heard three types of questions:</p>
      <sec id="sec-5-1">
        <title>1. Will this interface be used?</title>
        <p>2. If so, how will clients use it?
3. Will the availability of this interface be sufficient for live application usage?
The analysis in this paper allows us to formulate a preliminary answer on all three of
them.</p>
        <p>First of all, the interface has indeed been used, as evidenced by more than 4 million
requests in the course of its first 4 months. Most of this usage came from the client-side
SPARQL query executor we previously built for the Triple Pattern Fragments interface, but
we also saw third-party clients such as a Perl client and a DBpedia viewer. Search engine
crawlers also consumed the interface with ease. Relatively few people browsed the
interface directly, as it is of course targeted at machines. It does raise the question
whether it makes sense to improve accessibility for people. Client IP addresses from
47 countries show that usage is spreading geographically.</p>
        <p>
          Second, while the analysis provides us with some insights about how the interface is used,
more high-level patterns are absent. On the one hand, this is a blessing for privacy:
clients only ask generic questions, and they themselves can combine this to answers for
more complex questions in any way they see fit. On the other hand, it makes it harder to
understand what kind of usage is popular, and for which use cases we could or might need
to optimize. This process could be facilitated if we explicitly ask clients to provide
feedback [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. For now, we are in the dark as to precisely what SPARQL queries—and other
tasks—clients have executed. Having more information would allow us to compare this
with, for instance, the logs of the public DBpedia SPARQL endpoint. At the same time, we
should realize that not all clients of Triple Pattern Fragments interfaces necessarily have
the evaluation of SPARQL queries as a task or subtask.
        </p>
        <p>Third, the 99.999% availability of the server removes any doubt that the Triple Pattern
Fragments interface is sufficiently reliable for live applications. We must, however, remark
two things here. While 4 million requests is a large quantity for a young interface, it is still
nowhere near full capacity. The server is still mostly idling, so in order to really find out its
limits, more requests are necessary. Also, the number of requests cannot be compared to
that of a SPARQL endpoint, as in many cases, more requests are necessary to achieve the
same goal. When talking about availability, we therefore need to mention expressivity too.
The goal of the Triple Pattern Fragments interface is to reliably balance both.
Our conclusion is that applications now have a reliable interface to query the public
DBpedia dataset. Therefore, we seem to have overcome one of the main obstacles that
could hold developers from building applications on top of live Linked Data. An important
question remains: is this enough? Now that reliable access is possible, what excuses
remain for not building intelligent Linked Data clients? It seems the next move should be
made by application developers, given that the data and the tools are now really there,
99.999% of time. We should keep our eyes, ears, and minds open to the demands of this
community to help evolve the concept of Semantic Web applications from vision to reality.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>The described research activities were funded by Ghent University, iMinds, the Institute for
the Promotion of Innovation by Science and Technology in Flanders, the Fund for Scientific
Research Flanders, and the European Union.</p>
      <p>Pingdom graciously provided us with availability monitoring. The geographic analysis was
performed using GeoLite data created by MaxMind. Special thanks to Dimitris Kontokostas
from the DBpedia Association for giving us the opportunity to host DBpedia as Triple
Pattern Fragments.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heath</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Linked Data - the story so far</article-title>
          .
          <source>International Journal on Semantic Web and Information Systems</source>
          <volume>5</volume>
          (
          <issue>3</issue>
          ),
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          (
          <year>Mar 2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kobilarov</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Becker</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cyganiak</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hellmann</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>DBpedia - a crystallization point for the web of data</article-title>
          .
          <source>Journal of Web Semantics</source>
          <volume>7</volume>
          (
          <issue>3</issue>
          ),
          <fpage>154</fpage>
          -
          <lpage>165</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Buil-Aranda</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hogan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Umbrich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vandenbussche</surname>
          </string-name>
          , P.Y.:
          <article-title>SPARQL Webquerying infrastructure: Ready for action?</article-title>
          <source>In: Proceedings of the 12th International Semantic Web Conference (Nov</source>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Feigenbaum</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>G.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>K.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Torres</surname>
          </string-name>
          , E.:
          <article-title>SPARQL 1.1 protocol</article-title>
          . Recommendation,
          <source>World Wide Web Consortium (Mar</source>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Fernández</surname>
            ,
            <given-names>J.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martínez-Prieto</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gutiérrez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Polleres</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arias</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Binary RDF representation for publication and exchange (HDT)</article-title>
          .
          <source>Journal of Web Semantics</source>
          <volume>19</volume>
          ,
          <fpage>22</fpage>
          -
          <lpage>41</lpage>
          (
          <year>Mar 2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Harris</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seaborne</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>SPARQL 1.1 query language</article-title>
          .
          <source>Recommendation, World Wide Web Consortium (Mar</source>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hartig</surname>
            ,
            <given-names>O.:</given-names>
          </string-name>
          <article-title>An overview on execution strategies for Linked Data queries</article-title>
          .
          <source>Datenbank-Spektrum</source>
          <volume>13</volume>
          (
          <issue>2</issue>
          ),
          <fpage>89</fpage>
          -
          <lpage>99</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Madhavan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ko</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Kot,
          <string-name>
            <surname>Ł</surname>
          </string-name>
          , Ganapathy,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Rasmussen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Halevy</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <source>Google's Deep Web crawl Proceedings of the VLDB Endowment</source>
          <volume>1</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>1241</fpage>
          -
          <lpage>1252</lpage>
          (
          <year>Aug 2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Verborgh</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <source>The Lonesome LOD Cloud In: Proceedings of the Fourth Workshop on Usage Analysis and the Web of Data (May</source>
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Verborgh</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hartig</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Meester</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haesendonck</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Vocht</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vander</surname>
            <given-names>Sande</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Cyganiak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Colpaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Mannens</surname>
          </string-name>
          , E., Van de Walle, R.:
          <article-title>Querying datasets on the Web with high availability</article-title>
          .
          <source>In: Proceedings of the International Semantic Web Conference. Lecture Notes in Computer Science</source>
          , vol.
          <volume>8796</volume>
          , pp.
          <fpage>180</fpage>
          -
          <lpage>196</lpage>
          (
          <year>Oct 2014</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>