<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>April</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Will Linked Data Benefit from Inverse Link Traversal?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Position Paper</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ansgar Scherp Kiel University, Germany Leibniz Information Center for Economics</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Stefan Scheglmann University of Koblenz-Landau, Germany Institute for Web Science and Techologies</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2014</year>
      </pub-date>
      <volume>8</volume>
      <issue>2014</issue>
      <abstract>
        <p>Query execution using link-traversal is a promising approach for retrieving and accessing data on the web. However, this approach finds its limitation when it comes to query patterns such as ?s rdf:type ex:Employee, where one does not know the subject URI. Such queries are quite useful for di erent application needs. In this paper, we conduct an empirical analysis on the use of such patterns in SPARQL query logs. We present di erent solution approaches to extend the current Linked Open Data principles with the ability for inverse link traversal. We discuss the advantages and disadvantages of the di erent approaches.</p>
      </abstract>
      <kwd-group>
        <kwd>Querying of Linked Open Data</kwd>
        <kwd>Link Traversal</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>The four principles of Linked Open Data (LOD) as
published in 20061 have gained widespread adoption in research,
industries, as well as governmental and non-profit
organizations. Today, we find various approaches for
publishing linked data such as dedicated data stores like DBpedia,
SPARQL engines with a SPARQL endpoint, plain RDF files,
as well as data embedded into websites such as RDFa. After
the “early years” of individually implemented solutions and
1http://www.w3.org/DesignIssues/LinkedData.html last
visit 16th February, 2014
proofs of concepts, we now more and more see the
development of common frameworks and standardization e orts
around the four LOD principles. One example is the
current W3C working draft on a Linked Data Platform (LDP)2;3,
which aims at providing further clarifications and extensions
to the four Linked Data Principles defined by Tim
BernersLee. It describes a generic infrastructure for providing LOD
resources as well as modifying them. The principle idea is
to o er so-called LDP resources (LDPR) that can be
dereferenced in order to obtain information about it (read), modify
or delete the resource, or create a new resource. In addition
to the notion of LDPR, the draft also suggests the concept of
LDP collections (LDPC). A LDPC represents a set of
“samesubject, same-predicate triples”, which can be accessed and
modified through a common URI. Another example of a
standardization e ort is the SPARQL Graph Protocol4, a W3C
recommendation that operates on the level of graphs as entities
for creation, modification, and deletion. Both specifications
overlap in certain points and seek to align their e orts as
much as possible.5</p>
      <p>
        We very much appreciate these e orts as they allow for
developing more robust and mature LOD services. However,
we believe that still improvements are needed regarding an
e cient access and retrieval of linked data. Searching on
linked data can in principle be categorized into two di erent
approaches: a) Crawling all the data in a first step and
subsequently indexing it to make it available to the users [
        <xref ref-type="bibr" rid="ref12 ref15 ref18 ref19 ref5 ref6 ref9">6, 12,
19, 15, 5, 18, 9</xref>
        ] and b) On-the-fly retrieval of the data
according to the follow-your-nose-principle [
        <xref ref-type="bibr" rid="ref10 ref11">11, 10</xref>
        ]. While the first
approach adopts the typical search scenario as it is known
from the web, the second approach comes more natural with
the LOD approach. This link traversal based query
execution seems one of the most promising approaches to execute
queries over the Web of Linked Data. No fixed, predefined
set of relevant data sources has to be defined and relevant
sources can be discovered during query execution. Hartig et
al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] defined the semantics of link-traversal query
execution on the web of data. However, it has its limitations. In
2http://www.w3.org/TR/2013/WD-ldp-20130730/ last visit
16th February, 2014
3https://dvcs.w3.org/hg/ldpwg/raw-file/default/
ldp-bp/ldp-bp.html last visit 16th February, 2014
4http://www.w3.org/TR/sparql11-http-rdf-update/ last
visit 16th February, 2014
5A detailed discussion on the overlaps and resulting
conflicts can be found here: http://www.w3.org/2012/ldp/
wiki/Main_Page#Linked_Data_Platform_.28LDP.29_vs_
SPARQL_Graph_Store_HTTP_Protocol_.28GSP.29 last visit
16th February, 2014
order to illustrate this, let us consider a simple example taken
from Hartig and Freytag [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] as shown in Listing 1 below.
      </p>
      <sec id="sec-1-1">
        <title>Listing 1: Example query on LOD (from [11]).</title>
        <p>1 SELECT ?p ?l WHERE {
2 &lt;http :// bob.name &gt; &lt;http ://.../ knows &gt; ?p.
3 ?p &lt;http ://.../ currentProject &gt; ?pr.
4 ?pr &lt;http ://.../ label &gt; ?l.
5 }</p>
        <p>
          The query specifies the information need to obtain a list of
project names that friends of &lt;http://bob.name&gt; work on. This
query can be executed over LOD as shown by the authors [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
Still, executing queries on LOD has significant limitations
when it comes even to slighty di erent information needs.
For example, we might not know a-priori the existence of
Bob or even if we know about him as person, we might not
be aware of &lt;http://bob.name&gt; being the URI representing
him. Thus, we like to execute a query on LOD as shown in
Listing 2 where we do not know the subject URI of the first
query pattern. Such a query is useful, e. g., on data providers
such as the fictitious media company Big Lynx described in
the Linked Data book by Heath and Bizer6 to first determine
which employees are working with the company, before finding
out further information about each person.
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>Listing 2: Query requiring inverse traversal.</title>
        <p>1 SELECT ?s WHERE
2 ?s &lt;http ://.../ type &gt;
3 &lt;http ://.../ BigLynx/Employee &gt; .
4 }</p>
        <p>However, this and similar queries cannot be executed on
the LOD today as it requires an inverse link traversal from the
Big Lynx specific vocabulary defining the concept Employee
to the resource URIs that are typed with this concept. Please
note that our discussion of inverse traversal mainly addresses
the rdf:type triples. However, the ideas can be applied in
principle to triples with any kind of properties. We refer to
this at some parts of the paper.</p>
        <p>Below, we first further illustrate and motivate the need
for an local inverse link traversal of rdf:type predicates on
LOD. Subsequently, we conduct an empirical analysis on the
use of SPARQL queries involving query patterns such as the
one shown in Listing 2. We present possible approaches to
extend the current LOD principles with the ability for inverse
link traversal and discuss its advantages and disadvantages.
Finally, we present related work before we conclude.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>QUERIES ON THE SEMANTIC WEB</title>
      <p>In order to motivate the need for an inverse link-traversal
for querying LOD, we first look into further detail how queries
on the Semantic Web are executed. To this end, we consider
and discuss the following two questions:
(1) Why is it not su cient to fall back to other query paradigms
like SPARQL in cases where we need to resolve query
patterns such as the example in Listing 2.
(2) Is this pattern important enough and would the local
execution generate enough benefit to justify such an intervention
into existing standards?</p>
      <p>First, we address Question 1: As already mentioned, one
might argue that one should fall back to SPARQL in cases
6http://linkeddatabook.com/editions/1.0/ last visit
16th February, 2014
where link traversal based query execution is not su cient.
But this might not always be possible. According to the state
of the LOD cloud7 in 2011, only 68% of the data sources in the
LOD cloud provide SPARQL endpoint to allow expressive
queries to be asked against the datasets. The remaining 31%
are only accessible by link traversal queries.</p>
      <p>
        But also the stability of provided SPARQL endpoints is of
interest. Buil-Aranda et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] studied the current state of
available public SPARQL endpoints. They conducted
various experiments to test discoverability, interoperability,
performance, and availability. According to them availability
of SPARQL endpoints is still an issue. In their experiments,
they monitored 427 public SPARQL endpoints which are
registered at DataHub8 over a 27-month long experiment.
Regarding the availability, they conclude that only around 32%
of the endpoints reach an uptime of 99 100%, 59% an uptime
over 75%, and 29:3% are available less than 5% of the time.
      </p>
      <p>
        In order to answer Question 2, we rely on SPARQL query
logs. Di erent analyses about the current usage of SPARQL
already provide a well substantiated line of argumentation.
For instance, Gallego et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] analyzed the USEWOD2011
dataset. Their results show that more than 90% of the queries
consist of less than 3 query patterns. SPARQL query patterns
with the subject unbound while given predicate and object
corresponds to our rdf:type pattern. They are the third most
used type of patterns, with 7% in DBpedia and 46% in
Semantic Web Dog Food (SWDF)9 conference metadata. Möller
et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] applied similar analysis on a di erent dataset. They
conducted an analysis on query logs of SWDF, DBpedia,
DBtune, and RKBExplorer (RKB). In their analysis they came
to comparable results, 90% consist of less than three triple
patterns. The query pattern with unbound subject is used in
43% of the SWDF queries, 68% of the DBTune queries, 68%
of the RKB queries, and 50% of the DBpedia queries.
      </p>
      <p>Nethertheless, we conduced our own analysis, in order
to show the frequent use of rdf:type in SPARQL query
patterns. We extracted SELECT queries taken from the
USEWOD201310 data challenge files. The USEWOD2013 queries
where taken from Apache CLF11 logs of four di erent linked
open data sources, Linked Geo Data (LGD)12, DBpedia13,
bio2RDF14, and Semantic Web Dog Food (SWDF) conference
metadata. Table 1 shows the results of our analysis.</p>
      <p>We could show that around 11% of all SELECT queries
from the aforementioned logs contain at least one pattern
with rdf:type as predicate. This ratio ranges from less than
1% in the bio2rdf queries up to 14:1% in the DBpedia queries.
Please note that there is also the possibility that a triple pattern
like ?pr rdf:type ex:ResearchProject is contained in queries
like that one in Listing 1. In such cases, the query does not
7http://lod-cloud.net/state/ last visit 16th February,
2014
8http://datahub.io/en/dataset?res_format=api\
%2Fsparql last visit 16th February, 2014
9http://data.semanticweb.org last visit 16th February,
2014
10http://data.semanticweb.org/usewod/2013/ last visit
16th February, 2014
11Common Log Format, an informal standardhttp://en.
wikipedia.org/wiki/Common_Log_Format last visit 16th
February, 2014
12http://linkedgeodata.org/About last visit 16th February,
2014
13http://DBpedia.org/About last visit 16th February, 2014
14http://bio2rdf.org/ last visit 16th February, 2014
necessarily need the possibility to inverse traversal of rdf:type
links. However, investigating these cases was beyond scope
of our analysis.</p>
    </sec>
    <sec id="sec-3">
      <title>POSSIBLE SOLUTION APPROACHES</title>
      <p>In Linked Data, each resource has an explicit URI which
can be resolved in order to retrieve additional information
about this entity. However, how can one do this for RDF
types, i. e., how to dereference RDF type URIs? Here, two
principal challenges have to be addressed:</p>
      <p>RDF types used in linked data sources are often not
defined by the source itself, e. g., foaf:Person. Thus, the
provider hosting the vocabulary does not know
anything about the data sources that are using the
vocabulary to describe the resources.</p>
      <p>In addition, resolving RDF types such as foaf:Person in
order to retrieve its instances can potentially lead to a
very large amount of results. In an extreme case, i. .e,
when we aim at resolving the RDF type rdf:Resource,
we end up retrieving all instances.</p>
      <p>
        In order to address the challenges, one cannot operate on a
global level, i. e., on the entire Web of Linked Data. Rather,
we aim to provide resources of RDF types only on a local
level, i. e., resources hosted by data sources that are run by a
single organization or hosted on a single pay-level domain.
Thus, when resolving a RDF type such as foaf:Person, a data
provider operating a specific pay-level domain may return
a reference containing information about all instances it
defines using this type, e. g., by the semantic pingback
mechanism [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. In summary, we can come up with the following
solution approaches based on common REST practices to
allow inverse link traversal on Linked Data:
Approach 1: Simple URI Queries. It is possible that linked
data providers extend their servers with mechanisms that
allow for HTTP GET requests with embedded queries. This is
inspired by the W3C standardization of Media Fragments15,
i. e., the definition of a unique URI scheme to access
fragments of media assets such as images and videos on the web.
A media fragment URI specification supports a lightweight
mechanism to embed queries in such URIs. According to the
standard every URI consists of four parts:
      </p>
      <sec id="sec-3-1">
        <title>Listing 3: Media Fragment URI Schema</title>
        <p>1 &lt;scheme name &gt; : &lt;hierarchical part &gt;
2 [ ? &lt;query &gt; ] [ # &lt;fragment &gt; ]
15http://www.w3.org/TR/media-frags/
#standardisation-URI-queries last visit 16th
February, 2014
For inverse rdf:type traversal like in Listing 2 such an URI with
an embedded query might look like the examples displayed
in Listing 4. Query (1) introduces “instanceOf” as keyword
to indicate inverse of rdf:type. This would allow to retrieve
all instances of a specific RDF type. Query (2) extends this
to “subjectOf” and allows to specify with another parameter
the property URI like here the rdf:type. This allows to embed
queries for arbitrary inverse properties.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Listing 4: Embedded Queries</title>
        <p>(1) http ://ex.com/ instanceOf?</p>
        <p>&lt;http ://.../ BigLynx/Employee &gt;
(2) http ://ex.com/subjectOf?
p=&lt;http ://.../ #type &gt;&amp;
o=&lt;http ://.../ BigLynx/Employee &gt;</p>
        <p>In order to address the problem that a data provider hosts a
very large number of resources of specific RDF type, the data
can be retrieved in a paginated manner. For example, the
simple HTTP GET query can be extended by determining a limit
and an o set for the data, e. g., http://ex.com/instancesOf?
http://.../BigLynx/Employee&amp;offset=100&amp;limit=40.</p>
        <p>This retrieves forty instances starting at an o set of 100.</p>
        <p>The queries above could be provided as new features
similar to the URI lookup endpoint feature of VoiD.16 This feature
allows to determine a dedicated URI that can be used for
search by appending the keywords to this URI.</p>
        <p>Approach 2: Dedicated Schema URI. Instead of
providing a query mechanism, information about rdf:type entities
could also be made accessible by a specific URI per type.
Depending on the number of types in a dataset, this could lead
to a potentially large number of entity-URIs. An alternative
solution is that linked data providers could provide a special
kind of RDF schema document. This schema document has to
be made available under an generally accepted schema URI,
e. g., http://example.org/.well-known/schema/instance-types.17
A schema document consists of triples in the form
&lt;entitiesURI&gt; rdf:type &lt;type-URI&gt;, thereby the entities-URI provides
a direct lookup mechanism to find instances of the specific
type. For example, assuming two entities of foaf:Person on
a server, namely http://bob.name and http://tim.name, its
entities document would consist of two triples, cf. Listing 5.
If this list is very long, i. e., there exist a large number of
instances, a pagination approach can be implemented by
connecting multiple RDF schema documents via rdf:List.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Listing 5: Entities document for foaf:Person</title>
        <p>&lt;http :// bob.name &gt; rdf:type &lt;foaf:Person &gt;.
&lt;http :// tim.name &gt; rdf:type &lt;foaf:Person &gt;.</p>
        <p>A globally agreed URI schema like
http://example.org/.wellknown/schema allows to instantly access any kind of relevant
schema information for the entities in a data source such as
instances of specific types. However, it is debatable whether
standardization e orts should agree on generally accepted
URIs for storing schema information. Thus, another way
16http://www.w3.org/TR/2011/NOTE-void-20110303/
#lookup last visit 16th February, 2014
17Please note the use of the path .well-known/ for modeling a
“well-known location” for common data like our schema data
as proposed in RFC 5785 available from: https://tools.
ietf.org/html/rfc5785
would be to add a link to an instances list to a VoID metadata
description of the datasets.</p>
        <p>Implementation Issues.</p>
        <p>Finally, we have to think about how the response to a query
(Approach 1) or the type-entity documents (Approach 2) can
be generated. In general, Linked Data can be provided in
di erent means. One can distinguish three di erent ways of
publishing and accessing LOD on the web:
(a) Linked data sources using a dedicated RDF infrastructure,
like a triple store in the back-end. These sources may
provide a SPARQL endpoint to enable for complex queries. An
example of such a source is DBpedia.
(b) Linked data may also be provided by sources as a set
of di erent RDF files or data dumps. An example for this
could be a web server just providing a couple of FOAF18
documents, in addition to personal information published
on hosted websites.
(c) Linked data could even be embedded into existing (X)HTML
documents of a website using RDF in attributes (RDFa)19.</p>
        <p>For sources of type (a) the response to an inverse link
traversal could just be generated by executing a query
corresponding to the request directly on the triple store. Sources of type
(b) and (c) need a more sophisticated mechanism. Servers
providing linked data in that way have to somehow collect
the information first. If the linked data is provided in
multiple RDF files, e.g. like on a FOAF server, we might extend the
server by an indexing mechanism, which provides all RDF
files and generates the response set. In order to also support
embedded RDF, this indexing mechanism has to be extended
so that it can provide arbitrary files that allow to embed RDF
and identify and extract the desired information from those
files.</p>
        <p>Should an indexing service generate the pages
dynamically or should we go for static index pages? The former
would be space saving and more robust regarding changes
in the data but has several disadvantages in terms of
computational power and response times. The latter is clearly more
space consuming. Depending on the data, the worse case
leads to redundancy in the URIs that is linear to the
number of used types. It is also less robust regarding changes in
the data. There might be indices a ected by the change in
the data and those have to be updated. But this method has
clear advantages regarding performance and response times,
because all required answer sets already exist. The concrete
decision on how such an indexing service is finally
implemented depends on the concrete use case and the data. A
static indexing brings some advantages if space is not critical
but computational power, or the data is static and changes
infrequent, or the amount of entities per type is reasonable, or
there are strong requirements regarding the response times.
In other cases, e.g., for frequently changing data or datasets
with very large amounts of entities per type, dynamic
indexing might be more convenient. We could also think of
adaptive indexing strategies, e.g. computing only those indexes
dynamically which lead to a high amount of redundancy and
precompute indexes for frequently requested types.
18The friend of a friend ontology http://www.foaf-project.
org/ last visit 16th February, 2014
19The RDFa Primer http://www.w3.org/TR/
xhtml-rdfa-primer/ last visit 16th February, 2014
4.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>RELATED WORK</title>
      <p>
        We find a plethora of di erent approaches for searching
Linked Open Data (and the Semantic Web in general) such
as Swoogle [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], SemSearch [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], Sig.ma [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], Sindice [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ],
Falcons [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], Hermes [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], and LODatio [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].20 Despite particular
features and di erences these systems have, they all have in
common that they index a crawled set of semantic data. As
such, they require the semantic data readily at hand for search
in order to provide answers to the users’ queries. LODatio
does not store the instance data itself but solely relies on a
schema-level index. This schema-level index allows to find
sources of information on the web of data without keeping
the original data. However, it still requires an initial crawl of
LOD to compute the schema-level index.
      </p>
      <p>
        Among the di erent approaches for semantic search, there
are also some that exploit information from existing query
logs. Examples are Adeyanju et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] who use query logs
for query term recommendations such as generalizations and
specializations. Meij et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] align query logs with concepts
from DBpedia and conduct recommendations based on the
number of concepts linking to or from these concepts.
Overall, query logs provide useful information about the actual
information needs a LOD client has. However, they are not
used by LOD providers to compute some a-priori available
indices that ease the consumption of their data. Finally, we
find tools that solely rely on exploring the data in direct
communication with the LOD providers. A well known example
is the Tabulator system by Tim Berners-Lee [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] that allows
besides viewing and browsing also editing and thus updating
the data. Also Marbles21 allows for browsing the LOD graph
by following the outgoing edges. Given a user provided URI,
it retrieves all information about it by dereferencing it. In
addition, Marbles queries search engines such as Sindice and
Falcons and follows RDF predicates like owl:sameAs and
rdfs:seeAlso to gain further information about a resource.
While the data is retrieved live, Marbles can be considered
as hybrid solution as it not only makes use of other
semantic search engines but also makes the graph data persistent
across user sessions to allow for a more e cient access to
the data. However, it does not allow to browse the data
providers, e. g., by inverse type links or some simple
lookup features to find the instances of particular RDF types on
a LOD data source. A comprehensive overview of further
LOD components and clients can be found online and
includes Linked Data browsers, crawlers, data extractors, and
mashup frameworks.22.
      </p>
      <p>
        Hartig et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] formalized traversal-based query
execution on LOD, provide semantics for queries and analyse
characteristics of such queries. Examples of graph traversal
languages for RDF data are nSPARQL [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], a language with
focus on navigation through RDF data. With NautiLOD [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
Fionda et al. introduced a declarative language which
allows to specify portions of the web of data, defines routes
and instructs agents to perform actions. They also introduce
20For a comprehensive overview, please refer to: http:
//www.w3.org/wiki/TaskForces/CommunityProjects/
LinkingOpenData/SemanticWebSearchEngines last visit
16th February, 2014
21http://mes.github.io/marbles/ last visit 16th February,
2014
22http://www.w3.org/wiki/TaskForces/
CommunityProjects/LinkingOpenData/SemWebClients
last visit 16th February, 2014
swget, a tool that allows for the use of NautiLOD on the
web of data. GuLP [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] is a query execution language and
framework based on the link-traversal paradigm, which can
declaratively include preferential attachment into its queries
to order the answer in terms of node/edge attributes. With
SQUIN [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], Hartig presented a query execution framework
that integrates the traversal of data links into the result
construction process. All these approaches would clearly benefit
from our proposed extension.
      </p>
      <p>The VoID Vocabulary23 allows for describing Linked Datasets
in terms of providing metadata information such as a SPARQL
endpoint location, example resources, and the number of
triples, entities, classes, and properties in the data set.
However, this does not help in identifying resources of a specific
type. Most relevant to our work is the VoiD feature to denote
root resources (named using the void:rootResource property).
Here, it is assumed that the “entire dataset can be crawled by
resolving the root resource(s) and recursively following links
to other URIs in the retrieved RDF responses”. Thus, in
principle it is possible to list all relevant entities here. However,
it cannot be stated that specific resources are of a particular
RDF type. In addition, finding the relevant resources requires
that the data has to be crawled first.</p>
    </sec>
    <sec id="sec-5">
      <title>CONCLUSIONS</title>
      <p>In our opinion traversal-based query execution is one of the
most promising approaches to execute queries over the Web
of Linked Data. The enormous freedom of this approach,
its flexibility and reliability, the independence from
centralized indexes, the on-the-fly discovery of new sources and
its integration with common Web protocols, has the
potential to make this paradigm the preferred mechanism to
execute queries on Linked Data. From our point of view this
paradigm has still some weaknesses. The analysis of SPARQL
query logs has shown that inverse traversal of links in RDF
data is crucial in order to make full use of the potential of
Linked Data. From our point of view, the possibility to
inverse traversal of rdf:type links would imply a tremendous
extension to the possibilities of traversal-based query
execution. We propose a set of ideas to ongoing standardization
e orts of linked data and hope that these ideas will lead to
a discussion in the community. Part of our future work is
to conduct some experimental research that actually
investigates the merits and drawbacks of the approaches outlined
in the paper.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>I. A.</given-names>
            <surname>Adeyanju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-D. Albakour</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          <string-name>
            <surname>Kruschwitz</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. De Roeck</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Fasli</surname>
          </string-name>
          .
          <article-title>Adaptation of the concept hierarchy model with search logs for query recommendation on intranets</article-title>
          .
          <source>In SIGIR</source>
          , pages
          <fpage>5</fpage>
          -
          <lpage>14</lpage>
          . ACM,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Arias</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Fernández</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Martínez-Prieto</surname>
          </string-name>
          , and P.
          <string-name>
            <surname>de la Fuente</surname>
          </string-name>
          .
          <article-title>An Empirical Study of Real-World SPARQL Queries</article-title>
          . CoRR, abs/1103.5043,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hollenbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Presbrey</surname>
          </string-name>
          , E. Prud'hommeaux, and
          <string-name>
            <surname>M. M. C.</surname>
          </string-name>
          <article-title>Schraefel</article-title>
          .
          <article-title>Tabulator redux: Browsing and writing linked data</article-title>
          .
          <source>In LDOW</source>
          , volume
          <volume>369</volume>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Buil-Aranda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hogan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Umbrich</surname>
          </string-name>
          , and P.-Y. Vandenbusshe.
          <article-title>Sparql web-querying infrastructure: Ready for action</article-title>
          ?
          <source>In Proceedings of the 12th International Semantic Web Conference</source>
          , Sydney, Australia,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5] G. Cheng, W. Ge, and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Qu</surname>
          </string-name>
          .
          <article-title>Falcons: searching and browsing entities on the semantic web</article-title>
          .
          <source>In WWW</source>
          , pages
          <fpage>1101</fpage>
          -
          <lpage>1102</lpage>
          . ACM,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. W.</given-names>
            <surname>Finin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Cost</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Reddivari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Doshi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Sachs</surname>
          </string-name>
          .
          <article-title>Swoogle: a search and metadata engine for the semantic web</article-title>
          .
          <source>In CIKM. ACM</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>V.</given-names>
            <surname>Fionda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gutierrez</surname>
          </string-name>
          , and G. PirrŸ.
          <article-title>Semantic navigation on the web of data: specification of routes, web fragments and actions</article-title>
          . In A.
          <string-name>
            <surname>Mille</surname>
            ,
            <given-names>F. L.</given-names>
          </string-name>
          <string-name>
            <surname>Gandon</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Misselis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Rabinovich</surname>
          </string-name>
          , and S. Staab, editors,
          <source>WWW</source>
          , pages
          <fpage>281</fpage>
          -
          <lpage>290</lpage>
          . ACM,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>V.</given-names>
            <surname>Fionda</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Pirrò</surname>
          </string-name>
          .
          <article-title>Querying graphs with preferences</article-title>
          .
          <source>In CIKM</source>
          , pages
          <fpage>929</fpage>
          -
          <lpage>938</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T.</given-names>
            <surname>Gottron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Scherp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Krayer</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Peters</surname>
          </string-name>
          .
          <article-title>Lodatio: using a schema-level index to support users infinding relevant sources of linked data</article-title>
          .
          <source>In K-CAP</source>
          , pages
          <fpage>105</fpage>
          -
          <lpage>108</lpage>
          . ACM,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>O.</given-names>
            <surname>Hartig</surname>
          </string-name>
          .
          <article-title>Squin: a traversal based query execution system for the web of linked data. In K. A</article-title>
          .
          <string-name>
            <surname>Ross</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Srivastava</surname>
          </string-name>
          , and D. Papadias, editors,
          <source>SIGMOD Conference</source>
          , pages
          <fpage>1081</fpage>
          -
          <lpage>1084</lpage>
          . ACM,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>O.</given-names>
            <surname>Hartig</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.-C.</given-names>
            <surname>Freytag</surname>
          </string-name>
          .
          <article-title>Foundations of traversal based query execution over linked data</article-title>
          .
          <source>In HT</source>
          , pages
          <fpage>43</fpage>
          -
          <lpage>52</lpage>
          . ACM,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. S.</given-names>
            <surname>Uren</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Motta</surname>
          </string-name>
          .
          <article-title>Semsearch: A search engine for the semantic web</article-title>
          .
          <source>In EKAW</source>
          , volume
          <volume>4248</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>238</fpage>
          -
          <lpage>245</lpage>
          . Springer,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>E.</given-names>
            <surname>Meij</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hollink</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Huurnink</surname>
          </string-name>
          , and
          <string-name>
            <surname>M. de Rijke</surname>
          </string-name>
          .
          <article-title>Learning semantic query suggestions</article-title>
          .
          <source>In ISWC</source>
          , pages
          <fpage>424</fpage>
          -
          <lpage>440</lpage>
          . Springer,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>K.</given-names>
            <surname>Möller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hausenblas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Grimnes</surname>
          </string-name>
          .
          <article-title>Learning from linked open data usage: Patterns &amp; metrics</article-title>
          .
          <source>In Proceedings of the WebSci10: Extending the Frontiers of Society On-Line</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>E.</given-names>
            <surname>Oren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Delbru</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Catasta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Stenzhorn</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Tummarello.</surname>
          </string-name>
          <article-title>Sindice.com: a document-oriented lookup index for open linked data</article-title>
          .
          <source>IJMSO</source>
          ,
          <volume>3</volume>
          (
          <issue>1</issue>
          ):
          <fpage>37</fpage>
          -
          <lpage>52</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pérez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Arenas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Gutierrez</surname>
          </string-name>
          . nSPARQL:
          <article-title>A navigational language for RDF</article-title>
          . J. Web Sem.,
          <volume>8</volume>
          (
          <issue>4</issue>
          ):
          <fpage>255</fpage>
          -
          <lpage>270</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Tramp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Frischmuth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ermilov</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          .
          <article-title>Weaving a Social Data Web with Semantic Pingback</article-title>
          .
          <source>In Proceedings of the EKAW</source>
          <year>2010</year>
          , pages
          <fpage>135</fpage>
          -
          <lpage>149</lpage>
          , Berlin / Heidelberg,
          <year>October 2010</year>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>T.</given-names>
            <surname>Tran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Haase</surname>
          </string-name>
          .
          <article-title>Hermes: Data web search on a pay-as-you-go integration infrastructure</article-title>
          .
          <source>J. Web Sem</source>
          .,
          <volume>7</volume>
          (
          <issue>3</issue>
          ):
          <fpage>189</fpage>
          -
          <lpage>203</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>G.</given-names>
            <surname>Tummarello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Catasta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Danielczyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Delbru</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Decker</surname>
          </string-name>
          . Sig.ma:
          <article-title>Live views on the web of data</article-title>
          .
          <source>J. Web Sem</source>
          .,
          <volume>8</volume>
          (
          <issue>4</issue>
          ):
          <fpage>355</fpage>
          -
          <lpage>364</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>