=Paper=
{{Paper
|id=Vol-1184/paper8
|storemode=property
|title=Will Linked Data Benefit from Inverse Link Traversal?
|pdfUrl=https://ceur-ws.org/Vol-1184/ldow2014_paper_08.pdf
|volume=Vol-1184
|dblpUrl=https://dblp.org/rec/conf/www/ScheglmannS14
}}
==Will Linked Data Benefit from Inverse Link Traversal?==
<pdf width="1500px">https://ceur-ws.org/Vol-1184/ldow2014_paper_08.pdf</pdf>
<pre>
       Will Linked Data Benefit from Inverse Link Traversal?
                                                    Position Paper
                          Stefan Scheglmann                                     Ansgar Scherp
               University of Koblenz-Landau, Germany                       Kiel University, Germany
              Institute for Web Science and Techologies           Leibniz Information Center for Economics
                       schegi@uni-koblenz.de                            asc@informatik.uni-kiel.de


ABSTRACT                                                        proofs of concepts, we now more and more see the devel-
Query execution using link-traversal is a promising approach    opment of common frameworks and standardization efforts
for retrieving and accessing data on the web. However, this     around the four LOD principles. One example is the cur-
approach finds its limitation when it comes to query patterns   rent W3C working draft on a Linked Data Platform (LDP)2,3 ,
such as ?s rdf:type ex:Employee, where one does not know        which aims at providing further clarifications and extensions
the subject URI. Such queries are quite useful for different    to the four Linked Data Principles defined by Tim Berners-
application needs. In this paper, we conduct an empirical       Lee. It describes a generic infrastructure for providing LOD
analysis on the use of such patterns in SPARQL query logs.      resources as well as modifying them. The principle idea is
We present different solution approaches to extend the cur-     to offer so-called LDP resources (LDPR) that can be derefer-
rent Linked Open Data principles with the ability for inverse   enced in order to obtain information about it (read), modify
link traversal. We discuss the advantages and disadvantages     or delete the resource, or create a new resource. In addition
of the different approaches.                                    to the notion of LDPR, the draft also suggests the concept of
                                                                LDP collections (LDPC). A LDPC represents a set of “same-
                                                                subject, same-predicate triples”, which can be accessed and
Categories and Subject Descriptors                              modified through a common URI. Another example of a stan-
H.1 [Information Systems Applications]: Models and Prin-        dardization effort is the SPARQL Graph Protocol4 , a W3C rec-
ciples; H.4 [Information Systems Applications]: Miscella-       ommendation that operates on the level of graphs as entities
neous                                                           for creation, modification, and deletion. Both specifications
                                                                overlap in certain points and seek to align their efforts as
                                                                much as possible.5
General Terms                                                      We very much appreciate these efforts as they allow for de-
Design, Measurement                                             veloping more robust and mature LOD services. However,
                                                                we believe that still improvements are needed regarding an
Keywords                                                        efficient access and retrieval of linked data. Searching on
                                                                linked data can in principle be categorized into two different
Querying of Linked Open Data; Link Traversal                    approaches: a) Crawling all the data in a first step and sub-
                                                                sequently indexing it to make it available to the users [6, 12,
1.    INTRODUCTION                                              19, 15, 5, 18, 9] and b) On-the-fly retrieval of the data accord-
   The four principles of Linked Open Data (LOD) as pub-        ing to the follow-your-nose-principle [11, 10]. While the first
lished in 20061 have gained widespread adoption in research,    approach adopts the typical search scenario as it is known
industries, as well as governmental and non-profit organi-      from the web, the second approach comes more natural with
zations. Today, we find various approaches for publish-         the LOD approach. This link traversal based query execu-
ing linked data such as dedicated data stores like DBpedia,     tion seems one of the most promising approaches to execute
SPARQL engines with a SPARQL endpoint, plain RDF files,         queries over the Web of Linked Data. No fixed, predefined
as well as data embedded into websites such as RDFa. After      set of relevant data sources has to be defined and relevant
the “early years” of individually implemented solutions and     sources can be discovered during query execution. Hartig et
                                                                al. [11] defined the semantics of link-traversal query execu-
1
  http://www.w3.org/DesignIssues/LinkedData.html last           tion on the web of data. However, it has its limitations. In
visit 16th February, 2014
                                                                2
                                                                  http://www.w3.org/TR/2013/WD-ldp-20130730/ last visit
                                                                16th February, 2014
                                                                3
                                                                  https://dvcs.w3.org/hg/ldpwg/raw-file/default/
                                                                ldp-bp/ldp-bp.html last visit 16th February, 2014
                                                                4
                                                                  http://www.w3.org/TR/sparql11-http-rdf-update/ last
                                                                visit 16th February, 2014
                                                                5
                                                                  A detailed discussion on the overlaps and resulting con-
                                                                flicts can be found here: http://www.w3.org/2012/ldp/
                                                                wiki/Main_Page#Linked_Data_Platform_.28LDP.29_vs_
Copyright is held by the author/owner(s).                       SPARQL_Graph_Store_HTTP_Protocol_.28GSP.29 last visit
LDOW2014 April 8, 2014, Seoul, Korea.                           16th February, 2014
order to illustrate this, let us consider a simple example taken     where link traversal based query execution is not sufficient.
from Hartig and Freytag [11] as shown in Listing 1 below.            But this might not always be possible. According to the state
                                                                     of the LOD cloud7 in 2011, only 68% of the data sources in the
                                                                     LOD cloud provide SPARQL endpoint to allow expressive
       Listing 1: Example query on LOD (from [11]).
1    SELECT ?p ?l WHERE {                                            queries to be asked against the datasets. The remaining 31%
2      <http :// bob.name > <http ://.../ knows > ?p.                are only accessible by link traversal queries.
3      ?p <http ://.../ currentProject > ?pr.                           But also the stability of provided SPARQL endpoints is of
4      ?pr <http ://.../ label > ?l.                                 interest. Buil-Aranda et al. [4] studied the current state of
5    }                                                               available public SPARQL endpoints. They conducted vari-
   The query specifies the information need to obtain a list of      ous experiments to test discoverability, interoperability, per-
project names that friends of <http://bob.name> work on. This        formance, and availability. According to them availability
query can be executed over LOD as shown by the authors [11].         of SPARQL endpoints is still an issue. In their experiments,
Still, executing queries on LOD has significant limitations          they monitored 427 public SPARQL endpoints which are reg-
when it comes even to slighty different information needs.           istered at DataHub8 over a 27-month long experiment. Re-
For example, we might not know a-priori the existence of             garding the availability, they conclude that only around 32%
Bob or even if we know about him as person, we might not             of the endpoints reach an uptime of 99−100%, 59% an uptime
be aware of <http://bob.name> being the URI representing             over 75%, and 29.3% are available less than 5% of the time.
him. Thus, we like to execute a query on LOD as shown in                In order to answer Question 2, we rely on SPARQL query
Listing 2 where we do not know the subject URI of the first          logs. Different analyses about the current usage of SPARQL
query pattern. Such a query is useful, e. g., on data providers      already provide a well substantiated line of argumentation.
such as the fictitious media company Big Lynx described in           For instance, Gallego et al. [2] analyzed the USEWOD2011
the Linked Data book by Heath and Bizer6 to first determine          dataset. Their results show that more than 90% of the queries
which employees are working with the company, before finding         consist of less than 3 query patterns. SPARQL query patterns
out further information about each person.                           with the subject unbound while given predicate and object
                                                                     corresponds to our rdf:type pattern. They are the third most
                                                                     used type of patterns, with 7% in DBpedia and 46% in Se-
        Listing 2: Query requiring inverse traversal.                mantic Web Dog Food (SWDF)9 conference metadata. Möller
1    SELECT ?s WHERE
                                                                     et al. [14] applied similar analysis on a different dataset. They
2      ?s <http ://.../ type >
3        <http ://.../ BigLynx /Employee > .                         conducted an analysis on query logs of SWDF, DBpedia, DB-
4    }                                                               tune, and RKBExplorer (RKB). In their analysis they came
                                                                     to comparable results, 90% consist of less than three triple
   However, this and similar queries cannot be executed on           patterns. The query pattern with unbound subject is used in
the LOD today as it requires an inverse link traversal from the      43% of the SWDF queries, 68% of the DBTune queries, 68%
Big Lynx specific vocabulary defining the concept Employee           of the RKB queries, and 50% of the DBpedia queries.
to the resource URIs that are typed with this concept. Please           Nethertheless, we conduced our own analysis, in order
note that our discussion of inverse traversal mainly addresses       to show the frequent use of rdf:type in SPARQL query pat-
the rdf:type triples. However, the ideas can be applied in           terns. We extracted SELECT queries taken from the USE-
principle to triples with any kind of properties. We refer to        WOD201310 data challenge files. The USEWOD2013 queries
this at some parts of the paper.                                     where taken from Apache CLF11 logs of four different linked
   Below, we first further illustrate and motivate the need          open data sources, Linked Geo Data (LGD)12 , DBpedia13 ,
for an local inverse link traversal of rdf:type predicates on        bio2RDF14 , and Semantic Web Dog Food (SWDF) conference
LOD. Subsequently, we conduct an empirical analysis on the           metadata. Table 1 shows the results of our analysis.
use of SPARQL queries involving query patterns such as the              We could show that around 11% of all SELECT queries
one shown in Listing 2. We present possible approaches to            from the aforementioned logs contain at least one pattern
extend the current LOD principles with the ability for inverse       with rdf:type as predicate. This ratio ranges from less than
link traversal and discuss its advantages and disadvantages.         1% in the bio2rdf queries up to 14.1% in the DBpedia queries.
Finally, we present related work before we conclude.                 Please note that there is also the possibility that a triple pattern
                                                                     like ?pr rdf:type ex:ResearchProject is contained in queries
2.    QUERIES ON THE SEMANTIC WEB                                    like that one in Listing 1. In such cases, the query does not
   In order to motivate the need for an inverse link-traversal       7
                                                                        http://lod-cloud.net/state/ last visit 16th February,
for querying LOD, we first look into further detail how queries       2014
on the Semantic Web are executed. To this end, we consider            8
                                                                        http://datahub.io/en/dataset?res_format=api\
and discuss the following two questions:                              %2Fsparql last visit 16th February, 2014
(1) Why is it not sufficient to fall back to other query paradigms    9
                                                                        http://data.semanticweb.org last visit 16th February,
like SPARQL in cases where we need to resolve query pat-              2014
                                                                     10
terns such as the example in Listing 2.                                 http://data.semanticweb.org/usewod/2013/ last visit
(2) Is this pattern important enough and would the local exe-         16th February, 2014
                                                                     11
cution generate enough benefit to justify such an intervention          Common Log Format, an informal standardhttp://en.
into existing standards?                                              wikipedia.org/wiki/Common_Log_Format last visit 16th
                                                                      February, 2014
   First, we address Question 1: As already mentioned, one           12
                                                                        http://linkedgeodata.org/About last visit 16th February,
might argue that one should fall back to SPARQL in cases              2014
                                                                     13
6
  http://linkeddatabook.com/editions/1.0/             last   visit      http://DBpedia.org/About last visit 16th February, 2014
                                                                     14
16th February, 2014                                                     http://bio2rdf.org/ last visit 16th February, 2014
     DataSet       #SELECT         #Queries w.        Ratio         For inverse rdf:type traversal like in Listing 2 such an URI with
     DataSet        Queries           rdf:type            %         an embedded query might look like the examples displayed
     LGD             1,639,889          139,788         8.52        in Listing 4. Query (1) introduces “instanceOf” as keyword
     DBpedia        27,089,369        3,819,413        14.10        to indicate inverse of rdf:type. This would allow to retrieve
     SWDF           11,157,747          556,783         4.99        all instances of a specific RDF type. Query (2) extends this
     bio2rdf           179,370                 1        0.00        to “subjectOf” and allows to specify with another parameter
     total          40,066,375        4,515,985        11.27        the property URI like here the rdf:type. This allows to embed
                                                                    queries for arbitrary inverse properties.
Table 1: SPARQL query statistics on the USEWOD2013
dataset
                                                                                    Listing 4: Embedded Queries
                                                                    (1) http :// ex.com/ instanceOf?
necessarily need the possibility to inverse traversal of rdf:type                     <http ://.../ BigLynx /Employee >
links. However, investigating these cases was beyond scope          (2) http :// ex.com/ subjectOf?
of our analysis.                                                                      p=<http ://.../ #type >&
                                                                                      o=<http ://.../ BigLynx /Employee >

3.     POSSIBLE SOLUTION APPROACHES                                    In order to address the problem that a data provider hosts a
  In Linked Data, each resource has an explicit URI which           very large number of resources of specific RDF type, the data
can be resolved in order to retrieve additional information         can be retrieved in a paginated manner. For example, the sim-
about this entity. However, how can one do this for RDF             ple HTTP GET query can be extended by determining a limit
types, i. e., how to dereference RDF type URIs? Here, two           and an offset for the data, e. g., http://ex.com/instancesOf?
principal challenges have to be addressed:                          http://.../BigLynx/Employee&offset=100&limit=40.
                                                                    This retrieves forty instances starting at an offset of 100.
     • RDF types used in linked data sources are often not de-         The queries above could be provided as new features simi-
       fined by the source itself, e. g., foaf:Person. Thus, the    lar to the URI lookup endpoint feature of VoiD.16 This feature
       provider hosting the vocabulary does not know any-           allows to determine a dedicated URI that can be used for
       thing about the data sources that are using the vocabu-      search by appending the keywords to this URI.
       lary to describe the resources.
     • In addition, resolving RDF types such as foaf:Person in      Approach 2: Dedicated Schema URI. Instead of provid-
       order to retrieve its instances can potentially lead to a    ing a query mechanism, information about rdf:type entities
       very large amount of results. In an extreme case, i. .e,     could also be made accessible by a specific URI per type. De-
       when we aim at resolving the RDF type rdf:Resource,          pending on the number of types in a dataset, this could lead
       we end up retrieving all instances.                          to a potentially large number of entity-URIs. An alternative
                                                                    solution is that linked data providers could provide a special
In order to address the challenges, one cannot operate on a         kind of RDF schema document. This schema document has to
global level, i. e., on the entire Web of Linked Data. Rather,      be made available under an generally accepted schema URI,
we aim to provide resources of RDF types only on a local            e. g., http://example.org/.well-known/schema/instance-types.17
level, i. e., resources hosted by data sources that are run by a    A schema document consists of triples in the form <entities-
single organization or hosted on a single pay-level domain.         URI> rdf:type <type-URI>, thereby the entities-URI provides
Thus, when resolving a RDF type such as foaf:Person, a data         a direct lookup mechanism to find instances of the specific
provider operating a specific pay-level domain may return           type. For example, assuming two entities of foaf:Person on
a reference containing information about all instances it de-       a server, namely http://bob.name and http://tim.name, its en-
fines using this type, e. g., by the semantic pingback mecha-       tities document would consist of two triples, cf. Listing 5.
nism [17]. In summary, we can come up with the following            If this list is very long, i. e., there exist a large number of in-
solution approaches based on common REST practices to al-           stances, a pagination approach can be implemented by con-
low inverse link traversal on Linked Data:                          necting multiple RDF schema documents via rdf:List.

Approach 1: Simple URI Queries. It is possible that linked
data providers extend their servers with mechanisms that al-                Listing 5: Entities document for foaf:Person
                                                                    <http :// bob.name > rdf:type <foaf:Person >.
low for HTTP GET requests with embedded queries. This is            <http :// tim.name > rdf:type <foaf:Person >.
inspired by the W3C standardization of Media Fragments15 ,
i. e., the definition of a unique URI scheme to access frag-        A globally agreed URI schema like http://example.org/.well-
ments of media assets such as images and videos on the web.         known/schema allows to instantly access any kind of relevant
A media fragment URI specification supports a lightweight           schema information for the entities in a data source such as
mechanism to embed queries in such URIs. According to the           instances of specific types. However, it is debatable whether
standard every URI consists of four parts:                          standardization efforts should agree on generally accepted
                                                                    URIs for storing schema information. Thus, another way
           Listing 3: Media Fragment URI Schema                     16
 1   <scheme name > : <hierarchical part >                             http://www.w3.org/TR/2011/NOTE-void-20110303/
 2                    [ ? <query > ] [ # <fragment > ]               #lookup last visit 16th February, 2014
                                                                    17
                                                                       Please note the use of the path .well-known/ for modeling a
15
 http://www.w3.org/TR/media-frags/                                   “well-known location” for common data like our schema data
#standardisation-URI-queries last visit 16th Febru-                  as proposed in RFC 5785 available from: https://tools.
ary, 2014                                                            ietf.org/html/rfc5785
would be to add a link to an instances list to a VoID metadata       4.   RELATED WORK
description of the datasets.                                            We find a plethora of different approaches for searching
                                                                     Linked Open Data (and the Semantic Web in general) such
Implementation Issues.                                               as Swoogle [6], SemSearch [12], Sig.ma [19], Sindice [15], Fal-
   Finally, we have to think about how the response to a query       cons [5], Hermes [18], and LODatio [9].20 Despite particular
(Approach 1) or the type-entity documents (Approach 2) can           features and differences these systems have, they all have in
be generated. In general, Linked Data can be provided in             common that they index a crawled set of semantic data. As
different means. One can distinguish three different ways of         such, they require the semantic data readily at hand for search
publishing and accessing LOD on the web:                             in order to provide answers to the users’ queries. LODatio
(a) Linked data sources using a dedicated RDF infrastructure,        does not store the instance data itself but solely relies on a
like a triple store in the back-end. These sources may pro-          schema-level index. This schema-level index allows to find
vide a SPARQL endpoint to enable for complex queries. An             sources of information on the web of data without keeping
example of such a source is DBpedia.                                 the original data. However, it still requires an initial crawl of
(b) Linked data may also be provided by sources as a set             LOD to compute the schema-level index.
of different RDF files or data dumps. An example for this               Among the different approaches for semantic search, there
could be a web server just providing a couple of FOAF18              are also some that exploit information from existing query
documents, in addition to personal information published             logs. Examples are Adeyanju et al. [1] who use query logs
on hosted websites.                                                  for query term recommendations such as generalizations and
(c) Linked data could even be embedded into existing (X)HTML         specializations. Meij et al. [13] align query logs with concepts
documents of a website using RDF in attributes (RDFa)19 .            from DBpedia and conduct recommendations based on the
   For sources of type (a) the response to an inverse link traver-   number of concepts linking to or from these concepts. Over-
sal could just be generated by executing a query correspond-         all, query logs provide useful information about the actual
ing to the request directly on the triple store. Sources of type     information needs a LOD client has. However, they are not
(b) and (c) need a more sophisticated mechanism. Servers             used by LOD providers to compute some a-priori available
providing linked data in that way have to somehow collect            indices that ease the consumption of their data. Finally, we
the information first. If the linked data is provided in multi-      find tools that solely rely on exploring the data in direct com-
ple RDF files, e.g. like on a FOAF server, we might extend the       munication with the LOD providers. A well known example
server by an indexing mechanism, which provides all RDF              is the Tabulator system by Tim Berners-Lee [3] that allows be-
files and generates the response set. In order to also support       sides viewing and browsing also editing and thus updating
embedded RDF, this indexing mechanism has to be extended             the data. Also Marbles21 allows for browsing the LOD graph
so that it can provide arbitrary files that allow to embed RDF       by following the outgoing edges. Given a user provided URI,
and identify and extract the desired information from those          it retrieves all information about it by dereferencing it. In ad-
files.                                                               dition, Marbles queries search engines such as Sindice and
   Should an indexing service generate the pages dynami-             Falcons and follows RDF predicates like owl:sameAs and
cally or should we go for static index pages? The former             rdfs:seeAlso to gain further information about a resource.
would be space saving and more robust regarding changes              While the data is retrieved live, Marbles can be considered
in the data but has several disadvantages in terms of compu-         as hybrid solution as it not only makes use of other seman-
tational power and response times. The latter is clearly more        tic search engines but also makes the graph data persistent
space consuming. Depending on the data, the worse case               across user sessions to allow for a more efficient access to
leads to redundancy in the URIs that is linear to the num-           the data. However, it does not allow to browse the data
ber of used types. It is also less robust regarding changes in       providers, e. g., by inverse type links or some simple look-
the data. There might be indices affected by the change in           up features to find the instances of particular RDF types on
the data and those have to be updated. But this method has           a LOD data source. A comprehensive overview of further
clear advantages regarding performance and response times,           LOD components and clients can be found online and in-
because all required answer sets already exist. The concrete         cludes Linked Data browsers, crawlers, data extractors, and
decision on how such an indexing service is finally imple-           mashup frameworks.22 .
mented depends on the concrete use case and the data. A                 Hartig et al. [11] formalized traversal-based query exe-
static indexing brings some advantages if space is not critical      cution on LOD, provide semantics for queries and analyse
but computational power, or the data is static and changes           characteristics of such queries. Examples of graph traversal
infrequent, or the amount of entities per type is reasonable, or     languages for RDF data are nSPARQL [16], a language with
there are strong requirements regarding the response times.          focus on navigation through RDF data. With NautiLOD [7],
In other cases, e.g., for frequently changing data or datasets       Fionda et al. introduced a declarative language which al-
with very large amounts of entities per type, dynamic index-         lows to specify portions of the web of data, defines routes
ing might be more convenient. We could also think of adap-           and instructs agents to perform actions. They also introduce
tive indexing strategies, e.g. computing only those indexes
dynamically which lead to a high amount of redundancy and            20
                                                                        For a comprehensive overview, please refer to: http:
precompute indexes for frequently requested types.                    //www.w3.org/wiki/TaskForces/CommunityProjects/
                                                                      LinkingOpenData/SemanticWebSearchEngines last visit
                                                                      16th February, 2014
                                                                     21
                                                                        http://mes.github.io/marbles/ last visit 16th February,
18
   The friend of a friend ontology http://www.foaf-project.           2014
 org/ last visit 16th February, 2014                                 22
                                                                        http://www.w3.org/wiki/TaskForces/
19
   The       RDFa         Primer      http://www.w3.org/TR/           CommunityProjects/LinkingOpenData/SemWebClients
 xhtml-rdfa-primer/ last visit 16th February, 2014                    last visit 16th February, 2014
swget, a tool that allows for the use of NautiLOD on the             [4] C. Buil-Aranda, A. Hogan, J. Umbrich, and P.-Y.
web of data. GuLP [8] is a query execution language and                  Vandenbusshe. Sparql web-querying infrastructure:
framework based on the link-traversal paradigm, which can                Ready for action? In Proceedings of the 12th International
declaratively include preferential attachment into its queries           Semantic Web Conference, Sydney, Australia, 2013.
to order the answer in terms of node/edge attributes. With           [5] G. Cheng, W. Ge, and Y. Qu. Falcons: searching and
SQUIN [10], Hartig presented a query execution framework                 browsing entities on the semantic web. In WWW, pages
that integrates the traversal of data links into the result con-         1101–1102. ACM, 2008.
struction process. All these approaches would clearly benefit        [6] L. Ding, T. W. Finin, A. Joshi, R. Pan, R. S. Cost, Y. Peng,
from our proposed extension.                                             P. Reddivari, V. Doshi, and J. Sachs. Swoogle: a search
   The VoID Vocabulary23 allows for describing Linked Datasets           and metadata engine for the semantic web. In CIKM.
in terms of providing metadata information such as a SPARQL              ACM, 2004.
endpoint location, example resources, and the number of              [7] V. Fionda, C. Gutierrez, and G. PirrŸ. Semantic
triples, entities, classes, and properties in the data set. How-         navigation on the web of data: specification of routes,
ever, this does not help in identifying resources of a specific          web fragments and actions. In A. Mille, F. L. Gandon,
type. Most relevant to our work is the VoiD feature to denote            J. Misselis, M. Rabinovich, and S. Staab, editors, WWW,
root resources (named using the void:rootResource property).             pages 281–290. ACM, 2012.
Here, it is assumed that the “entire dataset can be crawled by       [8] V. Fionda and G. Pirrò. Querying graphs with
resolving the root resource(s) and recursively following links           preferences. In CIKM, pages 929–938, 2013.
to other URIs in the retrieved RDF responses”. Thus, in prin-
                                                                     [9] T. Gottron, A. Scherp, B. Krayer, and A. Peters. Lodatio:
ciple it is possible to list all relevant entities here. However,
                                                                         using a schema-level index to support users infinding
it cannot be stated that specific resources are of a particular
                                                                         relevant sources of linked data. In K-CAP, pages
RDF type. In addition, finding the relevant resources requires
                                                                         105–108. ACM, 2013.
that the data has to be crawled first.
                                                                    [10] O. Hartig. Squin: a traversal based query execution
                                                                         system for the web of linked data. In K. A. Ross,
5.       CONCLUSIONS                                                     D. Srivastava, and D. Papadias, editors, SIGMOD
   In our opinion traversal-based query execution is one of the          Conference, pages 1081–1084. ACM, 2013.
most promising approaches to execute queries over the Web           [11] O. Hartig and J.-C. Freytag. Foundations of traversal
of Linked Data. The enormous freedom of this approach,                   based query execution over linked data. In HT, pages
its flexibility and reliability, the independence from central-          43–52. ACM, 2012.
ized indexes, the on-the-fly discovery of new sources and           [12] Y. Lei, V. S. Uren, and E. Motta. Semsearch: A search
its integration with common Web protocols, has the poten-                engine for the semantic web. In EKAW, volume 4248 of
tial to make this paradigm the preferred mechanism to ex-                Lecture Notes in Computer Science, pages 238–245.
ecute queries on Linked Data. From our point of view this                Springer, 2006.
paradigm has still some weaknesses. The analysis of SPARQL          [13] E. Meij, M. Bron, L. Hollink, B. Huurnink, and
query logs has shown that inverse traversal of links in RDF              M. de Rijke. Learning semantic query suggestions. In
data is crucial in order to make full use of the potential of            ISWC, pages 424–440. Springer, 2009.
Linked Data. From our point of view, the possibility to in-         [14] K. Möller, M. Hausenblas, R. Cyganiak, and G. A.
verse traversal of rdf:type links would imply a tremendous               Grimnes. Learning from linked open data usage:
extension to the possibilities of traversal-based query execu-           Patterns & metrics. In Proceedings of the WebSci10:
tion. We propose a set of ideas to ongoing standardization               Extending the Frontiers of Society On-Line, 2010.
efforts of linked data and hope that these ideas will lead to       [15] E. Oren, R. Delbru, M. Catasta, R. Cyganiak,
a discussion in the community. Part of our future work is                H. Stenzhorn, and G. Tummarello. Sindice.com: a
to conduct some experimental research that actually investi-             document-oriented lookup index for open linked data.
gates the merits and drawbacks of the approaches outlined                IJMSO, 3(1):37–52, 2008.
in the paper.                                                       [16] J. Pérez, M. Arenas, and C. Gutierrez. nSPARQL: A
                                                                         navigational language for RDF. J. Web Sem.,
6.       REFERENCES                                                      8(4):255–270, 2010.
                                                                    [17] S. Tramp, P. Frischmuth, T. Ermilov, and S. Auer.
     [1] I. A. Adeyanju, D. Song, M.-D. Albakour,
                                                                         Weaving a Social Data Web with Semantic Pingback. In
         U. Kruschwitz, A. De Roeck, and M. Fasli. Adaptation
                                                                         Proceedings of the EKAW 2010, pages 135–149, Berlin /
         of the concept hierarchy model with search logs for
                                                                         Heidelberg, October 2010. Springer.
         query recommendation on intranets. In SIGIR, pages
         5–14. ACM, 2012.                                           [18] T. Tran, H. Wang, and P. Haase. Hermes: Data web
                                                                         search on a pay-as-you-go integration infrastructure. J.
     [2] M. Arias, J. D. Fernández, M. A. Martínez-Prieto, and
                                                                         Web Sem., 7(3):189–203, 2009.
         P. de la Fuente. An Empirical Study of Real-World
         SPARQL Queries. CoRR, abs/1103.5043, 2011.                 [19] G. Tummarello, R. Cyganiak, M. Catasta, S. Danielczyk,
                                                                         R. Delbru, and S. Decker. Sig.ma: Live views on the
     [3] T. Berners-Lee, J. Hollenbach, K. Lu, J. Presbrey,
                                                                         web of data. J. Web Sem., 8(4):355–364, 2010.
         E. Prud’hommeaux, and M. M. C. Schraefel. Tabulator
         redux: Browsing and writing linked data. In LDOW,
         volume 369, 2008.
23
 http://www.w3.org/TR/2011/NOTE-void-20110303/ last
visit 16th February, 2014

</pre>