<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>March</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Augmenting the Web of Data using Referers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hannes Mühleisen</string-name>
          <email>muehleis@inf.fu-berlin.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anja Jentzsch</string-name>
          <email>mail@anjajentzsch.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Freie Universität Berlin, Networked Information Systems Group</institution>
          ,
          <addr-line>Königin-Luise-Str. 24/26, 14195 Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Freie Universität Berlin, Web-based Systems Group</institution>
          ,
          <addr-line>Garystr. 21, 14195 Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2011</year>
      </pub-date>
      <volume>29</volume>
      <issue>2011</issue>
      <abstract>
        <p>Linked Data relies on one central concept: Typed links connect entities stored within data sets published by di erent individuals. Manual input and mapping are common techniques to create these links. We propose a novel method, where HTTP Referer information is used to create new links between Linked Data entities stored in di erent data sets. We evaluate our method using 27.86 million real-world log entries from web servers hosting Linked Data.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The Web of Data forms a single global data space for the
very reason that its data sources are connected by links.
However, as the current state of the Linked Open Data
cloud shows, most data sources are not su ciently
interlinked, with over 50% of them only being interlinked with
only one or two other data sources2. Almost two thirds of
the data sources do not link back to all the data sources
they are linked from. This leads to a weakly interlinked and
often unidirectional graph of Linked Data which impedes
applications relying on link traversal. In addition, for the
integration of data duplicate detection and linkage
recording are crucial preliminaries. While some fully automatic
tools for link discovery do exist [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], most tools generate the
links semi-automatically based on user-de ned link speci
cations [
        <xref ref-type="bibr" rid="ref10 ref11 ref12">11, 12, 10</xref>
        ].
      </p>
      <p>In this paper, we propose a novel { fully automatic {
approach for back link generation. Here, the \Referer" request
header de ned by the HTTP protocol speci cation is used to
discover remote documents containing Linked Data entities
linking to local entities. Since RDF links between entities
are typed, the type of a back link depends on the type of</p>
    </sec>
    <sec id="sec-2">
      <title>1http://lod-cloud.net 2http://lod-cloud.net/state</title>
      <sec id="sec-2-1">
        <title>2. LINK GENERATION APPROACH</title>
        <p>
          According to the Linked Data principles, links between
entities contained in di erent data sets and stored on di erent
servers are an integral part of Linked Data [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. These links
into other data sets are often used to provide background
information, or lead to other related entities. Apart from
using central databases such as Sindice[
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], many Linked
Data tools and applications are dependent on considerable
quantities of links. For example, the SQUIN SPARQL query
processor uses link traversal to resolve patterns within a
query [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
        <p>Within the Resource Description Format (RDF) data model,
links are directed and have link semantics speci ed; each link
is required to be labeled by a machine-readable URI. Data
sets are independent in management and storage, and links
between entities are not a part of any meta-level central
system, but reside in the data set they were created in. Figure 1
shows this principle: Two entities, ds1:res1 and ds2:res2
are linked from ds2:res2 to ds1:res1 using the link type
ex:p1 (1). The back link from ds1:res1 to ds2:res2 is not
required to be present, its link type is also unknown a priori.
(2) shows the physical storage of the entities and the link,
Dataset 1 contains the entity ds1:res1, and Dataset 2
contains both the entity ds2:res2 as well as the link. Should a
link-traversal based tool encounter ds1:res1, it would have
no way of reaching ds2:res2 without the help of central
databases.</p>
        <p>?
ex:p1
ds2:res2
ds2:res2
Dataset 2
Links between di erent data sets cannot { so far { be created
automatically without complex entity recognition schemes
or data structure conventions. Thus, link creation is often
based on human interaction, which represents a tedious
process and is only practicable between two di erent data sets
at a time. An automatic or supporting process for link
generation would be desirable, even if only a subset of possible
links can be discovered. In the \classic" WWW, links are
often created on the basis of a link exchange; web authors
communicate the intent of linking to each other's sites, a
process that can be bene cial for both sites and their
visitors. The amount of links is kept low as not to distract
readers. For Linked Data entities however, a large amount
of links to other entities is not disruptive for its usage, as
these entities are mainly published for use by computer
programs. Hence, as content is machine-readable, link exchange
can be performed automatically.</p>
        <p>
          The Linked Data speci cation de nes the Hypertext
Transfer Protocol (HTTP) as underlying data exchange protocol.
Linked Data entities are thus requested and served using
this protocol. The HTTP speci cation de nes the Referer 3
header eld as part of HTTP requests [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. This eld can be
set by the user agent program to the URL of the site that
it was referred from.
        </p>
        <p>\The Referer[sic] request-header eld allows
the client to specify, for the server's bene t, the
address (URI) of the resource from which the
Request-URI was obtained[. . . ] The Referer
requestheader allows a server to generate lists of back
links to resources for interest, logging,
optimized caching, etc." [7, sec. 14.36]
The value of the Referer header is commonly added to
request log les by standard web servers, for example by the
Apache HTTP Server. For human-only web sites, the
Referer values are currently mainly analyzed to track visitor
sources such as search engine queries. In the case of Linked
Data, the highlighted part of the Referer de nition is more
relevant: If RDF crawlers and user agents would correctly
set this eld, a program could generate back links to local
3This spelling is used in this paper to be consistent with the
HTTP speci cation
entities from the web server's log les and increase the
overall connectivity of the Linked Data cloud.</p>
        <p>
          If Referer information are to be used to create links
between RDF entities, the link property URI has to be
determined rst, as RDF does not allow untyped links. For
very generic cases, the RDF Schema (RDFS) speci cation
de nes the rdfs:seeAlso property, which \indicates a
entity that might provide additional information" [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
However, the Linked Data speci cation allows the retrieval of
remote entities (\dereferencing") in order to gain more
information about that entity. The dereferenced remote RDF
document can then be processed into RDF statements,
possibly yielding the link property that was used to refer to a
local entity. Reconsider the situation depicted in Figure 1,
if a Referer value of ds2:res2 is logged for an HTTP
request to the server hosting ds1:res1 as part of Dataset 1,
an automatic process can retrieve the document describing
ds2:res2 to determine the property value of the link
pointing to ds1:res1, in this case ex:p1.
        </p>
        <p>
          One of the strengths of RDF is the possibility to describe
the vocabularies used to link entities in a machine-readable
and dereferenceable way as well. This description can be
encoded using either RDFS or the Web Ontology Language
(OWL) [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Using the owl:inverseOf property, a property
itself can de ne which property is to be used for back links.
For example, the link property hasChild could have the
inverse link property hasParent. Alternatively,
vocabularies can specify properties to be symmetric, for example the
property hasFriend could be de ned to be symmetric
(assuming a main-stream sociocultural environment). Should
a link property neither have an inverse link property, nor be
de ned to be symmetric, the remote statement linking the
local and remote entity can be included into the local data
set. Since agents can follow properties regardless of their
direction, these links can be useful to them as well.
Figure 2 gives examples for both cases. For both pictures,
the dashed elements are new to the local data set. If the
inverse property is unknown, the remote statement is
included (1). If the inverse property is known { for example by
dereferencing the property URL { the correct link property
ex:p2 known to be the owl:inverseOf ex:p1 along with the
entity URL of the remote resource ds2:res2 is included (2).
(1)
(2)
ds1:res1
ds1:res1
ex:p1
ex:p1
owl:inverseOf
ex:p2
ds2:res2
ds2:res2
From these prerequisites, the automatic generation or
recommendation of back links in the Linked Data context is
possible. The following algorithm can be executed fully
automatically, and { given Referers are supplied by the user
agents { will generate new and meaningful links between
Linked Data entities in di erent data sets. In the
following, RDF statements are encoded as triples in the triple
notation (subject; predicate; object). Algorithm 1 details the
process of link (and statement) generation: After the
document pointed to by the Referer URL has been retrieved,
two cases are di erentiated: If the response contains RDF
statements, they are checked whether the local entity URL
occurs as subject or as object. If the local entity occurs as
an object, the remote statement is returned. If the local
entity occurs as an object in one of the statements, three
cases are possible: First, the link property may be
symmetric, in this case it is used to create the connecting
statement (Line 11). Second, if the inverse property is known,
that property is used to create the new statement (Line 14).
Third, if neither of both is the case, the remote statement
is also returned. For non-RDF-documents, a string search
for the URI of the local entity within the remote document
is performed, if a match is found, a rdfs:seeAlso link is
created as well (Line 22), since this link property explicitly
allows linking to non-RDF resources [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>
          Algorithm 1 Link Generation from Referers
Require: Requested local entity URL u, Referer URL r
1: rdoc retrieve(r)
2: if isRDF (rdoc) then
3: statementSet parseRdf (rdoc)
4: for all statementSet as s do
5: if subject(s) == u then
6: return s
7: end if
8: if object(s) == u then
9: p predicate(s)
10: if isSymmetric(p) then
11: return (subject(s); p; u)
12: end if
13: if hasInverseP roperty(p) then
14: return (subject(s); inverse(p); u)
15: end if
16: n createN ewLocalU rl()
17: return s
18: end if
19: end for
20: else
21: if contains(rdoc; u) then
22: return (u; rdfs:seeAlso; r)
23: end if
24: end if
The statements generated by this algorithm can now be used
in a variety of ways. We propose two methods: First, the
statements could be handed over for review by another
software component or the person responsible for the local data
set. Second, an automatic inclusion into the data set is also
feasible. In this case, we recommend storing the statements
in a separate Named Graph, along with a machine-readable
provenance annotation, for example using the Provenance
Vocabulary [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>3. EVALUATION</title>
        <p>
          To answer our research question and validate our algorithm,
real log les from web servers hosting Linked Data sets were
analyzed. Two sets of log les were made available for the
USEWOD 2011 Data Challenge [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. The rst set of les
was created on the web server of the DBpedia project, the
second set on the web server hosting the Semantic Web Dog
Food project. Both servers used the Apache \combined"
log format4, which is the default setting. Each log entry
is represented by one line in the log le. Each log entry
is similar to the following sample entry in the \combined"
format (line breaks added, not an actual log entry):
160.45.170.10 [07/Jan/2010:09:52:45 -0800]
"GET /resource/South_Bend,_Indiana HTTP/1.1"
303 40
"http://en.openei.org/wiki/South_Bend,_Indiana"
"Mozilla/4.0"
The format is structured into elds for client IP address,
date and time, HTTP request method and URL, status code,
bytes transmitted for the response, \Referer" request header
eld, and user agent (browser). In order to generate new
links, two things have to be determined: First, the URL of
the local resource that was requested, and second the URL
of the remote resource the user agent visited before. This
data can be taken from the described log le format.
In total, about 27.86 million log entries were parsed,
ltered, and checked for \interesting" Referer entries.
Filtering included the removal of log entries without the optional
Referer eld, local redirects, and log entries with Referer
entries pointing to result pages of search engines such as
Google, Yahoo, etc.. For all remaining entries, the Referer
URL was resolved, and the resulting HTML or RDF
document searched for the URL of the local resource identifying
a local entity. Requests expressed their preference for RDF
document responses using the Accept HTTP header. Thus,
this operation was de ned to have four possible outcomes:
Not found { The local resource was not found in the
remote document, neither in plain text nor RDF
Text match { the local resource was found occurring
in a plain text or HTML response
RDF subject match { the local resource was found in
a remote RDF statement as the subject entry
RDF object match { the local resource was found in a
remote RDF statement as the object entry. In the last
case, the properties used to link to the local resource
were also recorded
For RDF matches (not considering possible links to HTML
documents), new statements linking the local and remote
resources were generated according to our algorithm. Then,
an additional request was performed on the local data set
to see whether the local data set already contains this
statement. If this was not the case, the new statement could have
been added to the data set.
        </p>
        <p>The frequencies of the possible outcomes mentioned above
as well as the properties used for object matches can give
4http://httpd.apache.org/docs/current/logs.html#
combined
an indication whether the additional links created using our
approach merit the additional e ort of analyzing log les for
Referer entries.
The most frequent properties used in object matches are
given in Table 2 for the two data sets. Entries with less than
ten occurrences are are not included. Both the dereferencing
results as well as the statements generated for the respective
data sets are available online6 in order to to support further
analysis.</p>
        <sec id="sec-2-2-1">
          <title>Property URI DBpedia</title>
          <p>http://www.w3.org/2002/07/owl#sameAs
http://dbpedia.org/ontology/wikiPageRedirects
http://rdfs.org/sioc/ns#links to
http://www.rkbexplorer.com[..]#duplicate</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>SWDF</title>
          <p>http://www.w3.org/2002/07/owl#sameAs
http://xmlns.com/foaf/0.1/knows
http://www.w3.org[..]rdf-schema#seeAlso
Freq.
5Accessible on 2011/03/10
6http://page.mi.fu-berlin.de/muehleis/ldow2011/
Two main conclusions can be drawn from our evaluation:
First, the generation of new links between Linked Data
entities is indeed possible using log les, which contain
Referer values. Second, the comparably small amount of
statements generated shows the failure of Linked Data clients and
crawlers to properly set the Referer header.</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>4. RELATED WORK</title>
        <p>
          Link discovery between data entities across data sets
requires linkage recording and duplicate detection techniques.
While there is a large amount of related work on these
topics in the database community [
          <xref ref-type="bibr" rid="ref15 ref5">15, 5</xref>
          ] as well as on ontology
matching in the knowledge representation community [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ],
the approaches for Linked Data are still limited at the
moment.
        </p>
        <p>
          The Silk Link Discovery Framework [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] is an identity
resolution framework which generates RDF links between data
items based on user-provided link speci cations which are
expressed using the Silk Link Speci cation Language. Silk
is available in di erent variants, one on them being Silk
Server. Silk Server can be used as an identity resolution
component within applications that consume Linked Data
from the Web. It provides an HTTP API for matching
instances from an incoming stream of RDF data.
        </p>
        <p>
          LIMES [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] is a link discovery framework for the Web of
Data. It is available as a web interface as well as standalone
tool. It o ers string metrics.
        </p>
        <p>
          LinQuer [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] is a tool for semantic link discovery over
relational data, based on string and semantic matching
techniques and their combinations. The LinQuer framework
rewrites linkage requirement queries into standard SQL
queries that can be run over relational data sources. LinQuer is
meant to be used together with relational databases to RDF
wrappers such as D2R Server or Virtuoso RDF Views.
Raimond et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] propose a link discovery algorithm that
takes into account both the similarities of data entities on
the Web of Data and of their neighbor entities. The
algorithm is implemented within the GNAT tool.
        </p>
        <p>The RKBExplorer sameAs service7 provides a uni ed view
over di erent Linked Data sets by managing owl:sameAs
links to identify duplicate URIs. The links have to be
provided to the system from external sources, which also applies
to the related BackLink service.</p>
        <p>Most of the current approaches generate links
semi-automatically based on user-de ned link speci cations. This
requires data providers to keep up with new linking
possibilities and schemata. Furthermore, except for Silk Server and
RKBExplorer's sameAs service, data sets to be linked have
to be speci ed manually. This doesn't scale for the growing
number of data sets on the Web of Data.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>7http://www.rkbexplorer.com/sameAs/</title>
      <sec id="sec-3-1">
        <title>5. CONCLUSION</title>
        <p>Acting on the fourth Linked Data principle, namely the need
for cross-dataset links between Linked Data entities, we have
identi ed the Referer request header eld de ned by the
HTTP speci cation as a possible source for automatic
creation of those links. However, the presence of an Referer
URL does not prove the presence of an existing link to a
local entity. Thus, our approach is based on applying the third
Linked Data principle { the possibility of de-referencing
arbitrary URLs { on the Referer URL. When retrieving the
document identi ed by the Referer, we were able to
ascertain the presence of a link between a remote entity to a local
entity along with the link type used. We were then also able
to determine the semantically correct back link property and
create a new locally stored back link leading from a local
entity to a remote entity.</p>
        <p>We have evaluated our fully automatic approach using log
entries from web servers hosting the DBpedia and Semantic
Web Dog Food data sets. In total, 27.86 million log entries
were analyzed, and 24,668 Referer URLs were dereferenced,
yielding 9,401 distinct results. From these results, we were
able to generate 643 new typed links. Our results show the
feasibility and practicability of automatic back link
generation for Linked Data entities using Referer information in
general and web server log les in particular.</p>
        <p>From our results, the failure of many Linked Data clients
and spider programs to add the Referer header eld to their
requests was identi ed to be the main factor limiting the
amount of statements generated by our algorithm. We
therefore would like to urge developers of Linked Data tools to set
the Referer request header to the resource where the URL of
the document currently retrieved was found whenever
possible.</p>
      </sec>
      <sec id="sec-3-2">
        <title>5.1 Further Work</title>
        <p>
          Since our approach can be used to directly add statements
based on information loaded from remote sources, the
statements generated are easily susceptible to malicious requests
and malicious remote statements. For example, if an
attacker would publish RDF data linking a popular
DBpedia entity (e.g. dbpedia:Berlin) to his advertisement page,
and then creating a request to this entity with his document
as Referer, the algorithm would automatically create a link
from the popular resource to the advertisement page. To
overcome this problem, one could evaluate provenance
information in order to establish and enforce a required trust
level, before new links are created [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
        <p>We would also like to create a generic tool for Linked Data
server administrators, which they can use to automatically
process their log entries for interesting Referers, generate
new back links, and automatically publish these links again
in their local data set. Alternatively, the tool could also
display the new statements to an administrator for approval.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Acknowledgments</title>
        <p>This work has been partially supported by the \DigiPolis"
project funded by the German Federal Ministry of Education
and Research (BMBF), support code 03WKP07B. The
authors would like to thank the reviewers and their colleagues
R. Oldakowski and M. Luczak-Rosch for their insights.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Sean</given-names>
            <surname>Bechhofer</surname>
          </string-name>
          , Frank van Harmelen,
          <string-name>
            <surname>Jim Hendler</surname>
          </string-name>
          , et al.
          <source>Owl web ontology language reference</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Berendt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hollink</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Hollink</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Luczak-Rosch, K. H</article-title>
          . Moller, and
          <string-name>
            <given-names>D.</given-names>
            <surname>Vallet</surname>
          </string-name>
          .
          <article-title>USEWOD2011 | 1st international workshop on usage analysis and the web of data</article-title>
          .
          <source>In 20th International World Wide Web Conference (WWW2011)</source>
          , Hyderabad, India,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Tim</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          .
          <source>Linked data</source>
          ,
          <year>2006</year>
          . http://www.w3.org/DesignIssues/LinkedData.html accessed 2010-
          <volume>08</volume>
          -12.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Dan</given-names>
            <surname>Brickley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.V.</given-names>
            <surname>Guha</surname>
          </string-name>
          , and
          <string-name>
            <surname>Brian McBride</surname>
          </string-name>
          .
          <source>Rdf vocabulary description language</source>
          ,
          <year>02 2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Ahmed</surname>
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Elmagarmid</surname>
          </string-name>
          , Panagiotis G. Ipeirotis, and
          <string-name>
            <surname>Vassilios</surname>
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Verykios</surname>
          </string-name>
          .
          <article-title>Duplicate record detection: A survey</article-title>
          .
          <source>IEEE Trans. on Knowl. and Data Eng</source>
          .,
          <volume>19</volume>
          (
          <issue>1</issue>
          ):1{
          <fpage>16</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Jero</given-names>
            <surname>^me Euzenat</surname>
          </string-name>
          , Al o Ferrara,
          <string-name>
            <surname>Christian Meilicke</surname>
          </string-name>
          , et al.
          <article-title>Results of the ontology alignment evaluation initiative 2010</article-title>
          .
          <source>In Proc. 5th ISWC workshop on ontology matching (OM)</source>
          ,
          <source>Shanghai (CN)</source>
          , pages
          <fpage>85</fpage>
          {
          <fpage>117</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Fielding</surname>
          </string-name>
          , Gettys, Mogul, Frystyk, Masinter, Leach, and
          <string-name>
            <surname>Berners-Lee</surname>
          </string-name>
          .
          <source>Hypertext transfer protocol { http/1</source>
          .1,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Olaf</given-names>
            <surname>Hartig</surname>
          </string-name>
          and
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Langegger</surname>
          </string-name>
          .
          <article-title>A database perspective on consuming linked data on the web</article-title>
          . Datenbank-Spektrum, Semantic Web Special Issue,
          <volume>10</volume>
          /
          <year>2010</year>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Olaf</given-names>
            <surname>Hartig</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jun</given-names>
            <surname>Zhao</surname>
          </string-name>
          .
          <article-title>Publishing and consuming provenance metadata on the web of linked data</article-title>
          . In Deborah L.
          <string-name>
            <surname>McGuinness</surname>
          </string-name>
          ,
          <string-name>
            <surname>James Michaelis</surname>
          </string-name>
          , and Luc Moreau, editors,
          <source>IPAW</source>
          , volume
          <volume>6378</volume>
          of Lecture Notes in Computer Science, pages
          <volume>78</volume>
          {
          <fpage>90</fpage>
          . Springer,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Oktie</surname>
            <given-names>Hassanzadeh</given-names>
          </string-name>
          , Reynold Xin,
          <string-name>
            <given-names>Renee J</given-names>
            .
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Anastasios</given-names>
            <surname>Kementsietsidis</surname>
          </string-name>
          , et al.
          <article-title>Linkage query writer</article-title>
          .
          <source>PVLDB</source>
          ,
          <volume>2</volume>
          (
          <issue>2</issue>
          ):
          <volume>1590</volume>
          {
          <fpage>1593</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Robert</surname>
            <given-names>Isele</given-names>
          </string-name>
          , Anja Jentzsch, and
          <string-name>
            <given-names>Christian</given-names>
            <surname>Bizer</surname>
          </string-name>
          .
          <article-title>Silk Server - Adding missing Links while consuming Linked Data</article-title>
          .
          <source>In 1st International Workshop on Consuming Linked Data (COLD</source>
          <year>2010</year>
          ), Shanghai,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Axel-Cyrille Ngonga</surname>
          </string-name>
          Ngomo and
          <article-title>Soren Auer. Limes - a time-e cient approach for large-scale link discovery on the web of data</article-title>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Eyal</surname>
            <given-names>Oren</given-names>
          </string-name>
          , Renaud Delbru, Michele Catasta, Richard Cyganiak, et al.
          <article-title>Sindice.com: a document-oriented lookup index for open linked data</article-title>
          .
          <source>Int. J. of Metadata and Semantics and Ontologies</source>
          ,
          <volume>3</volume>
          :
          <fpage>37</fpage>
          {
          <fpage>52</fpage>
          ,
          <string-name>
            <surname>November</surname>
            <given-names>10</given-names>
          </string-name>
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Yves</surname>
            <given-names>Raimond</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Christopher</given-names>
            <surname>Sutton</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Mark</given-names>
            <surname>Sandler</surname>
          </string-name>
          .
          <article-title>Automatic interlinking of music datasets on the semantic web</article-title>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>William</surname>
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Winkler</surname>
          </string-name>
          .
          <article-title>Overview of record linkage and current research directions</article-title>
          .
          <source>Technical report, Bureau of the Census</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>