<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Silk Server -Adding missing Links while consuming Linked Data</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Robert</forename><surname>Isele</surname></persName>
							<email>robertisele@googlemail.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Freie Universität Berlin</orgName>
								<address>
									<addrLine>Web-based Systems Group Garystr. 21</addrLine>
									<postCode>14195</postCode>
									<settlement>Berlin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Anja</forename><surname>Jentzsch</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Freie Universität Berlin</orgName>
								<address>
									<addrLine>Web-based Systems Group Garystr. 21</addrLine>
									<postCode>14195</postCode>
									<settlement>Berlin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Christian</forename><surname>Bizer</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Freie Universität Berlin</orgName>
								<address>
									<addrLine>Web-based Systems Group Garystr. 21</addrLine>
									<postCode>14195</postCode>
									<settlement>Berlin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Silk Server -Adding missing Links while consuming Linked Data</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">573E2CAFDB5B5326D28A39372DA43CBB</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T18:29+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Linked Data</term>
					<term>Link Discovery</term>
					<term>Identity Resolution</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The Web of Linked Data is built upon the idea that data items on the Web are connected by RDF links. Sadly, the reality on the Web shows that Linked Data sources set some RDF links pointing at data items in related data sources, but they clearly do not set RDF links to all data sources that provide related data. In this paper, we present Silk Server, an identity resolution component, which can be used within Linked Data application architectures to augment Web data with additional RDF links. Silk Server is designed to be used with an incoming stream of RDF instances, produced for example by a Linked Data crawler. Silk Server matches the RDF descriptions of incoming instances against a local set of known instances and discovers missing links between them. Based on this assessment, an application can store data about newly discovered instances in its repository or fuse data that is already known about an entity with additional data about the entity from the Web. Afterwards, we report on the results of an experiment in which Silk Server was used to generate RDF links between authors and publications from the Semantic Web Dog Food Corpus and a stream of FOAF profiles that were crawled from the Web.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The Web of Linked Data <ref type="bibr" target="#b2">[3]</ref> is built upon two simple ideas: Structured data is published on the Web using dereferencable URIs to represent data items wherein related data items are connected using RDF links. At its present state, the Web of Linked Data contains only a fraction of the links that would be desirable to be set <ref type="foot" target="#foot_0">1</ref> . According to Rodriguez <ref type="bibr" target="#b13">[14]</ref>, the Web of Data graph merely consists of two weakly connected components with a large diameter of 10 and an average path length of 3.4. A Linked Data application which wants to exploit the relationships between data items from different data sources thus might want to augment Web data with additional links before using it in the application context.</p><p>In order to tackle this problem, we provide the Silk Link Discovery Framework <ref type="bibr" target="#b14">[15]</ref>. Silk generates RDF links between data items based on user-provided link specifications which are expressed using the Silk Link Specification Language (Silk-LSL). Silk is provided in three different variants which address different use cases:</p><p>• Silk Single Machine is used to generate RDF links between two datasets on a single machine. • Silk MapReduce is based on Hadoop and enables Silk to scale out to very big datasets by distributing the link generation to multiple machines. • Silk Server can be used as an identity resolution component within applications that consume Linked Data from the Web.</p><p>This paper is focused on Silk Server which has been recently added as a new component to the Silk Link Discovery Framework. Silk Single Machine and Silk MapReduce are described on the Silk homepage 2 .</p><p>Silk Server is designed to be used with an incoming stream of RDF instances, produced for example by a Linked Data crawler such as LDSpider 3 . Silk Server matches incoming instances against a local set of known instances and discovers missing links between them. Incoming instances which do not match a known instance are added to the local set of instances continuously. Based on this assessment, an application can store data about newly discovered instances in its repository or fuse data that is already known about an entity with additional data about the entity from the Web.</p><p>The main features of the Silk Server are:</p><p>• It runs as an HTTP server and offers a REST interface <ref type="bibr" target="#b8">[9]</ref> that allows applications to check whether an entity that has been discovered on the Web is already known to the system. If the entity is already known, Silk Server returns an RDF link pointing at the URI identifying the known entity.</p><p>• It provides a flexible, declarative language for specifying the conditions which determine whether an entity is already known to the system. • It is high-performing by holding the data about all known instances in an inmemory cache, which is updated as soon as new instances are discovered. In addition, the performance can be further enhanced using a blocking feature.</p><p>The paper is structured as follows: Section 2 explains the role Silk Server can play within Linked Data application architectures. In Section 3, the architecture and workflow of the Silk Server are presented. Section 4 reports on the results of an experiment in which Silk Server was used to generate RDF links between the data about authors and publications from the Semantic Web Dog Food Corpus <ref type="bibr" target="#b11">[12]</ref> and a stream of FOAF 4 profiles that were crawled from the Web. Section 5 compares Silk Server with related work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Silk Server within Linked Data Application Architectures</head><p>This section discusses the role of Silk Server within Linked Data application architectures. Figure <ref type="figure" target="#fig_0">1</ref> gives an overview of the architecture of a fully-fledged Linked Data application which operates on top of the public Web of Linked Data <ref type="bibr" target="#b2">[3]</ref>. All data that is published on the Web according to the Linked Data principles becomes part of a giant global graph -the Web of Linked Data. This logical graph is depicted in the Web of Linked Data layer in Figure <ref type="figure" target="#fig_0">1</ref>. Applications that utilize this graph might implement the modules (or a subset of the modules) depicted in the Data Access, Integration and Storage Layer. In the following, we will describe the functionality of the different modules.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Web Data Access Module</head><p>The basic means to access Linked Data on the Web is to dereference HTTP URIs into RDF descriptions and to discover additional data by traversing RDF links. Such link traversal can for instance be implemented using readily available Linked Data crawlers such as LDspider. In addition, the data access module might download RDF data set dumps or utilize SPARQL endpoints (for an overview about SPARQL-based distributed query architectures please refer to <ref type="bibr" target="#b9">[10]</ref>). Data set dumps and SPARQL endpoints might be discovered by the data access module by relying on VOID descriptions <ref type="bibr" target="#b0">[1]</ref> and Semantic Web Sitemaps <ref type="bibr" target="#b5">[6]</ref> published on the Web by the data sources. URIs to identify the same entity in order to enable clients to directly retrieve data describing the entity from the different sources using the HTTP protocol. In addition, data sources might publish owl:sameAs links pointing at URIs that are used by other data sources to identify the same entity.</p><p>In contrast, it is often desirable for Linked Data applications to locally use only a single URI as the subject of all RDF statements about an entity while keeping track of the provenance of the statements. Thus in addition to using the owl:sameAs statements that are part of the ordinal Web data, applications might also employ an local identity resolution module, which generates additional owl:sameAs statements and interlinks newly discovered data about entities with data about them that is already known by the application. Silk Server provides this functionality and can thus be used as an identity resolution module within Linked Data applications.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Quality Evaluation Module</head><p>Due to the open nature of the Web, any Web data needs to be treated with suspicion and Linked Data applications should thus consider RDF statements which they discover on the Web as claims by a specific source rather than as facts. In order to determine which claim to accept and trust, Linked Data applications should employ a data quality evaluation module. This module may filter RDF SPAM, prefer data from sources that are known for good quality and optionally resolves data conflicts <ref type="bibr" target="#b3">[4]</ref>. An overview about the different information quality assessment heuristics that can be used by the quality evaluation module is given in <ref type="bibr" target="#b1">[2]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Integrated Web Data At the end of the processing pipeline, the cleaned</head><p>Web data is stored in a repository together with provenance information to be used by the application layer. A commonly used model for representing Web data together with provenance information are Named Graphs <ref type="bibr" target="#b4">[5]</ref>. Different vocabularies for exposing provenance information are currently compared by the W3C Provenance Incubator Group<ref type="foot" target="#foot_2">6</ref> .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">The Silk Server</head><p>Silk Server is an identity resolution component that can be used within Linked Data application architectures. It runs as an HTTP server and matches instances of an incoming RDF stream against a local set of known instances based on userprovided link specifications. In the following, we will describe the architecture of the Silk Server as well as the general linking workflow.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Architecture</head><p>The Silk Server is composed of the following three layers:</p><p>The in-memory instance cache builds the bottom layer which holds all known instances and keeps track of newly discovered instances. For each instance, the values of all relevant properties which are later required for the comparison are stored. As soon as a new instance is discovered, it is added to the instance cache. This enables the server to generate links to the newly discovered instance in future requests. Currently, the instance cache is held in memory, but can be replaced by a persistent cache in future versions of Silk. The current implementation of the instance cache can fit approximately 10 million instances into 8GB of main memory.</p><p>The Silk Linking Engine generates the links based on a set of link specifications and forms the central part of Silk Server. The details of the link generation process are covered in Section 3.3.</p><p>The REST interface enables applications to commit newly discovered resources and receive the generated links. New resources are accepted through an HTTP POST request using one of the supported RDF serialization formats, such as RDF/XML or N-Triples. The response contains all generated links optionally including statements declaring unknown instances i.e. instances for which no link could be generated. The server can process multiple requests in parallel.  The Silk Server workflow is divided into 2 phases: In the Setup phase the server loads all data sets which are specified by the user-provided link specifications. For each link specification, one instance cache is used to hold the part of the data that is later required for matching instances.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Data Processing Workflow</head><p>The Service phase starts as soon as all data sets have been loaded. If an application discovers new instances on the Web, it issues a request to the server containing the newly found data. The request may contain multiple instances with different types. On receiving the request, the Server matches the given instances with its link specifications. If a link specification can be applied to a specific instance, the server forwards it to the Silk Linking Engine. The Silk Linking Engine generates links for the given instances based on the corresponding link specifications as described in Section 3.3.</p><p>The generated links are processed by the server to find the set of instances which are not matched by any known instance. The instance cache is updated with the set of unmatched instances. Thus, in future request the server will also generate links to the newly found unmatched instances.</p><p>After the update has been completed, the generated links along with statements containg the unmatched instances are returned.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">The Silk Linking Engine</head><p>When receiving new instances to be matched, the Silk Linking Engine generates new buckets consisting of a provided instance and a set of instances from the cache. Each bucket is processed in 3 subsequent phases:</p><p>The optional Blocking phase partitions the incoming buckets into clusters. Since comparing every source resource to every single target resource results in a number of n * m comparisons (n being the number of source resources, m the number of target resources), blocking can be used to reduce the number of comparisons. Blocking partitions similar data items into clusters limiting the comparisons to items in the same cluster. For example, given a set of books to be compared, in order to reduce the number of comparisons, one could block the books by publisher. In this case only books from the same publisher will be compared.</p><p>The Link Generation phase reads the incoming buckets and computes a similarity value for each pair of instances. The incoming data items, which might be allocated to a cluster by the preceding blocking phase, are written to an internal cache. From the cache, pairs of data items are generated. If blocking is disabled, this will generate the complete cartesian product of the two data sets. If blocking is enabled, only data items from the same cluster are compared. For each pair of data items, the link condition is evaluated, which computes a similarity value between 0 and 1. Each pair generates a preliminary link with a confidence according to the similarity of the source and target data item.</p><p>The Filtering phase filters the incoming links in two stages: In the first stage, all links with a lower confidence than the user-defined threshold are removed. In the second stage, all links which originate from the same subject are grouped together. If a limit is defined on the number of links per subject, only the links with the highest confidence are forwarded to the output.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Implementation</head><p>Silk Server is implemented in Scala<ref type="foot" target="#foot_3">7</ref> and runs as a Servlet on the Jetty Web Server<ref type="foot" target="#foot_4">8</ref> . The REST interface has been realized using the Lift Web Framework<ref type="foot" target="#foot_5">9</ref> . The Silk Link Discovery Framework including Silk Server can be downloaded from the project homepage<ref type="foot" target="#foot_6">10</ref> under the terms of the Apache Software License.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Evaluation</head><p>This section reports on the results of an experiment in which we used Silk Server to generate RDF links between authors and publications from a Semantic Web Dog Food Corpus dump and a stream of FOAF profiles that we crawled from the Web. Semantic Web Dog Food Corpus publishes information on people and publications from Semantic Web conferences. FOAF is a widely used vocabulary to describe persons, their connections, projects, publications and interests. Twitter is a social networking and microblogging website which provides user information as RDFa. Given these different sources for information on persons, the experiment aims at linking duplicate person descriptions. In the following, we explain the Silk-LSL specification used by Silk Server in the experiment; we then first describe the setup of the experiment and finally report on and discuss the results of the experiment.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">The Link Specification used</head><p>Figure <ref type="figure" target="#fig_3">3</ref> contains the link configuration used in the experiment for linking data items describing the same person. The complete link configuration for discovering RDF links between persons as well as publications is available online <ref type="foot" target="#foot_7">11</ref> .</p><p>The involved data sources for this experiment are the Semantic Web Dog Food Corpus dump (line 5) and an RDF input stream (line 9).</p><p>A link configuration may contain several link specifications if links for different types of data items should be generated. Silk Server will set owl:sameAs links between duplicates as configured in line <ref type="bibr" target="#b15">16</ref>.</p><p>Link specifications contain link conditions which define the conditions that data entities must fulfill in order to be interlinked. Link conditions may apply similarity metrics to multiple property values of an entity or related entities. The resulting similarity scores can be combined and weighted using various similarity aggregation functions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Link Conditions</head><p>The link condition specifies how two data entities are compared for similarity. It consists of a number of comparison operators which are combined using aggregation functions.</p><p>A comparison operator evaluates two inputs and computes their similarity based on a user-defined metric. Silk provides several similarity metrics including string, numeric, date, and URI similarity. String comparison methods cover the most common ones like Jaro, Jaro-Winkler and Levenshtein. Silk can easily be enhanced with new metrics.</p><p>Multiple comparisons can be aggregated using a specific aggregation method by using the &lt;Aggregate&gt; directive.</p><p>In the given experiment's link condition we compute similarity values for the FOAF names, homepages, and mailbox hash sums (lines 24 to 45). The overall similarity value of two data entities is derived by the weighted average of the similarity values of all comparisons. To identify a person uniquely, either a homepage or a mailbox hash sum is required. Thus, two persons are considered equal if both names and either the homepage or the mailbox hash sum match.</p><p>Some comparison operators might be more relevant for the correct establishment of a link between two resources than others and can therefore be weighted higher. If no weight is supplied, a default weight of 1 will be assumed. As a person may be known under different names, matching homepages or mailbox hash sums are more important and therefore weighted higher (line 35).</p><p>Filtering The generated links can be filtered by using the &lt;Filter&gt; directive. A threshold for the minimum similarity of two data items required to generate a link between them can be defined (line 47). The number of links originating from a single data item can be limited. Only the highest-rated links per source data item will remain after the filtering.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Setup of the Experiment</head><p>For the experiment, we loaded the Semantic Web Dog Food Corpus into the Silk Server. The Semantic Web Dog Food Corpus contains profiles for 3,739 persons from which 2,580 provide either a homepage or a mailbox hash which is required to uniquely identify them. We have set up a Linked Data crawler which takes a number of FOAF profile URIs as seeds and follows linked profiles. The crawled documents are forwarded to Silk Server which generates owl:sameAs links to known persons from the Semantic Web Dog Food Corpus. All generated links have been written to an ouput file which has been analyzed for the results presented in section 4.3.</p><p>The crawler was also used to traverse the RDFa of Twitter accounts for which the server identified the corresponding persons in the Semantic Web Dog Food Corpus if any.</p><p>In order to show the flexibility of Silk Server, the link configuration was further enhanced to also match publications. For this purpose the crawler was employed to also follow publication links in addition to FOAF profiles.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Results of the Experiment</head><p>Generated links to FOAF profiles At first, we evaluated how exhaustive the found links are. For this purpose, we exploited the fact that for 56 persons the Semantic Web Dog Food Corpus already sets links to their FOAF profile. For 51 of these persons, Silk Server was able to reconstruct links from the stream. For some persons even multiple duplicated profiles could be identified. For example e.g. in addition to Tom Heath's<ref type="foot" target="#foot_8">12</ref> official FOAF profile &lt;http://tomheath.com/ id/me&gt;, Silk Server also identified him on &lt;http://www.eswc2006.org/people/ #tom-heath&gt;. Because in some cases, Silk Server found a link to another profile than the one given in the data set, we checked all links manually for correctness. Thereby, all generated links have been found to be correct.</p><p>Next, we evaluated for how many persons in the Semantic Web Dog Food Corpus, the server was able to generate links to a FOAF profile. In total, Silk Server was able to find profiles for 228 persons in the data set. Thus, Silk Server was able to discover links to the FOAF profile of additional 177 persons for which the Semantic Web Dog Food Corpus did not contain a link yet. Generated links to Twitter accounts For 89 persons in the Semantic Web Dog Food Corpus, Silk Server was able to find a corresponding Twitter account. Silk Server was able to detect more than one account for persons holding multiple accounts. For example, it found that Ralph Hodgson<ref type="foot" target="#foot_9">13</ref> not only uses the account http://twitter.com/ralphtq but also the account http: //twitter.com/oegovnews. Generated links to publications For 37 publications in the Semantic Web Dog Food Corpus Silk Server was able to find the corresponding publication in the Web of Data. The number of links is lower than the number of found FOAF profiles because many persons do not link their publications in their profile.. One exception is the Digital Enterprise Research Institute (DERI), which publishes the meta data about all publications as RDF<ref type="foot" target="#foot_10">14</ref> .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Related Work</head><p>Discovering links between data items across data sets requires record linkage and duplicate detection techniques. There is a large body of related work on these topics within the database community <ref type="bibr">[16][7]</ref> as well as on ontology matching in the knowledge representation community <ref type="bibr" target="#b7">[8]</ref>.</p><p>Silk builds on the research results from within these communities. Silk can be used in scenarios where different types of links should be discovered between Web data sources which often make use of terms from different vocabularies.</p><p>Besides Silk, there are two related tools for generating RDF links: LinQuer <ref type="bibr" target="#b10">[11]</ref> is a tool for semantic link discovery over relational data, based on string and semantic matching techniques and their combinations. The Lin-Quer framework consists of LinQL, a declarative language that allows specification of linkage requirements in a wide variety of applications. The framework rewrites LinQL queries into standard SQL queries that can be run over relational data sources. LinQuer is meant to be used together with relational databases to RDF wrappers such as D2R Server <ref type="foot" target="#foot_11">15</ref> or Virtuoso RDF Views <ref type="foot" target="#foot_12">16</ref> .</p><p>Related work that also focuses on Linked Data includes Raimond et al. <ref type="bibr" target="#b12">[13]</ref> who propose a link discovery algorithm that takes into account both the similarities of web resources and of their neighbors. The algorithm is implemented within the GNAT tool and has been evaluated for interlinking music-related data sets.</p><p>While LinQuer and GNAT only allow batch processing, Silk Server is the first identity resolution component that works on an on-demand fashion and can be used together with RDF data streams.</p><p>The EU-funded project OKKAM<ref type="foot" target="#foot_13">17</ref> offers an Entity Name System (ENS), which supportes the storage and reuse of global entity identifiers. While OKKAM ENS contains several matching modules per default, it does not provide a flexible and comprehensive link specification language.</p><p>The RKBExplore sameAs service<ref type="foot" target="#foot_14">18</ref> is targeted at providing a unified view over multiple data sources by managing owl:sameAs links to identify duplicate URIs. In contrast to Silk Server the links are not generated based on user-defined link specifications, but must be provided to the system from external sources.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Conclusion</head><p>Vint Cerf, the inventor of the internet, said in his keynote speech at 19th International World Wide Web Conference (WWW2010) that in the age of the internet where everything should be connected, he would also expect database management systems to automatically connect new records that are added to a database with all related entities that are already stored in the database. With Silk Server, we make a first step to provide such functionality for the Linked Data context.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Schematic Architecture of Linked Data Applications</figDesc><graphic coords="3,134.77,230.39,345.83,266.39" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2</head><label>2</label><figDesc>Figure 2 illustrates the Silk Server workflow.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Silk Server Workflow</figDesc><graphic coords="6,134.77,116.83,345.83,106.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 3 .</head><label>3</label><figDesc>Fig. 3. Example: Interlinking persons in FOAF profiles</figDesc></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/ DataSets/LinkStatistics</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_1">http://www4.wiwiss.fu-berlin.de/bizer/r2r/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_2">http://www.w3.org/2005/Incubator/prov/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_3">http://scala-lang.org</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_4">http://jetty.codehaus.org</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_5">http://liftweb.net</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_6">http://www4.wiwiss.fu-berlin.de/bizer/silk/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_7">http://www4.wiwiss.fu-berlin.de/bizer/silk/linkspecs/persons_and_ publications.xml</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="12" xml:id="foot_8">http://data.semanticweb.org/person/tom-heath</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="13" xml:id="foot_9">http://data.semanticweb.org/person/ralph-hodgson</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="14" xml:id="foot_10">http://www.deri.ie/publications/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="15" xml:id="foot_11">http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="16" xml:id="foot_12">http://virtuoso.openlinksw.com/whitepapers/relational%20rdf%20views%20mapping.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="17" xml:id="foot_13">http://www.okkam.org/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="18" xml:id="foot_14">http://www.rkbexplorer.com/sameAs/</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">Acknowledgments</head><p>This work was supported in part by Vulcan Inc. as part of its Project Halo (www.projecthalo.com) and by the EU FP7 project LOD2 -Creating Knowledge out of Interlinked Data (http://lod2.eu/, Ref. No. 257943).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Describing linked datasets</title>
		<author>
			<persName><forename type="first">K</forename><surname>Alexander</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Cyganiak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hausenblas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhao</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of the 2nd Workshop on Linked Data on the Web (LDOW2009)</title>
				<meeting>of the 2nd Workshop on Linked Data on the Web (LDOW2009)</meeting>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Quality-driven information filtering using the wiqa policy framework</title>
		<author>
			<persName><forename type="first">Christian</forename><surname>Bizer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Richard</forename><surname>Cyganiak</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Web Semantics: Science, Services and Agents on the World Wide Web</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="10" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Linked data -the story so far</title>
		<author>
			<persName><forename type="first">Christian</forename><surname>Bizer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tom</forename><surname>Heath</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tim</forename><surname>Berners-Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Int. J. Semantic Web Inf. Syst</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="1" to="22" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Data fusion</title>
		<author>
			<persName><forename type="first">J</forename><surname>Bleiholder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Naumann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Computing Surveys</title>
		<imprint>
			<biblScope unit="volume">41</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="41" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Named graphs</title>
		<author>
			<persName><forename type="first">J</forename><surname>Carroll</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Bizer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Hayes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Stickler</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Web Semantics</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="247" to="267" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Semantic sitemaps: Efficient and flexible access to datasets on the semantic web</title>
		<author>
			<persName><forename type="first">R</forename><surname>Cyganiak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Delbru</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Stenzhorn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Tummarello</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Decker</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th European Semantic Web Conference (ESWC2008)</title>
				<meeting>the 5th European Semantic Web Conference (ESWC2008)</meeting>
		<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Duplicate record detection: A survey</title>
		<author>
			<persName><forename type="first">Ahmed</forename><forename type="middle">K</forename><surname>Elmagarmid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Panagiotis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Vassilios</forename><forename type="middle">S</forename><surname>Ipeirotis</surname></persName>
		</author>
		<author>
			<persName><surname>Verykios</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. on Knowl. and Data Eng</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="16" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Ontology matching</title>
		<author>
			<persName><forename type="first">Jérôme</forename><surname>Euzenat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Pavel</forename><surname>Shvaiko</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2007">2007</date>
			<publisher>Springer-Verlag</publisher>
			<pubPlace>Heidelberg (DE</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Architectural styles and the design of network-based software architectures</title>
		<author>
			<persName><forename type="first">T</forename><surname>Roy</surname></persName>
		</author>
		<author>
			<persName><surname>Fielding</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2000">2000</date>
		</imprint>
		<respStmt>
			<orgName>University of California, Irvine</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">PhD thesis</note>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">A database perspective on consuming linked data on the web</title>
		<author>
			<persName><forename type="first">Olaf</forename><surname>Hartig</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andreas</forename><surname>Langegger</surname></persName>
		</author>
		<imprint/>
	</monogr>
	<note>Datenbank Spektrum. to appear</note>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Linkage query writer</title>
		<author>
			<persName><forename type="first">Oktie</forename><surname>Hassanzadeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Reynold</forename><surname>Xin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rene</forename><forename type="middle">J</forename><surname>Miller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Anastasios</forename><surname>Kementsietsidis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lipyeow</forename><surname>Lim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Min</forename><surname>Wang</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Recipes for semantic web dog food -the eswc and iswc metadata projects</title>
		<author>
			<persName><forename type="first">Knud</forename><surname>Möller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tom</forename><surname>Heath</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Siegfried</forename><surname>Handschuh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">John</forename><surname>Domingue</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ISWC/ASWC</title>
				<imprint>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="802" to="815" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Automatic Interlinking of Music Datasets on the Semantic Web</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Raimond</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Sutton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sandler</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of the 1st Linked Data on the Web Workshop</title>
				<meeting>of the 1st Linked Data on the Web Workshop</meeting>
		<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">A graph analysis of the linked data cloud</title>
		<author>
			<persName><forename type="first">Marko</forename><forename type="middle">A</forename><surname>Rodriguez</surname></persName>
		</author>
		<idno>CoRR, abs/0903.0194</idno>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Discovering and maintaining links on the web of data</title>
		<author>
			<persName><forename type="first">Julius</forename><surname>Volz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christian</forename><surname>Bizer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Martin</forename><surname>Gaedke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Georgi</forename><surname>Kobilarov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Semantic Web Conference</title>
				<imprint>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="650" to="665" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title level="m" type="main">Overview of record linkage and current research directions</title>
		<author>
			<persName><forename type="first">William</forename><forename type="middle">E</forename><surname>Winkler</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2006">2006</date>
		</imprint>
		<respStmt>
			<orgName>Bureau of the Census</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Technical report</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
