<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>April</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Benchmarking the Performance of Linked Data Translation Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andreas Schultz</string-name>
          <email>a.schultz@fu-berlin.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Carlos R. Rivero University of Sevilla</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Christian Bizer Freie Universität Berlin</institution>
          ,
          <addr-line>Germany berlin.de</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>David Ruiz University of Sevilla</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Freie Universität Berlin</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2012</year>
      </pub-date>
      <volume>16</volume>
      <issue>2012</issue>
      <abstract>
        <p>Linked Data sources on the Web use a wide range of different vocabularies to represent data describing the same type of entity. For some types of entities, like people or bibliographic record, common vocabularies have emerged that are used by multiple data sources. But even for representing data of these common types, different user communities use different competing common vocabularies. Linked Data applications that want to understand as much data from the Web as possible, thus need to overcome vocabulary heterogeneity and translate the original data into a single target vocabulary. To support application developers with this integration task, several Linked Data translation systems have been developed. These systems provide languages to express declarative mappings that are used to translate heterogeneous Web data into a single target vocabulary. In this paper, we present a benchmark for comparing the expressivity as well as the runtime performance of data translation systems. Based on a set of examples from the LOD Cloud, we developed a catalog of fifteen data translation patterns and survey how often these patterns occur in the example set. Based on these statistics, we designed the LODIB (Linked Open Data Integration Benchmark) that aims to reflect the real-world heterogeneities that exist on the Web of Data. We apply the benchmark to test the performance of two data translation systems, Mosto and LDIF, and compare the performance of the systems with the SPARQL 1.1 CONSTRUCT query performance of the Jena TDB RDF store.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Categories and Subject Descriptors</title>
      <p>D.2.12 [Interoperability]: Data mapping;
H.2.5 [Heterogeneous Databases]: Data translation
Work partially done whilst visiting Freie Universita¨t Berlin.</p>
    </sec>
    <sec id="sec-2">
      <title>1. INTRODUCTION</title>
      <p>
        The Web of Linked Data is growing rapidly and covers a wide
range of different domains, such as media, life sciences,
publications, governments, or geographic data [
        <xref ref-type="bibr" rid="ref13 ref4">4, 13</xref>
        ]. Linked
Data sources use vocabularies to publish their data, which
consist of more or less complex data models that are
represented using RDFS or OWL [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Some data sources try to
reuse as much from existing vocabularies as possible in
order to ease the integration of data from multiple sources [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
Other data sources use completely proprietary vocabularies
to represent their content or use a mixture of common and
proprietary terms [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        Due to these facts, there exists heterogeneity amongst
vocabularies in the context of Linked Data. According to [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ],
on the one hand, 104 out of the 295 data sources in the
LOD Cloud only use proprietary vocabularies. On the other
hand, the rest of the sources (191) use common vocabularies
to represent some of their content, but also often extend and
mix common vocabularies with proprietary terms to
represent other parts of their content. Some examples of the use
of common vocabularies are the following: regarding
publications, 31:19% data sources use the Dublin Core
vocabulary, 4:75% use the Bibliographic Ontology, or 2:03% use the
Functional Requirements for Bibliographic Records; in the
context of people information, 27:46% data sources use the
Friend of a Friend vocabulary, 3:39% use the vCard ontology,
or 3:39% use the Semantically-Interlinked Online
Communities ontology; finally, regarding geographic data sets, 8:47%
data sources use the Geo Positioning vocabulary, or 2:03%
use the GeoNames ontology.
      </p>
      <p>
        To solve these heterogeneity problems, mappings are used
to perform data translation, i.e., exchanging data from the
source data set to the target data set [
        <xref ref-type="bibr" rid="ref19 ref21">19, 21</xref>
        ]. Data
translation, a.k.a. data exchange, is a major research topic in
the database community, and it has been studied for
relational, nested relational, and XML data models [
        <xref ref-type="bibr" rid="ref10 ref11 ref3">3, 10,
11</xref>
        ]. Current approaches to perform data translation rely on
two types of mappings that are specified at different levels,
namely: correspondences (modelling level) and executable
mappings (implementation level). Correspondences are
represented as declarative mappings that are then combined
into executable mappings, which consist of queries that are
executed over a source and translate the data into a
target [
        <xref ref-type="bibr" rid="ref18 ref19 ref7">7, 18, 19</xref>
        ].
      </p>
      <p>
        In the context of executable mappings, there exists a
number of approaches to define and also automatically generate
them. Qin et al. [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] devised a semi-automatic approach
to generate executable mappings that relies on data-mining;
Euzenat et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and Polleres et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] presented
preliminary ideas on the use of executable mappings in SPARQL
to perform data translation; Parreiras et al. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] presented
a Model-Driven Engineering approach that automatically
transforms handcrafted mappings in MBOTL (a mapping
language by means of which users can express executable
mappings) into executable mappings in SPARQL or Java;
Bizer and Schultz [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] proposed a SPARQL-like mapping
language called R2R, which is designed to publish expressive,
named executable mappings on the Web, and to flexible
combine partial executable mappings to perform data
translation. Finally, Rivero et al. [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] devised an approach called
Mosto to automatically generate executable mappings in
SPARQL based on constraints of the source and target data
models, and also correspondences between these data
models. In addition, translating amongst vocabularies by means
of mappings is one of the main research challenges in the
context of Linked Data, and it is expected that research
efforts on mapping approaches will be increased in the next
years [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. As a conclusion, a benchmark to test data
translation systems in this context seems highly relevant.
To the best of our knowledge, there exist two benchmarks
to test data translation systems: STBenchmark and
DTSBench. STBenchmark [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] provides eleven patterns that
occur frequently when integrating nested relational models,
which makes it difficult for at least some of the patterns
to extrapolate to our context due to a number of inherent
differences between nested relational models and the
graphbased RDF data model that is used in the context of Linked
Data [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. DTSBench [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] allows to test data translation
systems in the context of Linked Data using synthetic data
translation tasks only, without taking real-world data from
Linked Data sources into account.
      </p>
      <p>
        In this paper, we present a benchmark to test data
translation systems in the context of Linked Data. Our
benchmark provides a catalogue of fifteen data translation
patterns, each of which is a common data translation problem in
the context of Linked Data. To motivate that these patterns
are common in practice, we have analyzed 84 random
examples of data translation in the Linked Open Data Cloud.
After this analysis, we have studied the distribution of the
patterns in these examples, and have designed LODIB, the
Linked Open Data Integration Benchmark, to reflect this
real-world heterogeneity that exists on the Web of Data.
The benchmark provides a data generator that produces
three different synthetic data sets, which reflect the pattern
distribution. These source data sets need to be translated
into a single target vocabulary by the system under test.
This generator allows us to scale source data and it also
automatically generates the expected target data, i.e., after
performing data translation over the source data. The data
sets reflect the same e-commerce scenario that we already
used for the BSBM benchmark [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
LODIB is designed to measure the following: 1)
Expressivity: the number of mapping patterns that can be expressed
in a specific data translation system; 2) Time performance:
the time needed to perform the data translation, i.e.,
loading the source file, executing the mappings, and serializing
the result into a target file. In this context, LODIB provide
a validation tool that examines if the source data is
represented correctly in the target data set: we perform the data
translation task in a particular scenario using LODIB, and
the target data that we obtain are the expected target data
when performing data translation using a particular system.
This paper is organized as follows: Section 2 presents the
mapping patterns of our benchmark; in Section 3, we
describe the 84 data translation examples from the LOD Cloud
that we have analyzed, and the counting of the occurrences
of mapping patterns in the examples; Section 4 deals with
the design of our benchmark; Section 5 describes the
evaluation of our benchmark with two data translation systems
(Mosto and LDIF), and compares their performance with
the SPARQL 1.1 performance of the Jena TDB RDF store;
Section 6 describes the related work on benchmarking in the
Linked Data context; and, finally, Section 7 recaps on our
main conclusions regarding LODIB.
      </p>
    </sec>
    <sec id="sec-3">
      <title>2. MAPPING PATTERNS</title>
      <p>A mapping pattern represents a common data translation
problem that should be supported by any data translation
system in the context of Linked Data. Our benchmark
provides a catalogue of fifteen mapping patterns that we have
repeatedly discovered as we analyzed the heterogeneity
between different data sources in the Linked Open Data Cloud.
In the rest of this section, we present these patterns in
detail. Note that for vocabulary terms in concrete examples
we use the prefixes shown in Table 1.</p>
      <p>Rename Class (RC). Every source instance of a class C
is reclassified into the same instance of the renamed
class C’ in the target. An example of this pattern is
the renaming of class fb:location.citytown in Freebase
into class dbp:City in DBpedia.</p>
      <p>Rename Property (RP). Every source instance of a
property P is transformed into the same instance
of the renamed property P’ in the target. An example
is the renaming of property dbp:elevation in
DBpedia into property lgdo:ele in LinkedGeoData, in which
both properties represent the elevation of a geographic
location.</p>
      <p>Rename Class based on Property (RCP). This
pattern is similar to the Rename Class pattern but it is
based on the existence of a property. Every source
instance of a class C is reclassified into the same
instance of the renamed class C’ in the target, if
and only if, the source instance is related with
another instance of a property P. An example is the
renaming of class dbp:Person in DBpedia into class
fb:people.deceased person in Freebase, if and only if,
an instance of dbp:Person is related with an instance
of property dbp:deathDate, i.e., if a deceased person in
Freebase exists, there must exist a person with a date
of death in DBpedia.</p>
      <p>Rename Class based on Value (RCV). This pattern
is similar to the previous pattern, but the
property instance must have a specific value v to
rename the source instance. An example is the
renaming of class gw:Person in GovWILD into class
fb:government.politician in Freebase, if and only if,
each instance of gw:Person is related with an instance
of property gw:profession and its value is the literal
“politician”. This means that only people whose
profession is politician in GovWILD are translated into
politicians in Freebase.</p>
      <p>Reverse Property (RvP). This pattern is similar to the
Rename Property pattern, but the property instance
in the target is reversed, i.e., the subject is
interchanged with the object. An example is the reverse of
property fb:airports operated in Freebase into property
dbp:operator in DBpedia, in which the former relates
an operator with an airport, and the latter relates an
airport with an operator.</p>
      <p>Resourcesify (Rsc). Every source instance of a property
P is split into a target instance of property P’ and
an instance of property Q. Both instances are
connected using a fresh resource, which establishes the
original connection of the instance of property P. Note
that the new target resource must be unique and
consistent with the definition of the target vocabulary.
An example is the creation of a new URI or blank
node when translating property dbp:runtime in
DBpedia into po:duration in BBC by creating a new instance
of property po:version.</p>
      <p>Deresourcesify (DRsc). Every source instance of a
property P is renamed into a target instance of property P’,
if and only if, P is related to another source instance
of a property Q, that is, both instances use the same
resource. In this case, the source needs more instances
than the target to represent the same information. An
example of this pattern is that an airport in
DBpedia is related with its city served by property dbp:city,
and the name of this city is given as value of rdfs:label.
This is transformed into property lgdp:city served in
LinkedGeoData, which relates an airport with its city
served (as literal).
1:1 Value to Value (1:1). The value of every source
instance of a property P must be transformed by means
of a function into the value of a target instance of
property P’. An example is dbp:runtime in DBpedia
is transformed into movie:runtime in LinkedMDB, in
which the source is expressed in seconds and the target
in minutes.</p>
      <p>Value to URI (VtU). Every source instance of a
property P is translated into a target instance of
property P’ and the source object value is transformed
into an URI in the target. An example of this pattern
is property grs:point in DBpedia, which is translated
into property fb:location.location.geolocation in
Freebase, and the value of every instance of grs:point is
transformed into an URI.</p>
      <p>URI to Value (UtV). This pattern is similar to the
previous one but the source instance relates to a URI
that is transformed into a literal value in the target.
An example of the URI to Value pattern is property
dbp:wikiPageExternalLink in DBpedia that is
translated into property fb:common.topic.official website in
Freebase, and the URI of the source instance is
translated to a literal value in the target.</p>
      <p>Change Datatype (CD). Every source instance of a
datatype property P whose type is TYPE is renamed
into the same target instance of property P’ whose
type is TYPE’. An example of this pattern is property
fb:people.person.date of birth in Freebase whose type
is xsd:dateTime, which is translated into target
property dbp:birthDate in DBpedia whose type is xsd:date.
Add Language Tag (ALT). In this pattern, every source
instance of a property P is translated into a target
instance of property P’ and a new language tag TAG is
added to the target literal. An example of this pattern
is that db:genericName in Drug Bank is renamed into
property rdfs:label in DBpedia and a new language tag
“@en” is added.</p>
      <p>Remove Language Tag (RLT). Every source instance
of a property P is translated into a target instance of
property P’ and the source instance has a language tag
TAG that is removed. An example is skos:altLabel in
DataGov Statistics, which has a language tag “@en”,
is translated into skos:altLabel in Ordnance Survey
and the language tag is removed.</p>
      <p>N:1 Value to Value (N:1). A number of source instances
of properties P1, P2, . . . , Pn are translated into a
single target instance of property P’, and the value of the
target instance is computed by means of a function
over the values of the source instances. An example
of this pattern is that we concatenate the values of
properties foaf:givenName and foaf:surname in
DBpedia into property fb:type.object.name in Freebase.
Aggregate (Agg). In this pattern, we count the number of
source instances of property P, which is translated into
a target instance of property Q. An example is
property fb:metropolitan transit.transit system.transit lines
in Freebase whose values are aggregated into a single
value of dbp:numberOfLines for each city in DBpedia.
Finally, we present a summary of these mapping patterns
in Table 2. The first column of this table stands for the
code of each pattern; the second and third columns establish
the triples to be retrieved in the source and the triples to
be constructed in the target using a SPARQL-like notation.
Note that properties are represented as P and Q, classes as
C, constant values as v, tag languages as TAG, and data
types as TYPE.</p>
    </sec>
    <sec id="sec-4">
      <title>3. LODIB GROUNDING</title>
      <p>In order to base the LODIB Benchmark on realistic
realworld distributions of these mapping patterns, we analyzed
84 data translation examples from the LOD Cloud and
counted the occurrences of mapping patterns in these
examples. First, we selected different Linked Data sources by
exploring the LOD data set catalog maintained on CKAN1.
The criteria we followed was to choose sources that comprise
a great number of owl:sameAs links with other Linked Data
sources, i.e., more than 25; 000. Furthermore, we tried to
select sources from the major domains represented in the LOD
Cloud. Therefore, the selected Linked Data sources are the
following: ACM (RKB Explorer), DBLP (RKB Explorer),
Dailymed, Drug Bank, DataGov Statistics, Ordnance
Survey, DBpedia, GeoNames, Linked GeoData, LinkedMDB,
New York Times, Music Brainz, Sider, GovWILD,
ProductDB, and OpenLibrary. Note that, for each domain of
the LOD Cloud, there are at least two Linked Data sources
that contribute to our statistics except from the domain of
user-generated content.</p>
      <p>
        After selecting these sources, we randomly selected 42
examples, each of which comprises a pair of instances that are
connected by an owl:sameAs link. For each of these examples,
1http://thedatahub.org/group/lodcloud
we analyzed both directions: one instance is the source and
the other instance is the target, and backwards. Therefore,
the total number of examples we analyzed was 84. Then,
we manually counted the number of mapping patterns that
are needed to translate between the representations of the
instances (neighboring instances were also considered to
detect more complex structural mismatches). These statistics
are publicly-available at [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
      </p>
      <p>In the next step, we computed the averages of our mapping
patterns grouped by the pair of source and target data set.
To compute them, in some cases, we analyzed the translation
of one single instance since the data set of the Linked Data
source comprises only a couple of classes, such as Drug Bank
or Ordnance Survey. In other cases, we analyzed more than
one instance since the data set comprises a large number of
classes, such as DBpedia or Freebase.</p>
      <p>Table 3 presents the statistics of the mappings patterns that
we have found in the LOD Cloud. The two first columns
stand for the source and target Linked Data data sets, the
following columns contain the averages of each mapping
pattern according to the source and the target, i.e., we count the
occurrences of mapping patterns in a number of examples
and compute the average. Note that, for certain data sets,
we analyzed several examples of the same type; therefore,
the final numbers of these columns are real numbers (no
integers). Finally, the last column contains the total number
of instances that we analyzed for each pair of Linked Data
data sets.</p>
      <p>On the one hand, Rename Class and Rename Property
mapping patterns appear in the vast majority of the analyzed
examples, since these patterns are very common in practice.
On the other hand, there are some patterns that are not so
common, e.g., Value to URI and URI to Value patterns
appear only once in all analyzed examples (between DBpedia
and Drug Bank). Table 4 presents the average occurrences of
the LODIB mapping patterns over all analyzed examples.</p>
    </sec>
    <sec id="sec-5">
      <title>4. LODIB DESIGN</title>
      <p>Based on the previously described statistics, we have
designed the LODIB Benchmark. The benchmark consists of
three different source data sets that need to be translated
by the system under test into a single target vocabulary.</p>
      <p>
        The topic of the data sets is the same e-commerce data set
that we already used for the BSBM Benchmark [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The
data sets describe products, reviews, people and some more
lightweight classes, such as product price using different
source vocabularies. For translation from the representation
of an instance in the source data sets to the target
vocabulary, data translation systems need to apply several of the
presented mapping patterns. The descriptions of these data
sets are publicly-available at the LODIB homepage [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
      </p>
      <p>These data sets take the previously computed averages of
Table 4 into account by multiplying them by a constant
(11), and divided each one by another constant (3, the total
number of data translation tasks, i.e., from each source data
set to the target data set). As a result, each of the three data
translation tasks comprises a number of mapping patterns,
and we present the numbers in Table 5, in which the total
number of mapping patterns for each task is 18.
s
e
c
r
u
o
s
a
t
a
D
d
e
k
n
i
L
n
i
s
n
r
e
t
t
a
p
g
n
i
p
p
a
M
:
3
e
l
b
a
T
g 0 .
t .0 .0 .0 .
0 0 0 0 7 0 3 3 0 0 0 0 0 0 0 9 0 0 0 0 3 3 0 0 0 0 0 0
:1 .0 .0 .0 .
V
C .0 .</p>
      <p>P
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 0 0 5 0 0 0 0 0 0 0
u
o C a a B B B B B B B B r r r r</p>
      <p>M B o i n c
d d B B B B a a a a Y L a
y a L p p p p p p p g g g gu ee ee ee eo ovWieknd iend icu e p r r i
m G P e e e e e e e b b b N k s w en dn odu red
u u u
r r r
S A D D D D D D D D D D D D D D F F F G G L L M N O O P S
a
t
a
s
e
m
z i
y
e
v
r
D
o
s
e D e D ra rk rb ce tD
m IL G</p>
      <p>
        y u
T r S B
a
We have implemented a data generator to populate and scale
the three source data sets that we have specified in the
previous section, which is publicly-available at the LODIB
homepage [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. In the data generator, we defined a number of data
generation rules, and the generated data are scaled based on
the number of product instances that each data set contains.
      </p>
      <p>
        In our implementation, we use an extension of the language
used in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], which allows to define particular value
generation rules for basic types, such as xsd:string or xsd:date. In
addition, missing properties often occurs in the context of
the Web of Data, therefore, we also provide 44 statistical
distributions in our implementation to randomly select
distribute properties, including Uniform, Normal, Exponential,
Zipf, Pareto and empirical distributions, to mention a few.
      </p>
      <p>In this section, we provide examples on how a data
translation system needs to translate the data from the source to
the target vocabulary regarding the mapping patterns in the
three data translation tasks. Specifically, Figure 1 presents
a number of source triples that are translated into a number
of target triples. Note that we use prefixes src1:, src2:, src3:
and tgt: for referring to the source data sets and the single
target vocabularies of the data sets; and src1-data:,
src2:data, src3:-data and tgt:-data for referring to the source and
target data. These examples are the following:
Rename Class (RC). Class src1:Product needs to be
renamed into class tgt:Product, e.g.,
src1-data:Canon</p>
      <p>Ixus-20010 product instance.</p>
      <p>Rename Property (RP). Property src1:name needs to
be renamed into property rdfs:label, e.g., the name of
src1-data:Canon-Ixus-20010 product instance.</p>
      <p>Rename Class based on Property (RCP). In this case,
class src1:Person needs to be renamed into class
tgt:Reviewer, if and only if, property src1:author
exists, e.g., src1-data:Smith-W person instance.</p>
      <p>Rename Class based on Value (RCV). In this
example, class src2:Product needs to be renamed into
class tgt:OutdatedProduct, if and only if, property
src2:outdated exists and has value “Yes”, e.g.,
src2data:HTC-Wildfire-S product instance.</p>
      <p>Reverse Property (RvP). In this example, property
src1:author is reversed into property tgt:author, e.g.,
src1-data:Review-CI-001 review instance and
src1data:Smith-W person instance are related and reversed
in the target.</p>
      <p>Resourcesify (Rsc). Property src1:birthDate needs to be
renamed into property tgt:birthDate and a new target
instance of property tgt:birth is needed, e.g., the date
of birth of src1-data:Smith-W person instance.</p>
      <p>Deresourcesify (DRsc). Property src2:revText needs to
be renamed into property tgt:text, if and only if, the
instance of property src2:revText is related to another
source instance of property src2:hasText, e.g., the text
of src2-data:Review-HTC-W-S review instance.
1:1 Value to Value (1:1). Property src2:price needs to
be renamed into property tgt:productPrice, and the
value must be transformed by means of function
usDollarsToEuros, since the source price is represented
in US dollars and the target in Euros, e.g., the price
of src2-data:HTC-Wildfire-S product instance.</p>
      <p>Value to URI (VtU). In this example, we need to
rename property src1:personHomepage into property
tgt:personHomepage, and the values of the source
instances are transformed into URIs in the target, e.g.,
the homepage of src1-data:Smith-W person instance.</p>
      <p>URI to Value (UtV). In this example, we need to
rename property src2:productHomepage into property
tgt:productHomepage, and the URIs of the source
instances are transformed into values in the target, e.g.,
the homepage of src2-data:HTC-Wildfire-S product
instance.</p>
      <p>Change Datatype (CD). Property dc:date in the first
source needs to be translated into dc:date, and its
type is transformed from xsd:string into xsd:date, e.g.,
the date of src1-data:Review-CI-001 review instance.</p>
      <p>Add Language Tag (ALT). property src2:mini-cv needs
to be renamed into property tgt:bio and a new tag
language “@en” is added in the target, e.g., the CV of
src2-data:Doe-J person instance.</p>
      <p>Remove Language Tag (RLT). property src1:revText
needs to be renamed into property tgt:text and the
tag language of the source is removed, e.g., the text of
src1-data:Review-CI-001 review instance.</p>
      <p>N:1 Value to Value (N:1). properties foaf:firstName and
foaf:surname in the second source need to be
translated into property tgt:name, and their values are
concatenated to compose the target value, e.g., the
first name and surname of src2-data:Doe-J person
instance.</p>
      <p>Aggregate (Agg). we count the number of instances
of source property src3:hasReview, and this
number needs to be translated as the value of property
tgt:totalReviews, e.g., the reviews of src3-data:VPCS
product instance.</p>
    </sec>
    <sec id="sec-6">
      <title>5. EXPERIMENTS</title>
      <p>
        The LODIB benchmark can be used to measure two
performance dimensions of a data translation system. For one
thing we state the expressivity of the data translation
system, that is, the number of mapping patterns that can be
expressed in each system. Secondly we measure the
performance by taking the time to translate all source data sets to
the target representation. For our benchmark experiment,
we generated data sets in N-Triples format containing 25, 50,
75 and 100 million triples. For each data translation system
and data set the time is measured starting with reading the
input data set file and ending when the output data set has
been completely serialized to one or more N-Triples files.
We have applied the benchmark to test the performance of
two data translation systems:
Mosto It is a tool to automatically generate executable
mappings amongst semantic-web ontologies [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. It is
based on an algorithm that relies on constraints such
as rdfs:domain of the source and target ontologies
to be integrated, and a number of 1-to-1
correspondences between TBox ontology entities [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Mosto
tool also allows to run these automatically generated
executable mappings using several semantic-web
technologies, such as Jena TDB, Jena SDB, or Oracle 11g.
For our tests we advised Mosto to generate (Jena
specific) SPARQL Construct queries. The data sets
were translated using these generated queries and Jena
TDB (version 0.8.10).
      </p>
      <p>
        LDIF It is an ETL like component for integrating data
from Linked Open Data sources [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. LDIF’s
integration pipeline includes one module for vocabulary
mapping, which executes mappings expressed in the
R2R [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] mapping language. All the R2R mappings
were written by hand. LDIF supports different run
time profiles that apply to different work loads. For
the smaller data sets we used the in-memory profile,
in which all the data is stored in memory. For the
100M data set we executed the Hadoop version, which
was run in single-node mode (pseudo-distributed) on
the benchmarking machine as the in-memory version
was not able to process this use case.
      </p>
      <p>
        To allow other researchers to reproduce our results, the
configuration and all used mappings for Mosto and LDIF are
publicly-available at the LODIB homepage [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. To set the
results of these two systems into context of the more
popular tools in the Linked Data space, we compared the
performance of both systems with the SPARQL 1.1 performance
of the Jena TDB RDF store (version 0.8.10). All the
mappings for Jena TDB were expressed as SPARQL 1.1
Construct queries, which were manually written by ourselves.
For loading the source data sets we used the more efficient
tdbloader2, which also generates data set statistics that are
used by the TDB optimizer.
      </p>
      <p>Table 6 gives an overview of the expressivity of the data
translation systems. All mapping patterns are expressable
in SPARQL 1.1, so all the mappings are actually executed on
Jena TDB. The current implementation of the Mosto tool
generates Jena-specific SPARQL Construct queries, which
could, in general, cover all the mapping patterns. However,
the goal of Mosto tool is to automatically generate SPARQL
Construct queries by means of constraints and
correspondences without user intervention, therefore, the meaning of
a checkmark in Table 6 is that it was able to automatically
generate executable mappings from the source and target
data sets and a number of correspondences amongst them.
Note that Mosto tool is not able to deal with RCP and RCV
mapping patterns since it does not allow the renaming of
classes based on conditional properties and/or values.
Furthermore, it does not support Agg mapping pattern since
it does not allow to aggregate/count properties. In R2R it
is not possible to express aggregates, therefore no
aggregation mapping was executed on LDIF. In order to check if
the source data has been correctly and fully translated, we
developed a validation tool that examines if the source data
is represented correctly in the target data set. Using the
validation tool, we verified that all three systems produce
proper results.</p>
      <p>To compare the performance and the scaling behaviour of
the systems we have run the benchmark on an Intel i7 950
(4 cores, 3.07GHz, 1 x SATA HDD) machine with 24GB of
RAM running Ubuntu 10.04.</p>
      <p>Table 7 summarizes the overall runtimes for each mapping
system and use case. Since Mosto and R2R were not able
to express all mapping patterns, we created three groups:
1) one that did not execute the RCV, RCP and AGG
mappings, 2) one without the AGG mapping and 3) one
executing the full set of mappings. The results show that Mosto
and Jena TDB have – as expected – similar runtime
performance because Mosto internally uses Jena TDB. LDIF
on the other hand is about twice as fast on the smallest
data set and about three times as fast for the largest data
set compared to Jena TDB and Mosto. One reason for the
differences could be that LDIF highly parallelizes its work
load, both in the in-memory as well as the Hadoop version.</p>
    </sec>
    <sec id="sec-7">
      <title>6. RELATED WORK</title>
      <p>
        The most closely related benchmarks are STBenchmark [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
and DTSBench [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Alexe et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] devised STBenchmark,
a benchmark that is used to test data translation systems
in the context of nested relational models. This benchmark
provides eleven patterns that occur frequently in the
information integration context. Unfortunately, this benchmark
is not suitable in our context since semantic-web
technologies have a number of inherent differences with respect to
nested relational models [
        <xref ref-type="bibr" rid="ref14 ref15 ref2 ref25">2, 14, 15, 25</xref>
        ].
      </p>
      <p>
        Rivero et al. [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] devised DTSBench, a benchmark to test
data translation systems in the context of semantic-web
technologies that provides seven data translation patterns.
Furthermore, it provides seven parameters that allow to
create a variety of synthetic, domain-independent data
translation tasks to test such systems. This benchmark
is suitable to test data translation amongst Linked Data
sources, however, the patterns that it provides are inspired
from the ontology evolution and information integration
contexts, not the Linked Data context. Therefore, it allows
to generate synthetic tasks based on these patterns, but not
real-world Linked Data translation tasks.
      </p>
      <p>
        There are other benchmarks in the literature that are
suitable to test semantic-web technologies. However, they
cannot be applied to our context, since none of them focuses on
data translation problems, i.e., they do not provide source
and target data sets and a number of queries to perform data
translation. Furthermore, these benchmarks focus mainly on
Select SPARQL queries, which are not suitable to perform
data translation, instead of on Construct SPARQL queries.
Guo et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] presented LUBM, a benchmark to compare
systems that support semantic-web technologies, which
provides a single ontology, a data generator algorithm that
allows to create scalable synthetic data, and fourteen SPARQL
queries of the Select type. Wu et al. [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] presented the
experience of the authors when implementing an inference
engine for Oracle. Bizer and Schultz [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] presented BSBM, a
benchmark to compare the performance of SPARQL queries
using native RDF stores and SPARQL-to-SQL query
rewriters. Schmidt et al. [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] presented SP2Bench, a benchmark
to test SPARQL query management systems, which
comprises both a data generator and a set of benchmark queries
in SPARQL.
      </p>
    </sec>
    <sec id="sec-8">
      <title>7. CONCLUSIONS</title>
      <p>Linked Data sources try to reuse as much existing
vocabularies as possible in order to ease the integration of data
from multiple sources. Other data sources use completely
proprietary vocabularies to represent their content or use a
mixture of common terms and proprietary terms. Due to
these facts, there exists heterogeneity amongst vocabularies
in the context of Linked Data. Data translation, which
relies on executable mappings and consists of exchanging data
from a source data set to a target data set, helps solve these
heterogeneity problems.</p>
      <p>In this paper, we presented LODIB, a benchmark to test
data translation systems in the context of Linked Data. Our
benchmark provides a catalogue of fifteen data translation
patterns, each of which is a common data translation
problem. Furthermore, we analyzed 84 random examples of data
translation in the LOD Cloud and we studied the
distribution of the patterns in these examples. Taking these results
into account, we devised three source and one target data
set based on the e-commerce domain that reflect the
mapping pattern distribution. Each source data set comprises
one data translation task.</p>
      <p>SPARQL 1.1 / Jena TDB 2,925 6,858 12,774
* Hadoop version of LDIF as single node cluster. Out of memory for in-memory version.
1 without RCP, RCV and AGG mappings
2 without AGG mapping
X
X
X
50M
Current benchmarks concerning data translation focus on
nested relational models, which is not suitable for our
context since semantic-web technologies have a number of
inherent differences with respect to these models, or in the
general context of semantic-web technologies. To the best of
our knowledge, LODIB is the first benchmark that is based
on real-world distribution of data translation patterns in the
LOD Cloud, and that is specifically tailored towards the
Linked Data context.</p>
      <p>In this paper, we compared three data translation systems,
Mosto, SPARQL 1.1/Jena TDB and R2R, by scaling the
three data translation tasks. In this context, Mosto is able to
deal with 12 out of the 15 mapping patterns described in this
paper, SPARQL 1.1/Jena TDB deals with 15 out of 15, and
R2R deals with 14 out of 15. Furthermore, the results show
that R2R outperforms both Mosto and SPARQL 1.1/Jena
TDB data translation systems when performing the three
data translation tasks. Our empirical study has shown that,
to translate data amongst data sets in the LOD Cloud, there
is only needed a small set of simple mapping patterns. In this
context, the fifteen mapping patterns identified in this paper
were enough to cover the vast majority of data translation
problems when integrating these data sets.</p>
      <p>
        As the Web of Data grows, the task of translating data
amongst data sets moves into the focus. We hope that
LODIB benchmark will be considered useful by the
developers of the currently existing Linked Data translation systems
as well as the systems to come. More information about
LODIB is publicly-available at the homepage [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], such as
the exact specification of the benchmark data sets, the data
generator, examples of the mapping patterns, or the
statistics about these mappings that we found in the LOD Cloud.
      </p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgments</title>
      <p>Supported by the European Commission (FEDER), the
Spanish and the Andalusian R&amp;D&amp;I programmes (grants
P07-TIC-2602, P08-TIC-4100, TIN2008-04718-E,
TIN201021744, TIN2010-09809-E, TIN2010-10811-E, and
TIN201009988-E), and partially financed through funds received
from the European Community’s Seventh Framework
Programme (FP7) under Grant Agreement No. 256975 (LATC)
and Grant Agreement No. 257943 (LOD2).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Alexe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. C.</given-names>
            <surname>Tan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Velegrakis. STBenchmark</surname>
          </string-name>
          :
          <article-title>Towards a benchmark for mapping systems</article-title>
          .
          <source>PVLDB</source>
          ,
          <volume>1</volume>
          (
          <issue>1</issue>
          ):
          <fpage>230</fpage>
          -
          <lpage>244</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Angles</surname>
          </string-name>
          and
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Guti´errez. Survey of graph database models</article-title>
          .
          <source>ACM Comput. Surv.</source>
          ,
          <volume>40</volume>
          (
          <issue>1</issue>
          ),
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Arenas</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Libkin</surname>
          </string-name>
          .
          <article-title>Xml data exchange: Consistency and query answering</article-title>
          .
          <source>J. ACM</source>
          ,
          <volume>55</volume>
          (
          <issue>2</issue>
          ),
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Heath</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          .
          <article-title>Linked Data - the story so far</article-title>
          .
          <source>Int. J. Semantic Web Inf. Syst.</source>
          ,
          <volume>5</volume>
          (
          <issue>3</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jentzsch</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          .
          <article-title>State of the LOD cloud</article-title>
          . Available at: http://www4.wiwiss. fu-berlin.de/lodcloud/state/#terms,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Schultz</surname>
          </string-name>
          .
          <article-title>The Berlin SPARQL benchmark</article-title>
          .
          <source>Int. J. Semantic Web Inf. Syst.</source>
          ,
          <volume>5</volume>
          (
          <issue>2</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>24</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Schultz</surname>
          </string-name>
          .
          <article-title>The R2R framework: Publishing and discovering mappings on the Web</article-title>
          .
          <source>In 1st International Workshop on Consuming Linked Data (COLD)</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Blum</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Cohen</surname>
          </string-name>
          . Grr:
          <article-title>Generating random RDF</article-title>
          .
          <source>In ESWC (2)</source>
          , pages
          <fpage>16</fpage>
          -
          <lpage>30</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Polleres</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Scharffe</surname>
          </string-name>
          .
          <article-title>Processing ontology alignments with SPARQL</article-title>
          .
          <source>In CISIS</source>
          , pages
          <fpage>913</fpage>
          -
          <lpage>917</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. G.</given-names>
            <surname>Kolaitis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Miller</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Popa</surname>
          </string-name>
          .
          <article-title>Data exchange: semantics and query answering</article-title>
          .
          <source>Theor. Comput. Sci.</source>
          ,
          <volume>336</volume>
          (
          <issue>1</issue>
          ):
          <fpage>89</fpage>
          -
          <lpage>124</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Fuxman</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>A</article-title>
          . Herna´ndez, C. T. H.
          <string-name>
            <surname>Ho</surname>
            ,
            <given-names>R. J.</given-names>
          </string-name>
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Papotti</surname>
            , and
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Popa</surname>
          </string-name>
          .
          <article-title>Nested mappings: Schema mapping reloaded</article-title>
          .
          <source>In VLDB</source>
          , pages
          <fpage>67</fpage>
          -
          <lpage>78</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Pan</surname>
          </string-name>
          , and
          <string-name>
            <surname>J. Heflin.</surname>
          </string-name>
          <article-title>LUBM: A benchmark for OWL knowledge base systems</article-title>
          .
          <source>J. Web Sem</source>
          .,
          <volume>3</volume>
          (
          <issue>2</issue>
          -3):
          <fpage>158</fpage>
          -
          <lpage>182</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>T.</given-names>
            <surname>Heath</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          .
          <article-title>Linked Data: Evolving the Web into a Global Data Space</article-title>
          . Morgan &amp; Claypool,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>B.</given-names>
            <surname>Motik</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Horrocks</surname>
          </string-name>
          , and
          <string-name>
            <given-names>U.</given-names>
            <surname>Sattler</surname>
          </string-name>
          .
          <article-title>Bridging the gap between OWL and relational databases</article-title>
          .
          <source>J. Web Sem</source>
          .,
          <volume>7</volume>
          (
          <issue>2</issue>
          ):
          <fpage>74</fpage>
          -
          <lpage>89</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>N. F.</given-names>
            <surname>Noy and M. C.</surname>
          </string-name>
          <article-title>A. Klein. Ontology evolution: Not the same as schema evolution</article-title>
          .
          <source>Knowl</source>
          . Inf. Syst.,
          <volume>6</volume>
          (
          <issue>4</issue>
          ):
          <fpage>428</fpage>
          -
          <lpage>440</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>F. S.</given-names>
            <surname>Parreiras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schenk</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Winter</surname>
          </string-name>
          .
          <article-title>Model driven specification of ontology translations</article-title>
          .
          <source>In ER</source>
          , pages
          <fpage>484</fpage>
          -
          <lpage>497</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Polleres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Scharffe</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Schindlauer</surname>
          </string-name>
          . SPARQL+
          <article-title>+ for mapping between RDF vocabularies</article-title>
          .
          <source>In ODBASE</source>
          , pages
          <fpage>878</fpage>
          -
          <lpage>896</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>H.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dou</surname>
          </string-name>
          , and P. LePendu.
          <article-title>Discovering executable semantic mappings between ontologies</article-title>
          .
          <source>In ODBASE</source>
          , pages
          <fpage>832</fpage>
          -
          <lpage>849</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>C. R.</given-names>
            <surname>Rivero</surname>
          </string-name>
          , I. Hern´andez,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ruiz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Corchuelo</surname>
          </string-name>
          .
          <article-title>Generating SPARQL executable mappings to integrate ontologies</article-title>
          .
          <source>In ER</source>
          , pages
          <fpage>118</fpage>
          -
          <lpage>131</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>C. R.</given-names>
            <surname>Rivero</surname>
          </string-name>
          , I. Hern´andez,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ruiz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Corchuelo</surname>
          </string-name>
          . Mosto:
          <article-title>Generating SPARQL executable mappings between ontologies</article-title>
          .
          <source>In ER Workshops</source>
          , pages
          <fpage>345</fpage>
          -
          <lpage>348</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>C. R.</given-names>
            <surname>Rivero</surname>
          </string-name>
          , I. Hern´andez,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ruiz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Corchuelo</surname>
          </string-name>
          .
          <article-title>On benchmarking data translation systems for semantic-web ontologies</article-title>
          .
          <source>In CIKM</source>
          , pages
          <fpage>1613</fpage>
          -
          <lpage>1618</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>C. R.</given-names>
            <surname>Rivero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Schultz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          .
          <article-title>Linked Open Data Integration Benchmark (LODIB) specification</article-title>
          . Available at: http://www4.wiwiss.fu-berlin.de/bizer/lodib/,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hornung</surname>
          </string-name>
          , G. Lausen, and
          <string-name>
            <given-names>C.</given-names>
            <surname>Pinkel</surname>
          </string-name>
          .
          <article-title>SP2Bench: A SPARQL performance benchmark</article-title>
          .
          <source>In ICDE</source>
          , pages
          <fpage>222</fpage>
          -
          <lpage>233</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>A.</given-names>
            <surname>Schultz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Matteini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Isele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Becker</surname>
          </string-name>
          . LDIF -
          <article-title>Linked Data integration framework</article-title>
          .
          <source>In 2nd International Workshop on Consuming Linked Data (COLD)</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>M.</given-names>
            <surname>Uschold</surname>
          </string-name>
          and
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Gru¨ninger. Ontologies and semantics for seamless connectivity</article-title>
          .
          <source>SIGMOD Record</source>
          ,
          <volume>33</volume>
          (
          <issue>4</issue>
          ):
          <fpage>58</fpage>
          -
          <lpage>64</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          , G. Eadon,
          <string-name>
            <given-names>S.</given-names>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. I.</given-names>
            <surname>Chong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kolovski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Annamalai</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Srinivasan</surname>
          </string-name>
          .
          <article-title>Implementing an inference engine for RDFS/OWL constructs and user-defined rules in oracle</article-title>
          .
          <source>In ICDE</source>
          , pages
          <fpage>1239</fpage>
          -
          <lpage>1248</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>