<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Capturing Emerging Relations between Schema Ontologies on the Web of Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andriy Nikolov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Enrico Motta</string-name>
          <email>e.mottag@open.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Knowledge Media Institute, The Open University</institution>
          ,
          <addr-line>Milton Keynes</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Semantic heterogeneity caused by the use of di erent ontologies to describe the same topics represents an obstacle for many data integration tasks on the Web of Data, in particular, discovering relevant repositories for interlinking and comparing repositories with respect to the coverage of speci c domains. To facilitate these tasks, mappings between schema terms are needed alongside the links between instances. Currently, explicitly speci ed schema-level mappings are scarce in comparison with instance-level links. However, by analysing existing instance-level links it is possible to capture correspondences between classes to which these instances belong. In our experiments, we applied this approach on a large scale to generate schema-level mappings between several Linked Data repositories. The results of these experiments provide some interesting insights about the use of ontologies on the Web of Data and schema-level relations which emerge from existing data-level interlinks.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        One of the main motivations behind large-scale data publishing using the Linked
Data approach [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is the possibility to integrate relevant information originally
published by di erent providers. This is achieved, in particular, by establishing
links between instances in di erent repositories. However, linking a new
repository to other datasets in the cloud remains a non-trivial task for a data publisher.
In order to tackle this task, several questions have to be answered, in particular:
In order to answer the rst question, one needs to know what types of individuals
are stored in the datasets. With respect to the latter question, the choice of a
candidate third-party repository for establishing links depends on several factors,
in particular, their coverage (are all instances from the new repository mentioned
in a candidate repository?), popularity (which one is the commonly accepted
reference source for a speci c type of data?), and level of detail (which repository
describes the most properties for instances of a particular class?).
      </p>
      <p>These questions can partially be answered with the help of meta-level
descriptions using the voiD ontology1. However, voiD descriptors may be insu cient
to compare some of the characteristics (e.g., whether the domain of food and
diets is better covered in Freebase or DBPedia). Moreover, voiD descriptors not
always describe all relevant properties of datasets (e.g., dcterms:subject is not
always provided) and for some datasets may be not available.</p>
      <p>One of the major obstacles which complicate this kind of analysis is schema
heterogeneity. It can be di cult to establish automatically that two repositories
describe the same kind of data, retrieve relevant data subsets from them, and
make a comparison, if these repositories use di erent terminology to describe
the same or semantically similar types of instances. For example, a
hypothetical repository describing a TV program may need to refer to descriptions of
movies, music pieces, and their performers. There are several repositories
available on the Web: e.g., speci c sources describing the music topic (MusicBrainz,
Jamendo, etc.), the movie topic (LinkedMDB), as well as generic sources
covering both (DBPedia, Freebase). In order to compare how well these
repositories are suitable as reference sources, it is useful to know which classes in the
respective ontologies contain overlapping data: e.g., music:MusicArtist and
dbpedia:MusicalArtist, linkedmdb: lm and dbpedia:Movie, etc. Having a high-level
overview of schema-level correspondences, which would show the coverage of
topics by available ontologies would help the data publisher to make appropriate
choices.</p>
      <p>In this paper, we described our work on constructing such a network of class
level mappings for a subset of the Linked Data cloud. So far, several ontologies
used by popular Linked Data repositories were enriched with mappings
connecting them to other ontologies (most notably, in the context of the UMBEL
project2). However, these mappings, constructed in a top-down way, only cover
a limited subset of the Web of Data and do not fully re ect the structure of the
repository network formed by instance-level links (e.g., such important
repositories as Freebase, RKBExplorer, and LinkedMDB are not covered). Given the
abundance of existing instance-level links, a bottom-up process where the
correspondences between classes are captured based on the links between sets of
their instances becomes a promising approach. We applied light-weight
instancebased ontology matching techniques to a snapshot of the Web of Data which
was proposed for the Billion Triple Challenge 2009 competition3 and extracted
a large-scale network of ontology mappings. This network provides interesting
insights into the use of ontologies on the Web of Data and can be employed to
facilitate data integration.</p>
      <p>The rest of the paper is organised as follows: in section 2 we brie y outline
the ontology matching process we used to extract the mappings and discuss
our observations about its applicability and limitations. Then, in section 3 we
describe the resulting network of schema mappings we obtained. In section 4 we</p>
    </sec>
    <sec id="sec-2">
      <title>1 http://semanticweb.org/wiki/VoiD</title>
    </sec>
    <sec id="sec-3">
      <title>2 http://www.umbel.org</title>
    </sec>
    <sec id="sec-4">
      <title>3 http://vmlion25.deri.ie/</title>
      <p>overview relevant existing work. Finally, section 5 discusses the limitations of
our work and directions for the future work.
2</p>
      <sec id="sec-4-1">
        <title>Constructing the schema network</title>
        <p>The snapshot of the Web of Data which we used in our work was proposed
for the Billion Triple Challenge 2009 competition4. This is a large-scale dataset
containing about 1.14 billion statements. It contains the core portion of the
repositories published within the Linking Open Data (LOD) initiative, as well
as many smaller datasets retrieved using Semantic Web search engines, such
as Watson and Falcon-S. The LOD datasets included into the BTC repository
such as DBPedia, Freebase, Bio2RDF, RKBExplorer, Geonames, and others still
constitute the core of the Web of Data cloud and are commonly used to connect
other datasets. Thus, their schema ontologies are particularly interesting for
potential data integration scenarios.</p>
        <p>
          To derive the sets of mappings between these ontologies, we applied a
lightweight matching technique which computes the similarity between a pair of
classes based on the degree of overlap between their instance sets. Originally, we
used this approach to produce schema-level mappings in order to facilitate
further instance coreference resolution and discover previously missing links [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. An
advantage of using instance-based ontology matching techniques in the Linked
Data environment lies in their ability to capture interconnections between
ontologies which emerged from the way they are used by actual repositories rather
than the way they were originally designed.
        </p>
        <p>When two classes share at least one individual, we say that there is an overlap
relation between these classes. There are two common cases where an individual
becomes assigned to several classes de ned in di erent ontologies:
{ Declared coreference association. In this case, two individuals belonging to
di erent repositories are declared to be identical and linked via the owl:sameAs
property. This creates an overlap relation between the classes to which the
instances belong.
{ Co-typing. In this case the publishers of a repository structure the data
using terms of several ontologies. In this way, one individual can be explicitly
assigned to several classes from di erent ontologies. One example is DBPedia,
which uses Yago and UMBEL ontologies in addition to its native DBPedia
ontology.</p>
        <p>These two types of overlaps illustrate di erent aspects of the data structure.
Declared association-based overlap relations characterise the distribution of data
in di erent repositories and correspondences between sets of their individuals.
Co-typing-based mappings mostly highlight the choices of data publishers to
use speci c vocabularies to annotate their data. To keep this distinction, in</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4 Dataset statistics can be found on http://vmlion25.deri.ie/</title>
      <p>http://gromgull.net/blog/category/semantic-web/billion-triple-challenge/.
and
this paper we analyse the declared association-based and co-typing-based overlap
mappings separately.</p>
      <p>In order to generate all overlap relations present in the dataset, we used the
following procedure:
1. Extract all rdf:type relations present in the dataset: A(I), where A is a class
and I is an instance of this class.
2. For each class A, generate the set of its instances (extension): e(A) =
fIjA(I)g.
3. For each pair of classes A and B, generate the co-typing-based overlap set:
ecA \ B = fIjA(I); B(I)g. In total, this constituted about 3.6M
co-typingbased overlap mappings (we only considered intersections between classes
which did not share the same URI namespace)
4. Extract all owl:sameAs relations present in the dataset (sameAs(I1; I2)) and
generate their transitive closure.
5. Generate association-based overlap sets: ea(A \ B) = fI1jA(I1); B(I2);
sameAs(I1; I2)g (one sameAs relation corresponds to one element in the
set). In total, about 1M (992482) association-based overlap mappings were
produced.</p>
      <p>For association-based overlap sets we distinguished between a direct class link
(when their individuals were explicitly stated in the dataset as identical) and
an indirect link (when owl:sameAs relations were inferred using transitivity).
Indirect mappings occurred, in particular, when two repositories were connected
via a third one (e.g., MusicBrainz and Freebase via DBPedia). Both sets of
mappings were ltered to remove general-purpose concepts (such as OWL and
RDFS terms) and blank nodes. These two sets of mappings constitute the \raw
data" which were later analysed to retrieve valid semantic mappings.</p>
      <p>
        In our original work [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], we used a set similarity-based metrics to discover
relations between \strongly overlapping" classes in the ontologies. We used a
fuzzy notion of \strong overlap" instead of strict subsumption or equivalence
for two main reasons. First, in the Linked Data environment such mappings in
many cases are impossible to derive: sometimes even strong semantic similarity
between concepts does not imply strict equivalence. For instance, the concept
dbpedia:Actor denotes professional actors (both cinema and stage), while the
concept movie:actor in LinkedMDB refers to any person who played a role in
a movie, including participants in documentaries, but excluding stage actors.
Second, such \strong overlap" relations are valuable because they often point to
semantically similar categories which to a large extent share the same instances.
While not always strictly logically correct, these relations are still valuable for
the goals we discussed in section 1: determining and comparing suitable sources
for linking.
      </p>
      <p>In order to capture the optimal parameters for distinguishing valid semantic
mappings, in the experiments described in this paper we employed a machine
learning approach. To construct a gold standard set, we have randomly selected
a set of 6000 mappings (3000 association-based and 3000 co-typing-based ones)
and annotated them manually (\strong overlap" relations were assigned based on
subjective judgement). In these initial experiments, annotation was done by one
person. After that, we used this gold standard set to train a classi cation model
which would assign the relation type to any pair of overlapping classes. Our
goal was to nd a suitable classi er to distinguish between valid subsumption
and equivalence mappings (owl:equivalentClass and rdfs:subClassOf ) and other
mappings.</p>
      <p>For the classi er, we included the following features:
{ ns1, ns2 : namespaces of two class URIs A and B respectively.
{ je(A \ B)j: the size of the set of instances belonging to both classes A and</p>
      <p>B.
{ je(A)j, je(B)j: sizes of instance sets for classes A and B respectively.
{ (A; B), (B; A), where (X; Y ) = jej(eX(X\Y)j)j
{ direct (only for declared association-based links): a boolean value equal to
true for direct declared association-based mappings and false otherwise.
To test the resulting model, we used the standard 10-fold cross-validation
mechanism. After testing, we found that the J48 decision tree algorithm was able
to achieve the best performance (Table 1), so this learned classi er was then
applied to the whole dataset.</p>
      <p>The resulting set of mappings was compared against the set of already
existing schema-level relations declared in the dataset. We discovered that the
majority of overlap mappings were not covered by explicitly de ned axioms.
Only 3119 mappings (2162 and 957 for the declared association-based and
cotyping-based subsets respectively) were found to be de ned as rdfs:subClassOf
and owl:equivalentClass (or could be inferred), which constituted less than 2.6%
and 1.4% of the number of mappings selected by the learned classi er in each
case.
3</p>
      <sec id="sec-5-1">
        <title>Analysing the resulting mappings</title>
        <p>We applied the learned decision tree models (J48) to our two sets of mappings
containing declared association-based and co-typing-based overlap mappings. At
the next step, we ltered out redundant mappings: when a class A is found to be a
subclass of two classes B and Bsuper where B v Bsuper and the distance metrics
are equal ( (A; B) = (A; Bsuper)), then only the mapping A v B remains, and
the mapping A v Bsuper is removed. Two resulting sets of mappings were then
Property
Number of nodes
Number of edges
cMoanxniemctuimonsnupmerbnerodoef 5301
nNuomdebewritohf ctohnenmecatxioimnsum geonames:Feature
cAovnenreacgteionnusmpbeerrnoofde 8.09</p>
        <p>Declared association-based Co-typing-based
20365 35578
82422 67620
used to construct networks connecting classes from di erent ontologies. The
characteristics of this network are discussed in section 3.1. Then, in order to
study the relations between whole vocabularies, we used the original mappings
between classes to generate a set of mapping-based links between ontologies.
This stage is described in section 3.2.
3.1</p>
        <sec id="sec-5-1-1">
          <title>Links between classes</title>
          <p>We obtained two graphs where classes played the role of nodes and mappings
represented edges. The properties of these resulting networks of classes are given
in Table 2. To give an overview of the most important \hub" nodes in the
network, Table 3 lists the top 10 classes ranked by the number of connections
they are involved in.</p>
          <p>We can see that the \hub" nodes represent classes representing popular
concepts and de ned at the high level in the class hierarchy. Large number of
mappings per class is mostly caused by many rdfs:subClassOf relations. After
analysing the distribution of mappings per class, we found that in both cases it
follows the power law and most classes had only one mapping to another class.</p>
          <p>The declared association-based network derived from owl:sameAs links
between instances is more connected: average number of mappings per class is
8.09 compared to 3.8 in the co-typing-based case despite the fact that it
contains less nodes. This is possibly caused by the \data-level focus" of the LOD
initiative: the priority for a data repository owner is to generate instance-level
links to other repositories rather than reuse several di erent vocabularies for
data description. In this case, class-level mappings automatically derived from
owl:sameAs links can be particularly helpful for data integration tasks, because
they add new information which was not explicitly stated in any one repository.
On the other hand, the co-typing-based network illustrates the impact of
ontology popularity: although the graph has more nodes, it is less connected, and a
single class foaf:Person contributes to more than 25% of all mappings. From the
results we obtained, we can see the strong in uence of DBPedia on the
resulting mappings. In the association-based set, 7 out of the top 10 nodes relate to
18137
foaf:Person
3.80
Rank Declared association-based Co-typing-based
Rank Name Edges Name Edges
1 geonames:Feature 5301 foaf:Person 18137
2 freebase:people.person 2318 umbel:Person 4533
3 yago:PhysicalEntity100001930 2230 dbpedia:Person 2478
4 yago:Object100002684 2076 foaf:OnlineAccount 1983
5 yago:Abstraction100002137 1759 dbpedia:FootballPlayer 1300
6 yago:Whole100003553 1511 wordnet:Person 1237
7 linkedmdb: lm 1085 dbpedia:Album 996
8 yago:LivingThing100004258 975 dbpedia:Species 920
9 yago:Organism100004475 974 dbpedia:Artist 900
10 yago:CausalAgent100007347 956 dbpedia:MusicalArtist 853
top-level entities from the YAGO ontology. High positions of geonames:Feature
and freebase:people.person are also largely due to the number of DBPedia and
YAGO classes modelling the respective topics. In the co-typing-based network,
we can see the strong presence of the FOAF and WordNet ontologies (largely
due to their high reuse in small-scale datasets even before the start of the LOD
initiative). Beyond that, all top nodes in the network were produced based on
DBPedia instances annotated using di erent schemas. It is interesting to see the
high position of the class dbpedia:FootballPlayer. The main reason for it is the
large number of YAGO classes (Wikipedia categories) describing this topic.</p>
          <p>When we merged two mapping sets into one, we found that only a small
subset of mappings (3591) was shared between two networks. Two types of evidence
we used produced complementary sets of mappings rather than duplicated each
other.
3.2</p>
        </sec>
        <sec id="sec-5-1-2">
          <title>Mapping-based links between ontologies</title>
          <p>
            In order to capture the relations between di erent vocabularies used on the
Web of Data, we generated a set of mapping-based links between ontologies.
In accordance with [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ], we say that there is a mapping-based link between two
ontologies O1 and O2 if there exists a mapping between classes A and B such
that A 2 O1 and B 2 O2. The classes were assigned to ontologies based on their
URI pre xes, and mappings between classes from the same pair of ontologies
were grouped together. Table 4 contains the details of the resulting graphs, and
Table 5 lists for each case top 10 nodes sorted by the number of edges they are
connected to.
          </p>
          <p>The graphs constructed using declared association-based and co-typing-based
evidence are shown in Fig. 1 and Fig. 2. In the declared association-based graph
(Fig. 1), the main factor which in uences the position of an ontology in the
graph is topic coverage. The top 5 \hub" ontologies with wide coverage do not
have a large di erence in the number of connections: YAGO (29), Freebase (28),</p>
          <p>UMBEL(27), OpenCYC (26), and DBPedia (23). The 6th and the 7th ranking
nodes (LinkedMDB and eurostat), which cover speci c domains, have only 13
connections each. It is interesting to note that although Freebase is connected
to less repositories than DBPedia in the LOD cloud5, this does not have an
impact at the schema level. This is the e ect of indirect owl:sameAs mappings
inferred by transitivity. Connections of domain-speci c ontologies (such as
Music ontology or Geonames) point to other ontologies covering the same domain,
and, indirectly, to the underlying repositories which contain relevant data. This
makes them good starting points when the task is to nd several datasets
relevant to a speci c topic. Both networks contain several disjoint subgraphs (5
and 35 respectively), and in both cases the same pattern occurs: there exists one
large central cluster including the majority of nodes and several small ones
usually including a pair of ontologies (e.g., a cluster fhttp://purl.uniprot.org/core/,
http://bio2rdf.org/ns/uniprot#g). In Fig. 1, similarly to the data-level LOD
cloud, we can also observe the existence of two \communities" centered around
DBPedia and RKBExplorer. At the schema level these are centered around
YAGO and AKT ontologies. Both communities are connected via the FOAF
ontology (rdfs:subClassOf relations with the foaf:Person class). At the data level,
RKBExplorer and DBPedia are connected via two other repositories: DBLP
Hannover and DBLP Berlin. The reason for missing schema-level links between
AKT and the ontologies used in DBPedia was the omission of intermediate
owl:sameAs links on this route, which did not allow indirect declared
associationbased class mappings to be produced.</p>
          <p>The co-typing-based network (Fig. 2) is substantially larger (746 nodes vs
53) and mainly connects ontologies used outside the LOD cloud (including even
legacy schemas like DAML-OIL). In this graph, the distribution of nodes
primar</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5 http://richard.cyganiak.de/2007/10/lod/</title>
      <p>ily illustrates ontology popularity : FOAF (504 connections) and Wordnet (296)
get the most connections because they are reused in many datasets.
4</p>
      <sec id="sec-6-1">
        <title>Related Work</title>
        <p>
          Originally, schema matching approaches in the database and Semantic Web
domains primarily focused on the task of matching two input schemas in isolation
from others [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. With the availability of public ontologies, schema matching
methods started to utilise external sources as background knowledge. One
approach proposed in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] matches two ontologies by linking them to an external
third one. Then, semantic relations de ned in this external ontology are used
to infer mappings between entities of two original ones. The SCARLET tool [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]
further elaborates this approach and employs a set of external ontologies, which
it searches and selects using the Watson ontology search server6.
        </p>
        <p>
          Recently, with the growing number of public repositories storing data about
overlapping domains, it became important to analyse the emerging network
of interconnections as a whole. The idMesh system[
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] analysed the network of
instance-level owl:sameAs coreference links between semantic repositories with
the aim to identify spurious links and remove them. In [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] the authors used
light-weight matching techniques to create a large set of schema-level mappings
between ontologies from the BioPortal repository describing the medical
domain. Then, the authors analysed the resulting network to gain insights about
ontological coverage of the domain. We take a similar approach, however, our
primary interest is in schema mappings which emerge from existing data-level
links between repositories.
5
        </p>
      </sec>
      <sec id="sec-6-2">
        <title>Conclusion and future work</title>
        <p>As mentioned in section 1, schema-level mappings can become a valuable asset
for the data publisher who wants to integrate a new repository into the Linked
Data environment: for example, having a new repository about music described
using the Music ontology, the pool of potential data sources to connect to would
include other datasets using the the same ontology, but also repositories which
use ontologies mapped to the it (DBPedia, Freebase, LinkedMDB, etc.). From
this pool the publisher can select the most comprehensive data source for her
needs.</p>
        <p>We consider the work described in this paper as our starting point in studying
the emerging relations between ontologies on the Web of Data. There are several
interesting future directions of research. First, our approach focused on
establishing mappings between classes while ignoring mappings between properties,
which are equally important in data integration scenarios. Mappings between
properties are needed to represent data from di erent ontologies in a uniform</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>6 http://watson.kmi.open.ac.uk/WatsonWUI/</title>
      <p>way, which is necessary for applying coreference resolution tools or, in a more
general scenario, to present query results to the user.</p>
      <p>Second, in the context of our intended scenario (assisting the publisher in the
choice of appropriate points of linkage) the quality of mappings had relatively
low importance: a mapping is still useful if it connects two classes with a strong
degree of overlap, but no strict logical relation holds. This allowed us to use very
simple matching techniques to generate schema-level mappings. However, this
assumption does not hold for many actual data integration scenarios: in general,
a precise SPARQL query is not expected to return irrelevant results. Thus,
applying state-of-the-art ontology matching tools to discover high-quality schema
mappings in the Linked Data environment constitutes the second direction for
future work.
6</p>
      <sec id="sec-7-1">
        <title>Acknowledgements</title>
        <p>Part of this research has been funded under the EC 7th Framework Programme,
in the context of the SmartProducts project (231204). The authors would like to
thank Paul Groth and Cristophe Gueret for providing the 4store Amazon EC2
community server hosting the BTC dataset.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heath</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Linked data - the story so far</article-title>
          .
          <source>International Journal on Semantic Web and Information Systems (IJSWIS) 5</source>
          (
          <issue>3</issue>
          ) 1{
          <fpage>22</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Nikolov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Uren</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Motta</surname>
          </string-name>
          , E.,
          <string-name>
            <surname>de Roeck</surname>
          </string-name>
          , A.:
          <article-title>Overcoming schema heterogeneity between linked semantic repositories to improve coreference resolution</article-title>
          .
          <source>In: 4th Asian Semantic Web Conference (ASWC</source>
          <year>2009</year>
          ), Shanghai, China (
          <year>2009</year>
          )
          <volume>332</volume>
          {
          <fpage>346</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Ghazvinian</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Noy</surname>
            ,
            <given-names>N.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jonquet</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shah</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Musen</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          :
          <article-title>What four million mappings can tell you about two hundred ontologies</article-title>
          .
          <source>In: 8th International Semantic Web Conference (ISWC</source>
          <year>2009</year>
          ), Washington DC, USA (
          <year>2009</year>
          )
          <volume>229</volume>
          {
          <fpage>242</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Rahm</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Do</surname>
            ,
            <given-names>H.H.</given-names>
          </string-name>
          :
          <article-title>Data cleaning: Problems and current approaches</article-title>
          .
          <source>IEEE Bulletin of the Technical Committee on Data Engineering</source>
          <volume>23</volume>
          (
          <issue>4</issue>
          ) (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Euzenat</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shvaiko</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Ontology matching. Springer-Verlag, Heidelberg (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Aleksovski</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>M.C.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>ten Kate</surname>
          </string-name>
          , W., van
          <string-name>
            <surname>Harmelen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Matching unstructured vocabularies using a background ontology</article-title>
          .
          <source>In: 15th International Conference on Knowledge Engineering and Knowledge Management (EKAW</source>
          <year>2006</year>
          ).
          <article-title>(</article-title>
          <year>2006</year>
          )
          <volume>182</volume>
          {
          <fpage>197</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Sabou</surname>
          </string-name>
          , M.,
          <string-name>
            <surname>d'Aquin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Motta</surname>
          </string-name>
          , E.:
          <article-title>Exploring the Semantic Web as background knowledge for ontology matching</article-title>
          .
          <source>Journal of Data Semantics XI</source>
          (
          <year>2008</year>
          )
          <volume>156</volume>
          {
          <fpage>190</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Cudre-Mauroux</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haghani</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jost</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aberer</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>de Meer</surname>
          </string-name>
          , H.:
          <article-title>idMesh: Graphbased disambiguation of linked data</article-title>
          .
          <source>In: 18th International World Wide Web Conference (WWW</source>
          <year>2009</year>
          ), Madrid, Spain,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2009</year>
          )
          <volume>591</volume>
          {
          <fpage>600</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>