<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards a Benchmark for Instance Matching ?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Al o Ferrara</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Davide Lorusso</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefano Montanelli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gaia Varese</string-name>
          <email>vareseg@dico.unimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universita degli Studi di Milano, DICo</institution>
          ,
          <addr-line>10235 Milano</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In the general eld of knowledge interoperability and ontology matching, instance matching is a crucial task for several applications, from identity recognition to data integration. The aim of instance matching is to detect instances referred to the same real-world object despite the di erences among their descriptions. Algorithms and techniques for instance matching have been proposed in literature, however the problem of their evaluation is still open. Furthermore, a widely recognized problem in the Semantic Web in general is the lack of evaluation data. While OAEI (Ontology Alignment Evaluation Initiative) has provided a reference benchmark for concept matching, evaluation data for instance matching are still few. In this paper, we provide a benchmark for instance matching, with the goal of taking into account the main requirements that instance matching algorithms should address.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The increasing popularity of Semantic Web technologies makes the ontology
matching process a crucial task. Ontology matching [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] aim is to (semi)
automatically detect semantic correspondences between heterogeneous ontologies. It
can be performed at two di erent levels: schema matching and instance
matching. The objective of schema matching [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is to nd out a set of mappings between
concepts and properties in di erent ontologies, while the aim of instance
matching is to detect instances referred to the same real-world object. When
comparing di erent knowledge representations, ontologies' schemas should be merged,
in terms of concepts and properties describing the domain. Then, mappings
between di erent descriptions (i.e., ontologies' instances) of the same object should
be discovered, in order to achieve the goal of providing a data integration system
over Semantic Web sources.
      </p>
      <p>
        Instance matching is also crucial in projects like OKKAM1 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], where the main
idea is that real-world objects' descriptions could be retrieved, univocally
identi ed and shared over the Web.
      </p>
      <p>Most research has been focused on schema level matching, while instance
matching problem has been mainly studied in the database eld, in which it is more
speci cally called record linkage problem [4{6]. However, as shown in the paper,
instance matching brings new problems in comparison to record linkage and
requires speci c technologies.
2</p>
    </sec>
    <sec id="sec-2">
      <title>The Instance Matching Problem</title>
      <p>The instance matching problem is de ned as follows. Given two instances i1 and
i2, belonging to the same ontology or to di erent ontologies, instance matching
is de ned as a function Im(i1; i2) ! f0; 1g, where 1 denotes the fact that i1 and
i2 are referred to the same real-world object and 0 denotes the fact that i1 and
i2 are referred to di erent objects.</p>
      <p>In order to nd out properly if two individuals are referred to the same real-world
object, an instance matching algorithm should satisfy di erent kinds of
requirements. As shown in Figure 1, those can be divided in three main categories.</p>
      <p>Requirements
(management of:)
Data value differences</p>
      <p>Structural heterogeneity</p>
      <p>Logical heterogeneity
- Typographical errors
- Use of different standard
formats
- Use of different levels of
depth for properties
representation
- Use of different aggregation
criteria for properties
representation
- Missing values specification
- Instantiation on different
sub-classes of the same
super class
- Instantiation on disjoint
classes
- Instantiation on different
classes of a class hierarchy
explicitly declared
- Instantiation on different
classes of a class hierarchy
implicitly declared
- Implicit values specification
Data value di erences. An instance matching algorithm is required to
recognize, as better as possible, corresponding values, even if data contain errors or
are represented using di erent standard formats. This issue has been addressed
in the eld of record linkage research, and the problem of comparing instances'
property values is the same as comparing records' attribute values.
Structural heterogeneity. Instances belonging to di erent ontologies can not
only di er within their properties values, but they can also have di erent
structures. While in record linkage the structure of records is usually given and schema
and record matching are di erent problems, in instance matching, schema and
instances are more strictly related. Thus, besides the capability to evaluate the
level of similarity between property values, instance matching techniques have
to go beyond heterogeneous individual representations by identifying the pairs
of matching properties between two considered instances.</p>
      <p>Logical heterogeneity. A speci c ontologies' matching problem, which is not
taken into consideration in record linkage process, is the need to infer implicit
knowledge, typically referred to concepts hierarchy within the ontologies.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Design of a Benchmark for Instance Matching</title>
      <p>
        A widely recognized problem in the Semantic Web is the lack of evaluation data.
While OAEI (Ontology Alignment Evaluation Initiative)2 [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] has provided a
reference benchmark for concept matching, evaluation data for instance matching
are still few. Further works dealing with concept matching evaluation are those
published in ESWC 2008 [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]. In particular, they argue that ontology
matching techniques cannot be evaluated in an application independent way, since the
same matching technique can produce di erent quality results based on the
endto-end application that exploits the alignments.
      </p>
      <p>In this paper, we provide a benchmark for instance matching. The aim of our
benchmark is to take into account all the main requirements presented in the
previous section and to provide a complete set of tests for instance matching
algorithms evaluation. A contribution of our work is not only the de nition of
a speci c benchmark, but also the de nition of a semi-automatic procedure for
the generation of several di erent benchmarks. In Figure 2, the overall process
of benchmarks generation is shown. As an example of this general procedure,
we describe in the following a speci c instantiation of it, that is the creation
of a speci c benchmark for instance matching. That benchmark is available at
http://islab.dico.unimi.it/iimb/.
3.1</p>
      <sec id="sec-3-1">
        <title>Reference ABox Generation</title>
        <p>First of all, we chose a domain of interest (i.e., the domain of movie data), and we
created a reference (ALCF (D)) TBox for it, based on our knowledge of the
domain. The reference TBox is available at
http://islab.dico.unimi.it/ontologies/benchmark/imdbT.owl. This contains 15 named classes, 5 object properties and
13 datatype properties. The reference TBox is then populated by automatically
creating a reference ABox. Data are extracted from IMDb 3 by executing a query
2 http://oaei.ontologymatching.org/2007/benchmarks/.
3 http://www.imdb.com/.</p>
        <p>User
Query
Modified ABoxes Generation
input</p>
        <p>IMDb</p>
        <p>POPULATION</p>
        <p>Reference</p>
        <p>ABox
MODIFIER
Modified
ABox 2</p>
        <p>output
Modified
ABox 1
...</p>
        <p>Modified</p>
        <p>ABox n
where X is a variable specifying a word of our choice. Thus, all selected movies
contain the word X in their title. The corresponding individuals in the reference
ABox are referred to similar objects, but each of them represents a distinct object
in the real world. As a consequence, each instance can be univocally identi ed.
In order to get our reference ABox, we put X = Scarf ace. The reference ABox
obtained in that way contains 302 individuals, that is all the movie objects
matching the query and all the actors in the movie cast.
Once the reference ABox is created, we generate a set of modi ed ABoxes, each
consisting in a collection of instances obtained modifying the corresponding
instances in the reference ABox. Transformations introduced in benchmark ABoxes
can be distinguished into three main categories. In particular, each modi cation
category simulates a speci c problem that can be found when comparing
ontologies' instances, that is the issues discussed in section 2. Modi cations belonging
to di erent categories are also combined together within the same ABox.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Generating Instance Modi cations</title>
      <p>In this section, we describe the Modi er module of our benchmarks generation
procedure, that is the way the modi ed ABoxes of benchmarks are generated.
Given the reference ABox as input, and a user speci cation of all the
transformations to apply on it, the Modi er module automatically produces the
corresponding modi ed ABoxes. In the following, all the modi cations that can be
applied on the reference ABox are presented.
4.1</p>
      <sec id="sec-4-1">
        <title>Data Value Di erences</title>
        <p>The goal of this rst category of modi cations is to simulate the di erences that
can be found between instances referred to the same object at the property value
level. Those include typographical errors, use of di erent standard formats to
represent the same value, or a combination of both within the same value.
Typographical errors. Real data are often dirty. That is mainly due to
typographical errors made by humans while storing data.</p>
        <p>In order to simulate typographical errors, we use a function that takes as input
a datatype property value and produces as output a modi ed value. This kind
of transformation can be applied to each datatype property value (e.g., string
value, integer value, date value). The modi cations to apply on the input value
are randomly chosen between the following:
{ Insert character. A random character (or a random number, if the property
has a numerical value) is inserted in the input value at a random position.
{ Modify character. A random character (or a random number, if the property
has a numerical value) is modi ed in the input value.
{ Delete character. A random character (or a random number, if the property
has a numerical value) is deleted in the input value.
{ Exchange characters' position. The position of two adjacent characters (or
two adjacent numbers, if the property has a numerical value) is exchanged
in the input value.</p>
        <p>For example, the movie title \Scarface" can be transformed in the modi ed value
\Scrface", obtained deleting a random character from the original string.
In addition, it is possible to specify the level of severity (i.e., low, medium or
high) in applying such transformations. Anyway, the number of transformations
introduced in the input value is proportional to the value's length. If the number
of transformations to apply is greater than one, the corresponding value can be
modi ed combining di erent transformations.</p>
        <p>Typographical modi cations can be applied to \identifying properties",
\nonidentifying properties" or both. That classi cation is based on the analysis of
the percentage of null and distinct values speci ed for the selected property. In
particular, properties with an high percentage of distinct values and a low
percentage of null values are classi ed as the most identifying.</p>
        <p>Of course, the total amount of modi cations applied to each modi ed ABox has
to change the reference ABox in a way that it is still reasonable to consider
the two ABoxes semantically equivalent. In other words, a modi ed ABox is
included in the benchmark only if a human can understand that its instances
are referred to the same real-world objects as the ones belonging to the
reference ABox. Thus, in order to evaluate the distance between the reference ABox
and each modi ed ABox, we introduce a measure that takes into account the
number of modi cations applied to the same ABox, the kind of the properties
(i.e., \identifying properties" or \non-identifying properties") which have been
modi ed, and the level of severity of the modi cations (i.e., low, medium or
high). However, this measure does not a ect the instance matching results in
a deterministic way, since they depend on the weight that the tested algorithm
gives to each kind of modi cation. Anyway, we assume that a modi ed ABox can
be considered semantically equivalent to the reference ABox only if it changes
no more than 20% of each instance description.</p>
        <p>Use of di erent standard formats. The same data within di erent sources
can be represented in di erent ways.</p>
        <p>In order to simulate the use of di erent standards within di erent sources, we
use a function that takes as input a property value which allows standard
modi cations (e.g., person name) and produces as output a modi ed value, using a
di erent standard format. For example, the director name \De Palma, Brian"
can be transformed in the modi ed value \Brian De Palma", which is another
standard format to specify a person name.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Structural Heterogeneity</title>
        <p>Another kind of situation that is simulated in our instance matching benchmark
is the comparison between instances with di erent schemas. In fact, even
assuming that concept mappings are available, the same individual feature (i.e.,
each instance property) can be modeled in di erent ways. Moreover, di erent
descriptions of the same real-world object can specify di erent subsets,
eventually empty, of all the possible values for that property. Combinations of di erent
transformations belonging to this class of modi cation are also applied in the
benchmark.</p>
      </sec>
      <sec id="sec-4-3">
        <title>Use of di erent levels of depth for properties representation. A rst</title>
        <p>example of this class of heterogeneity is shown in Figure 3. The two instances
Scarface</p>
        <p>HasTitle
movie 1 and movie 2 are both referred to the same lm, but the movie title
property is modeled in two di erent ways. In fact, the title of movie 1 is
speci ed directly through a datatype property value, while the title of movie 2 is
speci ed through a reference to another individual which has a property with
the same title value (i.e., \Scarface"). In particular, in the rst representation,
the property HasTitle is a datatype property, while in the second one it is an
object property and its value is the reference to title 1 instance.
In order to simulate the comparison between instances with di erent schemas,
we use a function that takes as input a datatype property and produces as
output an object property with the same name. Moreover, the function creates a
new attribute to the generated object property, whose value is the same as the
original datatype property.</p>
      </sec>
      <sec id="sec-4-4">
        <title>Use of di erent aggregation criteria for properties representation. In</title>
        <p>an analogous way, the name of a person can be stored all within the same
property, or it can be split into di erent properties such as, for example, Name
and Surname. Figure 4 shows two di erent ways of modeling the name \Pacino,
Al". In the rst representation the whole value is stored within the property
actor_1
actor_2
Pacino, Al</p>
        <p>Name</p>
        <p>Gender
M</p>
        <p>Nickname
DateOfBirth</p>
        <p>Sonny
1940-04-25</p>
        <p>Pacino</p>
        <p>Surname</p>
        <p>Name DateOfBirth</p>
        <p>Gender</p>
        <p>Nickname</p>
        <p>Sonny
Al</p>
        <p>M
1940-04-25
Name, while in the second one the string is split into the two values \Pacino"
and \Al", referred to the properties Name and Surname respectively.
In order to simulate the comparison between properties modeled in di erent
ways, we use a function that takes as input a datatype property value that can
be split and produces as output two new datatype properties, each specifying a
di erent part of the original value.</p>
        <p>Missing values speci cation. A further example of structural heterogeneity
is shown in Figure 5. The two instances movie 1 and movie 2 are both referred
to the same lm, but the two di erent descriptions specify di erent subsets of
values on the property Genre.</p>
        <p>In order to simulate the comparison between di erent sets of values referred to
the same property, we use a function that takes as input the set of values speci ed
for a selected property and produces as output a subset, eventually empty, of
it. This kind of transformation can be applied to each property. Moreover, if a
property allows multiple values, it is possible to specify if deleting all the values
of the selected property or a random number of them.
Scarface</p>
        <p>HasTitle</p>
        <p>Genre</p>
        <p>Year
1983
Finally, instance matching process should take into account the need to use some
kind of reasoning, in order to nd out correctly instances to be compared. In
fact, ontologies' individuals referring to the same entity can be instantiated in
di erent ways within di erent ontologies. In the following we describe ve kinds
of situations that we develop in our benchmark, that can also be combined
together. Each requires some kind of reasoning. Examples of those are shown in
Figure 6.</p>
        <p>M ovie u P roduct v ?</p>
        <p>M ovie 8p:G
SubM 8p:SubG</p>
        <p>SubG v G</p>
      </sec>
      <sec id="sec-4-5">
        <title>Instantiation on di erent subclasses of the same superclass. This trans</title>
        <p>formation is obtained instantiating identical individuals into di erent subclasses
of the same class. For example, in our benchmark, all the movie objects are
instances of class Movie in the reference ABox. Instead, in one of the modi ed
ABoxes, we change the type of those individuals, making them instances of class
Film. Classes Movie and Film are both subclasses of Item. In Figure 6, movie 1
is instance of Movie in the reference ABox, while it is instance of Film in the
modi ed ABox. Instance matching algorithms are thus required to recognize
that those two instances are referred to the same object, even if they belong to
di erent concepts.</p>
        <p>Instantiation on disjoint classes. This transformation is obtained
instantiating identical individuals into disjoint classes. For example, in one of the modi ed
ABoxes, we change the type of all the movie objects, making them instances of
class Product. Classes Movie and Product are de ned as disjoint classes in the
reference TBox. In Figure 6, movie 2 is instance of Movie in the reference ABox,
while it is instance of Product in the modi ed ABox. In this case we want that
tested algorithms would be able to recognize that instances belonging to
disjoint classes cannot be referred to the same real-world object, even if they seem
identical.</p>
      </sec>
      <sec id="sec-4-6">
        <title>Instantiation on di erent classes of a class hierarchy explicitly de</title>
        <p>clared. This transformation is obtained instantiating identical individuals into
di erent classes on which an explicit class hierarchy is de ned. For example, an
individual representing a movie can be classi ed as an instance of the general
concept Movie, as it is in the reference ABox, or it can be classi ed as an
instance of a more speci c subclass of it, such as Action, Biography, Comedy or
Drama, depending on the value that the movie instances specify on the property
Genre. In Figure 6, movie 3 is instance of Movie in the reference ABox, while
it is instance of its subclass Action in the modi ed ABox, since it is an action
movie. Instance matching algorithms are thus required to recognize that those
two instances are referred to the same object, even if they belong to di erent
concepts within the class hierarchy. This explicit class hierarchy declaration can
be recognized using a RDFS reasoner.</p>
      </sec>
      <sec id="sec-4-7">
        <title>Instantiation on di erent classes of a class hierarchy implicitly de</title>
        <p>clared. A further modi cation that we apply in the benchmark is the
instantiation of identical individuals into di erent classes on which an implicit class
hierarchy is de ned. Such an implicit class hierarchy declaration can be obtained
through the use of restrictions. For example, the restrictions speci ed on classes
Movie and SubM in the reference TBox, implicitly declare that SubM is a
subclass of Movie. In Figure 6, movie 4 is instance of Movie in the reference ABox,
while it is instance of SubM in the modi ed ABox. Instance matching algorithms
are thus required to recognize that those two instances are referred to the same
object, even if they belong to di erent concepts which are not explicitly related.
This implicit class hierarchy declaration can be recognized using a DL reasoner.
Implicit values speci cation. Another use of restrictions that requires a
reasoning process, is the comparison between an explicit speci ed value and an
implicit speci ed one, that is using an hasValue restriction. This kind of situation
is simulated in our benchmark by adding a new type for each instance of the
modi ed ABox. This type is a class that (implicitly) speci es property values
through an hasValue restriction. In Figure 6, in the reference ABox, movie 5 is
instance of Movie and its value on the property HasTitle is \Scarface"; in the
modi ed ABox, movie 5 is as well instance of Movie, but it is also instance of
the restriction class that implicitly speci es the value \Scarface" for its HasTitle
property. Instance matching algorithms are thus required to recognize that those
two instances are referred to the same object, even if some property values of
the modi ed instance are implicitly de ned.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Benchmark at Work</title>
      <p>In this section, we describe how the generated benchmark is used to evaluate
instance matching algorithms. Each execution of the evaluation process takes as
input a couple of ABoxes, that is the reference ABox and one of the modi ed
ABoxes, and produces the set of instance mappings found by the tested
algorithm. The output mapping alignment is then compared with the expected one,
which is given together with each modi ed ABox. That reference alignment is
automatically generated by specifying a mapping for each couple of corresponding
instances, that is the one belonging to the reference ABox and the one obtained
by applying to it one or more of the modi cations discussed in section 4.
Instance matching algorithms are evaluated according to the following
parameters.</p>
      <p>{ Precision: the number of correct retrieved mappings = the number of
retrieved mappings.
{ Recall: the number of correct retrieved mappings = the number of expected
mappings.
{ F-measure: 2 (precision recall) = (precision + recall).
{ Fall-out: the number of incorrect retrieved mappings = the number of
nonexpected mappings.
{ Execution time: time taken by the tested algorithm to compare the two input</p>
      <p>ABoxes. This parameter measures how well the tested algorithm scales.
As an example, the results obtained by two instance matching algorithms are
reported. Figure 7 shows the precision and recall evaluation of the two instance
matching algorithms over the generated benchmark, distinguishing the results
obtained in the three main classes of problems simulated in our benchmark
(i.e., data value di erences, structural heterogeneity, logical heterogeneity) and
the ones obtained executing each algorithm without using any reasoner and
using a (DL) reasoner (i.e., Pellet). The results obtained comparing the reference
ABox with modi ed ABoxes simulating data value di erences are higher than the
ones obtained in the other categories, since string matching techniques are quite
consolidated. The results obtained comparing the reference ABox with modi ed
ABoxes simulating structural heterogeneity are not very high because neither the
rst nor the second algorithm can manage the use of di erent aggregation criteria
1,00
0,90
0,80
0,70
0,60
0,50
0,40
0,30
0,20
0,10
0,00
1,00
0,90
0,80
0,70
0,60
0,50
0,40
0,30
0,20
0,10
0,00</p>
      <sec id="sec-5-1">
        <title>Data values</title>
        <p>differences</p>
      </sec>
      <sec id="sec-5-2">
        <title>Structural</title>
        <p>heterogeneity</p>
      </sec>
      <sec id="sec-5-3">
        <title>Logical</title>
        <p>heterogeneity</p>
      </sec>
      <sec id="sec-5-4">
        <title>Data values</title>
        <p>differences</p>
      </sec>
      <sec id="sec-5-5">
        <title>Structural</title>
        <p>heterogeneity</p>
      </sec>
      <sec id="sec-5-6">
        <title>Logical heterogeneity</title>
        <p>for properties representation. The results obtained comparing the reference ABox
with modi ed ABoxes simulating logical heterogeneity are greatly a ected by the
use of a reasoner.</p>
        <p>
          Finally, in Figure 8, the overall results obtained executing the two algorithms
(with reasoner) on our benchmark are reported. That test had been executed
on a Pentium 4 (2.00 GHz) with 512 MB of RAM. For each pair of compared
IM Algorithm Precision Recall F-measure Fall-out Execution time
algorithm 1
algorithm 2
instances, the rst algorithm [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] analyzes all their property values, while the
second algorithm [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] checks only the values speci ed for the \most identifying"
properties. That is why the execution time of the rst algorithm is greater than
the execution time of the second one. Moreover, the recall of the second algorithm
is higher than the recall of the rst one due to the fact that all the modi cations
applied to \non-identifying" properties are ignored. A more detailed description
of the two algorithms is available in [
          <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
          ].
6
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Concluding Remarks</title>
      <p>In this paper, we provided a benchmark for instance matching, taking into
account the main requirements that instance matching algorithms should address.
A contribution of our work is not only the de nition of a speci c benchmark, but
also the de nition of a semi-automatic procedure for the generation of several
di erent benchmarks.</p>
      <p>Future works include the creation of further benchmarks dealing with data
belonging to di erent sources and di erent domains. In particular, we would like to
create a benchmark in which data belonging to di erent sources but referred to
the same real-world objects are compared. For example, it can include a mapping
between movie descriptions in IMDb and Amazon. In that case, the expected
alignments have to be done manually, so the benchmark dimension cannot be
signi cant for a real benchmark. However, it would be interesting to compare
the results obtained by the same algorithms executing that benchmark and our
semi-automatically generated one, in order to evaluate the quality of our
benchmark generation itself.</p>
      <p>Another possible development would be the de nition of a set of rules that
automatically choose the modi cations to apply, for each modi ed ABox, to the
reference ABox.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Euzenat</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shvaiko</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Ontology Matching. Springer-Verlag (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Shvaiko</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Euzenat</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A survey of schema-based matching approaches</article-title>
          .
          <source>Journal on Data Semantics (JoDS)</source>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bouquet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoermer</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Niederee</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mana</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Entity name system: The backbone of an open and scalable web of data</article-title>
          .
          <source>In: Proceedings of the IEEE International Conference on Semantic Computing, ICSC</source>
          <year>2008</year>
          , IEEE (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Fellegi</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sunter</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>A theory for record linkage</article-title>
          .
          <source>J. Am. Statistical Assoc</source>
          . (
          <year>1969</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Winkler</surname>
            ,
            <given-names>W.:</given-names>
          </string-name>
          <article-title>The state of record linkage and current research problems</article-title>
          .
          <source>Technical report</source>
          , Statistical Research Division, U.S. Bureau of the Census, Wachington, DC (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Gu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baxter</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vickers</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rainsford</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Record linkage: Current practice and future directions</article-title>
          .
          <source>Technical report, CSIRO Mathematical and Information Sciences, Canberra</source>
          , Australia (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Shvaiko</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Euzenat</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Noy</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stuckenschmidt</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Benjamins</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Uschold</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>Proceedings of the 1st international workshop on ontology matching (om-2006) collocated with the 5th international semantic web conference (iswc-</article-title>
          <year>2006</year>
          ), athens, georgia, usa, november
          <volume>5</volume>
          ,
          <year>2006</year>
          . In: OM. Volume
          <volume>225</volume>
          ., CEUR-WS.org (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Hollink</surname>
          </string-name>
          , L.,
          <string-name>
            <surname>van Assem</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Isaac</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schreiber</surname>
          </string-name>
          , G.:
          <article-title>Two variations on ontology alignment evaluation: Methodological issues</article-title>
          .
          <source>ESWC</source>
          <year>2008</year>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Isaac</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Matthezing</surname>
          </string-name>
          , H., van der Meij, L.,
          <string-name>
            <surname>Schlobach</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zinn</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Putting ontology alignment in contex: Usage scenarios, deployment and evaluation in a library case</article-title>
          .
          <source>ESWC</source>
          <year>2008</year>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Castano</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferrara</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montanelli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Matching ontologies in open networked systems: Techniques and applications</article-title>
          .
          <source>Journal on Data Semantics (JoDS)</source>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Bruno</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Castano</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferrara</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lorusso</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Messa</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montanelli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Ontology coordination tools: Version 2</article-title>
          .
          <source>Technical Report D4</source>
          .7,
          <string-name>
            <given-names>BOEMIE</given-names>
            <surname>Project</surname>
          </string-name>
          ,
          <fpage>FP6</fpage>
          -
          <volume>027538</volume>
          , 6th EU Framework Programme (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>