<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A modest proposal for data interlinking evaluation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jérôme Euzenat</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>INRIA</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>LIG (Jerome.Euzenat@inria.fr)</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Data interlinking, i.e., finding links between different linked data sets, is similar to ontology matching in various respects and would benefit from widely accepted evaluations such as the Ontology Alignment Evaluation Initiative. Instance matching evaluation has been performed as part of OAEI since 2009. Yet, there has been so far few participants to IM@OAEI and there is still no largely acknowledged benchmark for data interlinking activites. In order to secure more participation, we analyse the specificities of data interlinking and propose diverse modalities for evaluating data interlinking.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The data interlinking problem could be described in the same way as the ontology
matching problem was:
Given two linked data sets, usually tied to their vocabularies (or ontologies),
Generate a link set, i.e., a set of sameAs links between entities of the two data sets.</p>
      <p>This work has been partially supported by the French National Research Agency (ANR)
under grant ANR-10-CORD-009 (Datalift). A longer version of the proposal is available at
ftp://ftp.inrialpes.fr/pub/exmo/publications/euzenat2012b.pdf
matching compares entities in the same block in order to decide if they are the same.</p>
      <p>Since these two steps are clearly different, well identified, and it is possible to use
different matching methods with different blocking techniques, it is useful to evaluate
independently the capacities of these two techniques.</p>
      <p>In particular, from a reference link set, it is possible to determine how many pairs
in the link set a particular blocking technique misses (blocking recall) and how many
non necessary pairs a blocking technique imposes to compare (blocking precision).
Similarly, a matching technique could be evaluated with a given block structure if this
is necessary: the evaluation can be achieved by comparing the part of the reference
alignments which can be found from the given blocks.</p>
      <p>Scalability Linked data has to deal with large amounts of data. Even if this is also
the case in ontology matching, automatic data interlinking is really useful with large
amounts of data. So, besides qualitative evaluation, it is critical to assess the behaviour
of interlinking tools when data sizes get larger.</p>
      <p>Learning Another effect of the size of linked data is that learning is more relevant,
mainly for two reasons:
– The size of the data makes it difficult to study it for choosing the best approach and
after extracting a training sample, much work remains to be done;
– The regularity of the data facilitates machine learning efficiency.</p>
      <p>So, it is not surprising that learning methods are successful in data interlinking. This
provides incentive to evaluate data interlinking techniques using machine learning. For
that purpose, it is necessary to provide tests in which a part of the reference link set is
provided as training set to the systems which have then to deliver a complete link set.
Instance and ontology matching Data interlinking is dependent on the vocabularies
used in data sets. This vocabulary may be described by a schema or an ontology or not
described explicitly. Some matchers may be specialised for some of these situations and
it may be useful to recognise this by providing different evaluation tasks. In particular,
it seems useful to test these configurations:
– without published vocabulary (or ontology),
– with the same vocabulary or, alternatively, with aligned vocabularies,
– with different vocabularies, without alignment (as in the IIMB data set of IM@OAEI).
3</p>
      <p>Conclusions
In conclusion, it seems that in addition to or in combination with existing tasks provided
in IM@OAEI, such benchmarks should consider including:
– scalability tests (retaining one tenth, one hundredth, one thousandth of the data);
– training sets for tools using machine learning;
– separating the evaluation of blocking and matching for users who specifically
consider one of these aspects only;
– tests with no ontology, the same ontology and different ontologies.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>