<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Framework to Generate Reference Sets for Ontology Matching Algorithms?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gurpriy</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>sh Gh</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>hin Lo</string-name>
          <email>sachin.lodhag@tcs.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>54B, Tata Research Development and Design Center, Tata Consultancy Services Ltd., Hadapsar Industrial Estate</institution>
          ,
          <addr-line>Hadapsar, Pune, Maharashtra -411013</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>The performance of ontology matching algorithms is evaluated using F-measure, precision and recall which in turn rely on the availability of the ground truth. Typically, the ground truth generation process is manual, subjective and time consuming. Therefore, there is a need to come up with a (semi) automated approach which generates an unbiased reference set; an approximation of ground truth. We propose a framework based solution to generate a reference set and report encouraging results for the OAEI 2019 conference dataset.</p>
      </abstract>
      <kwd-group>
        <kwd>Reference Set Ontology Matching</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>The performance of ontology matching algorithms is evaluated using the
Fmeasure, precision, and recall measures. These measures in turn rely on the
ground truth (gold standard) generated by a community of domain experts.
Typically, the ground truth creation is manual, subjective and time consuming
exercise. Due to its subjective nature, even the creation of a small size ground
truth requires many domain experts to agree on a small set of pairs (e.g., some
ontology pairs of the conference data set have less than 15 pairs in their ground
truth).</p>
      <p>
        Ground truth is the requirement in almost every scienti c discipline to
validate ideas, theories, methods, etc. Therefore, many semiautomated approaches
are proposed in various domains to generate it. Euzenat et al. propose benchmark
generator framework to measure the meaningful properties of ontology matching
algorithms [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The objective of their framework is to generate a new benchmark
by supporting various alteration operations for any seed ontology. DBPediaNYD,
another such e ort, has resulted in the machine generated reference set (a silver
standard)[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Jorn Hees has proposed a semiautomated approach to map
Edinburgh Associative Thesaurus (EAT) to DBpedia entities [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Hees approach
? Copyright c for this paper by its authors. Use permitted under Creative Commons
License Attribution 4.0 International (CC BY 4.0).
      </p>
      <p>
        nds candidate mappings automatically through scores assigned to them by
using Wikipedia API. These mappings are further veri ed manually to generate
nal set of mappings. Harrow et al. have evaluated 11 matching systems on the
biomedical ontologies to evaluate their relative performance with respect
manually created mappings (gold standard), a set of mappings generated through
consensus (silver standard or a reference set), and unique mappings generated
by individual participating system [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Existing approaches do not consider a way to address bias introduced in
the reference set as a result of using particular approach to generate it. For
example, an algorithm that uses web search engines may get unfair advantage in
an evaluation when using DBPedia-NYD as the reference set [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Creation of an
unbiased reference set o ers multiple advantages: i) it can be used to evaluate
a newly proposed ontology matching algorithm, ii) it can be used for training
purpose, and iii) it can serve as the starting point for generating the ground
truth.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Framework</title>
      <p>We propose a plug-and-play framework that exploits properties of di erent
ontology matching algorithms to generate an unbiased reference set for the input
ontology matching algorithm and a pair of ontologies. Figure 1 outlines a
conceptual view of the proposed framework. The framework enables the right set
of ontology matching algorithms depending on the requirements speci ed by
the user (domain expert, ontology matching algorithm designer, etc). For
example, if the user wants to generate a reference set to be used for evaluating an
ontology matching algorithm that exploits distance property between concepts
of input ontologies, the framework enables those ontology matching algorithms
which exploit di erent properties (e.g., concept equivalence through synonym
set) to avoid bias in the reference set. Further, the user may choose to compute
con dence values for all or a subset of concept pairs of input ontologies.</p>
      <p>To generate a reference set of desired size and quality, it is necessary to lter
the alignment set with respect to threshold values on the size and con dence
values computed by all framework algorithms. Algorithm 1 outlines the approach
to select threshold on the con dence values for an ontology pair. The selection of
Alignmentset (many-to-many)</p>
      <p>O1 O2
O1, C11 C21
O2 C12 C22 Align
Algo C13
..</p>
      <p>C1n</p>
      <p>Alignmentset (one-to-one)
O1 O2
C11 C21</p>
      <p>C12 C22
C23 LOinpetiamrization C13 C23
.. .. ..</p>
      <p>C2m C1n C2m</p>
      <p>Alignlo</p>
      <p>Pluggedontologymatchingalgorithms
Algo1 Algo2 AlgoN
Cnf11 Cnf12 … Cnf1N</p>
      <p>Selection Function (SF)</p>
      <p>SF (cnf11.. cnf1N)
Cnf21 Cnf22 … Cnf2N Confidence SF (cnf21 .. cnf2N)
Cnf31 Cnf32 … Cnf3N veanlaubelsedfor SF (cnf31 .. cnf3N)
.. .. … .. algorithms ..</p>
      <p>Cnfn1 Cnfn2 … CnfnN SF (cnfn1 .. cnfnN)</p>
      <p>Reference Set
1
0
0
..
1
Fig. 1. Conceptual View of Framework.
threshold value is determined by two parameters, the cardinality of a set in
oneto-one matching form (generated after applying linear optimization - jalgoSetj
- as shown in algorithm) and 2 [0; 1], the user de ned parameter for the
minimum size of reference set.</p>
      <p>Selection Function (SF) is one of the most important elements of the
framework. SF takes `n' con dence values computed by chosen `n' ontology matching
algorithms for a concept pair and produces a boolean value. To put it formally,
SF : [0; 1]n ! f1; 0g. Di erent implementations of the SF function are possible.
In its current avatar of the framework, we provide two implementations. First
implementation uses Unanimity rule approach. All chosen algorithms should agree
on a concept pair for its inclusion in the reference set. Second implementation
uses Majority rule approach. If the majority of ontology matching algorithms
(&gt;= 50%) agree on a concept pair, it is included in the reference set.
Algorithm 1 Algorithm to compute threshold value
Require: Algoset, a superset containing one-to-one matching sets of all framework
algorithms for an ontology pair, , user de ned parameter
1: for all threshold in [0:1; ::; 1:0] do
2: f lag = true
3: for all algoSet 2 Algoset do
4: f ilteredSet = f ilterF orT hreshold(threshold; algoSet)
5: if (jf ilteredSetj=jalgoSetj) &lt; then
6: f lag = f alse
7: end if
8: end for
9: if f lag == true then
10: setT hresholdF orOntoP air(threshold)
11: end if
12: end for
3</p>
    </sec>
    <sec id="sec-3">
      <title>Experiments</title>
      <p>We have conducted experiments on the OAEI 2019 conference dataset using
python v3.7.3. We have evaluated our framework using six di erent ontology
matching algorithms two each for the categories of Deep learning (word2vec1 and
fastText2), WordNet (WuPalmer and Lin3) and character (nGram and MLCS4).</p>
      <p>
        For the computation of equality relation, classes and properties are compared
with classes and properties respectively. Moreover, we rst convert the output
of each ontology matching algorithm that is in many-to-many form (Align) into
1 https://spacy.io/api/doc/
2 https://fasttext.cc/docs/en/pretrained-vectors.html
3 https://www.nltk.org/howto/wordnet.html
4 https://pypi.org/project/strsim/
cmt Conference
cmt confOf
cmt edas
cmt ekaw
cmt iasted
cmt sigkdd
Conference confOf
Conference edas
Conference ekaw
Conference iasted
Conference sigkdd
confOf edas
confOf ekaw
confOf iasted
confOf sigkdd
edas ekaw
edas iasted
edas sigkdd
ekaw iasted
ekaw sigkdd
iasted sigkdd
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0.7
0.8
0.8
0.8
0.7
0.7
0.8
0.7
0.8
0.7
0.8
0.8
one-to-one matching form using the linear optimization [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. It produces the
maximal matching that maximizes overall con dence value of the one-to-one matching
form alignment (Alignlo). We have chosen two ontology matching algorithms for
each category as they were computing di erent con dence values for the same
concept pair (in some cases, the di erence is as high as 0.2). For = 0:1, we get
two di erent threshold values 0.7 and 0.8 for di erent ontology pairs as shown
in the table 1. We have excluded both ontology matching algorithms for the
category for which we want to generate the reference set.
      </p>
      <p>Table 1 shows the F-measure values for two di erent implementations of SF
as discussed above. From the table 1, it is clear that our framework generates
good quality reference set (maximum F-measure being around 88%). From the
Fmeasure values, we can conclude that not only SF selection strategy in uences
the quality of reference set, but the enabled algorithms (and therefore, their
properties) play an important role too. This behavior is consistent and can be
observed for multiple ontology pairs of the conference dataset. Obtained results
point to an important direction for generating unbiased reference set: the right
mix of ontology matching algorithms exploiting di erent properties with right
selection strategy.</p>
      <p>Discussion and Future work: In its current avatar, the proposed
framework does not model Inter-Algorithm disagreement between ontology matching
algorithms exploiting the similar or di erent properties. The modeling of
InterAlgorithm disagreement may further improve the quality of the generated
reference set and reduces the bias in it. The framework does not account for the
impact of approach that generates one-to-one matching form on the reference
set. Both research questions require further investigation.</p>
      <p>
        The notion of bias, accounted by the proposed framework, is based on a
property exploited by a given ontology matching algorithm. Therefore, that property
is applicable for all mappings of a reference set. The evaluation exercise of
Harrow et al. considers the bias based on the similarity between two participating
ontology matching systems and it is mapping speci c [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. If two variants of the
same participating system votes for a mapping, it is counted only once.
      </p>
      <p>To generate the output that can be used in real world applications, domain
experts need to further validate the generated reference set. Our framework will
reduce the e orts required by domain experts in generating silver standard or
gold standard. More experiments are needed to further validate the framework
with respect to i) the diversity of ontology matching algorithms (e.g., hybrid
ontology matching approaches combining and exploiting di erent properties)
and ii) real world ontologies.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Euzenat</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosoiu</surname>
            ,
            <given-names>M.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trojahn</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Ontology matching benchmarks: generation, stability, and discriminability</article-title>
          .
          <source>Journal of web semantics 21</source>
          , 30{
          <fpage>48</fpage>
          (
          <year>2013</year>
          ), https://doi.org/10.1016/j.websem.
          <year>2013</year>
          .
          <volume>05</volume>
          .002
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Harrow</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jimenez-Ruiz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Splendiani</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Romacker</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Woollard</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Markel</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alam-Faruque</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koch</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malone</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Waaler</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Matching disease and phenotype ontologies in the ontology alignment evaluation initiative</article-title>
          .
          <source>Journal of biomedical semantics 8</source>
          (
          <issue>1</issue>
          ),
          <volume>55</volume>
          (
          <year>2017</year>
          ), https://doi.org/10.1186/s13326-017-0162-9
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Hees</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bauer</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Folz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Borth</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dengel</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Edinburgh associative thesaurus as rdf and dbpedia mapping</article-title>
          .
          <source>In: European Semantic Web Conference</source>
          . pp.
          <volume>17</volume>
          {
          <fpage>20</fpage>
          . Springer (
          <year>2016</year>
          ), https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -47602-5 4
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Matousek</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Gartner, B.:
          <article-title>Understanding and using linear programming</article-title>
          . Springer Science &amp; Business
          <string-name>
            <surname>Media</surname>
          </string-name>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Paulheim</surname>
          </string-name>
          , H.:
          <article-title>Dbpedianyd-a silver standard benchmark dataset for semantic relatedness in dbpedia</article-title>
          . In:
          <article-title>NLP-DBPEDIA@ ISWC</article-title>
          .
          <string-name>
            <surname>Citeseer</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>