<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Learning reference alignments for ontology matching within and across domains ?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Beatriz Lima</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ruben Branco</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jo~ao Castanheira</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gustavo Fonseca</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Catia Pesquita</string-name>
          <email>clpesquita@fc.ul.pt</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dep. de Informatica</institution>
          ,
          <addr-line>Faculdade de Ci</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>LASIGE</institution>
          ,
          <addr-line>Faculdade de Ci</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Natural Language and Speech Group</institution>
          ,
          <addr-line>Faculdade de Ci</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>encias da Universidade de Lisboa</institution>
          ,
          <country country="PT">Portugal</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Reference alignments are the standard approach for ontology alignment evaluation. However, building a reference alignment is timeconsuming and usually depends on expert availability. Several strategies have been proposed to mitigate this issue, ranging from exploring external resources, building simulated alignment tasks, or even crowdsourcing. A simple approach is to take a consensus alignment built from the outputs of several ontology matching systems results. We present a preliminary investigation that focuses on the generalization of machine learning models trained on the output alignments of multiple systems for a task where a reference alignment is available to other alignment tasks. Results show that while the consensus alignment works well for alignment tasks where several systems achieve a high performance and produce similar alignments, trained reference models are able to improve on the consensus both within and across domains.</p>
      </abstract>
      <kwd-group>
        <kwd>Ontology matching ment</kwd>
        <kwd>OAEI</kwd>
        <kwd>Machine Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The evaluation of ontology alignments typically relies on reference alignments
which are automatically compared to the outputs of the alignment systems.
Reference alignments are commonly either manually-curated by domain experts or
automatically generated. The rst kind can be created manually from scratch
or manually validated given a set of automatically generated candidates[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
Although very reliable, they are di cult to obtain as they are very time-consuming
and require domain expertise. To decrease the e ort and associated cost, both
automated strategies and crowdsourcing have been used. Automated strategies,
usually work with simulated data [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] or by exploring external resources [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
? Copyright c 2020 for this paper by its authors. Use permitted under Creative
      </p>
      <p>Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>
        Crowdsourcing has been successfully employed, however producing references
for complex domains is more di cult to achieve due to the lack of expertise
of crowdsourced workers[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. When the above options are not available, an easy
solution to evaluate competing systems is based on a consensus alignment. This
strategy is employed by the Disease and Phenotype track at the Ontology
Alignment Evaluation Initiative [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] with a consensus alignment built on three votes
(i.e., if a mapping is found by 3 di erent systems it is considered correct). The
consensus is considered to be a partial reference alignment and mappings that
are generated by a single system are then manually evaluated.
      </p>
      <p>Motivated by the di culties in generating a reference alignment and inspired
by the consensus alignment strategy, we hypothesise that a machine learning
model trained on the output alignments of multiple systems for a task where a
reference alignment is available, can be used to evaluate other alignment tasks.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Methodology</title>
      <p>The alignments produced by the ontology matching (OM) tools that participated
in the Anatomy, Large BioMed and Conference tracks of OAEI 20194 were used
as data sources. In our proposed models, each instance corresponds to a mapping
in a given alignment task. The features translate in whether the given mapping
was present or absent in the output of each of the participating OM tools, taking
as values 1 or 0. The reference alignment was used to produce the target class,
and support supervised learning. Thus, the model is learning to classify whether
a mapping between two ontologies is correct, based on the pattern of outputs of
the OM tools while using the reliable reference alignment as ground-truth.</p>
      <p>
        The Anatomy track consists of matching Adult Mouse Anatomy[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and the
portion covering human anatomy of the National Cancer Institute Thesaurus
(NCI)[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and it is supported by a manually-curated reference alignment. Several
OM systems achieve a high performance in this track [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The Large BioMed track
comprises three ontologies, the Foundational Model of Anatomy (FMA)[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ],
SNOMED CT[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and NCI. These ontologies are matched pairwise,
generating three possible alignments: FMA-NCI, FMA-SNOMED and NCI-SNOMED,
which will be further addressed as LB1, LB2 and LB3, respectively, for
abbreviation. LB1 and LB2 cover the anatomical domain, whereas LB3 does not. The
reference alignment was extracted from an external resource [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The Conference
track[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] provides 16 ontologies from the conference organisation domain. Since
only 7 ontologies are contained in the existing reference alignment, we end up
with 21 result alignments, which corresponds to the complete alignment space
between these ontologies. We randomly generated 3 di erent datasets (CF1, CF2,
CF3), each of which containing 18 alignments for training and 3 alignments
for testing. The alignments used for testing were cmt-ekaw, cmt-conference and
iasted-sigkdd in CF1; conference-confof, edas-sigkdd and iasted-sigkdd in CF2;
confof-edas, cmt-ekaw and ekaw-sigkdd in CF3.
4 http://oaei.ontologymatching.org/2019/
      </p>
      <p>
        A reference alignment only contains true positive mappings. Assuming its
completeness, every potential mapping that is not a part of the reference is
a false mapping. A traditional option to generate negative examples would be
a random sampling of entity pairs from each ontology that are not present in
the reference alignment. However, this would result in mostly instances with all
zero features, and thus uninformative, since most systems produce alignments
of cardinality near one. Instead, we take as negative examples all mappings that
at least one of the OM tools nds but which are not a part of the reference
alignment. To tackle the imbalance caused by this approach, we investigated
two di erent sampling strategies: SMOTE oversampling[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and undersampling
with TomekLinks[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>Three types of experiments were performed for each domain to verify di
erent properties. In Experiment 1, which worked as a baseline, we investigated
how well a model can be learned within a given alignment task. We performed
10-fold cross-validation, with a grid search for hyperparameter tuning over a
set of 8 machine learning approaches5. In Experiment 2, we investigated if a
model trained in one/more tasks would generalize well to other tasks within the
same domain. To support this, features were extracted from the OM tools which
participated in both training and test tasks. Experiment 3 aims to verify how
well the method generalises for ontologies in completely di erent domains. We
train on LargeBio data and test on Conference, and vice-versa, again using the
intersection of OM tools that participated in both tracks. For all experiments,
we also computed the majority vote and the consensus with vote=3 results.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Results and Discussion</title>
      <p>Table 1 presents the results obtained for all three experiments, using the best
sampling strategy (oversampling) and machine learning approaches 6. In the
Biomedical domain, all cross-validation experiments achieved good performance
(0.8 to 0.915 average F1-score), however, in the Anatomy task, the Three votes
approach achieved the best result. In the second experiment, the model learned
in Anatomy achieved at best an F1-score of 0.697 in LargeBio, whereas the
model trained on LargeBio reached 0.938 in Anatomy. Nevertheless, the Three
votes consensus approach achieved a higher score in these two cases. However,
within the LB track, the ML models outperformed the consensus approach in
LB1 and LB2 trained models. These results indicate that system strategies likely
di er between the Anatomy and LargeBio tracks. The greater complexity and
coverage of LB (which includes both anatomical and non-anatomical tasks) can
help explain these results. In the Conference domain, the rst experiment
results were overall high, with ML approaches improving over the consensus. The
second experiment revealed that the ML approaches were able to outperform
5 Random Forest, K-Nearest Neighbors, Decision Tree, Multi-Layer Perceptron, Naive</p>
      <p>Bayes, Gradient Boosting, Logistic Regression and Adaboost
6 The full table of results along with hyperparameter information can be found here:
https://github.com/liseda-lab/ML4ReferenceAlignment
the consensus in only one test case. As for the cross-domain experiments, we
can observe that, even though the LB dataset is much bigger than CF, models
trained in CF were able to generalise well to LB and vice-versa, and in both
cases surpass the consensus. One relevant aspect that may help explain these
results is the agreement degree between OM systems. In the Anatomy task, the
average agreement 7 between systems is 0.75, whereas in LB1, LB2, and LB3
it is 0.35, 0.26 and 0.40, respectively. In Conference the agreement ranges
between 0.51 and 0.86 with most tasks falling below 0.65. This indicates that when
systems have a high agreement, the consensus provides a good evaluation, but
when systems di er in their outputs, the ML approaches work best.
Exp. Train</p>
      <p>Test</p>
      <p>Gradient
Boosting</p>
      <p>AdaBoost</p>
      <p>Logistic
Regression</p>
      <p>Decision Majority Three</p>
      <p>Tree Vote votes
1
2
1
2</p>
      <p>Anatomy</p>
      <p>LB1
LB2</p>
      <p>LB3
Anatomy</p>
      <p>LB
LB1
LB2
LB3</p>
      <p>LB
Anatomy
LB2+3
LB1+3</p>
      <p>LB1+2
CF
CF1
CF2
CF3</p>
      <p>Our preliminary results highlight an opportunity to address the challenge
of incomplete reference alignments by training models with a partial reference.
Furthermore, they also showcase that in tasks where systems output dissimilar
7 computed as the average pairwise jaccard similarity between OM systems outputs
alignments, a model trained in other alignment tasks, even from a di erent
domain, can provide a more complete evaluation than a consensus alignment.
Acknowledgements CP and BL are funded by the FCT through LASIGE
Research Unit, ref. UIDB/00408/2020 and ref. UIDP/00408/2020, and by projects
SMILAX (ref. PTDC/EEI-ESS/4633/2014). CP is also funded by GADgET (ref.
DSAIPA/DS/0022/2018). RB is funded by PORTULAN CLARIN Research
Infrastructure through Lisboa 2020, Alentejo 2020 and FCT (PINFRA/22117/2016).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Chawla</surname>
            ,
            <given-names>N.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bowyer</surname>
            ,
            <given-names>K.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hall</surname>
            ,
            <given-names>L.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kegelmeyer</surname>
            ,
            <given-names>W.P.</given-names>
          </string-name>
          :
          <article-title>Smote: synthetic minority over-sampling technique</article-title>
          .
          <source>Journal of arti cial intelligence research 16</source>
          ,
          <volume>321</volume>
          {
          <fpage>357</fpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Cheatham</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Conference v2.
          <article-title>0: An uncertain version of the oaei conference benchmark</article-title>
          . In: International Semantic Web Conference. pp.
          <volume>33</volume>
          {
          <fpage>48</fpage>
          . Springer (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Donnelly</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Snomed-ct: The advanced terminology and coding system for ehealth</article-title>
          .
          <source>Studies in health technology and informatics 121</source>
          ,
          <issue>279</issue>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Dragisic</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ivanova</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lambrix</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Experiences from the anatomy track in the ontology alignment evaluation initiative</article-title>
          .
          <source>Journal of biomedical semantics 8</source>
          (
          <issue>1</issue>
          ),
          <volume>56</volume>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ferrara</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montanelli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Noessner</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stuckenschmidt</surname>
          </string-name>
          , H.:
          <article-title>Benchmarking matching applications on the semantic web</article-title>
          .
          <source>In: Extended Semantic Web Conference</source>
          . pp.
          <volume>108</volume>
          {
          <fpage>122</fpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Golbeck</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fragoso</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hartel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hendler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oberthaler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parsia</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>The national cancer institute's thesaurus and ontology</article-title>
          .
          <source>Journal of Web Semantics First Look 1</source>
          <volume>1 4</volume>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Harrow</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jimenez-Ruiz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Splendiani</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Romacker</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Woollard</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Markel</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alam-Faruque</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koch</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malone</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Waaler</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Matching disease and phenotype ontologies in the ontology alignment evaluation initiative</article-title>
          .
          <source>J Biomed Semantics</source>
          <volume>8</volume>
          (
          <issue>1</issue>
          ),
          <volume>55</volume>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Hayamizu</surname>
            ,
            <given-names>T.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mangan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corradi</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kadin</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ringwald</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The adult mouse anatomical dictionary: a tool for annotating and integrating data</article-title>
          .
          <source>Genome biology 6(3)</source>
          , 1{
          <issue>8</issue>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Jimenez-Ruiz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grau</surname>
            ,
            <given-names>B.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Exploiting the umls metathesaurus in the ontology alignment evaluation initiative</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dragisic</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Faria</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ivanova</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jimenez-Ruiz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lambrix</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pesquita</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>User validation in ontology alignment: functional assessment and impact</article-title>
          .
          <source>The Knowledge Engineering Review</source>
          <volume>34</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Rosse</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mejino</surname>
            <given-names>Jr</given-names>
          </string-name>
          ,
          <string-name>
            <surname>J.L.:</surname>
          </string-name>
          <article-title>A reference ontology for biomedical informatics: the foundational model of anatomy</article-title>
          .
          <source>Journal of biomedical informatics 36(6)</source>
          ,
          <volume>478</volume>
          {
          <fpage>500</fpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Tomek</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Two modi cations of cnn</article-title>
          .
          <source>IEEE Transactions on Systems, Man, and Cybernetics</source>
          SMC-
          <volume>6</volume>
          (
          <issue>11</issue>
          ),
          <volume>769</volume>
          {
          <fpage>772</fpage>
          (
          <year>1976</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Zamazal</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Svatek</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>The ten-year ontofarm and its fertilization within the onto-sphere</article-title>
          .
          <source>Journal of Web Semantics</source>
          <volume>43</volume>
          ,
          <issue>46</issue>
          {
          <fpage>53</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>