<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards Tailoring Ontology Embeddings for Ontology Matching Tasks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sevinj Teymurova</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ernesto Jiménez-Ruiz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tillman Weyde</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jiaoyan Chen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>City St George's, University of London</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>The University of Manchester</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Oslo</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Ontology alignment becomes crucial for achieving semantic interoperability as the multiple ontologies representing the same domain are increasing. This paper introduces OWL2Vec4OA, an enhancement of the OWL2Vec* ontology embedding system. Although OWL2Vec* is a robust method for ontology embedding, it currently lacks specialization for ontology alignment tasks. OWL2Vec4OA addresses this limitation by incorporating confidence values from seed mappings to bias its random walk approach.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;ontology alignment</kwd>
        <kwd>walking strategy</kwd>
        <kwd>ontology embeddings</kwd>
        <kwd>knowledge graph embeddings</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction and System Overview</title>
      <p>of the input ontologies given a set of seed mappings, which enables the creation of sequences across
entities from diferent ontologies. The implementation of OWL2Vec4OA is available on our GitHub
repository: https://github.com/Sevinjt/OWL2Vec4OA</p>
    </sec>
    <sec id="sec-2">
      <title>2. Results, Discussion and Future Work</title>
      <p>Our study evaluated the performance of OWL2Vec4OA across multiple biomedical ontology alignment
tasks from the local matching setting of the OAEI’s Bio-ML track [19]. The local matching tasks consist
in ranking a correct mapping among a pool of incorrect mappings.</p>
      <p>In this work, mappings are scored and ranked according to the cosine similarity of the computed
URI embeddings for the entities in the mapping. We applied OWL2Vec4OA and OWL2Vec* to compute
the embeddings, fixing the Word2Vec hyper parameters — the number of epochs and embedding
dimension to 70 and 100, respectively. OWL2Vec4OA demonstrated significant improvements over our
predecessor OWL2Vec* for all the tested ontology pairs, indicating that the OWL2Vec4OA embeddings
are better suited for ontology alignment tasks. For OMIM-ORDO, OWL2Vec4OA showed substantial
improvement at walk depth 2, with Mean Reciprocal Rank (MRR) increasing from 0.074 to 0.586, and
Hits@1 improving from 0.018 to 0.533. In NCIT-DOID, OWL2Vec4OA achieved its best performance at
walk depth 4, with MRR rising from 0.105 to 0.609, and Hits@1 from 0.035 to 0.442.
SNOMED-NCITN exhibited the most dramatic improvement. At walk depth 4, MRR increased from 0.055 to 0.805,
and Hits@1 from 0.011 to 0.747. For SNOMED-NCIT-P, significant improvements were observed at
walk depth 2, with MRR increasing from 0.079 to 0.436, and Hits@1 from 0.018 to 0.342. Walk length
significantly influenced performance across diferent ontology pairs. Generally, shorter walk lengths (2
or 3) performed better for some pairs like OMIM-ORDO and SNOMED-NCIT-P, while others such as
NCIT-DOID and SNOMED-NCIT-N benefited from longer walk lengths. Computation time varied based
on ontology pair and walk depth, with longer depth consistently requiring more time than shorter walk
depth.</p>
      <p>We plan to extend our work as follows: (i) train a machine learning model with OWL2Vec4OA
embeddings, similar to approaches like LogMap-ML and Hao et al. [20]; (ii) perform additional experiments
to better understand the impact of the walk depth with diferent strategies to create entity sequences
(i.e., focusing on concepts and/or avoiding OWL constructs); and (iii) create an end-to-end ontology
alignment system to participate in the OAEI campaign.</p>
      <sec id="sec-2-1">
        <title>OMIM-ORDO</title>
      </sec>
      <sec id="sec-2-2">
        <title>NCIT-DOID</title>
      </sec>
      <sec id="sec-2-3">
        <title>SNOMED-NCIT-N</title>
      </sec>
      <sec id="sec-2-4">
        <title>SNOMED-NCIT-P</title>
        <p>OWL2Vec*
OWL2Vec4OA
OWL2Vec*
OWL2Vec4OA
OWL2Vec*
OWL2Vec4OA
OWL2Vec*
OWL2Vec4OA</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Acknowledgments</title>
      <p>This research is funded by the Ministry of Education and Science of Azerbaijan Republic with support
from City St George’s, University of London. This work has also been partially supported by the
Academy of Medical Sciences Network Grant (Neurosymbolic AI for Medicine, NGR1\1857) and the
project "XAI4SOC: Explainable Artificial Intelligence for Healthy Aging and Social Wellbeing" funded by
the Agencia Estatal de Investigación (AEI), the Spanish Ministry of Science, Innovation and Universities
and the European Social Funds (PID2021-123152OB-C22).
automatically learned entity representation, in: Proceedings of the 2015 conference on empirical
methods in natural language processing, 2015, pp. 2419–2429.
[5] P. Kolyvakis, A. Kalousis, D. Kiritsis, DeepAlignment: Unsupervised ontology matching with
refined word vectors, in: Proceedings of the 16th Annual Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language Technologies, 1-6
June 2018, 2018.
[6] V. Iyer, A. Agarwal, H. Kumar, VeeAlign: Multifaceted Context Representation Using Dual
Attention for Ontology Alignment, in: Proceedings of the 2021 Conference on Empirical Methods
in Natural Language Processing (EMNLP), Association for Computational Linguistics, 2021, pp.
10780–10792. doi:10.18653/V1/2021.EMNLP-MAIN.842.
[7] J. Hao, C. Lei, V. Efthymiou, A. Quamar, F. Özcan, Y. Sun, W. Wang, Medto: Medical data to
ontology matching using hybrid graph neural networks, in: Proceedings of the 27th ACM SIGKDD
Conference on Knowledge Discovery &amp; Data Mining, 2021, pp. 2946–2954.
[8] J. Chen, E. Jiménez-Ruiz, I. Horrocks, D. Antonyrajah, A. Hadian, J. Lee, Augmenting ontology
alignment by semantic embedding and distant supervision, in: The Semantic Web: ESWC, Springer,
2021, pp. 392–408.
[9] F. Gosselin, A. Zouaq, SORBET: A Siamese Network for Ontology Embeddings Using a
DistanceBased Regression Loss and BERT, in: International Semantic Web Conference, Springer, 2023, pp.
561–578.
[10] Y. He, J. Chen, D. Antonyrajah, I. Horrocks, BERTMap: A BERT-Based Ontology Alignment</p>
      <p>System, in: Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022.
[11] J. Chen, Y. He, Y. Geng, E. Jiménez-Ruiz, H. Dong, I. Horrocks, Contextual semantic embeddings
for ontology subsumption prediction, World Wide Web 26 (2023) 2569–2591.
[12] S. Hertling, H. Paulheim, OLaLa: Ontology Matching with Large Language Models, in: K. B.</p>
      <p>Venable, D. Garijo, B. Jalaian (Eds.), Proceedings of the 12th Knowledge Capture Conference
(K-CAP), ACM, 2023, pp. 131–139. URL: https://doi.org/10.1145/3587259.3627571. doi:10.1145/
3587259.3627571.
[13] S. Teymurova, E. Jiménez-Ruiz, T. Weyde, J. Chen, OWL2Vec4OA: Tailoring Knowledge Graph
Embeddings for Ontology Alignment, in: Submitted to a Conference, 2024. Paper available here:
http://arxiv.org/abs/2408.06310.
[14] J. Chen, P. Hu, E. Jimenez-Ruiz, O. M. Holter, D. Antonyrajah, I. Horrocks, OWL2vec*: Embedding
of OWL ontologies, Machine Learning 110 (2021) 1813–1845.
[15] E. Jimenez-Ruiz, B. Cuenca Grau, LogMap: Logic- Based and Scalable Ontology Matching, The</p>
      <p>Semantic Web – ISWC (2011) vol 7031.
[16] D. Faria, C. Pesquita, E. Santos, M. Palmonari, I. F. Cruz, F. M. Couto, The agreementmakerlight
ontology matching system, in: On the Move to Meaningful Internet Systems, volume 8185
of Lecture Notes in Computer Science, Springer, 2013, pp. 527–541. URL: https://doi.org/10.1007/
978-3-642-41030-7_38. doi:10.1007/978-3-642-41030-7\_38.
[17] M. Cochez, P. Ristoski, S. P. Ponzetto, H. Paulheim, Biased graph walks for RDF graph
embeddings, in: R. Akerkar, A. Cuzzocrea, J. Cao, M. Hacid (Eds.), Proceedings of the 7th
International Conference on Web Intelligence, Mining and Semantics, ACM, 2017, pp. 21:1–21:12. URL:
https://doi.org/10.1145/3102254.3102279. doi:10.1145/3102254.3102279.
[18] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, J. Dean, Distributed representations of words and
phrases and their compositionality, Advances in neural information processing systems 26 (2013).
[19] Y. He, J. Chen, H. Dong, E. Jiménez-Ruiz, A. Hadian, I. Horrocks, Machine Learning-Friendly
Biomedical Datasets for Equivalence and Subsumption Ontology Matching, in: 21st
International Semantic Web Conference, volume 13489 of Lecture Notes in Computer Science,
Springer, 2022, pp. 575–591. URL: https://doi.org/10.1007/978-3-031-19433-7_33. doi:10.1007/
978-3-031-19433-7\_33.
[20] Z. Hao, W. Mayer, J. Xia, G. Li, L. Qin, Z. Feng, Ontology alignment with semantic and structural
embeddings, J. Web Semant. 78 (2023) 100798. URL: https://doi.org/10.1016/j.websem.2023.100798.
doi:10.1016/J.WEBSEM.2023.100798.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. A. N.</given-names>
            <surname>Pour</surname>
          </string-name>
          , et al.,
          <article-title>Results of the ontology alignment evaluation initiative 2022</article-title>
          ,
          <source>in: Proceedings of the 17th International Workshop on Ontology Matching (OM)</source>
          , volume
          <volume>3324</volume>
          <source>of CEUR Workshop Proceedings</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>84</fpage>
          -
          <lpage>128</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3324</volume>
          /oaei22_paper0.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M. A. N.</given-names>
            <surname>Pour</surname>
          </string-name>
          , et al.,
          <source>Results of the Ontology Alignment Evaluation Initiative</source>
          <year>2023</year>
          ,
          <source>in: Proceedings of the 18th International Workshop on Ontology Matching (OM)</source>
          , volume
          <volume>3591</volume>
          <source>of CEUR Workshop Proceedings</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>97</fpage>
          -
          <lpage>139</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3591</volume>
          /oaei23_paper0.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Otero-Cerdeira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. J.</given-names>
            <surname>Rodríguez-Martínez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gómez-Rodríguez</surname>
          </string-name>
          ,
          <article-title>Ontology matching: A literature review</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>42</volume>
          (
          <year>2015</year>
          )
          <fpage>949</fpage>
          -
          <lpage>971</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Xiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Sui</surname>
          </string-name>
          ,
          <article-title>ERSOM: A structural ontology matching approach using</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>