1. Introduction and System Overview

Towards Tailoring Ontology Embeddings for Ontology Matching Tasks

Sevinj Teymurova

Ernesto Jiménez-Ruiz

0 2

Tillman Weyde

Jiaoyan Chen

1 0 City St George's, University of London , UK 1 The University of Manchester , UK 2 University of Oslo , Norway

Ontology alignment becomes crucial for achieving semantic interoperability as the multiple ontologies representing the same domain are increasing. This paper introduces OWL2Vec4OA, an enhancement of the OWL2Vec* ontology embedding system. Although OWL2Vec* is a robust method for ontology embedding, it currently lacks specialization for ontology alignment tasks. OWL2Vec4OA addresses this limitation by incorporating confidence values from seed mappings to bias its random walk approach.

eol>ontology alignment walking strategy ontology embeddings knowledge graph embeddings

1. Introduction and System Overview

of the input ontologies given a set of seed mappings, which enables the creation of sequences across entities from diferent ontologies. The implementation of OWL2Vec4OA is available on our GitHub repository: https://github.com/Sevinjt/OWL2Vec4OA

2. Results, Discussion and Future Work

Our study evaluated the performance of OWL2Vec4OA across multiple biomedical ontology alignment tasks from the local matching setting of the OAEI’s Bio-ML track [19]. The local matching tasks consist in ranking a correct mapping among a pool of incorrect mappings.

In this work, mappings are scored and ranked according to the cosine similarity of the computed URI embeddings for the entities in the mapping. We applied OWL2Vec4OA and OWL2Vec* to compute the embeddings, fixing the Word2Vec hyper parameters — the number of epochs and embedding dimension to 70 and 100, respectively. OWL2Vec4OA demonstrated significant improvements over our predecessor OWL2Vec* for all the tested ontology pairs, indicating that the OWL2Vec4OA embeddings are better suited for ontology alignment tasks. For OMIM-ORDO, OWL2Vec4OA showed substantial improvement at walk depth 2, with Mean Reciprocal Rank (MRR) increasing from 0.074 to 0.586, and Hits@1 improving from 0.018 to 0.533. In NCIT-DOID, OWL2Vec4OA achieved its best performance at walk depth 4, with MRR rising from 0.105 to 0.609, and Hits@1 from 0.035 to 0.442. SNOMED-NCITN exhibited the most dramatic improvement. At walk depth 4, MRR increased from 0.055 to 0.805, and Hits@1 from 0.011 to 0.747. For SNOMED-NCIT-P, significant improvements were observed at walk depth 2, with MRR increasing from 0.079 to 0.436, and Hits@1 from 0.018 to 0.342. Walk length significantly influenced performance across diferent ontology pairs. Generally, shorter walk lengths (2 or 3) performed better for some pairs like OMIM-ORDO and SNOMED-NCIT-P, while others such as NCIT-DOID and SNOMED-NCIT-N benefited from longer walk lengths. Computation time varied based on ontology pair and walk depth, with longer depth consistently requiring more time than shorter walk depth.

We plan to extend our work as follows: (i) train a machine learning model with OWL2Vec4OA embeddings, similar to approaches like LogMap-ML and Hao et al. [20]; (ii) perform additional experiments to better understand the impact of the walk depth with diferent strategies to create entity sequences (i.e., focusing on concepts and/or avoiding OWL constructs); and (iii) create an end-to-end ontology alignment system to participate in the OAEI campaign.

OMIM-ORDO NCIT-DOID SNOMED-NCIT-N SNOMED-NCIT-P

OWL2Vec* OWL2Vec4OA OWL2Vec* OWL2Vec4OA OWL2Vec* OWL2Vec4OA OWL2Vec* OWL2Vec4OA

Acknowledgments

This research is funded by the Ministry of Education and Science of Azerbaijan Republic with support from City St George’s, University of London. This work has also been partially supported by the Academy of Medical Sciences Network Grant (Neurosymbolic AI for Medicine, NGR1\1857) and the project "XAI4SOC: Explainable Artificial Intelligence for Healthy Aging and Social Wellbeing" funded by the Agencia Estatal de Investigación (AEI), the Spanish Ministry of Science, Innovation and Universities and the European Social Funds (PID2021-123152OB-C22). automatically learned entity representation, in: Proceedings of the 2015 conference on empirical methods in natural language processing, 2015, pp. 2419–2429. [5] P. Kolyvakis, A. Kalousis, D. Kiritsis, DeepAlignment: Unsupervised ontology matching with refined word vectors, in: Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1-6 June 2018, 2018. [6] V. Iyer, A. Agarwal, H. Kumar, VeeAlign: Multifaceted Context Representation Using Dual Attention for Ontology Alignment, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, 2021, pp. 10780–10792. doi:10.18653/V1/2021.EMNLP-MAIN.842. [7] J. Hao, C. Lei, V. Efthymiou, A. Quamar, F. Özcan, Y. Sun, W. Wang, Medto: Medical data to ontology matching using hybrid graph neural networks, in: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 2946–2954. [8] J. Chen, E. Jiménez-Ruiz, I. Horrocks, D. Antonyrajah, A. Hadian, J. Lee, Augmenting ontology alignment by semantic embedding and distant supervision, in: The Semantic Web: ESWC, Springer, 2021, pp. 392–408. [9] F. Gosselin, A. Zouaq, SORBET: A Siamese Network for Ontology Embeddings Using a DistanceBased Regression Loss and BERT, in: International Semantic Web Conference, Springer, 2023, pp. 561–578. [10] Y. He, J. Chen, D. Antonyrajah, I. Horrocks, BERTMap: A BERT-Based Ontology Alignment

System, in: Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022. [11] J. Chen, Y. He, Y. Geng, E. Jiménez-Ruiz, H. Dong, I. Horrocks, Contextual semantic embeddings for ontology subsumption prediction, World Wide Web 26 (2023) 2569–2591. [12] S. Hertling, H. Paulheim, OLaLa: Ontology Matching with Large Language Models, in: K. B.

Venable, D. Garijo, B. Jalaian (Eds.), Proceedings of the 12th Knowledge Capture Conference (K-CAP), ACM, 2023, pp. 131–139. URL: https://doi.org/10.1145/3587259.3627571. doi:10.1145/ 3587259.3627571. [13] S. Teymurova, E. Jiménez-Ruiz, T. Weyde, J. Chen, OWL2Vec4OA: Tailoring Knowledge Graph Embeddings for Ontology Alignment, in: Submitted to a Conference, 2024. Paper available here: http://arxiv.org/abs/2408.06310. [14] J. Chen, P. Hu, E. Jimenez-Ruiz, O. M. Holter, D. Antonyrajah, I. Horrocks, OWL2vec*: Embedding of OWL ontologies, Machine Learning 110 (2021) 1813–1845. [15] E. Jimenez-Ruiz, B. Cuenca Grau, LogMap: Logic- Based and Scalable Ontology Matching, The

Semantic Web – ISWC (2011) vol 7031. [16] D. Faria, C. Pesquita, E. Santos, M. Palmonari, I. F. Cruz, F. M. Couto, The agreementmakerlight ontology matching system, in: On the Move to Meaningful Internet Systems, volume 8185 of Lecture Notes in Computer Science, Springer, 2013, pp. 527–541. URL: https://doi.org/10.1007/ 978-3-642-41030-7_38. doi:10.1007/978-3-642-41030-7\_38. [17] M. Cochez, P. Ristoski, S. P. Ponzetto, H. Paulheim, Biased graph walks for RDF graph embeddings, in: R. Akerkar, A. Cuzzocrea, J. Cao, M. Hacid (Eds.), Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics, ACM, 2017, pp. 21:1–21:12. URL: https://doi.org/10.1145/3102254.3102279. doi:10.1145/3102254.3102279. [18] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, Advances in neural information processing systems 26 (2013). [19] Y. He, J. Chen, H. Dong, E. Jiménez-Ruiz, A. Hadian, I. Horrocks, Machine Learning-Friendly Biomedical Datasets for Equivalence and Subsumption Ontology Matching, in: 21st International Semantic Web Conference, volume 13489 of Lecture Notes in Computer Science, Springer, 2022, pp. 575–591. URL: https://doi.org/10.1007/978-3-031-19433-7_33. doi:10.1007/ 978-3-031-19433-7\_33. [20] Z. Hao, W. Mayer, J. Xia, G. Li, L. Qin, Z. Feng, Ontology alignment with semantic and structural embeddings, J. Web Semant. 78 (2023) 100798. URL: https://doi.org/10.1016/j.websem.2023.100798. doi:10.1016/J.WEBSEM.2023.100798.

[1]

M. A. N.

Pour , et al., Results of the ontology alignment evaluation initiative 2022 , in: Proceedings of the 17th International Workshop on Ontology Matching (OM) , volume 3324 of CEUR Workshop Proceedings , 2022 , pp. 84 - 128 . URL: https://ceur-ws. org/ Vol- 3324 /oaei22_paper0.pdf.

[2]

M. A. N.

Pour , et al., Results of the Ontology Alignment Evaluation Initiative 2023 , in: Proceedings of the 18th International Workshop on Ontology Matching (OM) , volume 3591 of CEUR Workshop Proceedings , 2023 , pp. 97 - 139 . URL: https://ceur-ws. org/ Vol- 3591 /oaei23_paper0.pdf.

[3]

Otero-Cerdeira ,

F. J.

Rodríguez-Martínez ,

Gómez-Rodríguez , Ontology matching: A literature review , Expert Systems with Applications 42 ( 2015 ) 949 - 971 .

[4]

Xiang ,

Jiang ,

Chang ,

Sui , ERSOM: A structural ontology matching approach using