TOM Matcher Results for OAEI 2021 Daniel Kossack1[0000−0002−8649−3357] , Niklas Borg1[0000−0002−8081−6653] , Leon Knorr1[0000−0003−4117−2629] , and Jan Portisch1,2[0000−0001−5420−0663] 1 SAP SE, Walldorf, Germany {daniel.tobias.kossack, niklas.borg, leon.knorr, jan.portisch}@sap.com 2 Data and Web Science Group, University of Mannheim, Germany jan@informatik.uni-mannheim.de Abstract. This paper presents the matching system TOM together with its results in the Ontology Alignment Evaluation Initiative 2021 (OAEI 2021). This is the first participation of TOM in the OAEI. Very recently, transformers achieved remarkable results in the natural language processing community on a variety of tasks. The TOM match- ing system exploits a zero-shot transformer-based language model to cal- culate confidences for each instance. The matcher uses the pre-trained transformer model paraphrase-TinyBERT-L6-v2.3 Keywords: Ontology Matching · Ontology Alignment · Language Mod- els · Transformers 1 Presentation of the System 1.1 State, purpose, general statement Transformers for Ontology Matching (TOM) is a matching system that uses a transformer-based language model to calculate a confidence for a pair of entities. The matcher is implemented as a pipeline of subsequent steps using pre-defined matching modules of the Matching EvaLuation Toolkit (MELT) [6], a framework for ontology matching and evaluation. Particularly, the new transformer exten- sion of MELT [7] is used in the implementation of this matcher. The matcher was implemented and packaged as a Docker image implementing the new Web Interface 4 . 1.2 Specific Techniques Used Transformer-based language models Transformers are types of artificial neural networks that have a specific architecture [13]. Especially in the domain of natural language processing, transformers achieved good results on a variety 3 Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 4 https://dwslab.github.io/melt/matcher-packaging/web# web-interface-http-matching-interface 2 D. Kossack et al. of tasks [13,4]. The tasks a transformer is capable of carrying out depend on its architecture and the task it was trained on. To perform well, transformers need to be trained on large amounts of data. Many researchers and organizations train their transformers on textual data from various domains. After training the model (so called pre-training), the transformer models can be fine-tuned for specific tasks or domains. Typically, the fine-tuning process is computationally cheap compared to training the main model. However, training data is required. The TOM research project explored the capabilities of such pre-trained mod- els for ontology matching. TOM does not use fine-tuned models, but there is a fine-tuned version of TOM, which is named Fine-TOM (F-TOM) [9]. There are several python libraries, like the transformers library by hugging- face5 [14] and the sentence-transformers library by the Ubiquitous Knowledge Processing Lab6 [12] that provide access to a variety of pre-trained transformers. We evaluated the performance of several pre-trained models, inter alia, GPT- 2 [11], BERT [4], and different Sentence-BERT models [12] on the Anatomy, Conference, and Knowledge Graph track. Based on this evaluation, we decided to use the Sentence-BERT model paraphrase-TinyBERT-L6-v2 [12] for our sub- mitted matcher since it achieved the best results. TOM Pipeline TOM consists of five components that are arranged in a pipeline which is shown in Figure 1. The arrows indicate how the ontologies and alignments are passed between the components. Each component can be conceptually regarded as a matcher. This design pattern is proposed by MELT. For chaining the components, class MatcherYAAAJena was used. First, the ontologies are aligned in with string matching methods. Since these are typically of high precision, the resulting alignment of this step is directly added to the final alignment. The transformer component in MELT is, by default, implemented as a filter. Therefore, candidates have to be generated which are then filtered by the trans- former. In this case, a string overlap metric is used. The transformer component adds a confidence to each candidate. After the transformer matcher calculated the confidence for each candidate pair, the alignment is filtered by a threshold. Based on an evaluation on different OAEI tracks, we decided to set the threshold to 0.8. Since most OAEI datasets are typically of one-to-one parity, we use an efficient implementation of the Hungarian method, known as Maximum Weight Bipartite Matching (MWBM) [3]. The motivation behind the matching compo- nents and their details are described in the sections below. String Matcher The string matcher is used to find very obvious correspon- dences. To do that it compares the inputs word by word. It assigns a confidence of 1.0 to the obvious correspondences. Therefore, the resource-consuming trans- former matcher does not have to evaluate those pairs of elements again. We 5 https://huggingface.co 6 https://www.sbert.net TOM Matcher Results for OAEI 2021 3 Fig. 1. High-level view of the TOM matching process. O1 and O2 represent the input ontologies and optionally elements. The final alignment is referred to as A. use the default string matcher implementation of MELT (class SimpleString- Matcher). Candidate Generator The Candidate Generator is used to generate an input alignment for the transformer matcher. It creates a cross product of both on- tologies, so that the alignment consists of all possible pairs of elements. For large ontologies, the alignment would be too large to be proceeded by the transformer matcher within an appropriate time frame. Therefore, the Candidate Generator excludes the correspondences that are found by the string matcher and obvious non-correspondences. Those are found through broad string operations. The candidate generator class splits the labels and saves single words in a set. Then the sets are compared and if over 50 percent of the words are equal, the correspondence is further processed by the transformer matcher. If less than 50 percent of the words are equal, it is not necessary to calculate a similarity for this pair. Finally, the alignment is passed to the transformer matcher. Transformer Matcher The transformer matcher iterates over the alignment and passes each pair of elements to the sentence-bert library [12]. For each pair, it receives a similarity value s ∈ [0, 1] which is used as the assigned confidence that this pair is a match. The transformer matcher does not change the size of the alignment but only adds additional confidences. To access a transformer via the sentence-BERT library, the transformer matcher starts a python server in the background since the transformer libraries are implemented in Python while the 4 D. Kossack et al. matching pipeline is implemented in Java. The communication between the Java project and the python server works via an HTTP Application Programming Interface (API) that is represented by the horizontal arrows in Figure 1. After the python server received the pairs of elements and calculated the similarity values with the cosine similarity function of scikit-learn7 [10], it returns the confidence list back to the class Transformer Matcher. Then, the transformer matcher replaces the confidences in the current alignment. Threshold Filter The threshold filter uses the alignment and cuts off all corre- spondences with a low confidence. To do that, the filter uses a threshold between zero an one. All correspondences with a confidence lower than the threshold are excluded from the alignment. The submitted matcher has a threshold t = 0.8 which yielded good results for all evaluated tracks. We use the default string matcher implementation of MELT (class ThresholdFilter). Max Weight Bipartite Extractor Up to this step in the pipeline, we could have multiple correspondences for an ontological element in the alignment. There- fore, the max weight bipartite extractor converts the current state to a one-to-one alignment. We use the default class MaxWeightBipartiteExtractor of MELT. 2 Results This section discusses the results of TOM for the tracks of OAEI 2021 on which the matcher was able to produce results. These include the Anatomy [1], Con- ference [2,15], and Knowledge Graph track [8,5]. 2.1 Anatomy Track TOM could achieve a higher F-measure than the OAEI Baseline (0.866 vs. 0.766). It was noticeable that the recall was improved by TOM (0.808 vs 0.622) but the precision is lower than with only the OAEI Baseline (0.997 vs 0.933). So there are non-obvious matches that cannot be found with string based matching but by TOM. Examples for these would be the matches Parietal Lobe of the Brain & parietal cortex with a confidence of 0.8202 and great vein of heart & Great Cardiac Vein with a confidence of 0.8794. Those are found because of the ability of transformers to detect semantic similarities between two phrases or words such as heart and cardiac, even though they are spelled differently. 2.2 Conference Track Also on the Conference track, TOM is able to find more correspondences than the OAEI Baseline. So the recall is higher (0.48 vs 0.41) and also the F-measure is higher (0.57 vs 0.53). The precision is a bit lower (0.69 vs 0.76). 7 https://github.com/scikit-learn/scikit-learn/blob/844b4be24/sklearn/ metrics/pairwise.py#L1211l TOM Matcher Results for OAEI 2021 5 3 General Comments We thank the OAEI organizers for their support and commitment. 4 Conclusion In this paper, we presented the TOM matching system and its results in the OAEI 2021. We can conclude that transformer-based language models are able to improve performance in the task and process of ontology matching. The developed matching system achieves an overall better F-measure than the baseline matchers and it improves the recall. It is important to note that the highest possible recall is set by the Candidate Generator’s alignment. The Transformer Matcher works only with this alignment and so it is not possible to achieve a higher recall. The research also showed that the presented architecture and the implemen- tation in Java and Python are appropriate approaches to use transformers for ontology matching. Most of the used matching components are available via the MELT framework to allow other developers to re-use them. The docker packag- ing allows to submit any implementation without set-up efforts on the organizer side which is also beneficial for matcher developers who do not have to worry about the execution of their system. This is the first OAEI participation of TOM and the system can be greatly improved in the future, for example by using fine-tuned models or by improving the candidate generation pipeline. The reported results motivate further research in the area of transformer-based ontology matching. References 1. Bodenreider, O., Hayamizu, T.F., Ringwald, M., de Coronado, S., Zhang, S.: Of mice and men: Aligning mouse and human anatomies. In: AMIA 2005, American Medical Informatics Association Annual Sym- posium, Washington, DC, USA, October 22-26, 2005. AMIA (2005), https://knowledge.amia.org/amia-55142-a2005a-1.613296/t-001-1.616182/ f-001-1.616183/a-012-1.616655/a-013-1.616652 2. Cheatham, M., Hitzler, P.: Conference v2.0: An uncertain version of the OAEI con- ference benchmark. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C.A., Vrandecic, D., Groth, P., Noy, N.F., Janowicz, K., Goble, C.A. (eds.) The Se- mantic Web - ISWC 2014 - 13th International Semantic Web Conference, Riva del Garda, Italy, October 19-23, 2014. Proceedings, Part II. Lecture Notes in Computer Science, vol. 8797, pp. 33–48. Springer (2014). https://doi.org/10.1007/978-3-319- 11915-1 3, https://doi.org/10.1007/978-3-319-11915-1_3 3. Cruz, I.F., Antonelli, F.P., Stroe, C.: Efficient selection of mappings and auto- matic quality-driven combination of matching methods. In: Shvaiko, P., Euzenat, J., Giunchiglia, F., Stuckenschmidt, H., Noy, N.F., Rosenthal, A. (eds.) Proceedings of the 4th International Workshop on Ontology Matching (OM-2009) collocated with the 8th International Semantic Web Conference (ISWC-2009) Chantilly, USA, 6 D. Kossack et al. October 25, 2009. CEUR Workshop Proceedings, vol. 551. CEUR-WS.org (2009), http://ceur-ws.org/Vol-551/om2009_Tpaper5.pdf 4. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirec- tional transformers for language understanding. CoRR abs/1810.04805 (2018), http://arxiv.org/abs/1810.04805 5. Hertling, S., Paulheim, H.: The knowledge graph track at OAEI - gold standards, baselines, and the golden hammer bias. In: Harth, A., Kirrane, S., Ngomo, A.N., Paulheim, H., Rula, A., Gentile, A.L., Haase, P., Cochez, M. (eds.) The Semantic Web - 17th International Conference, ESWC 2020, Heraklion, Crete, Greece, May 31-June 4, 2020, Proceedings. Lecture Notes in Computer Science, vol. 12123, pp. 343–359. Springer (2020). https://doi.org/10.1007/978-3-030-49461-2 20, https: //doi.org/10.1007/978-3-030-49461-2_20 6. Hertling, S., Portisch, J., Paulheim, H.: MELT - matching evaluation toolkit. In: Acosta, M., Cudré-Mauroux, P., Maleshkova, M., Pellegrini, T., Sack, H., Sure- Vetter, Y. (eds.) Semantic Systems. The Power of AI and Knowledge Graphs - 15th International Conference, SEMANTiCS 2019, Karlsruhe, Germany, September 9- 12, 2019, Proceedings. Lecture Notes in Computer Science, vol. 11702, pp. 231–245. Springer (2019). https://doi.org/10.1007/978-3-030-33220-4 17 7. Hertling, S., Portisch, J., Paulheim, H.: Matching with transformers in MELT. CoRR abs/2109.07401 (2021), https://arxiv.org/abs/2109.07401 8. Hofmann, A., Perchani, S., Portisch, J., Hertling, S., Paulheim, H.: Dbkwik: To- wards knowledge graph creation from thousands of wikis. In: Nikitina, N., Song, D., Fokoue, A., Haase, P. (eds.) Proceedings of the ISWC 2017 Posters & Demon- strations and Industry Tracks co-located with 16th International Semantic Web Conference (ISWC 2017), Vienna, Austria, October 23rd - to - 25th, 2017. CEUR Workshop Proceedings, vol. 1963. CEUR-WS.org (2017), http://ceur-ws.org/ Vol-1963/paper540.pdf 9. Knorr, L., Portisch, J.: Fine-TOM matcher results for OAEI 2021. In: OM@ISWC 2021 (2021), to appear 10. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011) 11. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Lan- guage models are unsupervised multitask learners (2018), https://d4mucfpksywv. cloudfront.net/better-language-models/language-models.pdf 12. Reimers, N., Gurevych, I.: Sentence-bert: Sentence embeddings using siamese bert- networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (11 2019), http: //arxiv.org/abs/1908.10084 13. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017), http: //arxiv.org/abs/1706.03762 14. Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Brew, J.: Huggingface’s transformers: State- of-the-art natural language processing. CoRR abs/1910.03771 (2019), http:// arxiv.org/abs/1910.03771 15. Zamazal, O., Svátek, V.: The ten-year ontofarm and its fertiliza- tion within the onto-sphere. J. Web Semant. 43, 46–53 (2017). https://doi.org/10.1016/j.websem.2017.01.001, https://doi.org/10.1016/j. websem.2017.01.001