GraphMatcher: A Graph Representation Learning Approach for Ontology Matching Sefika Efeoglu Free University of Berlin, Department of Computer Science, Takustraße 9, 14195 Berlin, Germany Abstract Ontology matching is defined as finding a relationship or correspondence between two or more entities in two or more ontologies. To solve the interoperability problem of the domain ontologies, semantically similar entities in these ontologies must be found and aligned before merging them. GraphMatcher, developed in this study, is an ontology matching system using a graph attention approach to compute higher-level representation of a class together with its surrounding terms. The GraphMatcher has obtained remarkable results in in the Ontology Alignment Evaluation Initiative (OAEI) 2022 conference track. Its codes are available at https://github.com/sefeoglu/gat_ontology_matching. Keywords graph attention, graph representation, ontology matching 1. Presentation of the system GraphMatcher is a new ontology matching system based on graph representation learning using a graph attention [1] together with a new neighbourhood aggregation approach. The graph representation learning approach has leveraged the graph attention and introduces a new neighbourhood aggregation algorithm that increases the contextual information of the centre class and property. 1.1. Proposal and general statement Ontology matching is to find a relationship or correspondence between two or more entities in two or more independent ontologies. The alignments of two ontologies are classified as two different cases: (i) simple alignment and (ii) complex alignment [2]. The simple alignment is defined as the mapping of the class names according to word-based similarity, while complex alignment considers the meaning of two classes to decide whether they are similar [2]. To understand the meaning of a class (sequence), the contextual information of the class or prop- erty is needed. In this case, we must decide which neighbour contributes to their contextual information. Many logic- and algorithm-based ontology matching tools, such as LogMAP [3] and AML [4], solve this interoperability problem of domain ontologies by using algorithms and logic-based ap- proaches. In addition to these approaches, DeepAlignment [5], VeeAlign [6], and Convolutional The 17th International Workshop on Ontology Matching, The 21st International Semantic Web Conference (ISWC) 2022, 23 October 2022, Hangzhou, China $ sefika.efeoglu@fu-berlin.de (S. Efeoglu) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) Networks of Bento et al. (2020) [7] apply machine learning (ML) for matching. Nevertheless, according to OAEI’s 1 conference track results, these ML approaches cannot achieve a better performance than traditional tools such as AML and LogMAP. The weakness of these ML approaches might be due to the lack of contextual information about the property and class. Another limitation is in how to represent the ontology’s data as a convolutional graph - such as an image in which each pixel in the image data has the same number as its neighbouring pixels - whereas each class in ontology has a different number of its neighboring terms like an arbitrary graph [1]. The most appropriate way of representing the data in the ontology is the arbitrary graph. Since the ontology represents the data with the arbitrary graph, we aim to develop a graph representation learning model based on a graph attention mechanism [1] using Siamese net- works [7, 8, 6] to find the semantically similar concepts within the ontologies. The graph attention mechanism computes the higher-level representation of a concept and its surrounding concepts (features). The model then finds similarity scores between the concept pairs among the aligned ontology pairs and determines the concept alignments. 1.2. Specific techniques used GraphMatcher utilises a graph representation learning approach that uses the graph attention [1]. Its network consists of five layers. The main contribution is the adaptation of the graph attention to the Siamese network in the third layer. Figure 1: The proposed network 2 is an application of heterogeneous graph attention on Siamese networks to find the similar classes. 1 Ontology Alignment Evaluation Initiative (OAEI): http://oaei.ontologymatching.org/ 2 The orders of the layers in the network is similar to the VeeAlign [6], since this work’s extended version has also increased the performance of the VeeAlign with our neighborhood aggregation algorithm in “ Efeoglu, S. (2021). A Deep Learning Approach for Domain-Specific Ontology Construction. University of Potsdam. [Master’s Thesis]”. 1.2.1. Preprocessing Data preprocessing is one of the most significant parts of developing a ML model and is required to explain the variability of features in a sample. In this study, we have handled data preprocessing of an ontology in six steps: (i) an ontology parsing, (ii) tokenization, (iii) finding the abbreviations, (iv) cleaning from stop words, (v) neighbourhood aggregation for creating the context, and (vi) finding the embedding of the terms. 1.2.2. Embedding layer We used the pre-trained Universal Sentence Encoder (USE) to obtain the word embedding vector of the class and property. According to benchmark results [9], the USE outperforms the BERT encoder in semantic sentence encoding. Therefore, we chose the USE for word embeddings. 1.2.3. Heterogeneous graph attention layer The graph attention introduced in graph attention networks is used to cluster and classify the citation graphs [1]. It provides inductive and transactive learning approaches. Its inductive learning approach computes contextual information by applying an attention centred on the centre class in the neighbourhood graph in our system. The centre class has five homogeneous graphs consisting of its neighbouring terms, and these five graphs refer to one heterogeneous graph. Figure 2: This figure shows how graph attention is applied to a homogeneous graph. The relationship between the centre class and its neighbouring classes is the same in the homogeneous graphs such as ‘subClassOf’. In this graph, the ‘ObjectProperty’ node represents the restrictions belonging to the definition of a child or parent class. Bag-Of-Word is used to represent the centre class and its ′ neighbouring classes. ℎ1 shows the contextual information of ℎ1 in terms of this ‘subClassOf’ relation after applying graph attention. In addition, a𝑖𝑗 denotes 𝛼𝑖𝑗 in Equation 5. The relation between the centre class and other terms in each homogeneous graph is the same. For example, the relation between the centre node and other terms in Figure 2 is ‘subClassOf’. However, some homogeneous graphs having ‘equivalentClass’ or ‘subClassOf’ relationships might contain some terms regarding restrictions defined with ‘ObjectProperty’ in the neighbour- hood aggregation. The figure explains how the centre class’ contextual information is computed by the graph attention in one of its homogeneous neighbourhood graphs. The main difference between the original graph attention approach and our graph attention is that this attention mechanism is applied to the heterogeneous graph containing five homogeneous subgraphs. The attention in this layer runs the following various attention mechanisms [1]. A set of features (the centre class and its neighbouring terms), which is inputs of the graph attention layer, is denoted in Eq. 1 where ℎ⃗𝑖 𝜖R𝐹 . {︁ }︁ ℎ = ℎ⃗1 , ℎ⃗2 , ℎ⃗3 ......ℎ⃗𝑁 , (1) The layer converts the input features to the new higher-level representation of the feature ′ list like defined in Eq. 2 where ℎ⃗ 𝜖R𝐹 ′ 𝑖 {︂ }︂ ′ ⃗′ ⃗′ ⃗′ ⃗′ ℎ = ℎ1 , ℎ2 , ℎ3 ......ℎ𝑁 , (2) Using features in Eq. 1, we have obtained the higher-level representation of the class’ neighbours, namely its contextual information in this layer. The new features are computed by Eq. 3, and 𝜎 indicates the linear activation function like sigmoid or softmax in the Eq. 3. To indicate the ′ higher-level representation of the set of features, 𝑊 𝜖R𝐹 𝑋𝐹 is used as a learnable parameter, and shared linear transformation is applied to each feature. “K” is the number of independent heads, and “K” equals five in our system. (︁∑︁ )︁ ℎ′ = ||𝐾 𝑘=1 𝜎 𝑘 𝛼𝑖𝑗 𝑊 𝑘 ℎ⃗𝑗 (3) ′ ′ 𝛼𝜖R𝐹 𝑋R𝐹 denotes a shared attention mechanism and a layer using the self-attention. The following equation computes attention coefficients (𝑒𝑖𝑗 ): 𝑒𝑖𝑗 = 𝛼(𝑊 ⃗ℎ𝑖 , 𝑊 ⃗ℎ𝑗 ) (4) To compute the 𝛼𝑖𝑗 for ℎ′ in the Eq. 2, this coefficient attention mechanism is applied in Eq. 5 where ⃗𝑎𝜖R2𝐹 , and 𝛼𝑖𝑗 is attention mechanism parameterized by ⃗𝑎 weight vector. ⃗ 𝑇 [𝑊 ℎ⃗𝑖 ||𝑊 ℎ⃗𝑗 ])) exp(𝐿𝑒𝑎𝑘𝑦𝑅𝑒𝐿𝑈 (𝑎 𝛼𝑖𝑗 = 𝑠𝑜𝑓 𝑡𝑚𝑎𝑥(𝑒𝑖𝑗 ) = ∑︀ (5) 𝑘𝜖𝑁𝑖exp(𝐿𝑒𝑎𝑘𝑦𝑅𝑒𝐿𝑈 (𝑎 ⃗ 𝑇 [𝑊 ℎ⃗𝑖 ||𝑊 ℎ⃗𝑘 ])) Using the formula in the Eq. 3 in this layer, we have obtained the higher-level representation of the class’ neighbours, namely its contextual information in this layer. 1.2.4. Output layer The output layer provides down sampling (dimensional reduction) of the contextual information, which is the concatenation of the class embedding and higher representations of the class’ neighbours. 1.2.5. Cosine similarity layer The cosine similarity layer measures the cosine similarity of output in the previous layer. 1.3. Adaptations made for the evaluation The GraphMatcher’s framework has been developed in Python with PyTorch and Ontospy, and is packed by SEALS using MELT [10]. 1.4. Parameter settings The model’s parameters 3 are 0.01 of learning rate, 5 epochs, 0.001 of weight decay and 16 of batch size. The threshold is computed from false positive alignments in the validation data as how the VeeAlign [6] system proposes 4 . 2. Results The conference track consists of sixteen ontologies, but only seven ontologies have (twenty-one) reference alignment cases. These reference alignments have been utilized as ground truths to use true positive alignments. Besides, negative alignment cases have been computed by oversampling from all the possible class and property alignments. Therefore, once we apply a supervised machine learning approach, we can use only these seven ontologies from this dataset. Table 1 The table shows GraphMatcher results in the conference track. Evaluation Precision F.5-measure F1-measure F2-measure Recall ra1-M1 0.82 0.77 0.71 0.65 0.62 ra1-M2 0.65 0.51 0.39 0.32 0.28 ra1-M3 0.8 0.74 0.67 0.6 0.57 ra2-M1 0.78 0.73 0.66 0.6 0.57 ra2-M2 0.65 0.5 0.39 0.32 0.28 ra2-M3 0.76 0.7 0.62 0.56 0.53 rar2-M1 0.77 0.73 0.67 0.62 0.59 rar2-M2 0.65 0.52 0.4 0.33 0.29 rar2-M3 0.75 0.7 0.63 0.58 0.55 The Table 1 shows the GraphMatcher results in the conference track. The performance of the GraphMatcher is better in M1 variants than in M2 in terms of F1-measure. Therefore, its F1-measure has been decreased in M3 variants. As a result, the model has a weakness in the M2 variants, namely property alignments. 3 These parameters do not give the optimum model. The best model has different parameters, but we mistakenly submitted the model using these parameters to the conference track challenge. 4 The project uses VeeAlign’s approach directly to compute the threshold with the permission of the first author. 3. General Comments 3.1. Comments on the results The GraphMatcher is the new ontology matching system participating in OAEI 2022 and is evaluated in the conference track. The GraphMatcher demonstrates remarkable performance in the M1 and M3 evaluation variants in terms of F1-measure, even though it does not have high performance in the M2 evaluation variant. However, all other matchers do not show remarkable results in this M2 evaluation variant. On the other hand, it is also evaluated in the uncertain reference alignments in OAEI 2022 conference track. It has the highest F1-measure (72% in both of them) in the discrete and continuous metrics 5 among all other matchers evaluated in this track. This means that the GraphMatcher’s confidence is higher than the other matchers evaluated in the OAEI 2022 conference track. Table 2 The table shows the matchers’ performances on the rar2-M3 reference alignments. Matcher Precision F.5-measure F1-measure F2-measure Recall LogMap 0.76 0.71 0.64 0.59 0.56 GraphMatcher 0.75 0.7 0.63 0.58 0.55 SEBMatcher 0.79 0.7 0.6 0.52 0.48 ATMatcher 0.69 0.64 0.59 0.54 0.51 ALIN 0.82 0.7 0.57 0.48 0.44 LogMapLt 0.68 0.62 0.56 0.5 0.47 LSMatch 0.83 0.69 0.55 0.46 0.41 AMD 0.82 0.68 0.55 0.46 0.41 KGMatcher+ 0.83 0.67 0.52 0.43 0.38 ALIOn 0.66 0.44 0.3 0.22 0.19 Matcha 0.37 0.2 0.12 0.08 0.07 3.2. Improvements The GraphMatcher should be improved to match the properties, since it does not perform well in the M2 evaluation variants. Its current version does not apply the graph attention to align the properties because of the lack of properties’ neighbours, especially datatype properties. These object type and data type properties might not have enough neighbouring terms to construct contextual information in the ontology. In this case, the property’s context might be improved with the external information in its next version, and the graph attention can also be applied to align the properties. In addition to these improvements, we will train the model with its optimum parameter settings in its further version. 5 The evaluation based on the uncertain reference alignments: http://oaei.ontologymatching.org/2022/results/ conference/index.html 4. Conclusion In this study, we have introduced the new ontology-matching system called GraphMatcher. The GraphMatcher adapted the graph attention to the homogeneous subgraphs of the centre class’ neighbours to obtain the contextual information about the centre class. The graph attention has computed the higher-level representation of each class and its surrounding classes and properties. The results demonstrate promising performances in M1 and M3 evaluation variants. The future work of this study will be to increase its performance in M2 evaluation variants. References [1] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, Y. Bengio, Graph attention networks (2018). arXiv:1710.10903. [2] É. Thiéblin, O. Haemmerlé, N. Hernandez, C. Trojahn, Survey on complex ontology matching, Semantic Web 11 (2020) 689–727. [3] E. Jiménez-Ruiz, B. Cuenca Grau, Logmap: Logic-based and scalable ontology matching, in: L. Aroyo, C. Welty, H. Alani, J. Taylor, A. Bernstein, L. Kagal, N. Noy, E. Blomqvist (Eds.), The Semantic Web – ISWC 2011, SBH, Berlin, Heidelberg, 2011, pp. 273–288. [4] D. Faria, C. Pesquita, E. Santos, M. Palmonari, I. F. Cruz, F. M. Couto, The agreement- makerlight ontology matching system, in: R. Meersman, H. Panetto, T. Dillon, J. Eder, Z. Bellahsene, N. Ritter, P. De Leenheer, D. Dou (Eds.), On the Move to Meaningful Internet Systems: OTM 2013 Conferences, SBH, 2013, pp. 527–541. [5] P. Kolyvakis, A. Kalousis, D. Kiritsis, DeepAlignment: Unsupervised ontology matching with refined word vectors, in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), ACL, 2018, pp. 787–798. [6] V. Iyer1, A. Agarwal, H. Kumar, Veealign: A supervised deep learning approach to ontology alignment (2020). [7] A. Bento, A. Zouaq, M. Gagnon, Ontology matching using convolutional neural networks, in: Proceedings of the 12th Language Resources and Evaluation Conference, ELRA, Mar- seille, France, 2020, pp. 5648–5653. URL: https://www.aclweb.org/anthology/2020.lrec-1. 693. [8] J. Chen, J. Gu, Adol: a novel framework for automatic domain ontology learning, Super Computing (2021). [9] H. Hassan, G. Sansonetti, F. Gasparetti, A. Micarelli, J. Beel, Bert, elmo, use and infersent sentence encoders: The panacea for research-paper recommendation?, in: RecSys, WS, 2019. [10] S. Hertling, J. Portisch, H. Paulheim, Melt - matching evaluation toolkit, in: M. Acosta, P. Cudré-Mauroux, M. Maleshkova, T. Pellegrini, H. Sack, Y. Sure-Vetter (Eds.), Semantic Systems. The Power of AI and Knowledge Graphs, Springer International Publishing, Cham, 2019, pp. 231–245.