GraphMatcher: A Graph Representation Learning
Approach for Ontology Matching
Sefika Efeoglu
Free University of Berlin, Department of Computer Science, Takustraße 9, 14195 Berlin, Germany


                                      Abstract
                                      Ontology matching is defined as finding a relationship or correspondence between two or more entities
                                      in two or more ontologies. To solve the interoperability problem of the domain ontologies, semantically
                                      similar entities in these ontologies must be found and aligned before merging them. GraphMatcher,
                                      developed in this study, is an ontology matching system using a graph attention approach to compute
                                      higher-level representation of a class together with its surrounding terms. The GraphMatcher has
                                      obtained remarkable results in in the Ontology Alignment Evaluation Initiative (OAEI) 2022 conference
                                      track. Its codes are available at https://github.com/sefeoglu/gat_ontology_matching.

                                      Keywords
                                      graph attention, graph representation, ontology matching


1. Presentation of the system
GraphMatcher is a new ontology matching system based on graph representation learning
using a graph attention [1] together with a new neighbourhood aggregation approach. The
graph representation learning approach has leveraged the graph attention and introduces a new
neighbourhood aggregation algorithm that increases the contextual information of the centre
class and property.

1.1. Proposal and general statement
Ontology matching is to find a relationship or correspondence between two or more entities in
two or more independent ontologies. The alignments of two ontologies are classified as two
different cases: (i) simple alignment and (ii) complex alignment [2]. The simple alignment is
defined as the mapping of the class names according to word-based similarity, while complex
alignment considers the meaning of two classes to decide whether they are similar [2]. To
understand the meaning of a class (sequence), the contextual information of the class or prop-
erty is needed. In this case, we must decide which neighbour contributes to their contextual
information.
   Many logic- and algorithm-based ontology matching tools, such as LogMAP [3] and AML [4],
solve this interoperability problem of domain ontologies by using algorithms and logic-based ap-
proaches. In addition to these approaches, DeepAlignment [5], VeeAlign [6], and Convolutional
The 17th International Workshop on Ontology Matching, The 21st International Semantic Web Conference (ISWC) 2022,
23 October 2022, Hangzhou, China
$ sefika.efeoglu@fu-berlin.de (S. Efeoglu)
                                    © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073
                                    CEUR Workshop Proceedings (CEUR-WS.org)
Networks of Bento et al. (2020) [7] apply machine learning (ML) for matching. Nevertheless,
according to OAEI’s 1 conference track results, these ML approaches cannot achieve a better
performance than traditional tools such as AML and LogMAP. The weakness of these ML
approaches might be due to the lack of contextual information about the property and class.
Another limitation is in how to represent the ontology’s data as a convolutional graph - such
as an image in which each pixel in the image data has the same number as its neighbouring
pixels - whereas each class in ontology has a different number of its neighboring terms like an
arbitrary graph [1]. The most appropriate way of representing the data in the ontology is the
arbitrary graph.
   Since the ontology represents the data with the arbitrary graph, we aim to develop a graph
representation learning model based on a graph attention mechanism [1] using Siamese net-
works [7, 8, 6] to find the semantically similar concepts within the ontologies. The graph
attention mechanism computes the higher-level representation of a concept and its surrounding
concepts (features). The model then finds similarity scores between the concept pairs among
the aligned ontology pairs and determines the concept alignments.

1.2. Specific techniques used
GraphMatcher utilises a graph representation learning approach that uses the graph attention [1].
Its network consists of five layers. The main contribution is the adaptation of the graph attention
to the Siamese network in the third layer.


Figure 1: The proposed network 2 is an application of heterogeneous graph attention on Siamese
networks to find the similar classes.


1
  Ontology Alignment Evaluation Initiative (OAEI):
  http://oaei.ontologymatching.org/
2
  The orders of the layers in the network is similar to the VeeAlign [6], since this work’s extended version has also
  increased the performance of the VeeAlign with our neighborhood aggregation algorithm in “ Efeoglu, S. (2021). A
  Deep Learning Approach for Domain-Specific Ontology Construction. University of Potsdam. [Master’s Thesis]”.
1.2.1. Preprocessing
Data preprocessing is one of the most significant parts of developing a ML model and is
required to explain the variability of features in a sample. In this study, we have handled data
preprocessing of an ontology in six steps: (i) an ontology parsing, (ii) tokenization, (iii) finding
the abbreviations, (iv) cleaning from stop words, (v) neighbourhood aggregation for creating
the context, and (vi) finding the embedding of the terms.

1.2.2. Embedding layer
We used the pre-trained Universal Sentence Encoder (USE) to obtain the word embedding vector
of the class and property. According to benchmark results [9], the USE outperforms the BERT
encoder in semantic sentence encoding. Therefore, we chose the USE for word embeddings.

1.2.3. Heterogeneous graph attention layer
The graph attention introduced in graph attention networks is used to cluster and classify the
citation graphs [1]. It provides inductive and transactive learning approaches. Its inductive
learning approach computes contextual information by applying an attention centred on the
centre class in the neighbourhood graph in our system. The centre class has five homogeneous
graphs consisting of its neighbouring terms, and these five graphs refer to one heterogeneous
graph.


Figure 2: This figure shows how graph attention is applied to a homogeneous graph. The relationship
between the centre class and its neighbouring classes is the same in the homogeneous graphs such
as ‘subClassOf’. In this graph, the ‘ObjectProperty’ node represents the restrictions belonging to
the definition of a child or parent class. Bag-Of-Word is used to represent the centre class and its
                        ′
neighbouring classes. ℎ1 shows the contextual information of ℎ1 in terms of this ‘subClassOf’ relation
after applying graph attention. In addition, a𝑖𝑗 denotes 𝛼𝑖𝑗 in Equation 5.


  The relation between the centre class and other terms in each homogeneous graph is the same.
For example, the relation between the centre node and other terms in Figure 2 is ‘subClassOf’.
However, some homogeneous graphs having ‘equivalentClass’ or ‘subClassOf’ relationships
might contain some terms regarding restrictions defined with ‘ObjectProperty’ in the neighbour-
hood aggregation. The figure explains how the centre class’ contextual information is computed
by the graph attention in one of its homogeneous neighbourhood graphs. The main difference
between the original graph attention approach and our graph attention is that this attention
mechanism is applied to the heterogeneous graph containing five homogeneous subgraphs. The
attention in this layer runs the following various attention mechanisms [1].
   A set of features (the centre class and its neighbouring terms), which is inputs of the graph
attention layer, is denoted in Eq. 1 where ℎ⃗𝑖 𝜖R𝐹 .
                                         {︁                         }︁
                                    ℎ = ℎ⃗1 , ℎ⃗2 , ℎ⃗3 ......ℎ⃗𝑁 ,                           (1)
   The layer converts the input features to the new higher-level representation of the feature
                                        ′
list like defined in Eq. 2 where ℎ⃗ 𝜖R𝐹
                                   ′
                                  𝑖
                                         {︂                       }︂
                                       ′    ⃗′ ⃗′ ⃗′          ⃗′
                                      ℎ = ℎ1 , ℎ2 , ℎ3 ......ℎ𝑁 ,                              (2)

Using features in Eq. 1, we have obtained the higher-level representation of the class’ neighbours,
namely its contextual information in this layer. The new features are computed by Eq. 3, and
𝜎 indicates the linear activation function like sigmoid or softmax in the Eq. 3. To indicate the
                                                            ′
higher-level representation of the set of features, 𝑊 𝜖R𝐹 𝑋𝐹 is used as a learnable parameter,
and shared linear transformation is applied to each feature. “K” is the number of independent
heads, and “K” equals five in our system.
                                               (︁∑︁             )︁
                                  ℎ′ = ||𝐾
                                         𝑘=1 𝜎
                                                      𝑘
                                                    𝛼𝑖𝑗 𝑊 𝑘 ℎ⃗𝑗                                 (3)
      ′     ′
𝛼𝜖R𝐹 𝑋R𝐹 denotes a shared attention mechanism and a layer using the self-attention. The
following equation computes attention coefficients (𝑒𝑖𝑗 ):

                                           𝑒𝑖𝑗 = 𝛼(𝑊 ⃗ℎ𝑖 , 𝑊 ⃗ℎ𝑗 )                             (4)

  To compute the 𝛼𝑖𝑗 for ℎ′ in the Eq. 2, this coefficient attention mechanism is applied in Eq. 5
where ⃗𝑎𝜖R2𝐹 , and 𝛼𝑖𝑗 is attention mechanism parameterized by ⃗𝑎 weight vector.

                                                              ⃗ 𝑇 [𝑊 ℎ⃗𝑖 ||𝑊 ℎ⃗𝑗 ]))
                                               exp(𝐿𝑒𝑎𝑘𝑦𝑅𝑒𝐿𝑈 (𝑎
             𝛼𝑖𝑗 = 𝑠𝑜𝑓 𝑡𝑚𝑎𝑥(𝑒𝑖𝑗 ) = ∑︀                                                         (5)
                                              𝑘𝜖𝑁𝑖exp(𝐿𝑒𝑎𝑘𝑦𝑅𝑒𝐿𝑈 (𝑎 ⃗ 𝑇 [𝑊 ℎ⃗𝑖 ||𝑊 ℎ⃗𝑘 ]))

   Using the formula in the Eq. 3 in this layer, we have obtained the higher-level representation
of the class’ neighbours, namely its contextual information in this layer.

1.2.4. Output layer
The output layer provides down sampling (dimensional reduction) of the contextual information,
which is the concatenation of the class embedding and higher representations of the class’
neighbours.
1.2.5. Cosine similarity layer
The cosine similarity layer measures the cosine similarity of output in the previous layer.

1.3. Adaptations made for the evaluation
The GraphMatcher’s framework has been developed in Python with PyTorch and Ontospy, and
is packed by SEALS using MELT [10].

1.4. Parameter settings
The model’s parameters 3 are 0.01 of learning rate, 5 epochs, 0.001 of weight decay and 16 of
batch size. The threshold is computed from false positive alignments in the validation data as
how the VeeAlign [6] system proposes 4 .


2. Results
The conference track consists of sixteen ontologies, but only seven ontologies have (twenty-one)
reference alignment cases. These reference alignments have been utilized as ground truths
to use true positive alignments. Besides, negative alignment cases have been computed by
oversampling from all the possible class and property alignments. Therefore, once we apply
a supervised machine learning approach, we can use only these seven ontologies from this
dataset.

Table 1
The table shows GraphMatcher results in the conference track.
         Evaluation       Precision      F.5-measure       F1-measure        F2-measure       Recall
         ra1-M1           0.82           0.77              0.71              0.65             0.62
         ra1-M2           0.65           0.51              0.39              0.32             0.28
         ra1-M3           0.8            0.74              0.67              0.6              0.57
         ra2-M1           0.78           0.73              0.66              0.6              0.57
         ra2-M2           0.65           0.5               0.39              0.32             0.28
         ra2-M3           0.76           0.7               0.62              0.56             0.53
         rar2-M1          0.77           0.73              0.67              0.62             0.59
         rar2-M2          0.65           0.52              0.4               0.33             0.29
         rar2-M3          0.75           0.7               0.63              0.58             0.55


  The Table 1 shows the GraphMatcher results in the conference track. The performance of
the GraphMatcher is better in M1 variants than in M2 in terms of F1-measure. Therefore, its
F1-measure has been decreased in M3 variants. As a result, the model has a weakness in the M2
variants, namely property alignments.

3
  These parameters do not give the optimum model. The best model has different parameters, but we mistakenly
  submitted the model using these parameters to the conference track challenge.
4
  The project uses VeeAlign’s approach directly to compute the threshold with the permission of the first author.
3. General Comments
3.1. Comments on the results
The GraphMatcher is the new ontology matching system participating in OAEI 2022 and is
evaluated in the conference track. The GraphMatcher demonstrates remarkable performance in
the M1 and M3 evaluation variants in terms of F1-measure, even though it does not have high
performance in the M2 evaluation variant. However, all other matchers do not show remarkable
results in this M2 evaluation variant.
  On the other hand, it is also evaluated in the uncertain reference alignments in OAEI 2022
conference track. It has the highest F1-measure (72% in both of them) in the discrete and
continuous metrics 5 among all other matchers evaluated in this track. This means that the
GraphMatcher’s confidence is higher than the other matchers evaluated in the OAEI 2022
conference track.

Table 2
The table shows the matchers’ performances on the rar2-M3 reference alignments.
          Matcher              Precision     F.5-measure      F1-measure      F2-measure       Recall
          LogMap               0.76          0.71             0.64            0.59             0.56
          GraphMatcher         0.75          0.7              0.63            0.58             0.55
          SEBMatcher           0.79          0.7              0.6             0.52             0.48
          ATMatcher            0.69          0.64             0.59            0.54             0.51
          ALIN                 0.82          0.7              0.57            0.48             0.44
          LogMapLt             0.68          0.62             0.56            0.5              0.47
          LSMatch              0.83          0.69             0.55            0.46             0.41
          AMD                  0.82          0.68             0.55            0.46             0.41
          KGMatcher+           0.83          0.67             0.52            0.43             0.38
          ALIOn                0.66          0.44             0.3             0.22             0.19
          Matcha               0.37          0.2              0.12            0.08             0.07


3.2. Improvements
The GraphMatcher should be improved to match the properties, since it does not perform well
in the M2 evaluation variants. Its current version does not apply the graph attention to align the
properties because of the lack of properties’ neighbours, especially datatype properties. These
object type and data type properties might not have enough neighbouring terms to construct
contextual information in the ontology. In this case, the property’s context might be improved
with the external information in its next version, and the graph attention can also be applied
to align the properties. In addition to these improvements, we will train the model with its
optimum parameter settings in its further version.


5
    The evaluation based on the uncertain reference alignments: http://oaei.ontologymatching.org/2022/results/
    conference/index.html
4. Conclusion
In this study, we have introduced the new ontology-matching system called GraphMatcher. The
GraphMatcher adapted the graph attention to the homogeneous subgraphs of the centre class’
neighbours to obtain the contextual information about the centre class. The graph attention
has computed the higher-level representation of each class and its surrounding classes and
properties. The results demonstrate promising performances in M1 and M3 evaluation variants.
The future work of this study will be to increase its performance in M2 evaluation variants.


References
 [1] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, Y. Bengio, Graph attention
     networks (2018). arXiv:1710.10903.
 [2] É. Thiéblin, O. Haemmerlé, N. Hernandez, C. Trojahn, Survey on complex ontology
     matching, Semantic Web 11 (2020) 689–727.
 [3] E. Jiménez-Ruiz, B. Cuenca Grau, Logmap: Logic-based and scalable ontology matching,
     in: L. Aroyo, C. Welty, H. Alani, J. Taylor, A. Bernstein, L. Kagal, N. Noy, E. Blomqvist
     (Eds.), The Semantic Web – ISWC 2011, SBH, Berlin, Heidelberg, 2011, pp. 273–288.
 [4] D. Faria, C. Pesquita, E. Santos, M. Palmonari, I. F. Cruz, F. M. Couto, The agreement-
     makerlight ontology matching system, in: R. Meersman, H. Panetto, T. Dillon, J. Eder,
     Z. Bellahsene, N. Ritter, P. De Leenheer, D. Dou (Eds.), On the Move to Meaningful Internet
     Systems: OTM 2013 Conferences, SBH, 2013, pp. 527–541.
 [5] P. Kolyvakis, A. Kalousis, D. Kiritsis, DeepAlignment: Unsupervised ontology matching
     with refined word vectors, in: Proceedings of the 2018 Conference of the North American
     Chapter of the Association for Computational Linguistics: Human Language Technologies,
     Volume 1 (Long Papers), ACL, 2018, pp. 787–798.
 [6] V. Iyer1, A. Agarwal, H. Kumar, Veealign: A supervised deep learning approach to ontology
     alignment (2020).
 [7] A. Bento, A. Zouaq, M. Gagnon, Ontology matching using convolutional neural networks,
     in: Proceedings of the 12th Language Resources and Evaluation Conference, ELRA, Mar-
     seille, France, 2020, pp. 5648–5653. URL: https://www.aclweb.org/anthology/2020.lrec-1.
     693.
 [8] J. Chen, J. Gu, Adol: a novel framework for automatic domain ontology learning, Super
     Computing (2021).
 [9] H. Hassan, G. Sansonetti, F. Gasparetti, A. Micarelli, J. Beel, Bert, elmo, use and infersent
     sentence encoders: The panacea for research-paper recommendation?, in: RecSys, WS,
     2019.
[10] S. Hertling, J. Portisch, H. Paulheim, Melt - matching evaluation toolkit, in: M. Acosta,
     P. Cudré-Mauroux, M. Maleshkova, T. Pellegrini, H. Sack, Y. Sure-Vetter (Eds.), Semantic
     Systems. The Power of AI and Knowledge Graphs, Springer International Publishing,
     Cham, 2019, pp. 231–245.