<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Athens, Greece
$ fernando.zhapacamacho@kaust.edu.sa (F. Zhapa-Camacho); robert.hoehndorf@kaust.edu.sa (R. Hoehndorf)
 https://ferzcam.github.io/ (F. Zhapa-Camacho); https://leechuck.de/ (R. Hoehndorf)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Evaluating Diferent Methods for Semantic Reasoning Over Ontologies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Fernando Zhapa-Camacho</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Robert Hoehndorf</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computational Bioscience Research Center</institution>
          ,
          <addr-line>Computer</addr-line>
          ,
          <institution>Electrical &amp; Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology</institution>
          ,
          <addr-line>4700 KAUST, 23955 Thuwal</addr-line>
          ,
          <country country="SA">Saudi Arabia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Reasoning over knowledge bases such as Semantic Web ontologies enables the discovery of new facts from existing knowledge. Knowledge-enhanced machine learning has motivated the development of neuro-symbolic reasoners, which enable faster but approximate computation of new facts or entailments. Neuro-symbolic methods generate vector representations (embeddings) of entities in a knowledge base. We analyze some ontology embedding methods, by implementing them as neuro-symbolic reasoners and evaluating their predictive performance on the datasets and tasks provided by the Semantic Reasoning Evaluation Challenge 2023. We explore two types of embedding methods: graph-based and modeltheoretic. Regarding graph-based embeddings, we evaluated the impact of diferent combinations of graph representation of ontologies with knowledge graph embedding methods. For model-theoretic embeddings, which create models for theories, we evaluate the impact of using several models, enabling approximate semantic entailment.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;approximate entailment</kwd>
        <kwd>neuro-symbolic AI</kwd>
        <kwd>ontology embedding</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>• We evaluate the diference between graph-based and geometric model-based embedding
methods.
• We analyze the impact of total and injective graph-embedding methods on axiom
prediction.
• We analyze the efect of implementing multiple models to achieve approximate semantic
entailment.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Preliminaries</title>
      <p>A Description Logic (DL) language consists of a signature Σ = ( C, R, I), where C is a set of
concept names, R is a set of role names and I is a set of individual names. We focus on two
DL languages: ℒ and ℰ ℒ. In the language ℒ, concept descriptions can be constructed
inductively; all concept names are concept descriptions. If  and  are concept descriptions
and  a role name, then  ⊓ ,  ⊔ , ¬, ∃., and ∀. are concept descriptions. If  and
 are concept descriptions,  and  individual names,  a role name, then  ⊑ , (, ), and
() are axioms in ℒ. Axioms  ⊑  are called general concept inclusions and when , 
are concept names, we will refer to them as subsumption axioms. Axioms () are called class
assertion axioms, and when  is a concept name, we will refer to them as membership axioms.
A set of axioms is an ℒ theory.</p>
      <p>An interpretation ℐ of an ℒ theory consists of an interpretation domain ∆ ℐ and an
interpretation function · ℐ . For every concept name  ∈ C, ℐ ⊆ ∆ ℐ ; individual name  ∈ I,
ℐ ∈ ∆ ℐ ; role name  ∈ R, ℐ ∈ ∆ ℐ × ∆ ℐ ; and, inductively:
⊥ℐ = ∅
⊤ℐ = Δℐ</p>
      <p>(¬)ℐ = Δℐ∖ℐ
( ⊓ )ℐ = ℐ ∩ ℐ ( ⊔ )ℐ = ℐ ∪ ℐ
(∃.)ℐ = {︁ ∈ Δℐ | ∃.((, ) ∈ ℐ ∧  ∈ ℐ)}︁
(∀.)ℐ = {︁ ∈ Δℐ | ∀.((, ) ∈ ℐ →  ∈ ℐ)}︁</p>
      <p>Given an interpretation ℐ, an axiom  ⊑  holds if and only if ℐ ⊆ ℐ . Similarly, axioms
() and (, ) hold if and only if ℐ ∈ ℐ and (ℐ , ℐ ) ∈ ℐ , respectively. If every axiom
holds under ℐ, then ℐ is a model for the theory.</p>
      <p>In the language ℰ ℒ, all axioms can be rewritten equivalently using four normal forms [9]:
 ⊑ ,  ⊓  ⊑ ,  ⊑ ∃.,
∃. ⊑ 
where , ,  are concept names and the concept ⊥ can only occur in the right side of normal
forms 1,3,4. An ℰ ℒ theory is a subset of a ℒ theory.</p>
      <p>←</p>
      <p>ℎ 
→ ℎ ←</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <sec id="sec-3-1">
        <title>3.1. Graph-based embeddings</title>
        <p>A graph-based embedding method is a two-step process [10] (Figure 1). The first step is a graph
projection function from ontologies into graphs. The second step is performed by a Knowledge
Graph Embedding (KGE) method that maps nodes and edge labels to vectors in R.</p>
        <p>As graph projection methods we chose OWL2Vec* [2] and CatE [7]. CatE was designed to
encode the ℒ language and OWL2Vec* can encode more complex DL languages than ℒ.
OWL2Vec* defines a set of axiom patterns to generate graph edges. The set of patterns might
not cover all the axioms existing in the ontology and therefore the projection might not be a
total function. On the other hand, CatE represents concepts and axioms as diagrams using their
category-theoretical representation. The categorical representation ensures the projection of
any concept description and therefore it is a total function. In OWL2Vec*, subsumption axioms
 ⊑  are mapped to the triple (, , ) and membership axioms () are mapped
into triples (, , ). The CatE projection does not only generate edges from axioms
but also from complex concept descriptions. However, class assertion axioms are not originally
supported. To extend CatE to assertion axioms, we use the same representation that OWL2Vec*
utilizes.</p>
        <p>Once the graph is generated as a set of triples (ℎ, , ) where ℎ,  are nodes and  are edge
labels standing for relations, we use several KGE methods to embed graph nodes and relations
into R. KGE methods can be grouped, depending on their scoring method, into
distancebased (TransE [11], TransD [12]), similarity-based (DistMult [13]) and neural-network-based
(ConvE [14]). For CatE, we also used another KGE method that supports the encoding of
hierarchical structures (OrderE [15]). For example, in TransE, the scoring function is:
(ℎ, , ) = −‖ (ℎ) + () − ()‖
(1)
where () is the vector representation of .</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Model-theoretic embeddings</title>
        <p>Model-theoretic embeddings generate models for a DL theory. An axiom (i.e.,  ⊑  or
()) is semantically entailed by a DL theory  if it is true in all models of  . We used three
methods to generate model-theoretic embeddings: ELEmbeddings [3], ELBoxEmbeddings [4]
and Box2EL [5]. These methods are designed to generate models for theories in the ℰ ℒ language
and they difer by the geometric shapes used for generating models. ELEmbeddings represent
concept descriptions as -dimensional balls, and axioms  ⊑  are enforced by the loss
function:
⊑(, ) = max(0, ‖() − ()‖ + () − () −  )
(2)
where (· ) represent the center of a ball and (· ) is its radius. Equation 2 enforces the ball
representing  to be inside the ball representing , with margin parameter  .</p>
        <p>In ELBoxEmbeddings, concepts are represented as -dimensional boxes, and axioms  ⊑ 
are encoded in the loss function:
⊑(, ) = max(0, ‖() − ()‖ +   () −   () −  )
(3)
where   (· ) is the ofset of a box.</p>
        <p>Box2EL, which also utilizes boxes to encode concepts, implements a generic loss function for
axioms  ⊑ :
⊑(, ) =
{︃max(0, ‖() − ()‖ +   () −   () −  ) if  ̸= ⊥
max(0,   () + 1) otherwise
(4)
we call Equations 2, 3, 4 inclusion losses.</p>
        <p>Class assertion axioms () are formulated as {} ⊑ . Thus, to encode class assertions, we
could still use Equations 2, 3, 4. However, we use the following formulations:
()(, ) =  (() · () + ())
()(, ) =  (() · () + (  ()))
(5)
(6)
where Equation 5 is used in ELEmbeddings and Equation 6 is used in ELBoxEmbeddings and
Box2EL, where the ofset of a box is a vector in R. The formulations in Equations 5, 6 obtained
better performance and are more computationally eficient than using inclusion losses. We call
Equations5, 6 similarity losses.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Computing entailment of axioms</title>
        <p>In the case of graph-based generated embeddings, we formulated the entailment computation of
axioms  ⊑  and () as link prediction of edges (, , ) and (, , ),
as shown in Figure 1 (right to left direction). A scoring function (, , ) or
(, , ) is used to compute the plausibility of an edge belonging to the graph.
The scoring function is given by the KGE method.</p>
        <p>For model-theoretic methods, we use the inclusion and similarity losses as scoring functions
for subsumption and membership axioms, respectively. To enable approximate semantic
entailment, we generate multiple models and aggregate the scores produced by all models. We analyze
diferent aggregator functions such as minimum, maximum, arithmetic mean and median.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Experimental setup</title>
        <p>We evaluated five ontology embedding methods over the datasets provided for the SemREC2023
challenge: ORE1, ORE2, ORE3, OWL2Bench1, OWL2Bench2 and CaLiGraph10e4. Due to
computational requirements (memory and time required), we were unable to evaluate CaLiGraph10e5.</p>
        <p>For all methods, we used an embedding size of 256 and a batch size of 128. We applied cyclic
learning rate [16] with a maximum learning rate of 0.001 and minimum learning rate of 0.0001.</p>
        <p>For all the experiments we report Mean Reciprocal Rank (MRR), Mean Rank (MR), Median
Rank (MedR), Hits@{1, 3, 10, 100} and ROCAUC (AUC). The best results are presented in
boldface and the second best results are underlined. All the experiments were performed using
a NVIDIA TITAN X (12GB). Due to time constraints and the large number of methods and
datasets, we were unable to tune hyperparameters.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Implementation</title>
        <p>We relied on mOWL [8] for the implementation of the methods. mOWL is a Python library
that implements many ontology embedding methods and functionalities regarding neural
reasoning algorithms. For graph-based methods, we used mOWL functionalities such as the
OWL2Vec* graph projection. We used the original CatE implementation, which is not part of
mOWL yet but it wraps its functionalities. Furthermore, we used PyKEEN [17] implementations
of several knowledge graph embedding methods. To implement model-theoretic embedding
methods, we used mOWL implementations of ELEmbeddings, ELBoxEmbeddings and Box2EL.
Currently, mOWL only allows generating one model structure. Therefore, we wrapped mOWL
functionalities to enable the generation of multiple models with diferent aggregation strategies.</p>
        <p>All the source code and data is available at https://github.com/bio-ontology-research-group/
semrec2023.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <sec id="sec-4-1">
        <title>4.1. Graph-based vs. Model-theoretic embeddings (RQ1)</title>
        <p>OWL2Vec* 0.1868
CatE 0.1599
ELEm 0.1385
ELBox 0.1650
Box2EL 0.1657
OWL2Vec* 0.2440
CatE 0.0906
ELEm 0.1524
ELBox 0.1706
Box2EL 0.1631
OWL2Vec* 0.0686
CatE 0.0091
ELEm 0.0304
ELBox 0.0608
Box2EL 0.0369
OWL2Vec* 0.0665
CatE 0.0102
ELEm 0.0297
ELBox 0.0583
Box2EL 0.0357
OWL2Vec* 0.0770
CatE 0.0082
ELEm 0.0332
ELBox 0.0610
Box2EL 0.0352
OWL2Vec*
CatE
ELEm
ELBox
Box2EL
7.0
8.5
9.0
13.0
6.0</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Analysis of diferent graph-based methods (RQ2)</title>
        <p>In Table 3 we show the results of subsumption prediction, where OWL2Vec* dominates over
CatE. More specifically, the combination OWL2Vec*+DistMult is the best setting for all datasets.
On the other hand, when predicting membership axioms (Table 4), the results are
datasetdependent and for the smaller datasets such as OWL2Bench1 and OWL2Bench2 OWL2Vec*
performs better than CatE in most metrics. However, for larger datasets, CatE outperforms
OWL2Vec* in ROCAUC and Mean Rank, indicating can concentrate the predictions of positive
samples in high positions. Furthermore, except for ORE2, CatE also dominates Hits@k metrics,
and Median Rank, suggesting that CatE is more robust to outlier predictions in these datasets.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Approximate semantic entailment (RQ3)</title>
        <p>Semantic entailment is defined as a relation between models:  is semantically entailed by
 ,  |= , if and only if all models of  are also models of :  ( ) ⊆
 ({}). We
can approximate semantic entailment by generating finite subsets of  ( ) using
modelgenerating embedding methods. To evaluate the impact of applying approximate semantic
entailment, we evaluated model-theoretic methods with one model and compared those methods
with ten models. For both tasks, subsumption (Table 5) and membership (Table 6) prediction,
generating multiple models improves the predictive performance in all metrics. Furthermore,
box-based methods (ELBox [4] and Box2EL [5]) outperform ELEmbeddings [3], which is a
ball-based geometric method. When generating multiple models, one question to solve is how to
aggregate the predictions of all the models to generate a single score. Experiments suggest that
applying the arithmetic mean is the most stable option to obtain better performance. However,
in small datasets such as OWL2Bench1 and OWL2Bench2, other aggregator operations such as
maximum, minimum or median obtain the best performance. In theory, if  |= , the minimum
degree of truth over all models should be the appropriate aggregator as only this ensures that
 is true in all models  ( ) (or the subset that has been sampled). In practice, however,
the model-generating algorithms do not always generate accurate or “good” models where
statements that should be true are actually true; therefore, other aggregators achieve better
results as they can efectively ignore models that are degenerate.</p>
        <p>Method
OWL2Bench1
OWL2Bench2
ORE1
ORE2
ORE3</p>
        <p>OWL2Vec* + TransE 0.3263
OWL2Vec* + TransD 0.1868
OWL2Vec* + DistMult 0.5230
OWL2Vec* + ConvE 0.2812
CatE + TransE 0.2603
CatE + TransD 0.0714
CatE + DistMult 0.1491
CatE + ConvE 0.1029
CatE + OrderE 0.1599
OWL2Vec* + TransE 0.3187
OWL2Vec* + TransD 0.2440
OWL2Vec* + DistMult 0.4689
OWL2Vec* + ConvE 0.2623
CatE + TransE 0.1295
CatE + TransD 0.0335
CatE + DistMult 0.1899
CatE + ConvE 0.1008
CatE + OrderE 0.0906
0.0 0.6202 0.9356 0.9772
0.0 0.5913 0.8809 0.9693
2.0 0.0215 0.6141 0.9636
2.0 0.0377 0.7105 0.9698
0.0 0.5650 0.8046 0.9658
1.0 0.3530 0.7906 0.9750
2.0 0.0000 0.6583 0.8673
3.0 0.0307 0.4967 0.9715
2.0 0.2011 0.6746 0.8861
1.0 0.4724 0.9473 0.9838
0.0 0.6135 0.8756 0.9781
2.0 0.0165 0.6288 0.9675
2.0 0.0300 0.7259 0.9752
1.0 0.3351 0.7244 0.9497
3.0 0.0714 0.3626 0.9781
2.0 0.0000 0.6606 0.8954
3.0 0.0241 0.4855 0.9752
3.0 0.0929 0.3428 0.9094
ELEm 1
ELEm 10 (mean)
ELBox 1
ELBox 10 (max)
Box2EL 1
Box2EL 10 (min)</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>Results of membership prediction of diferent model-theoretic embedding methods.tab:subsumption
We have evaluated diferent ontology embedding methods using the datasets provided for
the SemREC challenge. Our experiments show that diferent kinds of methods (graph-based,
model-theoretic) have diferent properties when predicting subsumption or membership axioms.
Furthermore, for graph-based methods, we evaluated diferent graph projections with several
KGE embedding methods. We also analyzed the impact of generating multiple models for
model-theoretic methods and the diference between diferent aggregation strategies.
Results of membership prediction of diferent model-theoretic embedding methods.</p>
      <p>MedR</p>
      <p>bbaa199.
[1] M. Kulmanov, F. Z. Smaili, X. Gao, R. Hoehndorf, Semantic similarity and machine
learning with ontologies, Briefings in Bioinformatics 22 (2020). doi: 10.1093/bib/bbaa199,
[2] J. Chen, P. Hu, E. Jimenez-Ruiz, O. M. Holter, D. Antonyrajah, I. Horrocks,
OWL2Vec*: embedding of OWL ontologies, Machine Learning (2021). doi:10.1007/
s10994-021-05997-6.</p>
      <p>Intelligence, 2019.
[3] M. Kulmanov, W. Liu-Wei, Y. Yan, R. Hoehndorf, El embeddings: Geometric construction
of models for the description logic el ++, in: International Joint Conference on Artificial
[4] X. Peng, Z. Tang, M. Kulmanov, K. Niu, R. Hoehndorf, Description logic el++ embeddings
with intersectional closure, 2022. arXiv:2202.14018.
[5] M. Jackermeier, J. Chen, I. Horrocks, Box2el: Concept and role box embeddings for the
description logic el++, 2023. arXiv:2301.11118.
[6] Z. Tang, T. Hinnerichs, X. Peng, X. Zhang, R. Hoehndorf, FALCON: Faithful neural semantic
entailment over alc ontologies, 2023. arXiv:2208.07628.
[7] F. Zhapa-Camacho, R. Hoehndorf, Cate: Embedding alc ontologies using
categorytheoretical semantics, 2023. arXiv:2305.07163.
[8] F. Zhapa-Camacho, M. Kulmanov, R. Hoehndorf, mOWL: Python library for machine
learning with biomedical ontologies, Bioinformatics 39 (2022) btac811. URL: https://doi.
org/10.1093/bioinformatics/btac811. doi:10.1093/bioinformatics/btac811.
[9] F. Baader, D. Calvanese, D. McGuinness, D. Nardi, P. F. Patel-Schneider (Eds.), The
Description Logic Handbook: Theory, Implementation, and Applications, Cambridge University
Press, 2003.
[10] F. Zhapa-Camacho, R. Hoehndorf, From axioms over graphs to vectors, and back again:
evaluating the properties of graph-based ontology embeddings, 2023. arXiv:2303.16519.
[11] A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, O. Yakhnenko, Translating embeddings
for modeling multi-relational data, in: Advances in Neural Information Processing Systems,
volume 26, Curran Associates, Inc., 2013.
[12] G. Ji, S. He, L. Xu, K. Liu, J. Zhao, Knowledge graph embedding via dynamic mapping
matrix, in: Annual Meeting of the Association for Computational Linguistics, 2015. URL:
https://api.semanticscholar.org/CorpusID:11202498.
[13] B. Yang, W. tau Yih, X. He, J. Gao, L. Deng, Embedding entities and relations for learning
and inference in knowledge bases, 2015. arXiv:1412.6575.
[14] T. Dettmers, P. Minervini, P. Stenetorp, S. Riedel, Convolutional 2d knowledge graph
embeddings, ArXiv abs/1707.01476 (2017). URL: https://api.semanticscholar.org/CorpusID:
4328400.
[15] I. Vendrov, R. Kiros, S. Fidler, R. Urtasun, Order-embeddings of images and language (2016).</p>
      <p>ArXiv:1511.06361 [cs].
[16] L. N. Smith, Cyclical learning rates for training neural networks, 2017.</p>
      <p>arXiv:1506.01186.
[17] M. Ali, M. Berrendorf, C. T. Hoyt, L. Vermue, S. Sharifzadeh, V. Tresp, J. Lehmann, PyKEEN
1.0: A Python Library for Training and Evaluating Knowledge Graph Embeddings, Journal
of Machine Learning Research 22 (2021) 1–6. URL: http://jmlr.org/papers/v22/20-825.html.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>