=Paper=
{{Paper
|id=Vol-3432/paper46
|storemode=property
|title=Neural Class Expression Synthesis
|pdfUrl=https://ceur-ws.org/Vol-3432/paper46.pdf
|volume=Vol-3432
|authors=N'Dah Jean Kouagou,Stefan Heindorf,Caglar Demir,Axel-Cyrille Ngonga Ngomo
|dblpUrl=https://dblp.org/rec/conf/nesy/KouagouHDN23
}}
==Neural Class Expression Synthesis==
Neural Class Expression Synthesis
N’Dah Jean Kouagou1,* , Stefan Heindorf1 , Caglar Demir1 and
Axel-Cyrille Ngonga Ngomo1
1
Department of Computer Science, Paderborn University, Warburger Str. 100, Paderborn, 33098, Germany
Abstract
Many applications require explainable node classification in knowledge graphs. Towards this end, a
popular “white-box” approach is class expression learning: Given sets of positive and negative nodes, class
expressions in description logics are learned that separate positive from negative nodes. Most existing
approaches are search-based approaches generating many candidate class expressions and selecting the
best one. However, they often take a long time to find suitable class expressions. In this paper, we cast
class expression learning as a translation problem and propose a new family of class expression learning
approaches which we dub neural class expression synthesizers. Training examples are “translated” into
class expressions in a fashion akin to machine translation. Consequently, our synthesizers are not
subject to the runtime limitations of search-based approaches. We study three instances of this novel
family of approaches based on LSTMs, GRUs, and set transformers (STs), respectively. An evaluation of
our approach on four benchmark datasets suggests that it can effectively synthesize high-quality class
expressions with respect to the input examples in approximately one second on average. Moreover, a
comparison to state-of-the-art approaches suggests that we achieve better F-measures on large datasets.
For reproducibility purposes, we provide our implementation as well as pretrained models in our public
GitHub repository at https://github.com/dice-group/NeuralClassExpressionSynthesis
Keywords
Neural network, Concept learning, Class expression learning, Learning from examples, NCES
Class expression learning (CEL) approaches learn a class expression that describes individuals
provided as positive examples. They are applied in a wide range of domains, including ontology
engineering, bio-medicine, and Industry 4.0. Several methods have been proposed to address
CEL. The state of the art consists of approaches based on refinement operators [1, 2], and
evolutionary algorithms [3]. However, the majority of these approaches suffer from scalability
issues because they explore an infinite conceptual space for each learning problem. We propose
a new family of self-supervised neuro-symbolic approaches dubbed neural class expression
synthesis (NCES) approaches for CEL. NCES [4] instances view CEL as a machine translation
problem by translating from the language of example embeddings to that of description logics
(or any other logic for that matter).
Overall, neural class expression synthesizers work as follows: First, a given knowledge base
is converted into a set of triples (𝑠, 𝑝, 𝑜) and then embedded into a continuous vector space such
NeSy 2023, 17th International Workshop on Neural-Symbolic Learning and Reasoning, Certosa di Pontignano, Siena,
Italy
*
Corresponding author.
$ ndah.jean.kouagou@upb.de (N. J. Kouagou); heindorf@upb.de (S. Heindorf); caglar.demir@upb.de (C. Demir);
axel.ngonga@upb.de (A. N. Ngomo)
0000-0002-4217-897X (N. J. Kouagou); 0000-0002-4525-6865 (S. Heindorf); 0000-0001-8970-3850 (C. Demir);
0000-0001-8970-3850 (A. N. Ngomo)
© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org)
Table 1
Evaluation results per approach and dataset. The star (*) indicates the Wilcoxon statistical significance
between NCES and the best search-based approach. NCES uses the ensemble of GRU, LSTM, and ST.
F1 (%) ↑ Runtime (sec.) ↓
Carcinogen. Mutagenesis Sem. Bible Vicodi Carcinogen. Mutagenesis Sem. Bible Vicodi
CELOE 37.92±44.25 82.95±33.48 93.18±17.52* 35.66±42.06 239.58±132.59 92.46±125.69 135.30±139.95 289.95±103.63
EvoLearner 91.48±14.30 93.27±12.95 91.88±10.14 92.74±10.28 54.73±25.86 48.00±31.38 17.16±9.20 213.78±81.03
NCES 97.06±13.06* 91.39±22.91 87.11±24.05 95.51±12.14* 0.27*±0.00 0.31*±0.00 0.15*±0.00 0.15*±0.00
as R𝑑 . Next, learning problems are generated automatically from the input knowledge base
using a refinement operator and an instance checker. Finally, the synthesizers are trained to
translate the embeddings of positive/negative examples to the corresponding class expressions
in the training data. If necessary, the second and third step can be iterated until a stopping
criterion (e.g., convergence) is fulfilled.
We compared NCES to state-of-the-art algorithms for CEL, including CELOE [1] and Ev-
oLearner [3], on the popular benchmarks Carcinogenesis, Mutagenesis, Semantic Bible, and
Vicodi. We measure the performance of each approach in terms of runtime and number of
positive/negative examples covered/ruled out by the computed solution. Table 1 gives the
results of our experiments on a total of 380 learning problems. These results suggest that post
training, NCES instances are over 300 times faster on average than search-based approaches.
Moreover, they perform particularly well on the largest datasets Carcinogenesis and Vicodi
with up to 5.5% absolute improvement in F-measure.
Acknowledgments This work is part of a project that has received funding from the European
Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie
grant No 860801 and the European Union’s Horizon Europe research and innovation programme
under the grant No 101070305. This work has also been supported by the Ministry of Culture and
Science of North Rhine-Westphalia (MKW NRW) within the project SAIL under the grant No
NW21-059D and by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation):
TRR 318/1 2021 – 438445824.
References
[1] J. Lehmann, S. Auer, L. Bühmann, S. Tramp, Class expression learning for ontology
engineering, J. Web Semant. 9 (2011) 71–81.
[2] N. J. Kouagou, S. Heindorf, C. Demir, A. N. Ngomo, Learning concept lengths accelerates
concept learning in ALC, in: ESWC, volume 13261 of LNCS, Springer, 2022, pp. 236–252.
[3] S. Heindorf, L. Blübaum, N. Düsterhus, T. Werner, V. N. Golani, C. Demir,
A. Ngonga Ngomo, Evolearner: Learning description logics with evolutionary algorithms,
in: WWW, ACM, 2022, pp. 818–828.
[4] N. J. Kouagou, S. Heindorf, C. Demir, A.-C. Ngonga Ngomo, Neural class expression
synthesis, in: ESWC, volume 13870 of LNCS, Springer, 2023, pp. 209–226.