<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Catania, Italy
$ simon.fluegel@uos.de (S. Flügel)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Automatic Ontology Extension for the ChEBI Ontology</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Simon Flügel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Computer Science, Osnabrück University</institution>
          ,
          <addr-line>Friedrich-Janssen Str. 1, 49076 Osnabrück</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>In the life sciences, the amount of available knowledge has increased drastically over the last decades. Reference ontologies are an essential tool for organising and making this knowledge accessible. However, since they are designed and maintained manually, extending them is costly and keeping up with scientific progress is almost impossible. In this PhD project, we develop methods that automatically extend the coverage of reference ontologies while remaining faithful to developers' intentions, using the example of the Chemical Entities of Biological Interest (ChEBI) ontology. In particular, we are interested in neural-symbolic integration methods that combine Machine Learning with axiomatic knowledge from the ontology. The research project focuses on three avenues: Firstly, we examine how to represent chemical structures in Machine Learning methods, in particular Graph Neural Networks. Secondly, axiomatising ontology classes in monadic second-order logic (MSOL) and first-order logic (FOL) and integrating them with OWL ontologies. And thirdly, we study the direct injection of ontology axioms into the training process of Machine Learning methods. The overarching goal of this work is to provide ontology developers and domain experts with a suite of tools that lighten the load of manual ontology development and broaden the scope of reference ontologies without lowering quality standards.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;ChEBI</kwd>
        <kwd>ontology extension</kwd>
        <kwd>neural-symbolic integration</kwd>
        <kwd>OWL</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Ontologies in the biomedical domain, such as those of the OBO Foundry [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], are maintained by manual
curation. While this ensures high quality standards, it also limits the growth of ontologies. The ChEBI
(Chemical Entities of Biological Interest) ontology [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] for example features 61,867 fully annotated
compounds (as of version 242, released in June 2025). Comparing this with chemical databases such as
PubChem [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], which has 121 million entries (as of July 2025), it becomes clear that ChEBI cannot not
come close to a full coverage of its domain, at least not with manual curation alone.
      </p>
      <p>Therefore, this research project focuses on developing and improving methods for automated ontology
extension, i.e., the addition of content to ChEBI, or the application of ChEBI to new data.</p>
      <p>A major challenge for ontology extension is staying consistent with the manually curated ontology.
Each ontology term is the result of a consensus between experts and refers to background knowledge
that is not always made explicit. This requires neural-symbolic techniques that can integrate knowledge
from symbolic sources, e.g., the OWL axioms of the ontology, as well as sub-symbolic sources, i.e., the
considerable amount of chemicals that are already annotated by ChEBI.</p>
      <p>
        Given the diversity of the chemical domain, we believe there is no one-size-fits-all solution for
ontology extension. Instead, our goal is to provide an ensemble of diferent approaches. In previous
work, various Machine Learning methods have been applied to this task, including LSTMs [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and
Transformer models [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the latter of which is performing best on the ontology extension task. While
Transformer models are able to predict a large number of classes simultaneously (up to 1,332 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]), they
are by nature data-dependent. In the experiments, only classes with at least 50 or 100 molecules have
been selected. Less populated classes, which make up the majority of ChEBI classes, have not yet been
not covered. In a user study, it has been determined that a lack more specific classification is seen by
users as a major drawback [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Also, not all classes can be learned equally well. For instance, classes
with complex ring structures pose a particular challenge [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        This author has also participated in preliminary work on a logic-based approach to ontology extension
using SMILES [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. SMILES is a string representation language for molecules. Many ChEBI classes that
represent groups of molecules have been annotated with their defining substructure. In order to use
these substructures for ontology extension, SMILES strings have been translated into first-order logic
(FOL) axioms which can then be used for classification [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>In this project, we aim to answer the following research questions:
1. Can a graph representation of molecules and feature augmentation improve the performance of</p>
      <p>Machine Learning methods for ontology extension?
2. How can complex chemical definitions be formalised? And how can such formalisations be used
for ontology extension?
3. Can ontology extension be improved by directly infusing the training process of Machine Learning
methods with ontology axioms?</p>
    </sec>
    <sec id="sec-2">
      <title>2. Molecule Representation</title>
      <p>The first question that has to be addressed when using Machine Learning for the chemical domain is:
How do we get molecular structures into a model? And which model provides the best representation
for a molecule?</p>
      <p>
        Fingerprints encode molecules as fixed-length vectors, taking the structure and physicochemical
properties into account. Depending on the application, e.g., toxicity prediction or virtual screening,
diferent types of fingerprints have been developed [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. String representation such as SMILES or
SELFIES [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] perform a traversal of the molecular graph, encoding each atom with a sequence of letters,
with additional symbols for bonds, branches and ring structures.
      </p>
      <p>
        In previous work, both fingerprints for classical Machine Learning methods and SMILES strings
for Transformer models have been used [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. However, this representation is not optimal: Chemical
structures come in many diferent shapes and sizes. Thus, fitting them into fixed-length or sequential
representations makes it harder for a model to understand the original molecule. Take for instance
benzene rings (cf. Figure 1), which can be described by the following SMILES string: C=1C=CC=CC1.
In the structure, the first and the last C are direct neighbours, while in the representation, they are
at opposite ends. Their connection can only be inferred from the ones that act as ring opening and
closure symbols. While learning this dependency is not a problem for small ring sizes, larger rings with
hundreds of atoms or complex ring structures pose a challenge for sequence-based models. Note that
SMILES usually allows diferent text representations for the same molecule, For instance, c1ccccc1 is
also a valid description of a benzene ring. However, all face the same issue that they have to bring a
circular structure into a linear form.
      </p>
      <p>
        We hypothesise that a graph representation in which each atom is translated to a node and each bond
becomes an edge would avoid such problems and facilitate the learning process. This representation
also requires a new model architecture, namely Graph Neural Networks (GNNs) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. So far, preliminary
experiments have been conducted which show that, for many ChEBI classes, GNNs perform better on
the ontology extension task than Transformer models while using fewer resources. Our next steps
will be to further optimise our GNN architecture using feature augmentation and a more sophisticated
graph structure (e.g., with additional nodes for ring structures or functional groups). Also, we are going
to investigate specialised pre-training methods for GNNs. This will result in an ensemble architecture
where predictions from several models (including both Transformers and GNNs) are aggregated into a
single classification.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Class Axiomatisation</title>
      <p>In our work with ChEBI, we have identified several areas which are underspecified. For example, about
15 thousand compounds are classified as belonging to the peptide class. This class has a subhierarchy
which allows for a more fine-grained classification, e.g., into oligo- and polypeptides (peptides with
either “few” or “many” amino acids). These two subclasses, by their definition, form a partition of
the peptide domain. This means that all peptide compounds could be classified into one of the two
subclasses. However, there are 7,500 direct children of peptide which do not have an is a axiom to one of
the subclasses. Such missing axioms make learning more dificult for Machine Learning models. For one,
there is simply less data to train on, i.e., less positive samples for the classes oligopeptide and polypeptide.
But there are also more negative samples, i.e., molecules that are chemically oligo- or polypeptides, but
have a negative label in the Machine Learning dataset.</p>
      <p>
        Therefore, in this project, peptides and 13 of its related classes or subclasses (cf. Figure 2) have been
selected to study a rule-based approach to ontology extension [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. In collaboration with domain experts,
we have developed and refined natural language definitions for these classes. While natural-language
definitions already exist in ChEBI for most classes, they do not cover some edge cases and presume
chemical background knowledge. By formalising these definitions and developing a methodology for
automatic classification, we were able to test our definitions and compare them against the current
ChEBI classification. This has led to further refinement of our definition or, in some cases, to the
identification of errors and inconsistencies in ChEBI.
      </p>
      <p>
        The classification methodology that is necessary for this process is based on an MSOL axiomatisation.
While [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] already identified MSOL as a suitable language for chemical class definitions (in their case,
fullerenes), this project has determined that peptides require monadic second-order definitions as well.
This is in contrast to OWL and FOL, which are more commonly used in ontology development, but in
which we cannot express the concept of peptides.
      </p>
      <p>This raises a new issue: How can one use second-order definitions in classification tasks? We address
this issue with a methodology that translates the MSOL definitions into a FOL model checking problem
in which some components are calculated algorithmically. Here, the central idea is that reasoning over
the whole ontology is not necessary. Instead, each model checking problem has a single molecule as its
domain. With this domain, which consists of the molecule’s atoms and the algorithmically supplied
components, model checking becomes feasible on the scale of ChEBI.</p>
      <p>As a third step, the MSOL and FOL axiomatisations have been used to verify the trustworthiness of
a purely algorithmic classifier. With the algorithmic method, we were able to classify all 121 million
compounds of PubChem, which goes significantly beyond the scope of ChEBI.</p>
      <p>In future work, we will generalise our methodology and extend it towards other areas of ChEBI.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Injecting Axioms into Training</title>
      <p>One of the drawbacks of our current Machine Learning pipeline is that it is agnostic about the OWL
axiomatisation of ChEBI. The only axioms that are used are the subsumption relations which connect
label classes to compound classes. From this, the dataset is constructed, taking each compound as a
sample with a list of positive or negative labels. Importantly, transitive parents are used as positive
labels as well. Thus, given a peptide for instance, the model not only has to predict the peptide class,
but also the amide class, the organic class and so on.</p>
      <p>This ignores the subsumption relations between label classes (e.g., we can infer from the ontology
that each peptide is organic and an amide). It also ignores disjointness and other relations between
classes. Therefore, a model might predict that a molecule is a peptide and not an amide, or that it is
both organic and inorganic. While it will still receive a loss for such predictions during training, this
loss teaches the ontology axioms only by example, not as a general rule.</p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], we introduced a fuzzy loss that combines a standard cross-entropy loss with additional loss
terms based on a fuzzy logic interpretation of subsumption and disjointness axioms. This allows the
model to learn relations between label classes directly instead of inferring them from samples. We
evaluated the fuzzy loss for diferent fuzzy implications and parameter configurations (cf. Figure 3).
Overall, we were able to improve the consistency of predictions significantly compared to a regular
cross-entropy loss.
      </p>
      <p>In addition, we performed experiments with an additional unsupervised learning task. There, the fact
is used that fuzzy loss works without labels: Even if the correct classification is unknown, we can tell if
a classification is consistent or not. Data from PubChem was used to augment our labelled dataset with
additional unlabelled samples. The goal was to improve the out-of-distribution generalisation abilities
of the model by drawing data from a wider distribution than the original dataset.</p>
      <p>
        The main drawbacks of the fuzzy loss are that, for one, in our experiments, adding fuzzy loss terms
was detrimental to model performance. While a balancing technique reduces the performance gap
between the baseline and fuzzy loss models, a successful application of the fuzzy loss would require
further performance improvements on the classification task. Also, in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], only subsumption and
disjointness axioms have been taken into account, leaving out other axioms in ChEBI.
      </p>
      <p>Expanding the fuzzy loss method to other axiom types poses a new challenge: How does the axiom
correspond to the loss function? Unlike for subsumption and disjointness axioms, where this is relatively
straight-forward (if A and B are disjoint, a model should not predict A and B at the same time), other
axiom types are more complex. For instance, ChEBI uses the object property has functional parent for
relations between classes where one can be derived from the other by functional modification. This does
not give us clear rules we can use for a loss function. Given the knowledge that, lets say, penicillin has
the functional parent 6-aminopenicillanic acid, we can derive no statement about individual molecules
that belong to either of the classes. One way to change that would be to train a separate model on the
prediction of functional modifications, which has been relegated to future work.</p>
      <p>For now, this project focuses on a promising axiom type that draws information from a diferent
source: has part relations are used to relate classes to chemical parts which they contain. Since the
chemical structures are provided by ChEBI, we can identify their parts and compare them to the axioms.
Take for example the class carboxylic acid. It has an axiom carboxylic acid has part some carboxy group.
If a model would predict a given molecule that has no carboxy group as a carboxylic acid, we would
know that this prediction is wrong. We can identify the carboxy group based on the ChEBI class carboxy
group, which, despite not being a molecule itself, is annotated with a SMILES string. This SMILES string
describes the substructure we have to identify in the molecule.</p>
      <p>While ChEBI already includes a subhierarchy for groups with more than 3,000 members, the parthood
axioms linking them to molecule classes are relatively sparse. We aim to expand the coverage of ChEBI
in this area, focussing specifically on the classes on which Machine Learning models are trained.</p>
      <p>In a separate approach, we also plan to use parthood relations for feature augmentation. For individual
samples, we can directly annotate molecules with the groups that are part of them. This does not
require a class-level axiomatisation. Instead, the hypothesis is that the groups themselves are chemically
relevant substructures. Since the groups in ChEBI have been selected by expert curators, giving this
knowledge to Machine Learning models might induce them to learn concepts in a way similar to how
experts usually see them, namely in terms of functional groups.</p>
      <p>
        Finally, this project will investigate the use of Logical Neural Networks (LNNs, [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]), an architecture
designed specifically to represent logic formulas. It is diferentiable and captures logical contradiction
directly in the loss function. Instead of a single prediction score, it outputs bounds which allow for
a well-founded interpretation of predictions. For example, a huge gap between the lower and upper
bounds represents uncertainty while a lower bounds that is higher than the upper bound indicates
inconsistency. LNNs have the potential to provide a stronger integration between the ontology and
sub-symbolic training and to yield more interpretable results.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This work has been funded by the Deutsche Forschungsgesellschaft (DFG, German Research Foundation)
under grant number 522907718.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Jackson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Matentzoglu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Overton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Vita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Balhof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. L.</given-names>
            <surname>Buttigieg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Carbon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Courtot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. D.</given-names>
            <surname>Diehl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Dooley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. D.</given-names>
            <surname>Duncan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. L.</given-names>
            <surname>Harris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Haendel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. E.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Natale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Osumi-Sutherland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ruttenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. M.</given-names>
            <surname>Schriml</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J. Stoeckert</given-names>
            <surname>Jr.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. A.</given-names>
            <surname>Vasilevsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Walls</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J. M.</given-names>
            <surname>Mungall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Peters</surname>
          </string-name>
          ,
          <article-title>OBO foundry in 2021: Operationalizing open data principles to evaluate ontologies</article-title>
          ,
          <year>Database 2021</year>
          (
          <year>2021</year>
          )
          <article-title>baab069</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hastings</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Owen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dekker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ennis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Muthukrishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Turner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Swainston</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mendes</surname>
          </string-name>
          , C. Steinbeck, ChEBI in 2016:
          <article-title>Improved services and an expanding collection of metabolites</article-title>
          ,
          <source>Nucleic Acids Research</source>
          <volume>44</volume>
          (
          <year>2016</year>
          )
          <fpage>D1214</fpage>
          -
          <lpage>D1219</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          , T. Cheng, A. Gindulyte,
          <string-name>
            <given-names>J.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. A.</given-names>
            <surname>Shoemaker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Thiessen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zaslavsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , E. E. Bolton,
          <article-title>PubChem 2025 update</article-title>
          ,
          <source>Nucleic Acids Res</source>
          .
          <volume>53</volume>
          (
          <year>2025</year>
          )
          <fpage>D1516</fpage>
          -
          <lpage>D1525</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hastings</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Glauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Memariani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Neuhaus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mossakowski</surname>
          </string-name>
          ,
          <article-title>Learning chemistry: Exploring the suitability of machine learning for the task of structure-based chemical ontology classification</article-title>
          ,
          <source>Journal of Cheminformatics</source>
          <volume>13</volume>
          (
          <year>2021</year>
          )
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Glauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Memariani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Neuhaus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mossakowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hastings</surname>
          </string-name>
          ,
          <article-title>Interpretable ontology extension in chemistry</article-title>
          ,
          <source>Semantic Web</source>
          <volume>15</volume>
          (
          <year>2024</year>
          )
          <fpage>937</fpage>
          -
          <lpage>958</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Glauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Neuhaus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Flügel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wosny</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mossakowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Memariani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schwerdt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hastings</surname>
          </string-name>
          , Chebifier:
          <article-title>Atomating semantic classification in ChEBI to accelerate data-driven discovery</article-title>
          ,
          <source>Digital Discovery</source>
          <volume>3</volume>
          (
          <year>2024</year>
          )
          <fpage>896</fpage>
          -
          <lpage>907</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Weininger</surname>
          </string-name>
          ,
          <string-name>
            <surname>SMILES,</surname>
          </string-name>
          <article-title>a chemical language and information system</article-title>
          ,
          <source>Journal of Chemical Information and Computer Sciences</source>
          <volume>28</volume>
          (
          <year>1988</year>
          )
          <fpage>31</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Flügel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Glauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Neuhaus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hastings</surname>
          </string-name>
          ,
          <article-title>When one logic is not enough: Integrating first-order annotations in OWL ontologies</article-title>
          ,
          <source>Semantic Web</source>
          <volume>16</volume>
          (
          <year>2025</year>
          ) SW-
          <volume>243440</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Concepts and applications of chemical fingerprint for hit and lead screening</article-title>
          ,
          <source>Drug discovery today 27</source>
          (
          <year>2022</year>
          )
          <fpage>103356</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Krenn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Häse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nigam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Friederich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Aspuru-Guzik</surname>
          </string-name>
          ,
          <article-title>SELFIES: A robust representation of semantically constrained graphs with an example application in chemistry</article-title>
          , arXiv preprint arXiv:
          <year>1905</year>
          .
          <volume>13741 1</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F.</given-names>
            <surname>Scarselli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Tsoi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hagenbuchner</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. Monfardini,</surname>
          </string-name>
          <article-title>The graph neural network model</article-title>
          ,
          <source>IEEE Transactions on Neural Networks</source>
          <volume>20</volume>
          (
          <year>2008</year>
          )
          <fpage>61</fpage>
          -
          <lpage>80</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Flügel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Glauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mossakowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Neuhaus</surname>
          </string-name>
          ,
          <article-title>ChemLog: Making MSOL viable for ontological classification and learning</article-title>
          ,
          <source>in: International Joint Conference on Learning and Reasoning</source>
          ,
          <year>2025</year>
          , in submission.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>O.</given-names>
            <surname>Kutz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hastings</surname>
          </string-name>
          , T. Mossakowski,
          <article-title>Modelling highly symmetrical molecules: Linking ontologies and graphs</article-title>
          ,
          <source>in: International Conference on Artificial Intelligence: Methodology, Systems, and Applications</source>
          , Springer,
          <year>2012</year>
          , pp.
          <fpage>103</fpage>
          -
          <lpage>111</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , T. Friedman,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Broeck</surname>
          </string-name>
          ,
          <article-title>A semantic loss function for deep learning with symbolic knowledge</article-title>
          ,
          <source>in: International Conference on Machine Learning, PMLR</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>5502</fpage>
          -
          <lpage>5511</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Flügel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Glauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mossakowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Neuhaus</surname>
          </string-name>
          ,
          <article-title>A fuzzy loss for ontology classification</article-title>
          ,
          <source>in: International Conference on Neural-Symbolic Learning and Reasoning</source>
          , Springer,
          <year>2024</year>
          , pp.
          <fpage>101</fpage>
          -
          <lpage>118</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>R.</given-names>
            <surname>Riegel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Luus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Makondo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. Y.</given-names>
            <surname>Akhalwaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Qian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Barahona</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Sharma</surname>
          </string-name>
          , et al.,
          <article-title>Logical neural networks</article-title>
          , arXiv preprint arXiv:
          <year>2006</year>
          .
          <volume>13155</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>