<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A-LIOn - Alignment Learning through Inconsistency negatives of the aligned Ontologies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sarah M. Alghamdi</string-name>
          <email>sarah.alghamdi.1@kaust.edu.sa</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fernando Zhapa-Camacho</string-name>
          <email>fernando.zhapacamacho@kaust.edu.sa</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Robert Hoehndorf</string-name>
          <email>robert.hoehndorf@kaust.edu.sa</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computational Bioscience Research Center</institution>
          ,
          <addr-line>Computer</addr-line>
          ,
          <institution>Electrical &amp; Mathematical Sciences and Engineering Division</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>King Abdul-Aziz University, Faculty of Computing and Information Technology</institution>
          ,
          <addr-line>Rabigh, 25732, Kingdom of Saudi</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>King Abdullah University of Science and Technology</institution>
          ,
          <addr-line>4700 KAUST, 23955 Thuwal</addr-line>
          ,
          <country country="SA">Saudi Arabia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Ontologies play an important role in sharing and reusing knowledge. Several ontologies have been developed to describe a particular domain but from diferent perspectives from communities of developers and users. This has led to the existence of multiple ontologies covering the same or a diferent domain with varying degrees of variability. Ontology Alignment is typically used to identify correspondences between semantically related elements of two or more ontologies in order to address this problem.</p>
      </abstract>
      <kwd-group>
        <kwd>Ontologies</kwd>
        <kwd>Ontology Alignments</kwd>
        <kwd>Ontology matching</kwd>
        <kwd>Inconsistency negatives</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>2. Proposed Methods</title>
      <p>, where  is a set of concept
(  
 ,    
) ∈</p>
      <p>×    that are considered as being equivalent or standing in a subclass relation
nEvelop-O
OAEI 2022
(R. Hoehndorf)
within certain contexts.</p>
      <p>A graph is defined as a tuple  = (, ,  ) , where  is a set of entities names,  is a set of
relations names and  ⊆  ×  ×  is a set of triples of the form (ℎ,  , ) .</p>
      <p>A projection of an ontology into a graph is a mapping  ∶  →  that maps the ontology
classes into graph nodes, ontology roles as graph relations, and ontology axioms as graph triples
following a particular set of rules.</p>
      <p>Our method A-LIOn combines diferent matching techniques and consists of four main
components (see Figure 1):
• Learning lexical matching seeds.
• Graph construction from source and target ontology.
• Graph embedding and transformation learning.</p>
      <p>• Consistency checking.</p>
      <p>Those components cover element-wise, structure-wise, and formal semantics learning
techniques. (a) Element-wise techniques consider the entities in the ontology in isolation in order to
ifnd alignments disregarding the fact that they are part of the ontology’s structure. This means
that we use the information belonging to an ontology class itself such as its textual annotations
and labels only. (b) On the other hand, structure-wise techniques analyze the entities as part of
their structure. In our case, we focus on adjacency structure within the ontology and extract
the structure in the form of a graph. (c) Finally, the semantic component consists of employing
formal semantics learning techniques and logical inference to identify correspondences and
repair inconsistencies.</p>
      <sec id="sec-1-1">
        <title>2.1. Learning Lexical matching seeds</title>
        <p>To begin learning ontology alignment, we need some known-to-be-positive seed alignments.
We chose to align the classes of both ontologies with the same IRI, or lexically matched labels
and relative IRIs. For lexical matching, we utilize fuzzy lexical matching, a method for finding
approximate string matching with a retrieved score representing the similarity between one
string to another. We begin with an exact matching score and then we decrease the threshold
iteratively until a suficient number of seeds are obtained or a minimal accepted threshold is
reached. The number of matching seeds required is a parameter of our method.</p>
      </sec>
      <sec id="sec-1-2">
        <title>2.2. Ontology Projection</title>
        <p>We project each ontology as a graph in order to learn structure-level information from the
source and the target ontologies. We evaluate two graph construction techniques:
• Subsumption hierarchy: in this method, we only utilized the subclass axioms asserted
between the ontology classes to generate a directed graph for the source ontology and
the target ontology. We evaluated this technique for Anatomy, Conference, Biodiversity
and Ecology, and Material Sciences and Engineering tracks.</p>
        <p>Symbolic</p>
        <p>Neural</p>
        <p>Add as negatives
min || (M.cs - ct) - (M.cns - cnt) ||2</p>
        <p>Find inconsistent alignments
(Explanations of unsatisfiability )</p>
        <p>Source ontology</p>
        <p>Target ontology
Graph projection</p>
        <p>Lexical Seed Learning
Transformation Embedding learning
vc1
vc2</p>
        <p>vc4
vc2
vc1
vc3</p>
        <p>min || M c1 - c1 ||2
No
min || Mr.c1+ r1- Mr.c2 ||2
Return Alignments
yes</p>
        <p>Is consistent?
(OWL reasoner)</p>
        <p>
          Merging ontologies + Alignment
• OWL Projection: This method was proposed in [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] where OWL axioms are transformed
directly into edges in the graph, and complex axioms are approximated in the graph to
avoid the use of blank nodes. Despite the fact that this transformation method does not
preserve exact logical relations, it enables correlation and learning alignments between
classes of the source and target ontologies as well as within the same ontology. We
evaluated this technique using the Phenotype ontology alignment.
        </p>
      </sec>
      <sec id="sec-1-3">
        <title>2.3. Transformation Learning</title>
        <p>After projecting an ontology, the result is a graph. Depending on the chosen projection method
(Section 2.2), these graphs would encode the taxonomical structure or relational information
found in the ontologies.</p>
        <p>
          In our method, we start with two ontologies   (source) and   (target), which, after applying
the graph projection, will become two graphs   ,   , respectively. When we deal with two
graphs, there are several graph alignment methods that can align two graphs from a small
number of seed alignments; we follow the method in [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
        <p>
          To learn representations of the two graphs   ,   , we define two vector spaces   ,   , where
the entities (nodes and edges) of each graph will be processed separately. To learn the graph
embeddings we rely on knowledge graph embeddings methods such as TransR [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], optimizing
the following loss function:
  
= ‖   ⋅   +   −    ⋅   ‖
(1)
for each relation   in the source graph where the triple (  ,   ,   ) exists.
        </p>
        <p>= ‖   ⋅   +   −    ⋅   ‖ (2)
for each relation   in the target graph where the triple (  ,   ,   ) exists.</p>
        <p>Simultaneously, we use a transformation  ∶    →    that takes the entities from the seeds
we found earlier (Section 2.1) from the source embedding space to the target space, using the
following loss:</p>
      </sec>
      <sec id="sec-1-4">
        <title>2.4. Inconsistency negatives learning</title>
        <p>
          OWL ontologies are based on Description Logic and facilitate the use of automated reasoners,
which in turn facilitate computing entailments of statements from the asserted ontology axioms.
In addition, these inferences can be investigated to determine if a class in an ontology is
satisfiable or unsatisfiable. A class is unsatisfiable if it cannot have any instances (i.e., the
axioms constrain the class in a contradictory way); an ontology is inconsistent if it has at least
one instance of a logical contradiction [
          <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
          ]. We utilize the ELK reasoner [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] to find alignments
that lead to unsatisfiable classes. In order to find unsatisfiable classes in aligning   and   ,
we first merge both ontologies (i.e., we combine their axioms into a new ontology) and add all
alignments predicted by our model as equivalence class axioms to the merged ontology   .
We define this ontology as   ∶= (  ∪   ,   ∪   ,   ∪   ,   ∪   , ) , where   is a set of
concepts from ontology  ,   is a set of relations form ontology  ,   is a set of individuals form
ontology  , and   is a set of axioms from ontology  ,  is the predicted alignment.
        </p>
        <p>
          Then we use the ELK reasoner [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] to identify unsatisfiable classes in the merged ontology. If
we identify an unsatisfiable class, we generate explanations for the entailment generated by
ELK; an explanation consists of a small set of axioms from which the unsatisfiability follows
directly; we specifically identify any of the equivalence class axioms we have added within the
generated explanations, as these are likely causing the class to become unsatisfiable. We remove
the equivalence class axioms causing unsatisfiable classes from the merged ontology and iterate.
Finally, we return to the transformation learning step with an updated loss to optimize for
alignment learning as follows:
where   ,   are positive class pairs from source ontology and target ontology, respectively,   ,  
are pairs of classes which gave rise to unsatisfiable classes and which we removed in the repair
step. The new iteration of our method now uses these pairs as negatives during training in the
alignment of both ontologies. We repeat this step until no more unsatisfiable classes remain.
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>3. Results</title>
      <p>For this year’s evaluation, we tested A-LIOn in three tracks: Anatomy, Conference and Material
Sciences and Engineering (MSE). We have also tested our system on the phenotypes track using
last year’s evaluation tests.</p>
      <sec id="sec-2-1">
        <title>3.1. Participation in OAEI</title>
        <p>We selected tracks that align ontologies that contain disjoint class assertion axioms. Disjoint
class assertion axioms are a common cause of inconsistencies, and, therefore, we will be able to
observe the performance of our method in correcting and training to avoid inconsistencies. In
the anatomy track, the ontology file h u m a n . o w l contains 17 disjoint class assertion axioms. In
the conference track, the number of disjoint class assertion axioms are as follows: 81 in c m t . o w l ,
42 in C o n f e r e n c e . o w l , 15 in c o n f i o u s . o w l , 129 in c o n f O f . o w l , 36 in c r s _ d r . o w l , 1,221 in e d a s . o w l ,
222 in e k a w . o w l , 3 in i a s t e d . o w l , 12 in M I C R O . o w l , 384 in M y R e v i e w . o w l , 237 in O p e n C o n f . o w l ,
396 in p a p e r d y n e . o w l . Finally, in the Material Sciences and Engineering track 158 disjoint
class axioms was found in M a t o n t o . All the results can be found in OAEI 2022 campaign page
http://oaei.ontologymatching.org/2022/results/.
3.1.1. Anatomy
In terms of precision, recall, and F-measure, the matching performance of A-LIOn in the
anatomy track were below the string equivalence baseline. The main issue that afected our
performance in this track is the small number of the predicted alignments and the small
number of inconsistencies discovered using the OWL EL reasoner. The main issue afecting the
performance of A-LIOn in the anatomy track is the limited number of initial seeds discovered
based on the parameters settings we used, which substantially afected recall. To overcome this
limitation, an adaptive method that uses a specific pairs of ontologies to determine parameters
(such as for seed matching) could be developed and used to overcome this limitation.</p>
        <sec id="sec-2-1-1">
          <title>3.1.2. Conference</title>
          <p>
            The Conference track contains information about conference organization. This track comes in
two versions: standard and uncertain. The standard version of the Conference track contains
a reference alignment which was the result of a “Consensus Workshop” in 2008. However,
some of these alignments may not be possible to detect either by a computational algorithm or
manually by humans [
            <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
            ]. For that reason, the uncertain version of the conference track was
generated by consulting a group of experts and computing the ratio of agreement on each match.
As a consequence, the uncertain track is more realistic because it removes the controversial
alignments (i.e., the ones for which the experts could not reach a consensus). For that reason,
when the evaluation is done on the uncertain version of the track, it is expected that systems
increase their performance. A-LIOn has the highest increase with respect to the standard
version among all the systems. This suggests that A-LIOn is capable to detect non-controversial
alignments more easily than the controversial ones.
          </p>
          <p>The current version of A-LIOn uses OWL EL reasoning to detect and exclude alignments that
cause inconsistencies. However, the results show that A-LIOn does not detect all inconsistent
alignments. The main reason for this lack of removing all incoherent alignment is the use of
more expressive description logics than OWL EL. A-LIOn only uses OWL EL reasoning because
computing entailments in expressive description logics has a high computational complexity
and may not always be successful for larger ontologies, such as those used in the biomedical
domain. However, the ontologies used in the Conference track are small compared to ontologies
in other tracks such as Anatomy. In a future version of A-LIOn, we may include additional
reasoners, including reasoners for more expressive logics.</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>3.1.3. Material Sciences and Engineering</title>
          <p>There are three test cases for the Material Sciences and Engineering track. The first and second
test cases align MatOnto ontology to the Material Information ontology, and the third case
EMMO ontology to Material Information ontology. MatOnto contains 158 disjoint class axioms
and could thus introduce useful inconsistencies that can be exploited by our method. The results
indicate that A-LIOn had the highest recall, and an F-measure comparable to the other tested
methods. However, there is one test case where A-LIOn failed to parse the labels in the ontology
(the EMMO ontology); consequently, A-LIOn failed to produce any alignments.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>3.2. Phenotype matching use case</title>
        <p>We tested the OWL projection method in the problem of aligning phenotype ontologies. To test
this approach, we utilized the datasets provided last year [9] for aligning Human phenotype
ontology (HP) [10] and Mammalian Phenotype Ontology (MP) [11]. The seed alignments we
used are exactly matching IRIs of classes, as well as lexical alignments for HP and MP classes
only. We tested two diferent approaches for generating the graphs from source and target
ontologies (Section 2.2). Results are shown in Table 3.2 where we included the results for some
of the participating systems from last year for comparison [12, 13, 14, 15]. Comparing the
results of the various graph generation techniques, we found that using the OWL projection in
the problem of phenotype mappings allows for the discovery of more mappings, whereas the
subsumption hierarchy produces alignments with high precision but finds fewer alignments,
thereby decreasing the recall.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Conclusion</title>
      <p>A-LIOn is a system that incorporates both entity-level and structure-level information in learning
alignments between two ontologies; A-LIOn also uses logical reasoning to correct alignments
that are likely faulty because they lead to unsatisfiable classes, and incorporates the results of
this symbolic step in the learning process to generate new negatives. In the future, we plan
to make our system able to learn better parameters based on the input ontologies features
and self-evaluate the predicted alignment. For example, using a diferent set of parameters for
anatomy and the first task on Material Sciences and Engineering tracks allowed us to increase
the F-score by 10% and 3.3% respectively. A further improvement will be the use of language
models in seed selection.
of the 6th International Conference on Ontology Matching - Volume 814, OM’11,
CEURWS.org, Aachen, DEU, 2011, p. 179–183.
[9] M. Pour, A. Algergawy, F. Amardeilh, R. Amini, O. Fallatah, D. Faria, I. Fundulaki, I. Harrow,
S. Hertling, P. Hitzler, et al., Results of the ontology alignment evaluation initiative 2021,
in: CEUR Workshop Proceedings 2021, volume 3063, CEUR, 2021, pp. 62–108.
[10] S. Köhler, M. Gargano, N. Matentzoglu, L. C. Carmody, D. Lewis-Smith, N. A. Vasilevsky,
D. Danis, G. Balagura, G. Baynam, A. M. Brower, et al., The human phenotype ontology in
2021, Nucleic acids research 49 (2021) D1207–D1217.
[11] C. L. Smith, J. T. Eppig, The mammalian phenotype ontology as a unifying standard
for experimental and high-throughput phenotyping data, Mammalian genome 23 (2012)
653–668.
[12] E. Jiménez-Ruiz, Logmap family participation in the oaei 2021, in: CEUR Workshop</p>
      <p>Proceedings, volume 3063, 2021, pp. 175–177.
[13] D. Faria, B. Lima1, F. Couto, M. Silva, C. Pesquita, Aml and amlc results for oaei 2021, in:
The 23rd International Conference on Information Integration and Web Intelligence, 2021,
pp. 131–136.
[14] S. Hertling, H. Paulheim, Atbox results for oaei 2021, in: CEUR Workshop Proceedings,
volume 3063, RWTH Aachen, 2021, pp. 137–143.
[15] D. Kossack, N. Borg, L. Knorr, J. Portisch, Tom matcher results for oaei 2021, in: CEUR
Workshop Proceedings, volume 3063, RWTH, 2022, pp. 193–198.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Jimenez-Ruiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. M.</given-names>
            <surname>Holter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Antonyrajah</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Horrocks</surname>
          </string-name>
          , Owl2vec*: Embedding of owl ontologies,
          <source>Machine Learning</source>
          <volume>110</volume>
          (
          <year>2021</year>
          )
          <fpage>1813</fpage>
          -
          <lpage>1845</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zaniolo</surname>
          </string-name>
          ,
          <article-title>Multilingual knowledge graph embeddings for cross-lingual knowledge alignment</article-title>
          ,
          <source>arXiv preprint arXiv:1611.03954</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <article-title>Learning entity and relation embeddings for knowledge graph completion</article-title>
          ,
          <source>in: Twenty-ninth AAAI conference on artificial intelligence</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L. T.</given-names>
            <surname>Slater</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. V.</given-names>
            <surname>Gkoutos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hoehndorf</surname>
          </string-name>
          ,
          <article-title>Towards semantic interoperability: finding and repairing hidden contradictions in biomedical ontologies</article-title>
          ,
          <source>BMC Medical Informatics and Decision Making</source>
          <volume>20</volume>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Martinez-Gil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Küng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Morvan</surname>
          </string-name>
          ,
          <article-title>Matching large biomedical ontologies using symbolic regression</article-title>
          ,
          <source>in: The 23rd International Conference on Information Integration and Web Intelligence</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>162</fpage>
          -
          <lpage>167</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kazakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krötzsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Simančík</surname>
          </string-name>
          ,
          <article-title>Elk: a reasoner for owl el ontologies, System Description (</article-title>
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Cheatham</surname>
          </string-name>
          , P. Hitzler, Conference v2.
          <article-title>0: An uncertain version of the oaei conference benchmark</article-title>
          , in: International Semantic Web Conference, Springer,
          <year>2014</year>
          , pp.
          <fpage>33</fpage>
          -
          <lpage>48</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Dänschel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stumpp</surname>
          </string-name>
          ,
          <article-title>Mappso and mapevo results for oaei 2011</article-title>
          , in: Proceedings
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>