=Paper=
{{Paper
|id=Vol-2288/oaei18_paper15
|storemode=property
|title=XMap: results for OAEI 2018
|pdfUrl=https://ceur-ws.org/Vol-2288/oaei18_paper15.pdf
|volume=Vol-2288
|authors=Warith Eddine Djeddi,Sadok Ben Yahia,Mohamed Tarek Khadir
|dblpUrl=https://dblp.org/rec/conf/semweb/DjeddiYK18
}}
==XMap: results for OAEI 2018==
<pdf width="1500px">https://ceur-ws.org/Vol-2288/oaei18_paper15.pdf</pdf>
<pre>
                           XMap : Results for OAEI 2018

        Warith Eddine DJEDDIa,b , Sadok BEN YAHIAb and Mohamed Tarek KHADIRa
         a
        LabGED, Computer Science Department, University Badji Mokhtar, Annaba, Algeria
b
    Faculty of Sciences of Tunis, University of Tunis El-Manar, LIPAH-LR 11ES14, 2092, Tunisia
                               {djeddi,khadir}@labged.net
                                sadok.benyahia@fst.rnu.tn


             Abstract. We describe in this paper the XMap system and the results achieved
             during the 2018 edition of the Ontology Alignment Evaluation Initiative. XMap
             aims to tackle the issue of matching large scale ontologies by involving particular
             parallel matching on multiple cores or machines. Our strategies aim to provide
             a set of requirements that foster the using of a domain-specific thesaurus for the
             alignment of specialized ontologies.


1        Presentation of the system

The eXtended Mapping (XMap) algorithm relies on the context notion to deal with lex-
ical ambiguity as well as a parallel comparison between concepts to efficiently handle
the matching of large ontologies. Our approach to matching ontologies employs dif-
ferent components and steps in the ontology alignment process such as preprocessing,
matching, filtering and combining matching results, and oracle validation of mapping
suggestions. The contributions are the following:

    – Defining a semantic similarity measure using UMLS1 [1] and WordNet [2] to pro-
      vide a synonymy degree between two entities from different ontologies, by ex-
      ploring both of their lexical and structural contexts. In XMap, the measurement
      of lexical similarity in ontology matching is performed using a synset, defined in
      WordNet and UMLS. In our approach, the similarity between two entities of dif-
      ferent ontologies is evaluated not only by investigating the semantics of the entity
      names, but also taking into account the context, through which the effective mean-
      ing is described. It is worth mentioning that the context is the set of information
      (partly) characterizing the situation of some entities [3]. The context notion is not
      universal but it is relative to some situations, tasks or applications [4, 5];
    – Limiting the number of mapping suggestions to be validated by an oracle. Indeed,
      our approach employs a double threshold to produce matching candidates and use
      a small set of constraints [6, 7] (e.g., consistency, locality, and conservativity or
      quality checks), acting as a filter to select the final alignments. The first threshold
      is used at the interactive selection algorithm, which will ask the oracle for feedback
      about mappings when they are below a given similarity threshold, until a given
      number of negative answers is reached. The second threshold is used at the final
    1
        http://www.nlm.nih.gov/research/umls/
               Fig. 1. The different steps for scoring a multiple network alignment.


      stage to filter out the set of correspondences having a similarity value below a
      given threshold. This strategy skips over the problem of the growing size and the
      complexity of the user participation in the process alignment of large ontologies.
    – Applying repair techniques from Applying Logical Constraints on Matching On-
      tologies (ALCOMO) [8] to make reference alignments coherent, by removing less
      unsatisfiable classes (discovering disjointness relationships) without having an im-
      pact on the F-measure score. Our strategy in the repair mode takes into account
      the confidence values during the selection of mappings to be removed in order to
      improve the quality of the repaired alignments in terms of computation time and
      mapping coherence.
    – Finally, is the ability of XMap to deal with large scale ontology matching, by pro-
      ducing good experimental results in terms of quality of the alignments, time per-
      formance and scalability.


2     State, purpose, general statement

Our prototype leans on the architecture of a sequential/parallel composition. XMap
uses various similarity measures of different categories such as string, linguistic, and
structural based similarity measures, each contributing to some extent to the alignment
results. At a glance, the mapping process of XMap is depicted in Figure 1. XMap re-
ceives as an input two source ontologies. The mappings discovered by the terminolog-
ical level matcher are transferred to the structural level matcher in order to find new
correspondences by analyzing the context of the entities in the taxonomy of ontologies.
Afterwards, the combined result of the two basic matchers are aggregated by a weighted
sum aggregation operator. For the final alignment method, the system uses the threshold
method. Moreover, we manually define the filters threshold value to produce the final
mappings. A fast repair method is applied so as to detect and remove the inconsistent
ones.
3   Results
In this section, we present the evaluation results obtained by running XMap under the
SEALS client with Anatomy, Conference, Multifarm, Interactive matching evaluation,
Large Biomedical Ontologies, Disease and Phenotype and Biodiversity and Ecology
tracks.

Anatomy The Anatomy track consists of finding an alignment between the Adult
Mouse Anatomy (2744 classes) and a part of the NCI Thesaurus (3304 classes) de-
scribing the human anatomy. XMap achieves a good F-Measure value of ≈89% in a
reasonable amount of time (37 sec.) (cf., Table 1).

                            Table 1. Results for Anatomy track.

             System        Precision     F-Measure Recall          Time(s)
             XMap          0.929         0.896        0.865        37
             StringEquiv   0.997         0.766        0.622        946


Conference The Conference track uses a collection of 16 ontologies from the domain
of academic conferences. Most ontologies were equipped with OWL-DL axioms of
various types; this opens a useful way to test our semantic matchers. For each reference
alignment, three evaluation modalities are applied : a) crisp reference alignments, b) the
uncertain version of the reference alignment, c) logical reasoning.

                  Table 2. Results based on the crisp reference alignments.

                            Precision        F-Measure 1      Recall
                             Original reference alignment (ra1)
              ra1-M1        0.81             0.70             0.61
              ra1-M2        0.69             0.31             0.20
              ra1-M3        0.81             0.65             0.54
                             Entailed reference alignment (ra2)
              ra2-M1        0.79             0.65             0.55
              ra2-M2        0.77             0.34             0.22
              ra2-M3        0.77             0.61             0.5
                            Violation reference alignment (rar2)
              rar2-M1       0.78             0.66             0.57
              rar2-M2       0.77             0.34             0.22
              rar2-M3       0.76             0.62             0.52


   As depicted in Table 2 and 3, XMap produces fairly consistent alignments when
matching the conference ontologies. Finally, XMap generated only one incoherent align-
ment for the evaluation based on logical reasoning.
         Table 3. Results based on the uncertain version of the reference alignment.

                     Precision       F-Measure 1     Recall
                          Uncertain reference alignments (Sharp)
                     0.81            0.65            0.54
                         Uncertain reference alignments (Discrete)
                     0.66            0.74            0.83
                        Uncertain reference alignments (Continuous)
                     0.74             0.70            0.66


Multifarm This track is based on the translation of the OntoFarm collection of on-
tologies into 9 different languages. XMap have low performance due to many internal
exceptions. The results are showed in Table 4.


                           Table 4. Results for Multifarm track.

               System             Different ontologies   Same ontologies
                              P         F       R      P    F       R
               XMap           0.2      0.3    0.07    0.13    0.14    0.19


Interactive matching evaluation The goal of this evaluation is to imitate interactive
alignment [9, 10], where a oracle user is involved to validate the correspondences found
by the alignment approach by checking the reference alignment, and changing error
values in order to assess their influence on the performance of alignment systems. For
the 2018 edition, participating systems are evaluated on the Conference and Anatomy
datasets using an oracle based on the reference alignment.
    XMap uses various similarity measures to generate candidate mappings. It applies
two thresholds to filter the candidate mappings: one for the mappings that are directly
added to the final alignment and another for those that are presented to the user for
validation. The latter threshold is selected to be high in order to minimize the num-
ber of requests and the rejected candidate mappings from the oracle; the requests are
mainly about incorrect mappings. The mappings accepted by the user are moved to
the final alignment. For the three years 2016, 2017 and 2018, XMap preserved roughly
the same F-Measure value, and it benefits the least from the interaction with the or-
acle. Whereas, for the conference track, XMap has increases in precision, recall and
F-measure. XMap’s measures differ with less than 0.2% from the non-interactive runs,
and performance does not change at all with the increasing error rates.


Large biomedical ontologies This track consists of finding alignments between the
Foundational Model of Anatomy (FMA), SNOMED CT, and the National Cancer In-
stitute Thesaurus (NCI). The results obtained by XMap (Evaluated without UMLS) are
depicted by Table 5.
                         Table 5. Results for the Large BioMedical track.

      Test set                          Precision    Recall       F-Measure Time(s)
      Small FMA-NCI                     0.977        0.783        0.869       7356
      Whole FMA-NCI                     0.877        0.741        0.803       66499
      Small FMA-SNOMED                  0.962        0.647        0.774       25544
      Whole FMA- Large SNOMED           0.723        0.608        0.661       299027
      Small SNOMED-NCI                  0.835        0.588        0.69        123597
      Whole SNOMED-NCI                  0.64         0.582        0.61        426584


    In general, we can conclude that XMap achieved a good precision/recall values. The
high recall value can be explained by the fact that UMLS thesaurus contains definitions
of highly technical medical terms.


Disease and Phenotype This track based on a real use case where it is required to find
alignments between disease and phenotype ontologies. Specifically, the selected ontolo-
gies are the Human Phenotype Ontology (HPO), the Mammalian Phenotype Ontology
(MP), the Human Disease Ontology (DOID), and the Orphanet and Rare Diseases On-
tology (ORDO).
    XMap achieved fair results according to the three evaluation (Silver standard, Man-
ually generated mappings and Manual assessment of unique mappings).


Biodiversity and Ecology This track aims finding the alignments between the Envi-
ronment Ontology (ENVO) and the Semantic Web for Earth and Environment Technol-
ogy Ontology (SWEET), and between the Flora Phenotype Ontology (FLOPO) and the
Plant Trait Ontology (PTO). The results are showed in Table .


                   Table 6. Results for the Biodiversity and Ecology track.

      Test set                          Precision    Recall       F-Measure Time(s)
      Small flopo-pto                   0.987        0.761        0.619       153
      Whole envo-sweet                  0.868        0.785        0.716       547


4     General comments

4.1    Comments on the results

This is the 6th time that we participate in the OAEI campaign. The official results of
OAEI 2018 show that XMap is competitive with other well-known ontology matching
systems in all OAEI tracks.
4.2   Comments on the OAEI 2018 procedure

As a sixth participation, we found the OAEI procedure very convenient and the organiz-
ers very supportive. The OAEI test cases are various, and this leads to a comparison on
different levels of difficulty, which is very interesting. We found that SEALS platform
is a precious tool to compare the performance of our system with the others.


5     Conclusion
Generally, according to our results obtained during the compaing OAEI 2018, our sys-
tem delivered good results comparatively to other well-known ontology matching sys-
tems. The used benchmark greatly helped to identify the power and weaknesses of
the algorithm. used benchmark helped greatly identify the power and weaknesses of
the algorithm. In addition, XMap showed the feasibility of our approach especially on
large-scale biomedical ontologies which was a thriving challenge in ontology matching
domain.


References
 1. Olivier Bodenreider. The unified medical language system (UMLS): integrating biomedical
    terminology. Nucleic Acids Research, 32(Database-Issue):267–270, 2004.
 2. Christiane D. Fellbaum. WordNet – An Electronic Lexical Database. MIT Press, 1998.
 3. Anind K. Dey, Gregory D. Abowd, and Daniel Salber. A conceptual framework and a toolkit
    for supporting the rapid prototyping of context-aware applications. Hum.-Comput. Interact.,
    16(2):97–166, December 2001.
 4. Paul Dourish. Seeking a foundation for context-aware computing. Human-Computer Inter-
    action, 16(2-4):229–241, 2001.
 5. Matthew Chalmers. A historical view of context. Computer Supported Cooperative Work,
    13(3):223–247, 2004.
 6. Ernesto Jiménez-Ruiz, Bernardo Cuenca Grau, Ian Horrocks, and Rafael Berlanga Llavori.
    Logic-based assessment of the compatibility of UMLS ontology sources. J. Biomedical
    Semantics, 2(S-1):S2, 2011.
 7. Elena Beisswanger and Udo Hahn. Towards valid and reusable reference alignments - ten
    basic quality checks for ontology alignments and their application to three different reference
    data sets. J. Biomedical Semantics, 3(S-1):S4, 2012.
 8. Christian Meilicke. Alignment incoherence in ontology matching. PhD thesis, University of
    Mannheim, 2011.
 9. Heiko Paulheim, Sven Hertling, and Dominique Ritzei. Towards evaluating interactive on-
    tology matching tools. In The Semantic Web: Semantics and Big Data, 10th International
    Conference, ESWC 2013, Montpellier, France, May 26-30, 2013. Proceedings, pages 31–45,
    2013.
10. Zlatan Dragisic, Valentina Ivanova, Patrick Lambrix, Daniel Faria, Ernesto Jimenez-Ruiz,
    and Catia Pesquita. User validation in ontology alignment. In Proceedings of the Interna-
    tional Semantic Web Conference, volume 9981 of LNCS, October 2016.

</pre>