<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>XMap++ : Results for OAEI 2014</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Warith Eddine Djeddi</string-name>
          <email>djeddi@labged.net</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mohamed Tarek Khadir</string-name>
          <email>khadir@labged.net</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>LabGED, Computer Science Department, University Badji Mokhtar</institution>
          ,
          <addr-line>Annaba</addr-line>
          ,
          <country country="DZ">Algeria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>State</institution>
          ,
          <addr-line>purpose, general statement</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we present the results obtained by our ontology matching system XMap++ within the OAEI 2014 campaign. XMap++ is a scalable ontology alignment tools capable of matching large scale ontology. This is our second participation in the OAEI, and we can see an overall improvement on nearly every task. XMap (eXtensible Mapping) is an ontology alignment tool for the alignment of OWL entities (i.e., classes, object properties and data properties). XMap++ approach uses different similarity measures of different categories such as string, linguistic, and structural based similarity measures to understand ontologies semantics. A weights vector must, therefore, be assigned to these similarity measures, if a more accurate and meaningful alignment result is favored. Combining multiple measures into a single similarity metric has been solved using weights determined by intelligent strategies [3]. The major drawback of our two previous versions XMapGen and XMapSig [2], despite the fact that they achieved fair results and the aim of their development is to deliver a stable version, the time performance was very low time, especially for the Large Biomedical Ontologies tracks, inability to recognize multiple labels to a single entity as synonyms and inability to recognize labels translated in different languages (e.g Chinese, Czech, Dutch, French, German). After carefully studying this issue, we realize that our algorithm needs more assessment in its performance. This inspires us to consider new strategies in the new version of XMap++ 2014, such as : 1) Using cosine similarity as a string similarity methods to compare the concepts textual descriptions associated with the nodes (labels, names, identity, etc) of each ontology; 2) Involving particular parallel matching on multiple cores or machines for dealing with the scalability issue on ontology matching; 3) Translating labels with different languages using Bing Translator (not use any services which require payment); 4) Interfacing with the Wordnet electronic dictionary using Java Wordnet Interface (JWI) as a Java library. Meanwhile, XMap++ loads WordNet dictionary fully into memory to gain time when it aligns large-scale ontologies. Consequently, the new version XMap++ 2014 has improved both the matching quality and time performance in large scale ontology matching tasks.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1.1</p>
    </sec>
    <sec id="sec-2">
      <title>Specific techniques used</title>
      <p>The workflow and the main components of the system can be seen in the Fig. 1. The
XMap++ consists of the following components:</p>
      <p>′
1. Matching inputs are two ontologies, source O and target O parsed by an Ontology</p>
      <p>
        Parser component;
2. The String Matcher based on linguistic matching compares the textual descriptions
of the concepts associated with the nodes (labels, names) of each ontology;
3. The Linguistic matcher jointly aims at identifying words in the input strings,
relaying on WordNet [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. These matching techniques may provide incorrect match
candidates, structural matching is used to correcting such match candidates based
on their structural context. In order to deal with lexical ambiguity, we introduce the
notion of the scope belonging to a concept which represents the context where it is
placed [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The value of linguistic methods is added to the linguistic matcher or the
structure matcher in order to enhance the semantic ambiguity during the
comparison process of entity names;
4. The structural matcher aligns nodes based on their adjacency relationships. The
relationships (e.g., subClassOf and is-a) that are frequently used in the ontology
serve, at one hand, as the foundation of the structural matching;
5. The three matchers perform similarity computation in which each entity of the
source ontology is compared with all the entities of the target ontology, thus
producing three similarity matrices, which contain a value for each pair of entities.
After that, an aggregation operator is used to combine multiple similarity matrices
computed by different matchers to a single aggregated similarity matrix. We refer
to [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] for more detail about the pruning and splitting techniques on data matrices
for two couple of entities;
6. XMap++ uses three types of aggregation operator; these strategies are aggregation,
selection and combination [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ];
7. Finally, these values are filtered using a selection according to a defined threshold
and the desired cardinality. In our algorithm, we adopt the 1-1 cardinality to find
the optimal solution in polynomial time.
2
      </p>
      <sec id="sec-2-1">
        <title>Results</title>
        <p>In this section, we present the evaluation results obtained by running XMap++ with
SEALS client with Benchmark, Anatomy, Conference, Multifarm, Library and Large
Biomedical Ontologies tracks. Adding to that, we present the results of the test
Ontology Alignment for Query Answering which not follow the classical ontology alignment
evaluation on the SEALS platform.
2.1</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Benchmark</title>
      <p>XMap++ performs very well in terms of Precision (1.0) while a low recall (0.4) in the
Benchmark track. Those low values are explained by the fact that ontological entities
with scrambled labels, lexical similarity becomes ineffective. Whereas for the others
two test suites our algorithm performed worse in term of F-Measure because our system
does not handle ontology instances. Table 1 summarises the average results obtained by
XMap++.
The Anatomy track consists of finding an alignment between the Adult Mouse Anatomy
(2744 classes) and a part of the NCI Thesaurus (3304 classes) describing the human
anatomy. XMap++ achieves a good F-Measure value of ≈89% in an adequate amount
of time (22 sec.) (see Table 2). In terms of F-Measure/runtime, XMap++ ranked 3rd
among the 10 tools participated in this track.
The Conference track uses a collection of 16 ontologies from the domain of academic
conferences. Most ontologies were equipped with OWL DL axioms of various kinds;
this opens a useful way to test our semantic matchers. The match quality was evaluated
against the original (ra1) as well as entailed reference alignment (ra2). As the Table 3
shows, for both evaluations we achieved F-Measure values better than the two Baselines
results (edna, StringEquiv).
This track is based on the translation of the OntoFarm collection of ontologies into 9
different languages. XMap ++’s results are showed in the Table 4.
The library track involves the matching of the STW thesaurus (6,575 classes) and the
Soz thesaurus (8,376 classes). Both of these thesauri provide vocabulary for economic
and social sciences. The results are depicted in table 5; our tools achieved a good recall
of ≈88%, and the precision was low (50%). XMap++ requires ≈ 3 hr and 30 min, it is
mainly due to the long times required for looking up concepts in Bing Translator when
it attempts to translate all the German labels to English labels.
that Xmap++ achieved a good precision and fair recall value. The fair recall value can
be explained by the fact that WordNet does not contain definitions of highly technical
medical terms, resulting in the system being unable to match entities that are not
located in the WordNet database. Using a different linguistic ontology should alleviate
this problem, or ideally the system should automatically select the most appropriate
linguistic ontology for this task.
2.7</p>
    </sec>
    <sec id="sec-4">
      <title>Ontology Alignment for Query Answering</title>
      <p>The objective of this test is to verify the ability of the generated alignments to answer a
set of queries in an ontology-based data access scenario where several ontologies exist.
The table 7 shows the F-measure results for the whole set of queries. XMap++ was
one of the four matchers whose alignments allowed to answer all the queries of the
evaluation.</p>
      <sec id="sec-4-1">
        <title>General comments</title>
        <p>3.1</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Comments on the results</title>
      <p>This is the second time that we participate in the OAEI campaign. While we participated
with two configurations of our system to the 2013 edition of the campaign, respectively
with XMapGen and XMapSig, this year a unique version has been submitted. Several
changes have been introduced. The official results of OAEI 2014 show that XMap++
is competitive with other well-known ontology matching systems in all OAEI tracks,
especially in Library track it got the highest recall of all attended systems. The current
version of XMap++ has shown a significant improvement both in terms of matching
quality and runtime. Additionally, to tackle the large ontology matching problem we
improved the runtime of the algorithm using a divide-and-conquer approach that can
partition the execution of the matchers into small threads was improved and joins their
results after each similarity calculation.
3.2</p>
    </sec>
    <sec id="sec-6">
      <title>Discussions on the way to improve the proposed system</title>
      <p>Some probable approaches to improving our tools are listed as follows:
1. Take comments and Instance information of ontology into account, especially when
the name of a concept is meaningless;
2. Using the UMLS Meta-thesaurus to have high recall when aligning ontologies from
the biomedical science domain;
3. Pre-compiling a local dictionary in order to avoid multiple accesses to the Microsoft</p>
      <p>Translator within the matching process.
3.3</p>
    </sec>
    <sec id="sec-7">
      <title>Comments on the OAEI 2013 procedure</title>
      <p>As a second participation, we found the OAEI procedure very convenient and the
organizers very supportive. The use of Seals allows objective assessments. The OAEI test
cases are various, and this leads to comparison on different levels of difficulty, which is
very interesting. We found that SEALS platform is a very valuable tool to compare the
performance of our system with the others.
4</p>
      <sec id="sec-7-1">
        <title>Conclusion</title>
        <p>We have briefly described our fully automate ontology matching system XMap++ and
presented the results achieved during the 2014 edition of the OAEI campaign. The
obtained results showed that XMap++ is able to efficiently and effectively match
ontologies of different size. In future we want to participate in more tracks. Our ontology
matching system presents some limitations. We intend to use the UMLS resource for
better discarding incorrect mappings for life sciences related ontologies.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Djeddi</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khadir</surname>
          </string-name>
          , M. T.:
          <article-title>A Novel Approach Using Context-Based Measure for Matching Large Scale Ontologies</article-title>
          .
          <source>In Proceedings of 16th International Conference on Data Warehousing and Knowledge Discovery (DAWAK</source>
          <year>2014</year>
          ),
          <source>September 2-4</source>
          , pp.
          <fpage>320</fpage>
          -
          <lpage>331</lpage>
          . Springer, Munich, Germany (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Djeddi</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khadir</surname>
          </string-name>
          , M. T.:
          <article-title>XMapGen and XMapSiG results for OAEI 2013</article-title>
          .
          <source>In Proceedings of the 8th International Workshop on Ontology Matching co-located with the 12th International Semantic Web Conference (ISWC</source>
          <year>2013</year>
          ), October 21, pp.
          <fpage>203</fpage>
          -
          <lpage>210</lpage>
          . CEUR-WS.org, Sydney, Australia (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Djeddi</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khadir</surname>
          </string-name>
          , M.T.:
          <article-title>Ontology alignment using artificial neural network for large-scale ontologies</article-title>
          .
          <source>In the International Journal of Metadata, Semantics and Ontologies (IJMSO)</source>
          , Vol.
          <volume>8</volume>
          , No.
          <issue>1</issue>
          , pp.
          <fpage>75</fpage>
          -
          <lpage>92</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Djeddi</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khadir</surname>
          </string-name>
          , M.T.:
          <article-title>Introducing artificial neural network in ontologies alignment process</article-title>
          .
          <source>In the Journal Control and Cybernetics</source>
          , Vol.
          <volume>41</volume>
          , No.
          <issue>4</issue>
          , pp.
          <fpage>743</fpage>
          -
          <lpage>759</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Djeddi</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khadir</surname>
          </string-name>
          , M.T. :
          <article-title>A dynamic multistrategy ontology alignment framework based on semantic relationships using WordNet</article-title>
          .
          <source>In Proc of the 3rd International Conference on Computer Science and its Applications (CIIA´11)</source>
          ,
          <fpage>13</fpage>
          −
          <lpage>15</lpage>
          December, Saida, Algeria, pp.
          <fpage>149</fpage>
          -
          <lpage>154</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Djeddi</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khadir</surname>
          </string-name>
          , M.T.:
          <article-title>XMAP: a novel structural approach for alignment of OWL-full ontologies</article-title>
          .
          <source>In Proc. of the International Conference on Machine and Web Intelligence (ICMWI)</source>
          , pp.
          <fpage>347</fpage>
          -
          <lpage>352</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Fellbaum</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>WordNet: An Electronic Lexical Database</article-title>
          , MIT Press, Cambridge, MA (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Gross</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hartung</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kirsten</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Rahm</surname>
          </string-name>
          , E. :
          <article-title>On matching large life science ontologies in parallel</article-title>
          . In , in Lambrix, P. and
          <string-name>
            <surname>Kemp</surname>
            ,
            <given-names>G.J.L</given-names>
          </string-name>
          . (Eds), DILS, Springer, pp.
          <fpage>35</fpage>
          -
          <lpage>49</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>