<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>SimCat Results for OAEI 2016</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Abderrahmane Khiat</string-name>
          <email>khiat@yahoo.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elhabib Abdelillah Ouhiba</string-name>
          <email>ouhiba.ab@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mohammed Amine Belfedhal</string-name>
          <email>Mohammed.belfedhal@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chihab Eddine Zoua</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>EEEDIS Lab, University Djillali Liabes</institution>
          ,
          <addr-line>Sidi Bel-Abbes</addr-line>
          ,
          <country country="DZ">Algeria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>LAMOSI Laboratory, Oran University of Science and Technology - Mohamed Boudiaf</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>LITIO Laboratory, University of Oran1 Ahmed Ben Bella</institution>
          ,
          <addr-line>Oran</addr-line>
          ,
          <country country="DZ">Algeria</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Ooredoo Algiers</institution>
          ,
          <country country="DZ">Algeria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Recently, the multilingualism issue has attracted considerable attention in the ontology matching field. Designed for this purpose, the SimCat system uses the Yandex translator and similarity computation based on the categories of the words. This is the first participation of SimCat in OAEI 2016 evaluation campaign and the obtained results are quite promising. Presentation of the System The Semantic Web relies on ontologies to describe the content of different information sources in order to overcome the heterogeneity issue and achieve their semantic interoperability [12, 14]. However, these ontologies are heterogeneous, distributed and even they are described in different languages. A solution to this heterogeneity is to use ontology alignment to bridge the semantic gap between these ontologies [11]. The ontology alignment system receives as input two or more ontologies and generates as output a set of semantic correspondences between the entities of the ontologies that are being processed [3, 2]. Indeed, these semantic correspondences are the bridges that hold the heterogeneous ontologies together and ensure their semantic interoperability. Moreover, with the enormous volume of ontologies already available on the web and their constant evolution, manual identification of semantic correspondences is not feasible [14]. Therefore, ontology alignment tools are required to have the ability of identifying semantic correspondences between entities of different ontologies in an automated way. However, the automatic identification of semantic correspondences is not a trivial task due to the conceptual diversity between the ontologies [4]. Performing an automatic ontology alignment task between mono-language ontologies such as English is difficult, however, the task is even more challenging when it comes to multilingual ontologies. Most existing approaches implement a direct strategy[15] i.e. using machine translation. However, the matching task is challenging for these approaches due to misinterpretations during the translation process. The research conducted on direct strategy leaves many questions to address such as (1) is the use of various translators has a different impact on the output of the translation? (2) is the translation into a pivot language (English) performing better output than</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>a translation from language to another? and (3) how to proceed when translators give
poor results?</p>
      <p>The multifarm[10] track has been integrated in the Ontology Alignment Evaluation
Initiative (OAEI) in 2012 with the goal of estimating and comparing different
techniques and systems related to multilingual ontology alignment. From 2012 to 2014
the multifarm track contains conference ontologies[9] described in eight different
languages (i.e., Chinese, Czech, Dutch, French, German, Portuguese, Russian, Spanish).
However, in 2015 the multifarm includes the Arabic language [13, 14].</p>
      <p>
        Back to results of the systems involved in previous editions (from 2012 to 2015)
[
        <xref ref-type="bibr" rid="ref5 ref6 ref7 ref8">5–8</xref>
        ] of multifarm track, we have observed that the best system (in all previous OAEI
editions) achieved an F-measure of 0.51 [15]. This is surprising, in spite of many
research works that have been established in the field of multilingual ontology matching.
      </p>
      <p>The proposed system also implements a direct strategy and its aim is to highlight
the translator used and similarity calculated using the categories of the word.
1.1</p>
    </sec>
    <sec id="sec-2">
      <title>State, Purpose, General Statement</title>
      <p>In this paper, we describe our SimCat software, yet another cross-lingual ontology
matching system. Unlike existing approaches which use well-known translators,
SimCat employs the Yandex translatorr1. In addition, SimCat computes the similarities
between translated entities based on the categories of the words.
1.2</p>
    </sec>
    <sec id="sec-3">
      <title>Specific Techniques Used</title>
      <p>The process of our system consists in the following successive steps.</p>
      <p>Step 1: Extraction and Normalization In this step, our system extracts the entities
of two ontologies to align. Then, it uses a segmentation technique to split labels into
words; Finally, it converts all words in lower case.</p>
      <p>Step 2: Translation and Cleaning In this step, SimCat translates the normalized
entities using the Yandex translator into English as a pivot language. To the best of our
knowledge, the Yandex translator has not been used before by multilingual ontology
matching system. Our choice of Yandex translator is justified by the fact that it is one
of largest search engine in the world and the obtained results are quite promising.
However, we have used the English as a pivot language because the categories of the words
which are used for similarity computation are in English language.</p>
      <p>Once the translation is is carried out, SimCat employs NLP techniques. First, it
eliminates the stop-words from translated entities; then it employs lemmatization and
stemming. This step is necessary since the categories of the words are in that lemma
form.
1 https://translate.yandex.com/?lang=es-en&amp;text=administrar&amp;
ncrnd=5317
Step 3: Similarity Computation In this step, our system computes the similarity
between entities using the categories of words. This matcher is based on an open project
named ”Calculate Semantic Similarity”.</p>
      <p>The project2 calculates the similarities between sentences and the results are stable.
The description of the project is as follows: First, the list of words was obtained from
using EOWL, then the categories for each word were calculated using the DISCO’s
semantics3. The semantic categories are obtained from disco as follows: (1)
en-BNC20080721 within 119 million tokens; (2) en-PubMedOA-20070501 within 181
million tokens and (3) en-wikipedia-20080101 within 267 million tokens. The matcher
enhances the Vector-Space by the analysis found withing the Classifier4j, which does
not take into account the semantic meanings of the words.</p>
      <p>However, we have adapted it for our case. We have reprogram the matcher in a way
that it can return the similarity value between words. We have some tests on the adapted
matcher and the results are quite good.</p>
      <p>Step 4: Identification of Alignment In this step, SimCat applies applies a filter to
select candidate correspondences which possess the maximum similarity value in each
line of Cartesian product between entities. Then it applies a second a filter to identify
the correspondences that possess similarity value upper than a given threshold.
1.3</p>
    </sec>
    <sec id="sec-4">
      <title>Adaptations Made for the Evaluation</title>
      <p>We do not have made any specific adaptation for OAEI 2016 evaluation campaign
regarding our SimCat system. All parameters are the same for aligning different
ontologies of multifarm track.
1.4</p>
    </sec>
    <sec id="sec-5">
      <title>Link to the set of provided alignments (in align format)</title>
      <p>The result of SimCat system can be downloaded from OAEI 2016 website http://
oaei.ontologymatching.org/2016/results/multifarm/index.html
2</p>
      <sec id="sec-5-1">
        <title>Results</title>
        <p>The SimCat system is yet another multilingual ontology alignment system. Designed
for this purpose, we present the results obtained by running our SimCat system on
multifarm tracks of OAEI 2016 evaluation campaign following website http://oaei.
ontologymatching.org/2016/results/multifarm/index.html.</p>
        <p>The multifarm track is constituted of seven ontologies. These ontologies describe
the conference domain and are based on the ontologies of the OAEI conference track.
These ontologies have been translated in nine different languages (since 2015 the Arabic
language is included, Chinese, Czech, Dutch, French, German, Portuguese, Russian,
and Spanish) and the corresponding alignments between these ontologies. The purpose
of multifarm is to evaluate and compare the performance of matching approaches with
a special focus on multilingualism.
2 http://wordnet.princeton.edu/
3 www.linguatools.de/disco/disco_en.html</p>
      </sec>
      <sec id="sec-5-2">
        <title>General Comments</title>
        <p>The evaluation conducted on SimCat system confirmed the following points:
– The results obtained from the Yandex translator API are quite promising.
– The similarity based on the categories of the words could provide good results.
– In overall, the SimCat system provides promising results by achieving a good
FMeasure, however, it consumes 24 min as computation time for each task. This is
considered as a drawback of the proposed system, since the multifarm contains 55
tasks.
4</p>
      </sec>
      <sec id="sec-5-3">
        <title>Conclusion</title>
        <p>In this paper, We have presented SimCat, an automatic matching system developed
specifically for aligning multilingual ontologies. The SimCat system implements a matcher
based on the categories of the words and a translation based on Yandex engine to find the
semantic correspondences between different concepts of the two ontologies described
in different natural languages. Regarding the first participation of SimCat system in
OAEI2016, the results are acceptable, however there is much work to do in order to
improve our system.
9. O. Svab, V. Svatek, P. Berka, D. Rak and P. Tomasek, “OntoFarm: Towards an Experimental</p>
        <p>Collection of Parallel Ontologies”, In: Poster Track of ISWC 2005, Galway, 2005.
10. C. Meilicke, R. Garca-Castro, F. Freitas, WR. Van Hage, E. Montiel-Ponsoda, R.R. De
Azevedo, H. Stuckenschmidt, O. vb-Zamazal, V. Svtek and A. Tamilin, “MultiFarm: A
benchmark for multilingual ontology matching”. Web Semant. Sci. Serv. Agents World Wide
Web. Vol. 15, pp. 6268, 2012.
11. A. Khiat and M. Benaissa, A New Instance-Based Approach for Ontology Alignment.
International Journal on Semantic Web and Information Systems (IJSWIS), Vol. 11, No. 3, ISSN
1683-3198, 2015.
12. A. Khiat and M. Benaissa, Boosting Reasoning-Based Approach by Structural Metrics for</p>
        <p>Ontology Alignment. The Journal of Information Processing Systems (JIPS), 2015.
13. A. Khiat and M. Benaissa and Ernesto Jimnez-Ruiz ADOM: arabic dataset for evaluating
arabic and cross-lingual ontology alignment systems. In Proceedings of the 10th
International Workshop on Ontology Matching co-located with the 14th International Semantic Web
Conference (ISWC 2015), USA, 2015.
14. A. Khiat, G. Diallo, B. Yaman, E. Jimnez-Ruiz and M. Benaissa, ABOM and ADOM:
Arabic Datasets for the Ontology Alignment Evaluation Campaign. In Proceedings of the 14th
International Conference (ODBASE 2015), Greece, 2015.
15. A. Khiat, CroLOM: Cross-Lingual Ontology Matching System Results for OAEI 2016. In
Proceedings of the 12th International Workshop on Ontology Matching co-located with the
15th International Semantic Web Conference (ISWC 2016), Japan, 2016.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>M.</given-names>
            <surname>Ehrig</surname>
          </string-name>
          , ”Ontology Alignment: Bridging the Semantic Gap”, Springer,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>P.</given-names>
            <surname>Shvaiko</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          , ”Ontology Matching:
          <article-title>State of the Art and Future Challenges”</article-title>
          ,
          <source>IEEE Transactions on Knowledge and Data</source>
          Engineering vol.
          <volume>25</volume>
          no.
          <issue>1</issue>
          , pp.
          <fpage>158</fpage>
          -
          <lpage>176</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Shvaiko</surname>
          </string-name>
          , “Ontology Matching”, Springer-Verlag, Heidelberg,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>P.</given-names>
            <surname>Bouquet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Franconi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Serafini</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Stamou and S. Tessaris “Specification of a Common Framework for Characterizing Alignment”</article-title>
          ,
          <source>Deliverable 2.2</source>
          .1,
          <string-name>
            <given-names>Knowledge</given-names>
            <surname>Web</surname>
          </string-name>
          <string-name>
            <surname>NoE</surname>
          </string-name>
          ,
          <source>Technical Report, Italy</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>M.</given-names>
            <surname>Cheatham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dragisic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Faria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ferrara</surname>
          </string-name>
          , G. Flouris,
          <string-name>
            <surname>I. Fundulaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Granada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ivanova</surname>
          </string-name>
          ,
          <string-name>
            <surname>E.</surname>
          </string-name>
          <article-title>Jime´nez-</article-title>
          <string-name>
            <surname>Ruiz</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Lambrix</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Montanelli</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Pesquita</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Saveta</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Shvaiko</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Solimando</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Trojahn</surname>
            and
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Zamazal</surname>
          </string-name>
          , “
          <source>Results of the Ontology Alignment Evaluation Initiative</source>
          <year>2015</year>
          ”, 10th Workshop on Ontology Matching,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dragisic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Eckert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Faria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ferrara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Granada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ivanova</surname>
          </string-name>
          , E. Jime´nezRuiz,
          <string-name>
            <given-names>A. O.</given-names>
            <surname>Kempf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lambrix</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Montanelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Paulheim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ritze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shvaiko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Solimando</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Trojahn-dos-</article-title>
          <string-name>
            <surname>Santos</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Zamazal</surname>
            and
            <given-names>B. Cuenca</given-names>
          </string-name>
          <string-name>
            <surname>Grau</surname>
          </string-name>
          , “
          <source>Results of the Ontology Alignment Evaluation Initiative</source>
          <year>2014</year>
          ”, 9th Workshop on Ontology Matching,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>B.</given-names>
            <surname>Cuenca Grau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dragisic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Eckert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ferrara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Granada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ivanova</surname>
          </string-name>
          ,
          <string-name>
            <surname>E.</surname>
          </string-name>
          <article-title>Jime´nez-</article-title>
          <string-name>
            <surname>Ruiz</surname>
            ,
            <given-names>A. Oskar</given-names>
          </string-name>
          <string-name>
            <surname>Kempf</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Lambrix</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Nikolov</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Paulheim</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Ritze</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Scharffe</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Shvaiko</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <article-title>Trojahn dos Santos, O</article-title>
          . Zamazal, “
          <source>Results of the Ontology Alignment Evaluation Initiative</source>
          <year>2013</year>
          ”. 8th Workshop on Ontology Matching,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>J.</given-names>
            <surname>Aguirre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Eckert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ferrara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.R.v.</given-names>
            <surname>Hage</surname>
          </string-name>
          , L. Hollink, Ch. Meilicke,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ritze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Scharffe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shvaiko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Svb-Zamazal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Trojahn</surname>
          </string-name>
          ,
          <string-name>
            <surname>E.</surname>
          </string-name>
          <article-title>Jime´nez-</article-title>
          <string-name>
            <surname>Ruiz</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Cuenca-Grau</surname>
            and
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Zapilko</surname>
          </string-name>
          :,
          <source>“Results of the Ontology Alignment Evaluation Initiative</source>
          <year>2012</year>
          ”, 7th Workshop on Ontology Matching,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>