<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>InsMT / InsMTL Results for OAEI 2014 Instance Matching</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Abderrahmane Khiat</string-name>
          <email>abderrahmane_khiat@yahoo.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Moussa Benaissa</string-name>
          <email>moussabenaissa@yahoo.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>LITIO Lab, University of Oran</institution>
          ,
          <addr-line>BP 1524 El-Mnaouar Oran</addr-line>
          ,
          <country country="DZ">Algeria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>InsMT and InsMTL are automatic instance-based ontology alignment systems which (a) annotate instances as first step. In the second step, the InsMT system (b) applies different terminological matchers with a local filter on these annotated instances. Contrary to InsMT, the InsMTL system (b) matches the annotated instances not only at terminological level but also at linguistic level. For the first version of our systems and the first participation at OAEI 2014 evaluation campaign, the results are good in terms of recall but they are not in terms of F-measure.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1.1</p>
    </sec>
    <sec id="sec-2">
      <title>State, purpose, general statement</title>
      <p>The instance matching aims to identify similar instances among different ontologies.
The systems InsMT (Instance Matching at Terminological level) and InsMTL
(Instance Matching at Terminological and Linguistic level) are realized for this
purpose. InsMT and InsMTL are automatic instance-based ontology alignment that
generates as output an alignment which that contains all the semantic correspondences
found between the instances of different concepts of the two ontologies to be aligned.</p>
      <p>The InsMT and InsMTL systems annotate the instances as first step with concept
and property names.</p>
      <p>As second step InsMT uses various string-based matching algorithms i.e.
terminological level, these similarities calculated by each algorithm are represented in
matrix. InsMT applied a local filter on each matrix, and combines these new
similarities with average aggregation method.</p>
      <p>Contrary to InsMT, InsMTL system calculates similarities between annotated
instances not only at terminological level but also at linguistic level. InsMTL
combines the similarities calculated by the various string-based matching algorithms
at terminological level, with similarities calculated using an external resource
WordNet i.e. at linguistic level. The next step consists in combining the similarities
by gives the priority to linguistic matcher otherwise we have used an average
aggregation method.</p>
      <p>Finally both systems applied a filter in order to select the semantic
correspondences between instances of different ontologies.</p>
      <p>The details of each step of InsMT and InsMTL systems are described in the
following section.
1.2</p>
    </sec>
    <sec id="sec-3">
      <title>Specific techniques used</title>
      <p>The process of InsMT and InsMTL systems consists in the following two successive
steps: 1) Annotation and Calculation of Similarities and 2) Combination and
Extraction of Alignment.</p>
      <sec id="sec-3-1">
        <title>A. InsMT system</title>
        <p>1.2.1</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Step 1: Annotation and Calculation of Similarities</title>
      <sec id="sec-4-1">
        <title>1.2.1.1 Phase 1: Extraction of Entities of the Ontologies</title>
        <p>In this phase, our system takes as input the two ontologies to be aligned and extract
their instances.</p>
      </sec>
      <sec id="sec-4-2">
        <title>1.2.1.2 Phase 2: Annotation of Instances</title>
        <p>In this phase, our system annotates in this second step the instances with the name and
label of the concept also with property name. The purpose of this annotation is to
enrich the instances with terminological information. This step is very import
especially when instances do contain terminological information.</p>
      </sec>
      <sec id="sec-4-3">
        <title>1.2.1.3 Phase 3: The Applied Matchers</title>
        <p>In this phase, our system calculates the similarities between instances, annotated in
previous phase, using various string-based matching algorithms. More precisely the
different string-based matching algorithms used are: levenshtein-distance, Jaro,
SLIM-Winkler. The calculations of similarities by each string matching algorithm are
represented in matrix.
1.2.2</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Step 2: Combination and Extraction of Alignment</title>
      <sec id="sec-5-1">
        <title>1.2.2.1 Phase 1: Local Filter</title>
        <p>In this first phase of the second step, our system applies a local filter on each matrix
i.e. we choose for each string-based matching algorithm a threshold to realize a filter.
We consider that: the similarities which are less than the threshold are set to 0. Our
intuition behind this local filter is that the similarities which are less than the
threshold can influence the strategy of the average aggregation.</p>
      </sec>
      <sec id="sec-5-2">
        <title>1.2.2.2 Phase 2: Aggregation of Similarities</title>
        <p>In this phase, our system combines the similarities of each matrix (after we have
applied a local filter) using the average aggregation method and the result of the
aggregation is represented in a matrix.</p>
      </sec>
      <sec id="sec-5-3">
        <title>1.2.2.3 Phase 3: Global Filter and Identification of Alignment</title>
        <p>In this final phase, our system applies a second filter on the combined matrix (result
of the previous step) in order to select the correspondences found using the maximum
strategy with a threshold.</p>
        <sec id="sec-5-3-1">
          <title>B. InsMTL system</title>
          <p>We mention in this section the difference between InsMT and InsMTL system.</p>
          <p>First, we have added another matcher at linguistic level for InsMTL system in
second phase “The applied Matchers”, we have used an external dictionary WordNet.</p>
          <p>In second step, InsMT does not apply a local filter (phase 1.2.2.1), the similarities
calculated by each matcher are represented in matrix without a local filter.</p>
          <p>In the phase “Aggregation of Similarities”, InsMTL system gives priority to
WordNet i.e. if the similarity value calculated using WordNet is greater than the
similarity value calculated using string matching algorithms, the similarity value of
the matrix combined is equal to the similarity calculated using WordNet, else we use
the average aggregation method. The result of the aggregation is represented in a
matrix.
1.3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Adaptations made for the evaluation</title>
      <p>We do not have made any specific adaptation for the first version of InsMT and
InsMTL, for OAEI 2014 evaluation campaign.
1.4</p>
    </sec>
    <sec id="sec-7">
      <title>Link to the system and parameters file</title>
      <p>The first version of InsMT and InsMTL systems submitted to OAEI 2014 can be
downloaded from seal-project at http://www.seals-project.eu/.
1.5</p>
    </sec>
    <sec id="sec-8">
      <title>Link to the set of provided alignments (in align format)</title>
      <p>The results of InsMT and InsMTL systems can be downloaded from seal-project at
http://www.seals-project.eu/.</p>
      <p>Results
In this section, we present the results obtained by running InsMT and InsMTL on
instance matching track of OAEI 2014 evaluation campaign.
2.1</p>
    </sec>
    <sec id="sec-9">
      <title>Instance Matching</title>
      <p>The instance matching track aims at evaluating tools able to identify similar instances
among different RDF and OWL ontologies. Our both systems annotate the instances
with concept and property names as a first step. Then as second step, the InsMT
system uses various string-based matching algorithms on annotated instances in order
to find correspondences between them and the InsMTL system use another matcher at
linguistic level in order to select semantic correspondences between instances of
different concepts.</p>
      <p>The table 1 and table 2 below present the results obtained by running InsMT and
InsMTL on the instance matching track of OAEI campaign 2014.
2.2.1</p>
      <p>Identity Recognition Task
The goal of the id-rec task is to determine when two OWL instances describe the
same real-world entity.</p>
      <p>Identity Recognition Task
InsMT
InsMTL</p>
      <p>Precision
0.0008
0.0008</p>
      <p>Recall
0.7785
0.7785</p>
      <p>F-measure
0.0015
0.0015
The goal of the sim-rec task is to evaluate the degree of similarity between two OWL
instances, even when the two instances describe different real-world entities.</p>
      <p>Identity Recognition Task</p>
      <p>F-measure
InsMT
d(InsMT) = 37.03</p>
      <p>General comments
3.1</p>
    </sec>
    <sec id="sec-10">
      <title>Comments on the results</title>
      <p>This is the first time that our systems participate in instance matching track of the
OAEI 2014 evaluation campaign, and our InsMT and InsMTL systems are new on the
SEALS Platform. However they provide good result in terms of recall but not good
result in terms of F-measure.
3.2</p>
    </sec>
    <sec id="sec-11">
      <title>Discussions on the way to improve the proposed system</title>
      <p>The InsMT and InsMT are automatic instance-based ontology matching systems
designed in order to find the correspondence between instances of different concepts.</p>
      <p>The objective behind the implementation of InsMT and InsmTL systems is first to
find the best strategy of annotation. The InsMT system applied different strategy of
aggregation and filter as we have proposed in section in section 1.2.1.3 (a local filter).
Contrary to InsMT, the objective behind the implementation of AOTL system is to
discover more new semantic correspondences by adding other matchers. For now, we
have used matchers at terminological and linguistic level.</p>
      <p>As we have mentioned before InsMT and InsMTL systems use terminological
information for annotation and matching, and when these ontologies do not contain
this information our two systems fails. Our both systems does not deal with instances
of ontologies written in different languages, and we hope in the future add a module
to translate them in the same language.</p>
      <p>Another point to be discussed is how to make our systems flexible i.e. the choice
of thresholds for the various matchers (terminological and linguistic). It is obvious
that we cannot set the threshold for all instances, in order to find automatically the
correspondences between instances of ontologies to be aligned; because each
ontology contain instances and possesses its own specific characteristic.
4</p>
      <p>Conclusion
This is the first time that InsMT and InsMTL have participated at SEAL platform and
OAEI 2014. The InsMT and InsMTL are instance-based ontology alignment system,
and in this year, our both systems have participated in instance matching track of
OAEI 2014 evaluation campaign.</p>
      <p>Initially AOT and AOTL systems annotate instances with concept and property
names. The purpose of this annotation is to enrich the instances with terminological
information.</p>
      <p>The InsMT system calculates similarities between these annotated instances using
various string-based matching algorithms. The similarities (between these annotated
instances) calculated by these different matchers are combined using average
aggregation after we have applied a local filter on each matrix.</p>
      <p>The InsMTL calculates similarities between these annotated instances using the
terminological and linguistic matchers. The similarities (between these annotated
instances) calculated by these different matchers are combined using average
aggregation with the priority to linguistic matcher.</p>
      <p>As final step both systems applied a filter on the combined matrix for the selection
of semantic correspondences between different instances of different concepts of
ontologies.</p>
      <p>Finally the results show that our systems provide good results in terms of recall
but they are not in terms of F-measure. We envision to select the best aggregation and
filtering strategy and add other matchers such as structure-based and reasoning-based
matchers.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Valtchev</surname>
          </string-name>
          ,
          <article-title>―Similarity-based ontology alignment in owllite</article-title>
          ,‖
          <source>in Proceedings of ECAI</source>
          , (
          <year>2004</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Shvaiko</surname>
          </string-name>
          . OntologyMatching. Springer (
          <year>2007</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>M.</given-names>
            <surname>Ehrig</surname>
          </string-name>
          .
          <source>Ontology Alignment Bridging the Semantic Gap</source>
          . Springer (
          <year>2007</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>M.</given-names>
            <surname>Jaro</surname>
          </string-name>
          .
          <article-title>Advances in record-linkage methodology as applied to matching the 1985 census of tampa, florida</article-title>
          .
          <source>Journal of America Statistical Association</source>
          ,
          <volume>84</volume>
          (
          <issue>406</issue>
          ):
          <fpage>414</fpage>
          -
          <lpage>420</lpage>
          , (
          <year>1989</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>A.</given-names>
            <surname>Khiat</surname>
          </string-name>
          et M.
          <article-title>Benaissa: "Nouvelle Approche d'Alignement d'Ontologies à base d'Instances : trasferet des instances par l'inférence"</article-title>
          ,
          <source>In The Proceeding of International Conference On Artificial Intelligence and Information Technology</source>
          ,
          <year>ICA2IT 2014</year>
          , Ouargla, Algeria, (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>V.</given-names>
            <surname>Levenshtein</surname>
          </string-name>
          .
          <article-title>Binary codes capable of correcting deletions, insertions, and reversals</article-title>
          .
          <source>Soviet Physics Doklady</source>
          ,
          <volume>10</volume>
          :
          <fpage>707</fpage>
          -
          <lpage>710</lpage>
          , (
          <year>1966</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>M.</given-names>
            <surname>Ehrig</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sure</surname>
          </string-name>
          ,
          <article-title>―Ontology mapping - an integrated approach</article-title>
          ,‖
          <source>in Proceedings of the European Semantic Web Symposium ESWS</source>
          , (
          <year>2004</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Beckwith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Fellbaum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gross</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Miller. WordNet</surname>
          </string-name>
          :
          <article-title>An online lexical database</article-title>
          .
          <source>Int. J. Lexicograph. 3</source>
          ,
          <issue>4</issue>
          , pp.
          <fpage>235</fpage>
          -
          <lpage>244</lpage>
          , (
          <year>1990</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Rodriguez and M. J. Egenhofer</surname>
          </string-name>
          <article-title>: Determining Semantic Similarity among Entity Classes from Different Ontologies</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          , vol.
          <volume>15</volume>
          , issue 2, pp,
          <fpage>442</fpage>
          -
          <lpage>456</lpage>
          , (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>A.</given-names>
            <surname>Doan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Madhavan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Domingos</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Halevy</surname>
          </string-name>
          ,
          <article-title>―Learning to map ontologies on the semantic web</article-title>
          ,‖
          <source>in Proceedings of the International World Wide Web Conference</source>
          (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>A.</given-names>
            <surname>Maedche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Motik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Silva</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Volz</surname>
          </string-name>
          <article-title>"Mafra-a mappingframework for distributed ontologies"</article-title>
          , Springer, Benjamins VR (eds) EKAW, Berlin, vol
          <volume>2473</volume>
          , pp
          <fpage>235</fpage>
          -
          <lpage>250</lpage>
          , (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>K.</given-names>
            <surname>Todorov</surname>
          </string-name>
          , P. Geibel,
          <string-name>
            <surname>KU.</surname>
          </string-name>
          <article-title>Kuhnberger "Mining concept similarities for heterogeneous ontologies"</article-title>
          , Springer, Berlin, ICDM, vol
          <volume>6171</volume>
          . , pp
          <fpage>86</fpage>
          -
          <lpage>100</lpage>
          , (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>B.</given-names>
            <surname>Schopman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Isaac</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Schlobach</surname>
          </string-name>
          ,
          <article-title>―Instance-Based Ontology Alignment by Instance Enrichment‖</article-title>
          ,
          <source>Journal on Data Semantics</source>
          , vol.
          <volume>1</volume>
          , N°
          <fpage>4</fpage>
          , (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. E. Rahm ―
          <article-title>Towards large-scale schema and ontology Alignment‖</article-title>
          , ReCALL, (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>J. Li</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            and
            <given-names>Q.</given-names>
          </string-name>
          <string-name>
            <surname>Luo</surname>
          </string-name>
          ,
          <article-title>―Rimom: a dynamic multistrategy ontology alignment framework‖</article-title>
          ,
          <source>IEEE Trans Knowl</source>
          , (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>