<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ALIN Results for OAEI 2017</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jomar da Silva</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fernanda Araujo Bai~ao</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kate Revoredo</string-name>
          <email>katerevoredog@uniriotec.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Graduated Program in Informatics, Department of Applied Informatics Federal University of the State of Rio de Janeiro (UNIRIO)</institution>
          ,
          <country country="BR">Brazil</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>ALIN is an ontology alignment system specialized in the interactive alignment of ontologies. Its main characteristic is the selection of correspondences to be shown to the expert, depending on the previous feedbacks given by the expert. This selection is based on semantic and structural characteristics. ALIN has obtained the alignment with the highest quality in the interactive tracking for Conference data set. This paper describes its con guration for the OAEI 2017 competition and discusses its results.</p>
      </abstract>
      <kwd-group>
        <kwd>ontology matching</kwd>
        <kwd>Wordnet</kwd>
        <kwd>interactive ontology matching</kwd>
        <kwd>ontology alignment</kwd>
        <kwd>interactive ontology alignment</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1.1</p>
    </sec>
    <sec id="sec-2">
      <title>State, purpose, general statement</title>
      <p>ALIN is an ontology alignment system, specialized in the ontology interactive
alignment, based primarily on linguistic matching techniques, using the Wordnet
as external resource. After generating an initial set of correspondences ( called set
of candidate correspondences, which are the correspondences selected to receive
the feedback from the expert ), interactions are made with the expert, and to each
interaction, the set of candidate correspondences is modi ed. The modi cation of
the set of candidate correspondences is through the use of the structural analysis
of ontologies and use of correspondence anti-patterns. The interactions continue
until there are no more candidate correspondences left. ALIN was built with a
special focus on the interactive matching track of OAEI 2017.
1.2</p>
    </sec>
    <sec id="sec-3">
      <title>Speci c techniques used</title>
      <sec id="sec-3-1">
        <title>The ALIN algorithm is shown in algorithm 1.</title>
      </sec>
      <sec id="sec-3-2">
        <title>Algorithm 1 ALIN algorithm</title>
        <p>Input: Two ontologies to be aligned
Output: Alignment between the two ontologies
1: Loading of ontologies
2: Generation of the initial set of candidate correspondences
3: Automatic classi cation of correspondences
4: Removal of correspondences by the low value of semantic similarity
5: while Set of candidate correspondences is not empty do
6: Choose correspondences to show to the expert
7: Receive expert feedback to chosen correspondences and remove them of
the set of candidate correspondences
8: Remove correspondences in an correspondence anti-pattern from set of
candidate correspondences
9: Insert some data property and object property correspondences into set
of candidate correspondences
10: Insert some correspondences from the backup set into set of candidate
correspondences
11: end while</p>
        <p>The steps of ALIN algorithm are the following:
1. Load of the ontologies with load of classes, object properties and data
properties through the Align API1. For each entity some data are stored such
as name and label. In the case of classes, their superclasses and disjunctions are
saved. In the case of object properties the properties that are their hypernyms
and their associated classes are saved. The classes of data properties are saved,
too. ALIN does not use instances. The ALIN can only work with ontologies
whose entity names are in English.</p>
        <p>
          2. As an initial set of candidate correspondences a stable marriage algorithm
with incomplete preference lists with maximum size of the list equals to 1, using
linguistic metrics to sort the priority list was used [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. The list is sorted in
decreasing order. For this algorithm only the correspondence whose rst entity
is in the list of second entity and vice-versa is selected. The linguist metrics used
are Jaccard, Jaro-Winkler and n-Gram [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] provided by Simmetrics API2 and
1 Alignment API . Available at http://alignapi.gforge.inria.fr/ Last accessed on Oct,
10, 2017.
2 String Similarity Metrics for Information Integration . Available on
http://www.coli.uni-saarland.de/courses/LT1/2011/slides/stringmetrics.pdf. Last
accessed on Oct, 10, 2017.
        </p>
        <p>
          Resnick, Jiang-Conrath and Lin [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] provide by HESML API3 that use Wordnet.
To use Wordnet the canonical form of the entity names is needed, therefore
Stanford CoreNLP API4 was used. The most frequent synsets of words are used
to calculate semantic similarities. To nd this synset is used the WS4J API5.
The algorithm is run six times, once by each metric, and the result set is the
union of results of each metric.
        </p>
        <p>3. The value of the similarity metrics ( Resnick, Jiang-Conrath, Lin, Jaccard,
Jaro-Winkler and n-Gram ) vary from 0 to 1 ( 1 is the maximum value ). When
a correspondence in the set of candidate correspondences has all the six metrics
with the maximum value, it is added to the nal alignment and removed from
the set of candidate correspondences. There are exceptions to this rule, some
correspondences that fall into some structural patterns are not put on the nal
alignment and are not removed from the set of candidate correspondences.</p>
        <p>
          4. The correspondences whose entities has one of its linguistic metrics less
than a given threshold are removed from the set of candidate correspondences.
These correspondences are put into a backup set, and can return to the set of
candidate correspondences using structural analysis. The use of this technique
can best be seen in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], with the di erence that, in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], instead of applying a
threshold, it was removed the classes of correspondences that were not in the
same Wordnet synset.
        </p>
        <p>5-11. At this point the interactions with the expert begin. The
correspondences in the set of candidate correspondences are sorted by the sum of similarity
metric values, with the greatest sum rst. The correspondences are showed to the
expert. The set of candidate correspondences has, at rst, only correspondences
of classes. When the expert answer one question, the set of candidate
correspondences is modi ed. Correspondences ( besides the correspondence answered by
expert ) can be removed and correspondences can be included into the set of
candidate correspondences, depending on the answer of the expert. If the expert
does not accept the correspondence it is removed from the set of candidate
correspondences. But if the expert accepts the correspondence it is removed from
the set of candidate correspondences and put in the nal alignment.</p>
        <p>
          At each interaction with the expert:
- We remove from the set of candidate correspondences and disregard all the
correspondences that are in correspondence anti-pattern [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] with the
correspondences accepted by the expert;
        </p>
        <p>- We insert into the set of candidate correspondences, data property and
object property correspondences related to the class correspondences accepted
by the expert.
3 HESML. Available at https://www.researchgate.net/publication/313881253 HESML A
scalable ontologybased semantic similarity measures library with a set of reproducible
experiments and a replication dataset Last accessed on Oct, 10, 2017.
4 Stanford CoreNLP . Available at http://stanfordnlp.github.io/CoreNLP/ Last
accessed on Oct, 10, 2017.
5 WS4J . Available at https://github.com/Sciss/ws4j Last accessed on Nov, 08, 2017.</p>
        <p>- We insert into the set of candidate correspondences, correspondences of
the backup set ( step 4 ) whose both entities are subclasses of the classes of a
correspondence accepted by expert.</p>
        <p>This step continues until the set of candidate correspondences is empty.</p>
        <p>Detailed information about the ALIN system can be seen in the master thesis
of Jomar da Silva6.
1.3</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Link to the system and parameters le</title>
      <p>ALIN is available through Google drive (</p>
      <p>https://drive.google.com/open?id=1myVtcRoKKdUDHQTKNKsomna8AFbukanf)
as a package for running through the SEALS client.
2</p>
      <p>Results
The system ALIN has been developed with its focus on interactive ontology
alignment. The approach performs better when the number of data and object
properties is proportionately large. ALIN considers properties associated to
correspondent classes when selecting entities for user feedback, thus allowing for
increased recall. When the number of properties in the ontologies is small, the
system still generates a very precise alignment, but its recall tends to decrease.</p>
      <p>Another characteristic of ALIN is its reliance on an interactive phase. The
non-interactive phase of the system is quite simple, mainly based on maximum
string similarity, specializing in maintaining a high precision without worrying
about recall, generating initially a low f-measure. The recall increases in the
interactive phase. Finally, ALIN is also not robust to users errors. The system uses
a number of techniques that take advantage of the expert feedback to reach other
conclusions. When the expert gives a wrong answer it is propagated generating
other errors, thereby decreasing the f-measure.
2.1</p>
    </sec>
    <sec id="sec-5">
      <title>Comments on the participation of the ALIN in non-interactive tracks</title>
      <p>As expected the participation of ALIN in non-interactive alignment processes
showed the following results: high precision and not so high recall, as can be
seen in Anatomy track7 shown in Table 1, where recall+ eld refers to
nontrivial correspondences found and Coherent eld lled by + indicates that the
generated alignment is consistent.
6 INTERACTIVE ONTOLOGY ALIGNMENT: AN APPROACH BASED ON
THE INTERACTIVE MODIFICATION OF THE SET OF CANDIDATE
CORRESPONDENCES . Available at
http://www2.uniriotec.br/ppgi/bancode-dissertacoes-ppgi-unirio/ano-2017/interactive-ontology-alignment-an-approachbased-on-the-interactive-modi cation-of-the-set-of-candidate-correspondences/view
Last accessed on Nov, 12, 2017.
7 Results for OAEI 2017 - Anatomy track . Available at
http://oaei.ontologymatching.org/2017/results/anatomy/index.html Last accessed
on Nov, 012, 2017.</p>
      <p>Regarding the Conference track8, as ALIN evaluates only the properties
associated with classes already evaluated as belonging to the alignment, the
alignment of the M2 type (which take into account only the properties of ontologies)
were with the f-measure = 0, as can be seen in Table 2. As properties are
evaluated only in the interactive phase in the ALIN, alignments of type M1 (only
classes) remained with a higher recall than M3 (classes and properties), as can be
seen in Table 2, because the reference alignments of type M3 contain properties
besides classes.</p>
    </sec>
    <sec id="sec-6">
      <title>Comments on the participation of the ALIN in interactive tracks</title>
      <p>8 "Results of Evaluation for the Conference track within OAEI 2017 . Available at
http://oaei.ontologymatching.org/2017/conference/eval.html Last accessed on Nov,
12, 2017.</p>
      <p>ALIN</p>
      <p>AML
LogMap
XMap
1000
45
23
44</p>
      <sec id="sec-6-1">
        <title>ALIN AML LogMap XMap</title>
      </sec>
      <sec id="sec-6-2">
        <title>ALIN</title>
        <p>AML
LogMap
XMap
Anatomy track In this track the program ALIN showed the highest precision
among the four evaluated tools when the error rate is zero, as can be seen in Table
3. When the error rate increases both the precision as the recall falls, reducing
the f-measure, as can be seen in Table 4. This is expected and explained earlier.</p>
        <p>As ontologies of the Anatomy Track contains almost no properties, some
interactive techniques used in ALIN can not be utilized, like the selection of
properties associated with classes with positive feedback. This has limited the
increase in recall, which in uenced the f-measure.</p>
        <p>Conference Track In this track ALIN stood out, showing the greatest
fmeasure among the four tools when the error rate is zero, as can be seen in
5, as with a loss of f-measure when the error rate increases, as can be seen in
Table 6.</p>
        <p>Other results, including results with other error rates can be seen on the
OAEI 20179 page.
9 Results for OAEI 2017 - Interactive Track . Available
http://oaei.ontologymatching.org/2017/results/interactive/index.html Last
cessed on Nov, 11, 2017.
at
ac</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Comparison of the participation to ALIN in OAEI 2017 with his participation in OAEI 2016</title>
      <p>
        The di erence between the participation of ALIN in OAEI 2016 and his
participation in OAEI 2017 was the use of the HESML API in 2017 instead of the WS4J
API in calculating semantic similarities, which greatly increased the e ciency
in these calculations. In ALIN's participation in OAEI 2016[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], three
semantic similarity metrics were used: Wu-Palmer, Jiang-Conrath and Lin. In ALIN's
participation in OAEI 2017 the metrics Resnick, Jiang-Conrath and Lin were
used. Resnick's exchange of Wu-Palmer is due to the fact that the Wu-Palmer
metric in the HESML API took longer to execute than the same metric in the
WS4J API. The Resnick metric proved to be much faster than the Wu-Palmer
metric in the HESML API and according to [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] as good as, so the Resnick metric
was chosen to take Wu-Palmer's place in the implementation of ALIN at OAEI
2017. More information about the HESML API can be found in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In table 7.
it can be seen that the ALIN runtime has decreased considerably with the use
of the HESML API instead of the WS4J API. In the Anatomy interactive track
of OAEI 2016, ALIN did not use the semantic metrics, only the string metrics,
since the semantic metrics were taking a long time, making it impossible to
execute it. In OAEI 2017, using the HESML API, it was possible to use semantic
metrics, which led to an increase in the quality of the alignment generated, but
with an increase in the expert's participation. The execution time also increased
with the inclusion of semantic metrics, as we can see in table 8.
Year Run Time (sec) Precision Recall F-measure Total Requests Distinct Mappings
Year Run Time (sec) Precision Recall F-measure Total Requests Distinct Mappings
Evaluating the results it can be seen that the system can be improved towards:
Within certain characteristics, the ALIN system stands out in ontology
alignment process in interactive application scenarios, especially when the amount
of data and object properties are relatively large and when the expert does not
make mistakes. With these features there is an alignment generated with
relatively high precision and recall.
      </p>
      <p>The third author was partially funding by project PQ-UNIRIO N01/2017 ("
Aprendendo, adaptando e alinhando ontologias:metodologias e algoritmos.") and
CAPES/PROAP.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>H.</given-names>
            <surname>Paulheim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hertling</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Ritze</surname>
          </string-name>
          ,
          <source>Towards Evaluating Interactive Ontology Matching Tools, Lect. Notes Comput. Sci.</source>
          , vol.
          <volume>7882</volume>
          , pp.
          <fpage>31</fpage>
          -
          <lpage>45</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>R. W.</given-names>
            <surname>Irving</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Manlove</surname>
          </string-name>
          , and
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>OMalley, Stable marriage with ties and bounded length preference lists J</article-title>
          .
          <source>Discret. Algorithms</source>
          , vol.
          <volume>7</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>213</fpage>
          -
          <lpage>219</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Shvaiko</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ontology Matching - Second Edition</surname>
          </string-name>
          , 2. Springer-Verlag,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Baia~o,
          <string-name>
            <given-names>F. A.</given-names>
            ,
            <surname>Revoredo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            , &amp;
            <surname>Euzenat</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. (n.d.).</surname>
          </string-name>
          <article-title>Semantic Interactive Ontology Matching : Synergistic Combination of Techniques to Improve the Set of Candidate Correspondences</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>A.</given-names>
            <surname>Guedes</surname>
          </string-name>
          , F. Baia~o, e
          <string-name>
            <given-names>K.</given-names>
            <surname>Revoredo</surname>
          </string-name>
          , Digging Ontology Correspondence Antipatterns,
          <source>Proceeding WOP14 Proc. 5th Int. Conf. Ontol. Semant. Web Patterns</source>
          , vol.
          <volume>1302</volume>
          , p.
          <fpage>3848</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>J.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. A.</given-names>
            <surname>Bai</surname>
          </string-name>
          <article-title>~ao, and</article-title>
          K. Revoredo,
          <source>ALIN Results for OAEI</source>
          <year>2016</year>
          , CEUR Workshop Proc., vol.
          <volume>1766</volume>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>E. G. M.</given-names>
            <surname>Petrakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varelas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hliaoutakis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Raftopoulou</surname>
          </string-name>
          ,
          <article-title>Design and Evaluation of Semantic Similarity Measures for Concepts Stemming from the Same or Di erent Ontologies object instrumentality</article-title>
          ,
          <source>Proc. 4th Work. Multimed. Semant.</source>
          , vol.
          <volume>4</volume>
          , pp.
          <fpage>233</fpage>
          -
          <lpage>237</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Lastra-D az</surname>
            ,
            <given-names>J. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garc</surname>
            a-Serrano,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Batet</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fernandez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Chirigati</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>HESML: A scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset</article-title>
          .
          <source>Information Systems</source>
          ,
          <volume>66</volume>
          , 97118. http://doi.org/10.1016/j.is.
          <year>2017</year>
          .
          <volume>02</volume>
          .002
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>