<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>AgreementMakerLight 2.0: Towards E Large-Scale Ontology Matching cient</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Daniel Faria</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Catia Pesquita</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emanuel Santos</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Isabel F. Cruz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francisco M. Couto</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ADVIS Lab, Dept. of Computer Science, University of Illinois at Chicago</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dept. Informatica</institution>
          ,
          <addr-line>Faculdade de Ci</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>LASIGE</institution>
          ,
          <addr-line>Faculdade de Ci</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>encias, Universidade de Lisboa</institution>
          ,
          <country country="PT">Portugal</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Ontology matching is a critical task to realize the Semantic Web vision, by enabling interoperability between ontologies. However, handling large ontologies e ciently is a challenge, given that ontology matching is a problem of quadratic complexity. AgreementMakerLight (AML) is a scalable automated ontology matching system developed to tackle large ontology matching problems, particularly for the life sciences domain. Its new 2.0 release includes several novel features, including an innovative algorithm for automatic selection of background knowledge sources, and an updated repair algorithm that is both more complete and more e cient. AML is an open source system, and is available through GitHub 1 both for developers (as an Eclipse project) and end-users (as a runnable Jar with a graphical user interface).</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Ontology matching is the task of nding correspondences (or mappings) between
semantically related concepts of two ontologies, so as to generate an alignment
that enables integration and interoperability between those ontologies [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. It is
a critical task to realize the vision of the Semantic Web, and is particularly
relevant in the life sciences, given the abundance of biomedical ontologies with
partially overlapping domains.
      </p>
      <p>At its base, ontology matching is a problem of quadratic complexity as it entails
comparing all concepts of one ontology with all concepts of the other. Early
ontology matching systems were not overly concerned with scalability, as the
matching problems they tackled were relatively small. But with the increasing
interest in matching large (biomedical) ontologies, scalability became a critical
aspect, and as a result, traditional all-versus-all ontology matching strategies
are giving way to more e cient anchor-based strategies (which have linear time
complexity).</p>
    </sec>
    <sec id="sec-2">
      <title>Input</title>
    </sec>
    <sec id="sec-3">
      <title>Ontologies</title>
      <p>OWL</p>
    </sec>
    <sec id="sec-4">
      <title>Ontology</title>
    </sec>
    <sec id="sec-5">
      <title>Loading</title>
    </sec>
    <sec id="sec-6">
      <title>Ontology</title>
    </sec>
    <sec id="sec-7">
      <title>Objects</title>
      <p>AgreementMakerLight</p>
    </sec>
    <sec id="sec-8">
      <title>Primary</title>
    </sec>
    <sec id="sec-9">
      <title>Matching</title>
    </sec>
    <sec id="sec-10">
      <title>Core</title>
    </sec>
    <sec id="sec-11">
      <title>Alignment</title>
    </sec>
    <sec id="sec-12">
      <title>Secondary</title>
    </sec>
    <sec id="sec-13">
      <title>Matching Output Alignment</title>
      <p>RDF</p>
    </sec>
    <sec id="sec-14">
      <title>Selection &amp; Repair</title>
    </sec>
    <sec id="sec-15">
      <title>Refined</title>
    </sec>
    <sec id="sec-16">
      <title>Alignment</title>
      <sec id="sec-16-1">
        <title>The AgreementMakerLight System</title>
        <p>
          AgreementMakerLight (AML) is a scalable automated ontology matching system
developed to tackle large ontology matching problems, and focused in
particular on the biomedical domain. It is derived from AgreementMaker, one of the
leading rst generation ontology matching systems [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], and adds scalability and
e ciency to the design principles of exibility and extensibility which
characterized its namesake.
2.1
        </p>
        <sec id="sec-16-1-1">
          <title>Ontology Matching Framework</title>
          <p>The AML ontology matching framework is represented in Figure 1. It is divided
into four main modules: ontology loading, primary matching, secondary
matching, and alignment selection and repair</p>
          <p>
            The ontology loading module is responsible for reading ontologies and parsing
their information into the AML ontology data structures, which were conceived
to enable anchor-based matching [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ]. AML 2.0 marks the switch from the Jena2
ontology API to the more e cient and exible OWL API, and includes several
upgrades to the ontology data structures. The most important data structure
AML uses for matching is the Lexicon, a table of class names and synonyms in
an ontology, which uses a ranking system to weight them and score their matches
[
            <xref ref-type="bibr" rid="ref7">7</xref>
            ].
          </p>
          <p>
            The primary and secondary matching modules contain AML's ontology
matching algorithms, or matchers, with the di erence between them being their time
complexity. Primary matchers have O(n) time complexity and therefore can be
employed globally in all matching problems, whereas secondary matchers have
O(n2) time complexity and thus can only be applied locally in large problems.
The use of background knowledge in primary matchers is a key feature in AML,
and it includes an innovative automated background knowledge selection
algorithm [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ].
          </p>
          <p>
            The alignment selection and repair module ensures that the nal alignment has
the desired cardinality and that it is coherent (i.e., does not lead to the
violation of restrictions of the ontologies) which is important for several applications.
AML's approximate alignment repair algorithm features a modularization step
which identi es the minimal set of classes that need to be analyzed for coherence,
thus greatly reducing the scale of the repair problem [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ].
2.2
          </p>
        </sec>
        <sec id="sec-16-1-2">
          <title>User Interface</title>
          <p>
            The GUI was a recent addition to AML, as we sought to make our system
available to a wider range of users. The main challenge in designing the GUI was
nding a way to visualize an alignment between ontologies that was both scalable
and useful for the user. Our solution was to visualize only the neighborhood of
one mapping at a time, while providing several options for navigating through
the alignment [
            <xref ref-type="bibr" rid="ref6">6</xref>
            ]. The result is a simple and easy to use GUI which is shown in
Figure 2.
          </p>
        </sec>
      </sec>
      <sec id="sec-16-2">
        <title>Performance</title>
        <p>
          AML 1.0 achieved top results in the 2013 edition of the Ontology Alignment
Evaluation Initiative (OAEI) [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Namely, it ranked rst in F-measure in the
anatomy track, and second in the large biomedical ontologies, conference and
interactive matching tracks. In addition to its e ectiveness in matching life
sciences ontologies, AML was amongst the fastest systems in all tracks, and more
importantly, had consistently a high F-measure/run time ratio.
        </p>
        <p>AML 2.0 is more e ective than its predecessor (thanks to the improved handling
of background knowledge, the richer data structures and the addition of new
matching algorithms) without sacri cing e ciency, so we expect it to perform
even better in this year's edition of the OAEI.</p>
      </sec>
      <sec id="sec-16-3">
        <title>Acknowledgments</title>
        <p>DF, CP, ES and FMC were funded by the Portuguese FCT through the SOMER
project (PTDC/EIA-EIA/119119/2010) and the LASIGE Strategic Project
(PEst-OE/EEI/UI0408/2014). The research of IFC was partially supported by
NSF Awards CCF{1331800, IIS{1213013, IIS{1143926, and IIS{0812258 and by
a UIC-IPCE Civic Engagement Research Fund Award.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>I. F.</given-names>
            <surname>Cruz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Palandri Antonelli</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Stroe. AgreementMaker</surname>
          </string-name>
          :
          <article-title>E cient Matching for Large Real-World Schemas and Ontologies</article-title>
          . PVLDB,
          <volume>2</volume>
          (
          <issue>2</issue>
          ):
          <volume>1586</volume>
          {
          <fpage>1589</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Shvaiko</surname>
          </string-name>
          . Ontology Matching. Springer-Verlag New York Inc,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>D.</given-names>
            <surname>Faria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pesquita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. F.</given-names>
            <surname>Cruz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Couto</surname>
          </string-name>
          .
          <article-title>AgreementMakerLight Results for OAEI 2013</article-title>
          .
          <source>In ISWC International Workshop on Ontology Matching (OM)</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>D.</given-names>
            <surname>Faria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pesquita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. F.</given-names>
            <surname>Cruz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Couto</surname>
          </string-name>
          .
          <article-title>Automatic Background Knowledge Selection for Matching Biomedical Ontologies</article-title>
          .
          <source>PLoS One</source>
          , In Press,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>D.</given-names>
            <surname>Faria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pesquita</surname>
          </string-name>
          , E. Santos,
          <string-name>
            <given-names>M.</given-names>
            <surname>Palmonari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. F.</given-names>
            <surname>Cruz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Couto</surname>
          </string-name>
          .
          <article-title>The AgreementMakerLight Ontology Matching System</article-title>
          .
          <source>In OTM Conferences</source>
          , volume
          <volume>8185</volume>
          <source>of LNCS</source>
          , pages
          <volume>527</volume>
          {
          <fpage>541</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>C.</given-names>
            <surname>Pesquita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Faria</surname>
          </string-name>
          , E. Santos, and
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Couto</surname>
          </string-name>
          .
          <article-title>Towards visualizing the alignment of large biomedical ontologies</article-title>
          .
          <source>In 10th International Conference on Data Integration in the Life Sciences</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>C.</given-names>
            <surname>Pesquita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Faria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Stroe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. F.</given-names>
            <surname>Cruz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Couto</surname>
          </string-name>
          .
          <article-title>What's in a `nym' ? Synonyms in Biomedical Ontology Matching</article-title>
          .
          <source>In The Semantic Web - ISWC</source>
          <year>2013</year>
          , volume
          <volume>8218</volume>
          of Lecture Notes in Computer Science, pages
          <volume>526</volume>
          {
          <fpage>541</fpage>
          . Springer Berlin Heidelberg,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>E.</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Faria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pesquita</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Couto</surname>
          </string-name>
          .
          <article-title>Ontology alignment repair through modularization and con dence-based heuristics</article-title>
          .
          <source>CoRR</source>
          , arXiv:
          <fpage>1307</fpage>
          .5322,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>