<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>AgreementMakerDeep Results for OAEI2021</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zhu Wang</string-name>
          <email>zwang260@uic.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Isabel F. Cruz</string-name>
          <email>isabelcfcruz@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ADVIS Lab, Dept of Computer Science University of Illinois at Chicago</institution>
          ,
          <addr-line>Chicago IL 60607</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>AgreementMakerDeep (AMD) is a new exible and extensible ontology matching system with knowledge graph embedding techniques. AMD learns from classes and their relations between classes by constructing vector representations into the low dimensional embedding space with knowledge graph embedding methods. The results demonstrate that AMD achieves a competitive performance in a few OAEI tracks, but AMD has limitations for property and instance matching. 1 Presentation of the system AgreementMakerDeep (AMD) is a new ontology matching system inspired by AgreementMaker [2, 3], AgreementMakerLight (AML) [7] and BootEA [19]. This year is the rst time that AMD participates in OAEI. It is designed with the main goal of higher e ciency for ontology matching problems by applying knowledge graph embedding methods.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1.1</p>
    </sec>
    <sec id="sec-2">
      <title>State,purpose, general statement</title>
      <p>
        Ontology matching aims to establish semantic correspondences or relationships
between concepts or properties of di erent ontologies [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. There is a wide range
of algorithms developed for ontology matching, such as those that use lexical
similarity with linguistic techniques [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], partition large ontology sets based on
structural proximity [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], or detect graph similarity [
        <xref ref-type="bibr" rid="ref14 ref5">5, 14</xref>
        ]. However, such
strategies may be time consuming [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], may use sparse and a high-dimensional training
space [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], and may vary with the domains [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        AMD mainly utilizes string-based techniques [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and lexical matching
algorithms [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], but adopts the representative learning models [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] to capture the
relations as structural information with a translation vector between two classes.
? Copyright © 2021 for this paper by its authors. Use permitted under Creative
      </p>
      <p>Commons License Attribution 4.0 International (CC BY 4.0).
1 This paper is dedicated to the memory of Isabel F. Cruz.
The architecture of AMD is shown in gure 1, including ontology parsing, string
and lexical matching, knowledge graph embedding, model learning and candidate
selection.</p>
      <p>
        Ontology parsing owlready2 [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] is used to extract meta information of
classes from the source and target ontology, such as super/sub-classes, labels,
annotations, partof and disjointwith. BeautifulSoup [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] is used to extract
synonyms.
      </p>
      <p>
        String and lexical matching We apply several text per-processing
techniques like stop-words removal and tokenization on class labels and annotations.
AMD uses the Base Similarity Matcher (BSM) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and lexical matching
algorithms to obtain a baseline class alignment.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Knowledge graph embedding and model learning We characterize</title>
      <p>
        the structure information of ontologies by relations translated from one class
to another class using a modi ed TransR [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] model into relational embedding
spaces.
      </p>
      <p>Problem Formulation Given two ontologies O and O', we construct knowledge
graph X and Y, and de ne the correspondence between two concepts as following
triplets Tc;c0 = &lt; c; r; c0 &gt;, where r is the relation between c and c'. The problem
is to nd mapping set M = f(cx; cy) X Y jcx cyg. In this study, we focus on
one-to-one alignment and the relation between concepts is equality.</p>
      <p>Let ~v(cx)= fv1; v2; :::vmg and ~v(cy)= fv10; v20; :::vn0g be two d-dimensional
vectors sets of size m and n, we compute their distance with simple cosine similarity
by d(~v(cx); ~v(cy)) = 1-sim(~v(cx); ~v(cy)) as follows:
sim(~v(cx); ~v(cy)) =</p>
      <p>X arg max cos(~v(cx); ~v(cy))
i=1 j
We de ne the probability of the aligned labels between concepts cx and cy by
p(cyjcx) as follows:
p(cyjcx) =
sim(~v(cx); ~v(cy))
(1)
where</p>
      <p>is the sigmoid function.</p>
      <p>
        Knowledge graph embedding In AMD, we apply a modi ed TransR method
which translates concepts and relations into concept space and relation-specify
concept spaces, since there are multiple relations in the ontologies e.g subclassof
and disjointwith. In the original TransR, the projected vectors are de ned as cr =
cMr; c0r = c0Mr, and the score function as fr(c; c0) = kcr + r c0rk22 [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Inspired
by Sun et al. [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], the absolute scores of positive triples are lower than the
negative ones, so we modify the loss function by using two hyper-parameters
as follows:
      </p>
      <p>L =</p>
      <p>X</p>
      <p>X
(cx;r;c0x)2S1 (cy;r;c0y)2S2
max(0; (fr(cx; c0x)
1)
(f (cy; c0y) + 2)
(3)
where 1; 2; &gt; 0 and 2 &gt; 1, S1 is the positive triples set and S2 is the
negative triples set. We set di erent values to ensure absolutely low margin
loss scores in the positive triples for reducing the drift of the embedding and
also keep the function of the margin-based ranking loss.</p>
      <p>
        During the process that computes vectors, we need to generate negative
triples. Following the work of Sun et al. [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] and Li et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], we re ne the
uniform negative sampling by choosing from the k-nearest neighbors in the
embedding space, and setting constraints of select candidates excluding from the
subclassOf or disjointWith related concepts. In this way, we can avoid vector
sparsity and obtain better quality of vector representations for the concepts.
      </p>
      <p>Candidate selection We select candidates based on a threshold of the
classes knowledge graph embedding vectors similarity, and then compare the
similarity with baseline if the pairs are in baseline result sets.
2.1</p>
    </sec>
    <sec id="sec-4">
      <title>Parameter settings</title>
      <p>In AMD, we use stochastic gradient descent as the optimizer and con gure
hyperparameters as listed: dimensions are set to 200 for the vectors. The learning rate
is among f0.01,0.02,0.001g, and mini-batch is f5,10g. 1 = f0.01,0.05,0.1g, 2 =
f0.5,1.0g. The number of nearest neighbors for negative sampling is f5,10,20g.</p>
      <p>From the local evaluation results on the Anatomy track, the best parameter
set is as follows: the learning rate is 0.01, mini-batch is 10, 1 is 0.01, 2 is 0.5
and 10 nearest neighbors for the negative sampling.
2.2</p>
    </sec>
    <sec id="sec-5">
      <title>Adaptations made for the evaluation</title>
      <p>Our framework uses Python with Tenser ow2 and RDFLib 3, and is packed for
SEALS using MELT. We use the best parameter set in local alignments for the
OAEI submission, see section 2.1.
2 https://www.tensor ow.org/
3 https://github.com/RDFLib</p>
      <sec id="sec-5-1">
        <title>Results</title>
        <p>3.1</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Anatomy</title>
      <p>The Anatomy track results of AMD are shown in Table 1. AMD returns 1167
correspondences in 3 seconds. The result shows that AMD can be
competitive among the top promising matching systems, especially in terms of runtime
and precision. AMD is the second fastest system in this track and a slightly
higher(0.004) precision than AML.
The Conference track results of AMD are shown in Table 2. As expected, the
performance of AMD in the conference track is not good, with the F-measure
only slightly higher when comparing baseline method(StringEquiv). AMD shows
a lack of ability to extract and match the properties in M2 and M3 evaluation
variants. However, AMD has higher values in term of Precision in most tasks.
AMD is able to complete two of the ve tasks with a runtime of 37 minutes, and
AMD only returns class correspondences with a precision of 1.0.</p>
      <sec id="sec-6-1">
        <title>General comments</title>
        <p>Comments on the result
2021 is the rst time that AMD participates in OAEI, and performs promising
results. Overall, the results show that AMD is able to complete several tasks in
di erent domains on class-level matching in a timely manner. It is a fast system in
most of tracks. Hence, we have shown that knowledge graph embedding is helpful
to decrease computation time and that it leads to a competitive performance
in term of F-measure in Anatomy and LargeBio tracks. AMD has consistently
had higher precision than AML in a few tracks. However, AMD is still under
development that it is only able to return class correspondences. Moreover, AMD
has memory issues for large scale datasets and is not able to match properties
and instances in the current stage.
4.2</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Improvements</title>
      <p>
        The current development of AMD touches on several aspects. Besides considering
properties and instances matching, we will utilize joint embedding to combine
contextualized knowledge graph embeddings like coKE and BERT and additional
knowledge resources such as WebIsA [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] as a lexicon database. Moreover, we
will adapt AMD with di erent data types parsing and parameters selections for
di erent tracks.
5
      </p>
      <sec id="sec-7-1">
        <title>Conclusions</title>
        <p>In this paper, we have introduced a new ontology matching system called AMD.
We adapted a modi ed transR model to t the ontology matching problem: thus,
we learn low-dimensional embeddings for each class and relation to capture the
hidden semantics of ontologies, rather than measuring the similarities between
classes directly, as in other traditional systems. AMD makes full use of the
textual and structure knowledge of ontologies. The results demonstrate the high
e ciency and the promising performance of our proposed matching method as
compared to other systems results in several tracks.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Michelle</given-names>
            <surname>Cheatham</surname>
          </string-name>
          and
          <string-name>
            <given-names>Pascal</given-names>
            <surname>Hitzler</surname>
          </string-name>
          .
          <article-title>String similarity metrics for ontology alignment</article-title>
          .
          <source>In International semantic web conference</source>
          , pages
          <volume>294</volume>
          {
          <fpage>309</fpage>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Isabel</surname>
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Cruz</surname>
          </string-name>
          , Flavio Palandri Antonelli, and Cosmin Stroe.
          <article-title>AgreementMaker: E cient Matching for Large Real-World Schemas and Ontologies</article-title>
          . PVLDB,
          <volume>2</volume>
          (
          <issue>2</issue>
          ):
          <volume>1586</volume>
          {
          <fpage>1589</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Isabel</surname>
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Cruz</surname>
            , Flavio Palandri Antonelli, and
            <given-names>Cosmin</given-names>
          </string-name>
          <string-name>
            <surname>Stroe</surname>
          </string-name>
          .
          <article-title>E cient Selection of Mappings and Automatic Quality-driven Combination of Matching Methods</article-title>
          . In International Workshop on Ontology Matching, volume
          <volume>551</volume>
          <source>of CEUR Workshop Proceedings</source>
          , pages
          <volume>49</volume>
          {
          <fpage>60</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Isabel</surname>
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Cruz</surname>
            , Flavio Palandri Antonelli, Cosmin Stroe, Ulas Keles, and
            <given-names>Angela</given-names>
          </string-name>
          <string-name>
            <surname>Maduko</surname>
          </string-name>
          .
          <article-title>Using AgreementMaker to Align Ontologies for OAEI 2009: Overview, Results, and Outlook</article-title>
          . In International Workshop on Ontology Matching, volume
          <volume>551</volume>
          <source>of CEUR Workshop Proceedings</source>
          , pages
          <volume>135</volume>
          {
          <fpage>146</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Isabel</surname>
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Cruz</surname>
            and
            <given-names>William</given-names>
          </string-name>
          <string-name>
            <surname>Sunna</surname>
          </string-name>
          .
          <article-title>Structural Alignment Methods with Applications to Geospatial Ontologies</article-title>
          . Transactions in GIS,
          <source>Special Issue on Semantic Similarity Measurement and Geospatial Applications</source>
          ,
          <volume>12</volume>
          (
          <issue>6</issue>
          ):
          <volume>683</volume>
          {
          <fpage>711</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Jer</surname>
          </string-name>
          <article-title>^ome Euzenat and Pavel Shvaiko</article-title>
          .
          <source>Ontology Matching</source>
          . Springer-Verlag, Heidelberg (DE),
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Faria</surname>
          </string-name>
          , Catia Pesquita, Emanuel Santos, Matteo Palmonari,
          <string-name>
            <given-names>Isabel F.</given-names>
            <surname>Cruz</surname>
          </string-name>
          , and Francisco M. Couto.
          <article-title>The AgreementMakerLight Ontology Matching System</article-title>
          . In International Conference on Ontologies,
          <source>DataBases, and Applications of Semantics (ODBASE)</source>
          , pages
          <fpage>527</fpage>
          {
          <fpage>541</fpage>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Junheng</given-names>
            <surname>Hao</surname>
          </string-name>
          , Muhao Chen, Wenchao Yu,
          <string-name>
            <given-names>Yizhou</given-names>
            <surname>Sun</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Wei</given-names>
            <surname>Wang</surname>
          </string-name>
          .
          <article-title>Universal Representation Learning of Knowledge Bases by Jointly Embedding Instances and Ontological Concepts</article-title>
          .
          <source>In ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining</source>
          , pages
          <volume>1709</volume>
          {
          <fpage>1719</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Prodromos</given-names>
            <surname>Kolyvakis</surname>
          </string-name>
          , Alexandros Kalousis, and
          <string-name>
            <given-names>Dimitris</given-names>
            <surname>Kiritsis</surname>
          </string-name>
          . Deepalignment:
          <article-title>Unsupervised ontology matching with re ned word vectors</article-title>
          .
          <source>In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (
          <issue>Long Papers)</issue>
          , pages
          <fpage>787</fpage>
          {
          <fpage>798</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Amir</surname>
            <given-names>Laadhar</given-names>
          </string-name>
          , Faiza Ghozzi, Imen Megdiche, Franck Ravat, Olivier Teste, and
          <string-name>
            <given-names>Faiez</given-names>
            <surname>Gargouri</surname>
          </string-name>
          .
          <article-title>Partitioning and Local Matching Learning of Large Biomedical Ontologies</article-title>
          .
          <source>In ACM SIGAPP Symposium on Applied Computing</source>
          , pages
          <volume>2285</volume>
          {
          <fpage>2292</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Jean-Baptiste Lamy</surname>
          </string-name>
          . Owlready:
          <article-title>Ontology-oriented programming in python with automatic classi cation and high level constructs for biomedical ontologies</article-title>
          .
          <source>Arti cial intelligence in medicine</source>
          ,
          <volume>80</volume>
          :
          <fpage>11</fpage>
          {
          <fpage>28</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Weizhuo</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Xuxiang</given-names>
            <surname>Duan</surname>
          </string-name>
          , Meng Wang, XiaoPing Zhang, and
          <article-title>Guilin Qi. Multi-view embedding for biomedical ontology matching</article-title>
          .
          <source>OM@ ISWC</source>
          ,
          <volume>2536</volume>
          :
          <fpage>13</fpage>
          {
          <fpage>24</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Yankai</surname>
            <given-names>Lin</given-names>
          </string-name>
          , Zhiyuan Liu, Maosong Sun, Yang Liu, and
          <string-name>
            <given-names>Xuan</given-names>
            <surname>Zhu</surname>
          </string-name>
          .
          <article-title>Learning entity and relation embeddings for knowledge graph completion</article-title>
          .
          <source>In AAAI Conference on Arti cial Intelligence</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Sergey</surname>
            <given-names>Melnik</given-names>
          </string-name>
          , Hector Garcia-Molina, and
          <string-name>
            <given-names>Erhard</given-names>
            <surname>Rahm</surname>
          </string-name>
          .
          <article-title>Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching</article-title>
          .
          <source>In IEEE International Conference on Data Engineering (ICDE)</source>
          , pages
          <fpage>117</fpage>
          {
          <fpage>128</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Catia</surname>
            <given-names>Pesquita</given-names>
          </string-name>
          , Daniel Faria, Cosmin Stroe, Emanuel Santos, Isabel F Cruz, and Francisco M Couto.
          <article-title>What's in a `nym' ? synonyms in biomedical ontology matching</article-title>
          .
          <source>In International Semantic Web Conference</source>
          , pages
          <volume>526</volume>
          {
          <fpage>541</fpage>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Leonard</given-names>
            <surname>Richardson</surname>
          </string-name>
          .
          <article-title>Beautiful soup documentation</article-title>
          .
          <source>April</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Petar</surname>
            <given-names>Ristoski</given-names>
          </string-name>
          , Jessica Rosati, Tommaso Di Noia, Renato De Leone, and Heiko Paulheim.
          <article-title>RDF2Vec: RDF Graph Embeddings Their Applications</article-title>
          .
          <source>Semantic Web</source>
          ,
          <volume>10</volume>
          (
          <issue>4</issue>
          ):
          <volume>721</volume>
          {
          <fpage>752</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Julian</surname>
            <given-names>Seitner</given-names>
          </string-name>
          , Christian Bizer, Kai Eckert, Stefano Faralli, Robert Meusel, Heiko Paulheim, and
          <article-title>Simone Paolo Ponzetto. A large database of hypernymy relations extracted from the web</article-title>
          .
          <source>In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)</source>
          , pages
          <fpage>360</fpage>
          {
          <fpage>367</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Zequn</surname>
            <given-names>Sun</given-names>
          </string-name>
          , Wei Hu, Qingheng Zhang, and
          <string-name>
            <given-names>Yuzhong</given-names>
            <surname>Qu</surname>
          </string-name>
          .
          <article-title>Bootstrapping Entity Alignment with Knowledge Graph Embedding</article-title>
          .
          <source>In IJCAI</source>
          , volume
          <volume>18</volume>
          , pages
          <fpage>4396</fpage>
          {
          <fpage>4402</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>