<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Daniel Faria</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marta C. Silva</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pedro Cotovio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lucas Ferraz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laura Balbi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Catia Pesquita</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>1.1. State</institution>
          ,
          <addr-line>Purpose, General Statement</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>INESC-ID, Instituto Superior Técnico, Universidade de Lisboa</institution>
          ,
          <country country="PT">Portugal</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>LASIGE, Faculdade de Ciências, Universidade de Lisboa</institution>
          ,
          <country country="PT">Portugal</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <abstract>
        <p>Matcha is an ontology matching system designed to tackle long-standing challenges such as complex and holistic ontology matching. It incorporates all of the key algorithms from AgreementmakerLight over a novel broader core architecture that includes several new algorithms. In this year's edition, some strategies were modified to rectify some gaps found in last year, and a few new strategies were debuted, with particular note for the inclusion of Language Models in two of our algorithms. Matcha performed well overall, achieving the highest F-measure in 15 out of 43 distinct OAEI tasks and ranking in the top three in ten others.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1.2. Specific Techniques Used</title>
      <p>
        Matcha includes all of AML’s lexical and structural matching algorithms [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], as well as some of
its background knowledge strategy [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. For this year’s OAEI, some matching techniques were
revised, and some were newly developed.
      </p>
      <p>
        One of the new matching algorithms uses a Language Model (LM) in order to go beyond
the information that is explicitly stated in the ontology and exploit the context that labels and
synonyms can provide when represented through a language model. The matching algorithm
uses the LM to represent the entities’ labels and synonyms as embeddings, which are
subsequently compared through cosine similarity. Similarly to last year, we used the pre-trained
sentence-BERT [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] all-MiniLM-L6-v2 model1 without fine-tuning.
Matches classes based on overlapping individuals that instantiate them,
computed through conservative instance matching algorithms
Matches ontologies by finding literal full name matches between their
lexicons. Weighs matches according to the provenance of the names
Matches ontologies by computing the cosine similarity between the
language model embeddings of their lexicons
Matches ontologies by using cross-references and/or exact lexical
matches between them and a third mediating ontology
Matches ontologies by measuring the maximum string similarity, using
one of the four available string similarity measures
Matches ontologies by measuring the word similarity, using a weighted
Jaccard index
      </p>
      <p>Instance Matching
Matches individuals by finding literal matches between the values of
their annotation and data properties
Maps individuals by comparing their values through the ISub string
similarity metric
Maps individuals by comparing the lexicon entries of one with the
values of the other using a combination of string and word matching
algorithms
Maps individuals by comparing sentence representations of the source
and target labels, obtained with a LM trained in a multilingual setting</p>
      <p>
        Additionally, for any task that requires translation, we constructed a new translation module
that uses a pre-trained multilingual translation LM, the "M2M100" [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], with 1.2B parameters and
trained on over 100 languages. The model uses an Encoder-Decoder Long Short-Term Memory
architecture that consists of two complex recurrent neural networks that act as an encoder and
decoder pair. The matching algorithm uses the encoder to map each of the source and target
ontologies’ labels to an embedding representation, followed by a computation of the cosine
similarity between the embeddings to generate a mapping score.
      </p>
      <p>Matcha’s matching algorithms are described in Table 1.</p>
    </sec>
    <sec id="sec-2">
      <title>1.3. Adaptations Made for the Evaluation</title>
      <p>
        The MELT [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] web-based package was implemented in Matcha for the required evaluation in
OAEI. Given two ontologies and a set of parameters, Matcha will generate a complete alignment
between them according to the type of entities to be matched. For local alignment tasks, where
each entity in the test set has a predetermined list of candidate matches, Matcha calculates
scores for each candidate. These candidates are then ranked based on the highest score obtained
from the various matching algorithms.
      </p>
      <p>Matcha was packaged in a docker container for ease of sharing and running the evaluation,
which included, for example, the files necessary for some of the algorithms, such as background
knowledge ontologies used in some tracks.</p>
      <sec id="sec-2-1">
        <title>2. Results</title>
        <p>Matcha’s results for OAEI are summarized in Table 2, with the exception of the results for the
BioML track, which are presented in Table 3. Matcha performed well overall, achieving the
highest F-measure out of all systems in 15 out of the 43 distinct OAEI tasks, while ranking in
the top 3 in ten others.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2.1. Anatomy track</title>
      <p>Matcha continues to excel in this track, placing first among all systems and with all evaluation
metrics above a 0.9 (0.951 for precision, 0.931 for recall, 0.941 for F-measure). While not ranking
ifrst in precision, both precision and recall are very high, resulting in a high F-measure. It is
interesting to note that the second-best system yields a 0.903 in F-measure.</p>
    </sec>
    <sec id="sec-4">
      <title>2.2. Archaeology Multilingual track</title>
      <p>This task had two participants: Matcha and LogMap with its three variants. Matcha achieved
ifrst place in F-measure in three out of ten tasks, but in some tasks, both systems achieved
lowperformance scores, including one task where all systems failed to return results. The results
are heterogeneous, with some tasks achieving a high precision (including perfect precision for
the de-de task), while others achieve values close to zero (six out of ten). In terms of recall, the
results are also fairly heterogeneous, with the values varying between close to zero and 0.75.</p>
      <p>With this being the first year that Matcha debuts the integrated MLLM-based translation
module, we count on exploring other multilingual pretrained models and possibly performing a
statistical analysis to understand the diferences in language coverage and depth to sustain our
future choice of a multilingual model for Matcha’s translation module.</p>
    </sec>
    <sec id="sec-5">
      <title>2.3. Biodiversity and Ecology track</title>
      <p>This task also had two participants: Matcha and LogMap with its three variants. Matcha
achieved first place in the F-measure in three of the nine tasks, while in most of the other
tasks, it achieved scores that were very close to those obtained by LogMap. It is interesting to
note that, in the NCBITAXON-TAXREFLD group, Matcha achieves perfect recall in two tasks,
while in four others the value is very close to 1.0 (the lowest being 0.984). Precision is mostly
consistent, oscillating between 0.57 and 0.74, with results being poorer in the
MACROALGAEMACROZOOBENTHOS and FISH-ZOOPLANKTON tasks, which are around 0.2.</p>
    </sec>
    <sec id="sec-6">
      <title>2.4. Circular Economy track</title>
      <p>In this new track, Matcha placed first out of all systems by F-measure. The results are moderate
and close to other competing systems, with 0.393 precision and 0.611 recall. According to the
organizers and an additional assessment performed, Matcha’s optimal threshold could be set to
0.9, which would capture most true positives. On a less positive note, in terms of false positives
and using a manual evaluation by the organizers, Matcha finds a fair amount of mappings that
are probably due to the usage of either the same name or the same words, an interesting insight
that could be used to further improve our strategies.</p>
    </sec>
    <sec id="sec-7">
      <title>2.5. Conference track</title>
      <p>Matcha tied in first place with another competing system, improving over last year’s placement
in all measures (precision, recall, and F-measure).</p>
      <p>An additional evaluation was run to assess any diferences in results from sharp, discrete, and
continuous settings. From this assessment, it is noted that Matcha performs well in the sharp
evaluation in terms of recall (0.67), but in the discrete uncertain setting, while its precision
drops, recall improves to 0.77, indicating that it is successful at identifying uncertain matches.
Matcha also appears to adapt well to the uncertain framework in the continuous setting, as its
recall and F-measure remain relatively high at 0.75 and 0.71.</p>
      <p>Regarding the evaluation performed based on logical reasoning, Matcha has 86 conservativity
principle violations and 72 consistency principle violations in an alignment of 21 mappings.
However, as the organizers note, conservativity principle violations can simply be false positives.</p>
    </sec>
    <sec id="sec-8">
      <title>2.6. Digital Humanities track</title>
      <p>Matcha achieves overall good results in this track, even if somewhat heterogeneous. Matcha
ranks first in four out of the eight tracks, and in the top 3 in one other. Precision is high in some
tracks (with a value of 1.0 in one of them), however in some others the value is close to zero,
with Matcha yielding no results in one of these tasks. Recall sufers from less of this variability,
with only the failed task having a value close to zero, and with good values for all others.</p>
      <p>Similarly to the Archaeology Multilingual track, this track uses the MLLM-based translation
module, which will be further explored and reviewed.</p>
    </sec>
    <sec id="sec-9">
      <title>2.7. Food Nutritional Composition track</title>
      <p>Matcha only competed in the “equal” relation testcase placing first against competing systems,
however with a value of 0.1016 in F-measure which is lower than other tracks where it also places
ifrst. While Matcha is less precise than other systems (0.0611 against 0.1333), it compensates
with its ten times higher recall (0.3013 against 0.0274 ). This track poses challenges that current
systems are clearly not well equipped to handle.</p>
    </sec>
    <sec id="sec-10">
      <title>2.8. Knowledge Graph track</title>
      <p>Matcha places last in this track when assessing the aggregated results. Looking at class mappings,
Matcha has a high performance overall with 0.97 of precision, 0.8 of recall, and 0.87 of F-measure,
outperforming both competing systems and the baselines. All systems fail at finding property
mappings, and as for instance mappings, Matcha has a lower performance with 0.55 of precision,
0.86 of recall and 0.63 of F-measure, finding far more mappings than other systems (249510
mappings versus 6653.8 by the next system) which decreases precision significantly.</p>
      <p>When looking at each of the test cases, a pattern emerges where Matcha has lower precision
and higher recall when compared to all other systems. However the precision values are low
enough that cannot be compensated by the high recall, and therefore lead Matcha to place last
in four of the five test cases, placing second in the remaining one, according to F-measure.</p>
      <p>In this track, two main problems arise which need to be assessed and corrected: the lack of
property mappings and the excessive amount of instance mappings produced, which directly
influence the system’s precision.</p>
    </sec>
    <sec id="sec-11">
      <title>2.9. Multifarm track</title>
      <p>Matcha’s performance in this track is fairly balanced considering a new strategy of using LLMs
for multilingual machine-translation was debuted this year.</p>
      <p>This year Matcha ranked second out of four systems, outperforming last year’s scores where
Matcha competed without the LLM module and ranked 4th out of four, with a clear improvement
in recall and F-measure. Although Matcha’s running time is within the same order of magnitude
as other competing systems, we recognize that it is very time-consuming in its current iteration
and could be optimized in future versions of the system.</p>
    </sec>
    <sec id="sec-12">
      <title>2.10. Bio-ML track</title>
      <p>This year marks Matcha’s first time competing in the local alignment challenges of this track.
While Matcha’s Bio-ML rankings based on F-score were moderate compared to the other
participating models, Matcha demonstrated a stronger relative performance when considering
MRR, especially in the unsupervised setting. Matcha’s middle-ranking F-scores were caused by
largely high precisions paired with relatively low recalls, a trend also evident among most other
participating systems, highlighting that the challenge of improving recall without compromising
precision still remains an open issue. Overall Matcha results to note in the Bio-ML track include:
a top-3 MRR ranking placement in 3 of the 5 tasks in the unsupervised setting; a first and second
place in MRR ranking in the unsupervised and supervised settings of the SNOMED-FMA (body)
task, respectively; and a second place in the F-score ranking in the unsupervised SNOMED-NCIT
(pharm) task.</p>
      <sec id="sec-12-1">
        <title>3. Conclusions</title>
        <p>Matcha achieved the highest F-measure in 15 out of the 43 distinct OAEI tasks and ranked in
the top 3 in ten others, making it overall the second-best system that competed this year.</p>
        <p>This year a new approach for the translation model was debuted that allowed Matcha to
improve its rank in the multifarm track, and place fairly well in the new tracks of
archaeologymultilingual and digital humanities. Moreover, across all tasks, Matcha tends to outperform
other systems in recall, while tending to underperform in precision, sometimes due to an
exaggerated number of mappings found that turn out to be false positives. Some tracks require
further review, such as the knowledge graph track where Matcha fails to find any property
mappings.</p>
      </sec>
      <sec id="sec-12-2">
        <title>Acknowledgements</title>
        <p>This work was supported by FCT through fellowships 2022.11895.BD (Marta
Silva), 2022.10557.BD (Pedro Cotovio), and through the KATY project
through fellowships R881.7 (Laura Balbi), and the LASIGE Research Unit, ref.
UIDB/00408/2020 (https://doi.org/10.54499/UIDB/00408/2020) and ref. UIDP/00408/2020
(https://doi.org/10.54499/UIDP/00408/2020). It was partially supported by the KATY project
which has received funding from the European Union’s Horizon 2020 research and innovation
program under grant agreement No 101017453, and it was also partially supported by project
41, HfPT: Health from Portugal, funded by the Portuguese Plano de Recuperação e Resiliência.
Mouse-Human
idai-pactols_de-de
idai-pactols_de-en
idai-pactols_de-fr
idai-pactols_de-it
idai-pactols_en-en
idai-pactols_en-fr
idai-pactols_en-it
idai-pactols_fr-fr
idai-pactols_fr-it
idai-pactols_it-it
CEON-BiOnto
OntoFarm (rar2-M3)
arch1_defc-pactols
arch2_idai-pactols
arch3_ironagedanube-pactols
arch4_pactols-parthenos
cult1_idai-parthenos
cult2_oeai-parthenos
dhcs1_dha-unesco
dhcs2_tadirah-unesco
Test Case Food V2
Aggregated (overall)
Aggregated
ENVO-SWEET
NCBITAXON-TAXREFLD Animalia
NCBITAXON-TAXREFLD Bacteria
NCBITAXON-TAXREFLD Chromista
NCBITAXON-TAXREFLD Fungi
NCBITAXON-TAXREFLD Plantae
NCBITAXON-TAXREFLD Protozoa
MACROALGAE-MACROZOOBENTHOS
FISH-ZOOPLANKTON</p>
        <p>MRR</p>
        <p>Rank *</p>
        <p>OMIM-ORDO
NCIT-DOID
SNOMED-FMA
SNOMED-NCIT (Pharm)
SNOMED-NCIT (Neoplas)
OMIM-ORDO
NCIT-DOID
SNOMED-FMA
SNOMED-NCIT (Pharm)
SNOMED-NCIT (Neoplas)</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>É.</given-names>
            <surname>Thiéblin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Haemmerlé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Hernandez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Trojahn</surname>
          </string-name>
          , Survey on complex ontology matching,
          <source>Semantic Web</source>
          <volume>11</volume>
          (
          <year>2020</year>
          )
          <fpage>689</fpage>
          -
          <lpage>727</lpage>
          . URL: https://doi.org/10.3233/SW-190366. doi:
          <volume>10</volume>
          .3233/SW-190366.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>I.</given-names>
            <surname>Megdiche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Teste</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Trojahn</surname>
          </string-name>
          ,
          <article-title>An extensible linear approach for holistic ontology matching</article-title>
          , in: International Semantic Web Conference, Springer,
          <year>2016</year>
          , pp.
          <fpage>393</fpage>
          -
          <lpage>410</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Faria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. S.</given-names>
            <surname>Balasubramani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Couto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pesquita</surname>
          </string-name>
          , Agreementmakerlight, Semantic
          <string-name>
            <surname>Web</surname>
          </string-name>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Faria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pesquita</surname>
          </string-name>
          , E. Santos,
          <string-name>
            <given-names>M.</given-names>
            <surname>Palmonari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. F.</given-names>
            <surname>Cruz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Couto</surname>
          </string-name>
          ,
          <article-title>The AgreementMakerLight Ontology Matching System</article-title>
          ,
          <source>in: OTM Conferences - ODBASE</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>527</fpage>
          -
          <lpage>541</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Faria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pesquita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. F.</given-names>
            <surname>Cruz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Couto</surname>
          </string-name>
          ,
          <article-title>Automatic Background Knowledge Selection for Matching Biomedical Ontologies</article-title>
          ,
          <source>PLoS One</source>
          <volume>9</volume>
          (
          <year>2014</year>
          )
          <article-title>e111226</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>Sentence-bert: Sentence embeddings using siamese bert-networks</article-title>
          , arXiv preprint arXiv:
          <year>1908</year>
          .
          <volume>10084</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Fan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhosale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Schwenk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ma</surname>
          </string-name>
          , A.
          <string-name>
            <surname>El-Kishky</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Goyal</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Baines</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Celebi</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Wenzek</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Chaudhary</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Goyal</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Birch</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Liptchinsky</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Edunov</surname>
            , E. Grave,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Auli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Joulin</surname>
          </string-name>
          ,
          <string-name>
            <surname>Beyond</surname>
          </string-name>
          english-centric
          <source>multilingual machine translation</source>
          ,
          <year>2020</year>
          . arXiv:
          <year>2010</year>
          .11125.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hertling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Portisch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Paulheim</surname>
          </string-name>
          , MELT
          <article-title>- matching evaluation toolkit</article-title>
          ,
          <source>in: Semantic Systems. The Power of AI and Knowledge Graphs - 15th International Conference, SEMANTiCS</source>
          <year>2019</year>
          , Karlsruhe, Germany, September 9-
          <issue>12</issue>
          ,
          <year>2019</year>
          , Proceedings,
          <year>2019</year>
          , pp.
          <fpage>231</fpage>
          -
          <lpage>245</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -33220-4_
          <fpage>17</fpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -33220-4\_
          <fpage>17</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>