<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Supervised Ontology and Instance Matching with MELT</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>rtling</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>n Portis</string-name>
          <email>fjan.portischg@sap.com</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>iko P</string-name>
          <email>heikog@informatik.uni-mannheim.de</email>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Data and Web Science Group, University of Mannheim</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>SAP SE Product Engineering Financial Services</institution>
          ,
          <addr-line>Walldorf</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we present MELT-ML, a machine learning extension to the Matching and EvaLuation Toolkit (MELT) which facilitates the application of supervised learning for ontology and instance matching. Our contributions are twofold: We present an open source machine learning extension to the matching toolkit as well as two supervised learning use cases demonstrating the capabilities of the new extension.</p>
      </abstract>
      <kwd-group>
        <kwd>ontology matching</kwd>
        <kwd>supervised learning</kwd>
        <kwd>machine learning knowledge graph embeddings</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Many similarity metrics and matching approaches have been proposed and
developed up to date. They are typically implemented as engineered systems which
apply a process-oriented matching pipeline. Manually combining metrics, also
called features in the machine learning jargon, is typically very cumbersome.
Supervised learning allows researchers and developers to focus on adding and
de ning features and to leave the weighting of those and the decision making
to a machine. This approach may also be suitable for developing generic
matching systems that self-adapt depending on speci c datasets or domains. Here, it
makes sense to test and evaluate multiple classi ers at once in a fair, i.e.
reproducible, way. Furthermore, recent advances in machine learning { such as in the
area of knowledge graph embeddings { may also be applicable for the ontology
and instance matching community. The existing evaluation and development
platforms, such as the Alignment API [3], SEALS [7,33] or the HOBBIT [25]
framework, make the application of such advances not as simple as it could be.
? The authors contributed equally to this paper.</p>
      <p>Copyright c 2020 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>In this paper, we present MELT-ML, an extension to the Matching and
EvaLuation Toolkit (MELT). Our contribution is twofold: Firstly, we present a
machine learning extension to the MELT framework (available in MELT 2.6) which
simpli es the application of advanced machine learning algorithms in matching
systems and which helps researchers to evaluate systems that exploit such
techniques. Secondly, we present and evaluate two novel approaches in an exemplary
manner implemented and evaluated with the extension in order to demonstrate
its functionality. We show that RDF2Vec [30] embeddings derived directly from
the ontologies to be matched are capable of representing the internal structure
of an ontology but do not provide any value for matching tasks with di
erently structured ontologies when evaluated as the only feature. We further show
that multiple feature generators and a machine learning component help to
obtain a high precision alignment in the Ontology Alignment Evaluation Initiative
(OAEI) knowledge graph track [11,8].
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>Classi cation is a avor of supervised learning and denotes a machine learning
approach where the learning system is presented with a set of records carrying
a class or label. Given those records, the system is trained by trying to predict
the correct class. [18] Transferred to the ontology alignment domain, the set
of records can be regarded as a collection of correspondences where some of
the correspondences are correct (class true) and some correspondences are false
(class false). Hence, the classi cation system at hand is binary.</p>
      <p>The application of supervised learning is not new to ontology matching. In
fact, even in the very rst edition of the OAEI3 in 2004 the OLA matching
system [5] performed a simple optimization of weights using the provided
reference alignments. In the past, multiple publications [14,4,31,24,16] addressed
supervised learning in ontology matching, occasionally also referred to as
matching learning. Unsupervised machine learning approaches are less often used, but
have been proposed for the task of combining matchers as well [23].</p>
      <p>More recently, Nkisi-Orji et al. [26] present a matching system that uses a
multitude of features and a random forest classi er. The system is evaluated on
the OAEI conference track [2] and the EuroVoc dataset, but did not participate
in the actual evaluation campaign. Similarly, Wang et al. [32] present a system
called OntoEmma which exploits a neural classi er together with 32 features.
The system is evaluated on the large biomed track. However, the system did not
participate in an OAEI campaign either. It should be mentioned here that a
comparison between systems that have been trained with parts of the reference
and systems that have not is not really fair (despite being the typical approach).</p>
      <p>Also a recent, OAEI-participating matching system applies supervised
learning: The POMap++ matching system [16] uses a local classi er which is not
3 Back then the competition was actually referred to as EON Ontology Alignment
Contest.
based on the reference alignment but on a locally created gold standard. The
system also participated in the last two recent OAEI campaigns [17,15].</p>
      <p>The implementations of the approaches are typically not easily reusable or
available in a central framework.
3</p>
    </sec>
    <sec id="sec-3">
      <title>The MELT Framework</title>
      <p>Overview MELT [10] is a framework written in Java for ontology and instance
matcher development, tuning, evaluation, and packaging. It supports both,
HOBBIT and SEALS, two heavily used evaluation platforms in the ontology
matching community. The core parts of the framework are implemented in Java, but
evaluation and packaging of matchers implemented in other languages is also
supported. Since 2020, MELT is the o cial framework recommendation by the
OAEI and the MELT track repository is used to provide all track data required
by SEALS. MELT is also capable of rendering Web dashboards for ontology
matching results so that interested parties can analyze and compare matching
results on the level of correspondences without any coding e orts [27]. This
has been pioneered at the OAEI 2019 for the knowledge graph track.4 MELT is
open-source5, under a permissive license, and is available on the maven central
repository6.</p>
      <p>Di erent Gold Standard Types Matching systems are typically evaluated against
a reference alignment. A reference alignment may be complete or only partially
complete. The latter means that not all entities in the matching task are aligned
and that any entity not appearing in the gold standard cannot be judged.
Therefore, the following ve levels of completeness can be distinguished: (i) complete,
(ii) partial with complete target and complete source, (iii) partial with complete
target and incomplete source, (iv) partial with complete source and incomplete
target, (v) partial with incomplete source and incomplete target. If the reference
is complete, all correspondences not available in the reference alignment can be
regarded as wrong. If only one part of the gold standard is complete (ii, iii, and
iv), every correspondence involving an element of the complete side that is not
available in the reference can be regarded as wrong. If the gold standard is
incomplete (v), the correctness of correspondences not in the gold standard cannot be
judged. For example, given that the gold standard is partial with complete target
and complete source (case ii), and given the correspondence &lt; a; b; =; 1:0 &gt;, the
correspondence &lt; a; c; =; 1:0 &gt; could be judged as wrong because it involves a
which is from the complete side of the alignment. On the other hand, the
correspondence &lt; d; e; =; 1:0 &gt; cannot be judged because it does not involve any
element from the gold standard. This evaluation setting is used for example for
the OAEI knowledge graph track. OAEI reference datasets are typically complete
4 For a demo of the MELT dashboard, see https://dwslab.github.io/melt/
anatomy_conference_dashboard.html
5 https://github.com/dwslab/melt/
6 https://mvnrepository.com/artifact/de.uni-mannheim.informatik.dws.melt
with the exception of the knowledge graph track. The completeness of references
in uences how matching systems have to be evaluated. MELT can handle all
stated levels of completeness. The completeness can be set for every TestCase
separately using the enum GoldStandardCompleteness. The completeness also
in uences the generation of negative correspondences for a gold standard in
supervised learning. MELT supports matching system developers also in this use
case.
4
4.1</p>
    </sec>
    <sec id="sec-4">
      <title>Supervised Learning Extensions in MELT</title>
      <sec id="sec-4-1">
        <title>Python Wrapper</title>
        <p>As researchers apply advances in machine learning and natural language
processing to other domains, they often turn to Python because leading machine
learning libraries such as scikit-learn7, TensorFlow 8, PyTorch9, Keras10, or
gensim11 are not easily available for the Java language. In order to exploit
functionalities provided by Python libraries in a consistent manner without a tool
break, a wrapper is implemented in MELT which communicates with a Python
backend via HTTP as depicted in Figure 1. The server works out-of-the-box
requiring only that Python and the libraries listed in the requirements.txt
le are available on the target system. The MELT-ML user can call methods in
Java which are mapped to a Python call in the background. As of MELT 2.6,
functionality from gensim and scikit-learn are wrapped.
7 https://scikit-learn.org/
8 https://www.tensorflow.org/
9 https://pytorch.org/
10 https://keras.io/
11 https://radimrehurek.com/gensim/</p>
      </sec>
      <sec id="sec-4-2">
        <title>Generation of Training Data</title>
        <p>Every classi cation approach needs features and class labels. In the case of
matching, each example represents a correspondence and the overall goal is to
have an ML model which is capable of deciding if a correspondence is correct
or not. Thus, the matching component can only work as a lter e.g. it can only
remove correspondences of an already generated alignment.</p>
        <p>For training such a classi er, positive and negative examples are required.
The positive ones can be generated by a high precision matcher or by an
externally provided alignment such as a sample of the reference alignment or manually
created correspondences. As mentioned earlier, no OAEI track provides a
dedicated alignment for training. Therefore, MELT provides a new sample(int n)
method in the Alignment class for sampling n correct correspondences as well as
sampleByFraction(double fraction) for sampling a f raction in range (0; 1)
of correct correspondences.</p>
        <p>Negative examples can be easily generated in settings where the gold
standard is complete or partially complete (with complete source and/or target, see
Section 3). The reason is that any correspondence with an entity appearing in the
positive examples can be regarded as incorrect. Thus, a recall oriented matcher
can generate an alignment and all such correspondences represent the negative
class. In cases where the gold standard is partial and the source and/or target
is incomplete, each negative correspondence has to be manually created.
4.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Generation of Features</title>
        <p>The features for the correspondences are generated by one or more matchers
which can be concatenated in a pipeline or any other control ow. MELT provides
an explicit framework for storing the feature values in correspondence extensions
(which are by default also serialized in the alignment format). The
correspondence method addAdditionalConfidence(String key, double confidence)
is used to add such feature values (more convenience methods exist).</p>
        <p>MELT already provides some out-of-the-box feature generators in the form
of so called lters and matchers. A matcher detects new correspondences. As
of MELT 2.6, 17 matchers are directly available (e.g., di erent string similarity
metrics). A lter requires an input alignment and adds the additional con dences
to the correspondences, or removes correspondences below a threshold. In MELT,
machine learning is also included via a lter (MachineLearningScikitFilter).
As of MELT 2.6, 21 lters are available. A selection is presented in the following:
SimilarNeighboursFilter Given an initial alignment of instances, the
SimilarNeighboursFilter analyzes for each of the instance correspondences how many
already matched neighbours the source and target instances share. It can be
further customized to also include similar literals (de ned by string processing
methods). The share of neighbours can be added to the correspondence as
absolute value or relative to the total numbers of neighbours for source and target.
For the latter, the user can choose from min (size of the intersection divided by
minimum number of neighbours of source or target), max, jaccard (size of
intersection dived by size of union), and dice (twice the size of intersection divided
by the sum of source and target neighbours).</p>
        <p>CommonPropertiesFilter This lter selects instance matches based on the
overlap of properties. The idea is that equal instances also share similar properties.
Especially in the case of homonyms, this lter might help. For instance, given
two instances with label 'bat', the string may refer to the mammal or to the
racket where the rst sense has properties like 'taxon', 'age', or 'habitat' and
the latter one has properties like 'material', 'quality', or 'producer'. This lter of
course requires already matched properties. The added con dence can be further
customized similarly to the previous lter. Furthermore, property URIs are by
default ltered to exclude properties like rdfs:label.</p>
        <p>SimilarHierarchyFilter This component analyzes any hierarchy for given
instance matches such as type hierarchy or a category taxonomy as given in the
knowledge graph track. Thus, two properties are needed: 1) instance to hierarchy
property which connects the instance to the hierarchy (in case of type hierarchy
this is rdf:type) 2) hierarchy property which connects the hierarchy (in case
of type hierarchy this is rdfs:subClassOf). This lter needs matches in the
hierarchy which are counted similarly to the previous lters. Additionally, the
con dence can be computed by a hierarchy level dependent value (the higher
the match in the hierarchy, the lower the con dence). SimilarTypeFilter is a
reduced version of it by just looking at the direct parent.</p>
        <p>BagOfWordsSetSimilarityFilter This lter analyzes the token overlap of the
literals given by a speci c property. The tokenizer can be freely chosen as well as
the overlap similarity.</p>
        <p>
          MachineLearningScikitFilter The actual classi cation part is implemented in
class MachineLearningScikitFilter. In the standard setting, a ve-fold cross
validation is executed to search for the model with the best f-measure. The
following models and hyper parameters are tested:
{ Decision Trees optimized by minimum leaf size and maximum depth of tree
(
          <xref ref-type="bibr" rid="ref1 ref10 ref11 ref12 ref13 ref14 ref15 ref16 ref17 ref18 ref19 ref2 ref20 ref3 ref4 ref5 ref6 ref7 ref8 ref9">1-20</xref>
          )
{ Gradient Boosted Trees optimized by maximum depth (
          <xref ref-type="bibr" rid="ref1 ref11 ref16 ref21 ref6">1,6,11,16,21</xref>
          ) and
number of trees (
          <xref ref-type="bibr" rid="ref1 ref21">1,21,41,61,81,101</xref>
          )
{ Random Forest optimized by number of trees (1-100 with 10 steps) and
minimum leaf size (
          <xref ref-type="bibr" rid="ref1 ref10 ref2 ref3 ref4 ref5 ref6 ref7 ref8 ref9">1-10</xref>
          )
{ Nave Bayes (without speci c parameter tuning)
{ Support Vector Machines (SVM) with radial base function kernel; C and
gamma are tuned according to [13]
{ Neural Network with one hidden layer in two di erent sizes F=2+2, sqrt(F ),
and two hidden layers of F=2 and sqrt(F ), where F denotes the number of
features
        </p>
        <p>All of these combinations are evaluated automatically with and without
feature normalization (MinMaxScaler which scales each feature to a range between
zero and one). The best model is then trained on the whole training set and
applied to the given alignment.
4.4</p>
      </sec>
      <sec id="sec-4-4">
        <title>Analysis of Matches</title>
        <p>A correspondence which was found by a matching system and which appears in
the reference alignment is referred to as true positive. A residual true positive
correspondence is a true positive correspondence that is not trivial as de ned
by a trivial alignment. The trivial alignment can be given or calculated by a
simple baseline matcher. String matches, for instance, are often referred to as
trivial. Given a reference alignment, a system alignment, and a trivial alignment,
the residual recall can be calculated as the share of non trivial correspondences
found by the matching system [1,6].</p>
        <p>If a matcher was trained using a sample of the reference alignment and is also
evaluated on the reference alignment, a true positive match can only be counted
as meaningful if it was not available in the training set before. In MELT, the
baseline matcher can be set dynamically for an evaluation. Therefore, for
supervised matching tasks where a sample from the reference is used, the sample can
be set as baseline solution (using the ForwardMatcher) so that only
additionally found matches are counted as residual true positives. Using the alignment
cube le12, residual true positives can be analyzed at the level of individual
correspondences.
5
5.1</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Exemplary Analysis</title>
      <sec id="sec-5-1">
        <title>RDF2Vec Vector Projections</title>
        <p>Experiment In this experiment, the ontologies to be matched are embedded
and a projection is used to determine matches. RDF2Vec is a knowledge graph
embedding approach which generates random walks for each node in the graph
to be embedded and afterwards runs the word2vec [21,22] algorithm on the
generated walks. Thereby, a vector for each node in the graph is obtained. The
RDF graph is used in RDF2Vec without any pre-processing such as in other
approaches like OWL2Vec [12]. The embedding approach chosen here has been
used on external background knowledge for ontology alignment before [29].</p>
        <p>In this setting, we train embeddings for the ontologies to be matched. In
order to do so, we integrate the jRDF2Vec13 [28] framework into MELT in order
to train the embedding spaces. Using the functionalities provided in the
MELTML package, we train a linear projection from the source vector space into the
target vector space. In order to generate a training dataset for the projection,
12 The alignment cube le is a CSV le listing all correspondences found and not found
(together with ltering properties) that is generated by the EvaluatorCSV.
13 https://github.com/dwslab/jRDF2Vec
the sampleByFraction(double fraction) method is used. For each source,
the closest target node in the embedding space is determined. If the con dence
for a match is above a threshold t, the correspondence is added to the system
alignment.</p>
        <p>Here, we do not apply any additional matching techniques such as string
matching. The approach is fully independent of any stated label information.
The exemplary matching system is available online as an example.14
Results For the vector training, we generate 100 random walks with a depth of
4 per node and train skip-gram (SG) embeddings with 50 dimensions, minimum
count of 1, and a window size of 5. We use a sampling rate of 50% and a threshold
of 0.85. While the implemented matcher fails to generate a meaningful residual
recall when the two ontologies to be matched are di erent, it performs very well
when the ontologies are of the same structure as in the multifarm track. Here,
the approach generates many residual true positives with a residual recall of
up to 61% on iasted-iasted as seen in Table 1. Thus, it could be shown that
RDF2Vec embeddings do contain structural information of the knowledge graph
that is embedded.</p>
        <p>Multifarm Test Case P R R+ F
iasted-iasted 0.8232 0.7459 0.6111 0.7836
conference-conference 0.7065 0.5285 0.1967 0.6047
confOf-confOf 0.9111 0.5541 0.1081 0.6891
# of TP # of FP # of FN
135 29 46
65 27 58
41 4 33
Experiment In this experiment, the instances of the OAEI knowledge graph track
are matched. First, a basic matcher (BaseMatcher) is used to generate a recall
oriented alignment by applying simple string matching on the property values
of rdfs:label and skos:altLabel. The text is compared once using string
equality and once in a normalized fashion (non-ASCII characters are removed
and the whole string is lowercased).</p>
        <p>Given this alignment, the above described feature generators / lters are
applied in isolation to re-rank the correspondences and afterwards the
NaiveDescendingExtractor [20] is used to create a one-to-one alignment based on
the best con dence.</p>
        <p>In contrast to this, another supervised approach is tried out. After executing
the BaseMatcher, all feature generators are applied after each other where each
14 https://github.com/dwslab/melt/tree/master/examples/RDF2VecMatcher
lter adds one feature value. The feature values are calculated independently
of each other. This results in an alignment where each correspondence has the
additional con dences in its extensions. As a last step, the
MachineLearningScikitFilter is executed. The training alignment is generated by sampling all
correspondences from the BaseMatcher where the source or target is involved.
The correspondence is a positive training example if the source and the
target appear in the input alignment (which is in our case the sampled reference
alignment) and a negative example in all other cases.</p>
        <p>The search for the machine learning model is executed as a ve-fold cross
validation and the best model is used to classify all correspondences given by
the BaseMatcher. The whole setup is available on GitHub15.</p>
        <p>Results In all lters, the absolute number of overlapping entities are used (they
are normalized during a grid search for the best model). In the
SimilarNeighboursFilter, the literals are compared with text equality and the hierarchy
lter compares the categories of the Wiki pages. The SimilarTypeFilter
analyzes the direct classes which are extracted from templates (indicated by the text
'infobox'). The results for this experiment are depicted in Table 2 which shows
that not one feature can be used for all test cases because di erent Wiki
combinations (test cases) require di erent lters. The BaseMatcher already achieves a
good f-measure which is also in line with previous analyses [9]. When executing
the MachineLearningScikitFilter the precision can be increased for three test
cases and the associated drop in recall is relatively small. It can be further seen
that there is not one single optimal classi er out of the classi ers tested.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion and Outlook</title>
      <p>With MELT-ML, we have presented a machine learning extension for the MELT
framework which facilitates feature generation and feature combination. The
latter are included as lters to re ne existing matches. MELT also allows for the
evaluation of ML-based matching systems.</p>
      <p>In the future, we plan to extend the provided functionality by the Python
wrapper to further facilitate machine learning in matching applications. We
further plan to extend the number of feature generators. With our contribution we
hope to encourage OAEI participants to apply and evaluate supervised matching
techniques. In addition, we intend to further study di erent strategies and ratios
for the generation of negative examples.</p>
      <p>We further would like to emphasize that a special machine learning track
with dedicated training and testing alignments might bene t the community,
would increase the transparency in terms of matching system performance, and
might further increase the number of participants since researchers use OAEI
datasets for supervised learning but there is no o cial channel to participate if
parts of the reference alignment are required.
15 https://github.com/dwslab/melt/tree/master/examples/</p>
      <p>supervisedKGTrackMatcher
6 1 5 9 8 0 0 6 2
7 8 4 5 2 4 8 6 2 g d
F .09 .91 .92 .91 .91 .91 .19 tse .91 tse .92 in en
0 0 0 0 0 0 0 r 0 r 0 ch il</p>
      <p>o o
p e
l b 8 5 0 0 6 7 2 F 99 F 57 nT
a y 7 8 3 1 1 2 9 M
ry ro R .89 .87 .88 .88 .89 .87 .85 om .85 om .85 iso lla eh
o 0 0 0 0 0 0 0 d 0 d 0 ic t</p>
      <p>m
e 4 1 6 4 6 4 3 a 3 aR 86 D fro ro
me 0 0 1 0 7 7 6 n 6 n 5 e
mm P .870 .930 .930 .930 .940 .920 .960 R .960 .90 )(F ].[9F
2 0 0 0 1 0 7 7 7
7 6 6 6 4 6 6 6 6 re in</p>
      <p>F .57 .75 .75 .75 .76 .75 .75 .75 tse .57 tse su d
- le 06 04 04 04 07 04 00 00 roF 002 roF -eam foun .s
cu rav R .679 .661 .661 .661 .668 .661 .662 M 26 om .66 om fd eb reo
m 0 0 0 0 0 0 0 SV .06 d 0 d n n sc
m
pp sea om im ga im im L L L b p S
A B C S B S S M M M</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Jose Luis Aguirre, Bernardo Cuenca Grau, Kai Eckert, Jero^me Euzenat, Al o Ferrara, Robert Willem Van Hague,
          <string-name>
            <surname>Laura Hollink</surname>
            , Ernesto Jimenez-Ruiz,
            <given-names>Christian</given-names>
          </string-name>
          <string-name>
            <surname>Meilicke</surname>
            ,
            <given-names>Andriy</given-names>
          </string-name>
          <string-name>
            <surname>Nikolov</surname>
          </string-name>
          , et al.
          <article-title>Results of the ontology alignment evaluation initiative</article-title>
          <year>2012</year>
          .
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Michelle</given-names>
            <surname>Cheatham</surname>
          </string-name>
          and
          <string-name>
            <given-names>Pascal</given-names>
            <surname>Hitzler</surname>
          </string-name>
          .
          <source>Conference v2</source>
          .
          <article-title>0: An uncertain version of the OAEI conference benchmark</article-title>
          .
          <source>In ISWC 2014. Proceedings</source>
          ,
          <string-name>
            <surname>Part</surname>
            <given-names>II</given-names>
          </string-name>
          , pages
          <volume>33</volume>
          {
          <fpage>48</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Jero^me David, Jero^me Euzenat, Francois Schar e, and
          <article-title>Cassia Trojahn dos Santos</article-title>
          .
          <source>The alignment API 4.0. Semantic Web</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ):3{
          <fpage>10</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Kai</given-names>
            <surname>Eckert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Christian</given-names>
            <surname>Meilicke</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Heiner</given-names>
            <surname>Stuckenschmidt</surname>
          </string-name>
          .
          <article-title>Improving ontology matching using meta-level learning</article-title>
          .
          <source>In ESWC 2009, Proceedings</source>
          , pages
          <volume>158</volume>
          {
          <fpage>172</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. Jero^me Euzenat, David Loup,
          <string-name>
            <given-names>Mohamed</given-names>
            <surname>Touzani</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Petko</given-names>
            <surname>Valtchev</surname>
          </string-name>
          .
          <article-title>Ontology alignment with OLA</article-title>
          .
          <source>In EON</source>
          <year>2004</year>
          ,
          <article-title>Evaluation of Ontology-based Tools</article-title>
          ,
          <source>Proceedings of the 3rd International Workshop on Evaluation of Ontology-based Tools held at ISWC</source>
          <year>2004</year>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Jero</surname>
          </string-name>
          <article-title>^me Euzenat and Pavel Shvaiko</article-title>
          . Ontology Matching, chapter
          <volume>9</volume>
          , pages
          <fpage>285</fpage>
          {
          <fpage>317</fpage>
          . Springer, New York, 2nd edition,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Raul</given-names>
            <surname>Garc</surname>
          </string-name>
          a
          <article-title>-Castro, Miguel Esteban-Gutierrez,</article-title>
          and
          <string-name>
            <surname>Asuncion</surname>
          </string-name>
          Gomez-Perez.
          <article-title>Towards an infrastructure for the evaluation of semantic technologies</article-title>
          .
          <source>In eChallenges e-2010 Conference</source>
          , pages
          <fpage>1</fpage>
          <article-title>{7</article-title>
          . IEEE,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Sven</given-names>
            <surname>Hertling</surname>
          </string-name>
          and
          <string-name>
            <given-names>Heiko</given-names>
            <surname>Paulheim</surname>
          </string-name>
          .
          <article-title>Dbkwik: A consolidated knowledge graph from thousands of wikis</article-title>
          .
          <source>In 2018 IEEE International Conference on Big Knowledge, ICBK</source>
          <year>2018</year>
          , Singapore,
          <source>November 17-18</source>
          ,
          <year>2018</year>
          , pages
          <fpage>17</fpage>
          {
          <fpage>24</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Sven</given-names>
            <surname>Hertling</surname>
          </string-name>
          and
          <string-name>
            <given-names>Heiko</given-names>
            <surname>Paulheim</surname>
          </string-name>
          .
          <article-title>The knowledge graph track at OAEI - gold standards, baselines, and the golden hammer bias</article-title>
          .
          <source>In ESWC 2020, Proceedings</source>
          , pages
          <volume>343</volume>
          {
          <fpage>359</fpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Sven</surname>
            <given-names>Hertling</given-names>
          </string-name>
          , Jan Portisch, and
          <string-name>
            <given-names>Heiko</given-names>
            <surname>Paulheim</surname>
          </string-name>
          .
          <article-title>MELT - matching evaluation toolkit</article-title>
          .
          <source>In Semantic Systems. The Power of AI and Knowledge Graphs - 15th International Conference, SEMANTiCS</source>
          <year>2019</year>
          , Proceedings, pages
          <volume>231</volume>
          {
          <fpage>245</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Alexandra</surname>
            <given-names>Hofmann</given-names>
          </string-name>
          , Samresh Perchani, Jan Portisch, Sven Hertling, and
          <string-name>
            <given-names>Heiko</given-names>
            <surname>Paulheim</surname>
          </string-name>
          . Dbkwik:
          <article-title>Towards knowledge graph creation from thousands of wikis</article-title>
          .
          <source>In Proceedings of the ISWC 2017 Posters &amp; Demonstrations and Industry Tracks</source>
          , Vienna, Austria, October 23rd - to - 25th,
          <year>2017</year>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. Ole Magnus Holter, Erik Bryhn Myklebust, Jiaoyan Chen, and
          <article-title>Ernesto JimenezRuiz. Embedding OWL ontologies with owl2vec</article-title>
          .
          <source>In ISWC 2019 Satellite Tracks (Posters &amp; Demonstrations, Industry, and Outrageous Ideas)</source>
          , pages
          <fpage>33</fpage>
          {
          <fpage>36</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Chih-Wei</surname>
            <given-names>Hsu</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chih-Chung Chang</surname>
          </string-name>
          , and
          <string-name>
            <surname>Chih-Jen Lin</surname>
          </string-name>
          .
          <article-title>A practical guide to support vector classi cation</article-title>
          .
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>Ryutaro</given-names>
            <surname>Ichise</surname>
          </string-name>
          .
          <article-title>Machine learning approach for ontology mapping using multiple concept similarity measures</article-title>
          .
          <source>In Seventh IEEE/ACIS International Conference on Computer and Information Science</source>
          (icis
          <year>2008</year>
          ), pages
          <fpage>340</fpage>
          {
          <fpage>346</fpage>
          . IEEE,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Amir</surname>
            <given-names>Laadhar</given-names>
          </string-name>
          , Faiza Ghozzi, Imen Megdiche, Franck Ravat, Olivier Teste, and Faez Gargouri.
          <article-title>OAEI 2018 results of pomap++</article-title>
          .
          <source>In OM@ISWC</source>
          <year>2018</year>
          , pages
          <fpage>192</fpage>
          {
          <fpage>196</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Amir</surname>
            <given-names>Laadhar</given-names>
          </string-name>
          , Faiza Ghozzi, Imen Megdiche, Franck Ravat, Olivier Teste, and Faez Gargouri.
          <article-title>The impact of imbalanced training data on local matching learning of ontologies</article-title>
          .
          <source>In Business Information Systems - 22nd International Conference</source>
          ,
          <string-name>
            <surname>BIS</surname>
          </string-name>
          <year>2019</year>
          . Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>I</given-names>
          </string-name>
          , pages
          <volume>162</volume>
          {
          <fpage>175</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Amir</surname>
            <given-names>Laadhar</given-names>
          </string-name>
          , Faiza Ghozzi, Imen Megdiche, Franck Ravat, Olivier Teste, and Faez Gargouri.
          <article-title>Pomap++ results for OAEI 2019: Fully automated machine learning approach for ontology matching</article-title>
          .
          <source>In OM@ISWC</source>
          <year>2019</year>
          , pages
          <fpage>169</fpage>
          {
          <fpage>174</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18. Bing Liu.
          <article-title>Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data. Data-centric systems and applications</article-title>
          . Springer, Heidelberg ; New York,
          <volume>2</volume>
          <fpage>edition</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Christian</surname>
            <given-names>Meilicke</given-names>
          </string-name>
          , Raul Garcia-Castro, Fred Freitas, Willem Robert van Hage,
          <string-name>
            <surname>Elena</surname>
            Montiel-Ponsoda, Ryan Ribeiro de Azevedo, Heiner Stuckenschmidt, Ondrej Svab-Zamazal, Vojtech Svatek, Andrei Tamilin, Cassia Trojahn dos Santos, and
            <given-names>Shenghui</given-names>
          </string-name>
          <string-name>
            <surname>Wang</surname>
          </string-name>
          .
          <article-title>Multifarm: A benchmark for multilingual ontology matching</article-title>
          .
          <source>J. Web Semant</source>
          .,
          <volume>15</volume>
          :
          <fpage>62</fpage>
          {
          <fpage>68</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <given-names>Christian</given-names>
            <surname>Meilicke</surname>
          </string-name>
          and
          <string-name>
            <given-names>Heiner</given-names>
            <surname>Stuckenschmidt</surname>
          </string-name>
          .
          <article-title>Analyzing mapping extraction approaches</article-title>
          .
          <source>In Proceedings of the 2nd International Conference on Ontology Matching-Volume</source>
          <volume>304</volume>
          , pages
          <fpage>25</fpage>
          {
          <fpage>36</fpage>
          . CEUR-WS. org,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Tomas</surname>
            <given-names>Mikolov</given-names>
          </string-name>
          , Kai Chen, Greg Corrado, and
          <article-title>Je rey Dean. E cient estimation of word representations in vector space</article-title>
          .
          <source>In 1st International Conference on Learning Representations, ICLR</source>
          <year>2013</year>
          , Workshop Track Proceedings,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Tomas</surname>
            <given-names>Mikolov</given-names>
          </string-name>
          , Ilya Sutskever, Kai Chen, Gregory S. Corrado, and
          <article-title>Je rey Dean. Distributed representations of words and phrases and their compositionality</article-title>
          .
          <source>In Advances in neural information processing systems</source>
          , pages
          <volume>3111</volume>
          {
          <fpage>3119</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Alexander C Mu</surname>
          </string-name>
          <article-title>ller and Heiko Paulheim</article-title>
          .
          <article-title>Towards combining ontology matchers via anomaly detection</article-title>
          .
          <source>In OM</source>
          , pages
          <volume>40</volume>
          {
          <fpage>44</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24. DuyHoa Ngo, Zohra Bellahsene, and
          <string-name>
            <given-names>Remi</given-names>
            <surname>Coletta</surname>
          </string-name>
          .
          <article-title>A generic approach for combining linguistic and context pro le metrics in ontology matching</article-title>
          .
          <source>In OTM Confederated International Conferences</source>
          , pages
          <volume>800</volume>
          {
          <fpage>807</fpage>
          . Springer,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Axel-Cyrille Ngonga</surname>
          </string-name>
          Ngomo and Michael Roder. Hobbit:
          <article-title>Holistic benchmarking for big linked data</article-title>
          .
          <source>ERCIM News</source>
          ,
          <year>2016</year>
          (
          <volume>105</volume>
          ),
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Ikechukwu</surname>
          </string-name>
          Nkisi-Orji, Nirmalie Wiratunga,
          <string-name>
            <surname>Stewart</surname>
            <given-names>Massie</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kit-Ying Hui</surname>
            , and
            <given-names>Rachel</given-names>
          </string-name>
          <string-name>
            <surname>Heaven</surname>
          </string-name>
          .
          <article-title>Ontology alignment based on word embedding and random forest classi cation</article-title>
          .
          <source>In ECML PKDD</source>
          <year>2018</year>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>I</given-names>
          </string-name>
          , pages
          <volume>557</volume>
          {
          <fpage>572</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Jan</surname>
            <given-names>Portisch</given-names>
          </string-name>
          , Sven Hertling,
          <string-name>
            <given-names>Heiko</given-names>
            <surname>Paulheim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A</given-names>
            <surname>Visual</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Confusion</given-names>
            <surname>Matrix</surname>
          </string-name>
          .
          <article-title>Visual analysis of ontology matching results with the melt dashboard</article-title>
          .
          <source>In The Semantic Web: ESWC 2020 Satellite Events</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Jan</surname>
            <given-names>Portisch</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Michael</given-names>
            <surname>Hladik</surname>
          </string-name>
          , and Heiko Paulheim.
          <article-title>RDF2Vec Light - A Lightweight Approach for Knowledge Graph Embeddings</article-title>
          .
          <source>In Proceedings of the ISWC 2020 Posters &amp; Demonstrations</source>
          ,
          <year>2020</year>
          . in print.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <given-names>Jan</given-names>
            <surname>Portisch</surname>
          </string-name>
          and
          <string-name>
            <given-names>Heiko</given-names>
            <surname>Paulheim</surname>
          </string-name>
          .
          <article-title>Alod2vec matcher</article-title>
          .
          <source>In OM@ISWC</source>
          <year>2018</year>
          , pages
          <fpage>132</fpage>
          {
          <fpage>137</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Petar</surname>
            <given-names>Ristoski</given-names>
          </string-name>
          , Jessica Rosati, Tommaso Di Noia, Renato De Leone, and Heiko Paulheim.
          <article-title>Rdf2vec: RDF graph embeddings and their applications</article-title>
          .
          <source>Semantic Web</source>
          ,
          <volume>10</volume>
          (
          <issue>4</issue>
          ):
          <volume>721</volume>
          {
          <fpage>752</fpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Bita</surname>
            <given-names>Shadgara</given-names>
          </string-name>
          , Azadeh Haratian Nejhadia, and
          <string-name>
            <given-names>Alireza</given-names>
            <surname>Osareha</surname>
          </string-name>
          .
          <article-title>Ontology alignment using machine learning techniques</article-title>
          .
          <source>International Journal of Computer Science and Information Technology</source>
          ,
          <volume>3</volume>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Lucy Lu</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Chandra Bhagavatula, Mark Neumann, Kyle Lo, Chris Wilhelm, and
          <string-name>
            <given-names>Waleed</given-names>
            <surname>Ammar</surname>
          </string-name>
          .
          <article-title>Ontology alignment in the biomedical domain using entity de nitions and context</article-title>
          .
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Stuart</surname>
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Wrigley</surname>
            , Raul Garcia-Castro, and
            <given-names>Lyndon J. B.</given-names>
          </string-name>
          <string-name>
            <surname>Nixon</surname>
          </string-name>
          .
          <article-title>Semantic evaluation at large scale (SEALS)</article-title>
          .
          <source>In Proceedings of the 21st World Wide Web Conference, WWW 2012</source>
          , pages
          <fpage>299</fpage>
          {
          <fpage>302</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>