<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>FOIS</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Using BERT Models to Automatically Classify Domain Concepts into DOLCE Top-Level Concepts: A Study of the OAEI Ontologies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Guilherme Sousa</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rinaldo Lima</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Renata Vieira</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cassia Trojahn</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CIDEHUS, Universidade de Évora</institution>
          ,
          <country country="PT">Portugal</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universidade Rural de Pernambuco</institution>
          ,
          <addr-line>Recife</addr-line>
          ,
          <country country="BR">Brazil</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>19</volume>
      <fpage>19</fpage>
      <lpage>20</lpage>
      <abstract>
        <p>Top-level ontologies provide a set of foundational concepts that have a well-founded philosophical meaning, being a useful tool in ontology engineering. However, in practice, few domain ontologies integrate top-level concepts. One of the dificulties refers to the selection of appropriate top-level concepts. This paper presents an analysis of top-level categories of a set of well-known domain ontologies from ontology matching benchmarks. Our main hypothesis is that training classification models using only concept comments (i.e., rdfs:comment) from top-level concepts can improve reported results in the literature. We then consider the best classifiers to estimate the distribution of concepts from Ontology Alignment Evaluation Initiative (OAEI) ontologies aligned to DOLCE top-level concepts.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Foundational Ontologies</kwd>
        <kwd>Top Level Prediction</kwd>
        <kwd>Ontology Matching</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Top-level ontologies provide a set of foundational concepts that have a well-founded
philosophical meaning, being a useful tool in ontology engineering [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. They play an essential role in
diferent tasks, such as ontology matching, providing a bridge for diferent ontologies [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ].
However, not all ontologies were built using top-level ontologies as a foundation and some
existing ones are too large to be manually annotated. In this sense, the use of automatic top-level
classifiers can help to establish a link between domain and foundational ontologies. In order to
train these classifiers, a large amount of labeled data aligned with top-level concepts is required.
One relevant source of such data is OntoWordNet [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] which aligns WordNet [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] synsets to
DOLCE [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] concepts.
      </p>
      <p>
        A recent efort in such direction has been done in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], where a training dataset was constructed
using labels and comments associated with the entities in OntoWordNet, which are aligned with
top-level DOLCE concepts. Using this data, several classifiers were evaluated for predicting
top-level concepts of entities. Based on that previous work, in this paper, we evaluate the
performance of a set of classification models and the impact of using comments as features in
the classification task. We address the cases of multi-inheritance, which may lead to diferent
top-level concepts in DOLCE, in a diferent manner from this previous work by disambiguating
cases that lead to a unique top-level concept and filtering those that lead to multiple concepts.
We then select the best classifier to study the distribution of top-level concepts in well-known
domain ontologies from benchmarks used for evaluating matching systems. Our study analyses
the distribution of the concepts of the ontologies from each track from the Ontology Alignment
Evaluation Initiative (OAEI) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. This is the first efort in such direction and our intuition is that
concepts in ontology correspondences (a correspondence is a triple involving a source concept
in the source ontology and a target concept from the target ontology, together with a relation
between them).
      </p>
      <p>The remainder of this paper is structured as follows: in Section 2 related work is presented.
In Section 3 the multi-inheritance is discussed along with the approaches adopted. Section 4
presents an evaluation of the performance of the models. Section 5 presents the analysis of
OAEI ontologies and, finally, in Section 6 the conclusion and future work are discussed.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Previous work has shown the importance of associating concepts from top-level to domain
ontologies. In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], correspondences between DBPedia ontology and DOLCE-Zero [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] have been
used to identify inconsistent statements in DBPedia. In [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] an alignment between a foundational
ontology (BFO) and a biomedical ontology (GO) is used for filtering out correspondences at the
domain level that relate two diferent kinds of ontology entities.
      </p>
      <p>
        Analyzing the impact of using top ontologies as semantic bridges in ontology matching
have been done in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], where a set of algorithms exploiting such bridges are applied and the
circumstances under which foundational ontologies improve matching approaches are studied.
They propose algorithms that use structural and mixed information and show that diferent
combinations have diferent impacts on precision and recall. In [ 13], where OAEI ontologies
were manually aligned to UFO, adopting a set of patterns grounded by UFO ontology. In [14],
a domain ontology describing web services (OWL-S) has been manually aligned to
DOLCELite-Plus, in order to overcome conceptual ambiguity, poor axiomatization, loose design, and
narrow scope of the domain ontology. The dificulties of such a manual alignment have also
been addressed in [15], where the authors evaluate the performance of manual classification of
entities in top-level concepts. The experiment was conducted by asking experts to manually
classify a set of entities into top-level concepts. They showed a high level of disagreement
between experts and that a methodological framework for this integration is needed.
      </p>
      <p>
        In order to automate this process, in [16], word sense disambiguation and word embedding
models have been used to automatically align top and domain concepts. The evaluation has been
conducted with the task of associating DOLCE and SUMO top-level concepts to ontologies from
three diferent domains. Automatisation has been also addressed in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. The authors organize
two datasets based on OntoWordNet with the goal of training top-level concept classifiers of
ontology entities. The first dataset contains OntoWordNet entities with their respective DOLCE
concepts. The second dataset contains the same entities but is classified into 5 top-level concepts
(Endurant, Perdurant, Quality, Situation, and Abstract). Along with the datasets, the authors
evaluate several models that predict the top-level concept based on entity labels and comments.
In their following work [17], they propose a method to extract two datasets from OntoWordNet
that have target concepts from DOLCE Lite and DOLCE Lite Plus. Diferent language models
are evaluated in the task of predicting the top-level concept from the textual comments, the
BERT base achieves the best results in predicting the concept from comments. Their models
were not available at the time we conducted our study.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Materials and Methods</title>
      <sec id="sec-3-1">
        <title>3.1. Training Datasets</title>
        <p>
          This section describes the characteristics of the datasets from [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] and how they have been rebuilt
and extended in order to deal with multi-inheritance cases1.
        </p>
        <p>
          Dataset Lopes22-5c This is the original dataset from [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], which is used to train models for
top-level concept prediction. The dataset is built from OntoWordNet containing 116838 entities.
It provides links for each concept of OntoWordNet to one of the 5 top concepts of DOLCE
(Endurant, Perdurant, Quality, Situation, and Abstract). This dataset is composed of 3 columns
(Table 1): Concept (DOLCE top-level concept), Label (OntoWordNet entity label, rdfs:label,
and Comment (OntoWordNet entity comment, rdfs:comment).
        </p>
        <p>Dataset Sousa23-5c The Sousa23-5c rebuilds the Lopes22-5c dataset while taking into
account the problem of multi-inheritance, which is further detailed in Section 3.1.1. Hence,
the resulting dataset Sousa23-5c difers from Lopes22-5c since the strategy of dealing with
multi-inheritance filters ambiguous entities, while in Lopes23-5c, the entity is inserted multiple
times with each possible top-level concept.</p>
        <p>1The source code used in dataset generation and experiments can be found at https://gitlab.irit.fr/melodi/
ontology-matching/top-level. For the rebuild of the dataset, the version of OntoWordNet used was downloaded
from http://www.loa.istc.cnr.it/ontologies/OWN/OWN.owl(on 01/04/23) having 66065 entities</p>
        <p>Dataset Sousa23-6c Another characteristic of Lopes22-5c dataset is its highly imbalance
concept distribution (Endurant 76%, Perdurant 10%, Quality 4%, Situation 6%, Abstract 3%). Our
proposal to deal with this problem is to break Endurant into two groups of concepts, generating
a more balanced dataset with 6 concepts. This dataset was built from Sousa23-5c by following
the hierarchy of entities until reaching one of the 5 top-level concepts (Endurant, Perdurant,
Quality, Situation, Abstract), and in the case that the type Endurant is found, it is replaced by
the immediate child in the path (Physical-endurant or Non-physical-endurant).</p>
        <p>The concept distribution of the 3 datasets is presented in Table 2.</p>
        <sec id="sec-3-1-1">
          <title>3.1.1. Dealing with Multi-Inheritance in DOLCE and OntoWordNet</title>
          <p>In order to deal with the case of multiple paths to the top-level concepts, we consider two distinct
scenarios: one when the multi-inheritance occurs in the Wordnet part of OntoWordNet, and the
other when it is present in the DOLCE hierarchy. If an entity in WordNet has multi-inheritance,
many paths are traversed, and if they all lead to the same type, the entity is added to the dataset.
If the paths diverge, the entity is ignored.</p>
          <p>In DOLCE, one example of a concept without a direct path to the proposed top-level concepts
is Physical-realization that is a sub-concept of Spatio-temporal-particular and which in turn is the
super concept of Endurant, Perdurant, and Quality causing ambiguity. To deal with these cases,
when multi-inheritance occurs in the DOLCE hierarchy, a breadth-first search is performed until
one of the defined top-level concepts is found. If the concept found is Spatio-temporal-particular,
then the WordNet entity is not added to the dataset.</p>
          <p>From the total of 66065 entities present in OntoWordNet, 889 entities in the WordNet hierarchy
have multi-inheritance. However, 5023 entities in the DOLCE hierarchy remain ambiguous even
after applying the strategy mentioned above. To solve this problem, for DOLCE concepts that
do not have a direct superclass, a breadth-first search is performed by traversing the predicates
RDFS.subClassOf, OWL.equivalentClass, OWL.intersectionOf, OWL.unionOf, RDF.first , RDF.rest in
decreasing order priority, adding the resulting objects to the priority queue used for the search.
Using this approach, the distribution of concepts remained deterministic over several runs.</p>
          <p>After this disambiguation process, entities are post-processed, in which labels are selected
from the entity name in the WordNet part and rdfs:comment as comments. comments are
then converted to lowercase. Both newline characters and quotes are removed, and semicolons
are replaced with periods. Labels having synonyms separated by two underscores (‘__‘) are
split and generate new rows in the dataset. For example, SOFTHEARTEDNESS__TENDERNESS
is split and generates two entries in the dataset SOFTHEARTEDNESS and TENDERNESS.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>3.1.2. Test Dataset</title>
          <p>For the purpose of evaluating the performance of the classification models, 2 testing datasets
based on OAEI Conference track ontologies2 were created. The top-level concepts are assigned
using an existing reference alignment provided in [18] that aligns the highest concepts in
Conference to DOLCE. The Conference dataset contains 70 correspondences between the
concepts in Conference with concepts in DOLCE-Lite-Plus (DLP). The sub-concepts of
topconcepts in Conference are aligned, by transitivity, to the top-level concepts in DOLCE. From
the 70 alignments present in the initial reference alignment, 1 has multiple paths leading to
the same top-level concept, whereas 34 were ambiguous and were manually assigned with
the concept Endurant. The resulting datasets have 5 concepts (Conference-5c) and 6 concepts
(Conference-6c) and their respective distributions are shown in Table 3.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Learning Models</title>
        <p>
          In [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], the prediction model relies on the use of labels and comments. The system is composed of
two parts as can be seen in Figure 1 a). The first Feed-Forward Neural Network (FNN) part has
as input the average of the embeddings of the words contained in the labels, whereas the second
part consists of a BiLSTM [19] neural architecture that contextualizes learned embeddings for
each word in the dataset comment. After passing through the BiLSTM, an average pooling is
applied to generate the embedding representation of the whole comment. The BiLSTM part of
this architecture has the same setting as ELMO [20] one.
        </p>
        <p>However, using more robust architectures like BERT [21] may achieve improved results in
this task as also reported in [17]. One of the reasons is that BERT can generate better natural
language text representations due to its capacity of managing context. Another point is that
some entities have the same label while being assigned to diferent top concepts in the dataset
Lopes22-5c. This can hamper the model’s ability to distinguish among the diferent concepts
2https://oaei.ontologymatching.org/2022/conference/index.html (on 01/07/23)
while giving less importance to the label part of the input. Another issue is that, while comments
can impact the training step, ontologies often contain a low amount of comments. Since the
model makes a distinction between labels and comments, its capacity for generalization in the
test phase can be reduced.</p>
        <p>In this way, unifying the model’s input between labels and comments can improve the model’s
performance since it will be able to take advantage of the information from comments during
training while being able to work only with labels when comments are not present. Based on
those assumptions, for better generalization, we used BERT with a classification head to predict
the top-level concept that accepts a single text input that can be both labels or comments. The
architecture using BERT is present in Figure1 b).</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Evaluation</title>
      <sec id="sec-4-1">
        <title>4.1. Performance of the Classification Models</title>
        <p>
          In order to verify the hypothesis that the use of comment improves the baseline model results in
BERT and in the model proposed in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], the 3 datasets (Lopes22-5c, Sousa23-5c, and Sousa23-6c)
were split using 10-fold cross-validation. Before training, the majority concept instances are
reduced to match the number of instances in the minority concept using downsampling [22].
The exceeding entities are added to the test folds.
        </p>
        <p>
          In [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], diferent word embeddings are tested, however, here we selected Glove 6B [ 23] because
it provides a good balance between performance and model size. It also provides a more
straightforward implementation compared to fastText [24] which is trained using character
ngrams and needs a further tokenization procedure. Other baseline models were tested including
Bernoulli Naive Bayes (BNB), Feed Forward Neural Network (FNN), Gaussian Naive Bayes
(GNB), Decision Tree (DT), Random Forest (RF), Logistic Regression (LR), Feed-forward Neural
Network (FNN), and Support Vector Machine (SVM). The proposed model is [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] Model-Lopes
and the FNN was trained using the Adam [25] optimizer with a learning rate of 0.001 for 10
epochs with a batch size of 64. The BERT model was also trained with Adam optimizer, with a
learning rate of 0.00003, employing 1 epoch with a batch size of 64. The models are evaluated
using the micro-F1 metric. For each model, we tested the following alternatives of input: using
the comment only, the label only, and the label+comment. The results of the evaluation on the
datasets Lopes22-5c, Sousa23-5c, and Sousa23-6c are presented in Table 4.
        </p>
        <p>One can notice that all classifiers achieved higher results using only comment as input, except
the Gaussian Naive Bayes in the datasets Lopes22-5c and Sousa23-5c. The BERT model achieved
the highest performance in all categories. And, in some cases, the BERT model obtained the same
results even using label+comment input. One possible reason for that result is that the model
can make better use of label information when appropriate due to the attention mechanism.</p>
        <p>
          The confusion matrix for the BERT model in the 3 datasets can be seen in Tables 5, 6, and
7. One can see that, in the dataset Lopes22-5c, the model tends to misclassify a considerable
amount of Perdurants into Situation (16%), and Situations into Perdurant (17%). In the dataset
Sousa23-5c, this misclassification is between Perdurant and Quality 21%. In the Sousa23-6c
dataset, the model also misclassifies 13% of Perdurants into Situation and 23% of situations into
Perdurant. Similar misclassification in Situation concept is found in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], and these results may
have a relation to the fact that Situation is nondisjoint with the other classes. In that sense, it is
not an appropriate class for a top-level classification model if a single class is required for the
task.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Evaluation on the OAEI Conference Datasets</title>
        <p>The models trained on Sousa23-5c and Sousa23-6c were tested on Conference-5c and
Conference6c, respectively. Since the downsampling technique used for balancing the training datasets
generates diferent dataset partitions, the results are evaluated in 150 steps to account for
possible variations. In this evaluation phase, all the models receive as input only the labels of
the entities, while the BERT model was trained only with comments. Model-Lopes is trained
and tested in diferent cases. Case 1: the model was trained only with labels and the test input
is fed into the label input of the model. Case 2: the model was trained only with comments and
the test input is fed into the comment input of the model. Case 3: the model was trained with
both labels and comments and the test input is fed into the label input of the model. And Case
4: the model was trained only with both labels and comments and the test input is fed into the
comment input of the model. The results are present in Table 8.</p>
        <p>In the results, the BERT model trained with the adopted hyperparameter settings is unstable
and has the highest standard deviation. This model collapses in some cases, giving the same
output for every input, causing it to have values ranging between 0 and 0.78 F-measure. The
model achieves 0.00 F-1 when it outputs Quality for every input, as the test dataset does not
have any element labeled as Quality. On the other hand, it achieves the highest F-1 (0.78) when
the model predicts Endurant for every input as, in the test dataset, 0.78% of the entities are
Endurant. The BERT model had the highest scores in Conference-5c when considering the 75%
percentile results, excluding instances of collapse. In Conference-6c, the BERT model and the
model trained and tested with comments had the best and nearly equal results. Since the BERT
model achieves the highest scores on Conference test datasets, it was selected to analyze the
OAEI datasets described in Section 5.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Applying the Best Classifier: An Evaluation on the OAEI</title>
    </sec>
    <sec id="sec-6">
      <title>Tracks</title>
      <p>This section presents an analysis of the top-level concepts in diferent OAEI tracks (Conference,
Anatomy, Complex, Food, BioML, BioDiv, MSE, and KG3 along with the characteristics of
3These tracks are described on https://oaei.ontologymatching.org/2022/ (on 01/07/23)
comments present in ontology entities. In the first subsection, an analysis of the distribution of
top concepts in the ontologies is provided. The second subsection evaluates the consistency of
the reference alignments in terms of their top-level concepts. The third subsection evaluates
the distribution of label length compared to the comments present in the training datasets.</p>
      <sec id="sec-6-1">
        <title>5.1. Distribution of Top Concepts</title>
        <p>Using the best model (BERT), the distribution of the top-level types of the concepts in the
ontologies from schema alignment tracks is estimated. The distribution concerns the entities
of each ontology present in the tracks, excluding blank nodes, and properties. For each entity,
one label is collected searching for label predicates rdfs.label, skos.prefLabel, skos.altLabel, or if
no label is found the label is retrieved from the resource identifier. As can be seen in Figure
2, the distribution of concepts for each track is distinct. The estimation by the model trained
in Sousa23-5c shows that Complex, Food, and BioDiv have a high concentration of Endurants
while the others are more distributed.</p>
        <p>The distribution by the model trained on Sousa23-6c is presented in Figure 3 as well as
its estimation. The Anatomy, Food, BioML, and KG ontologies have a high concentration of
entities in one concept. For BioDiv, the majority of entities concentrate on Physical-endurant
and Non-physical-endurant. The two models disagree with the distribution of Quality entities
between tracks. The model trained on Sousa23-5c tends to classify some Endurant as Quality
compared to the model trained on Sousa23-6c. The two models achieve similar distributions of
Perdurant and Abstract types.</p>
      </sec>
      <sec id="sec-6-2">
        <title>5.2. Alignment Consistency across Correspondence Entities</title>
        <p>We analyzed the number of correspondences having entities associated with the same
toplevel types (a correspondence is composed of a source and a target ontology entities). The
proportion can be seen in Table 9 for the estimations given by the model trained on Sousa23-5c
and Sousa23-6c. The distribution of correspondences that have the same type is similar for the
two models in Conference, MSE, CommonKG, BioML, and KG. The diference in scores for the
Endurant</p>
        <p>Anatomy track is related to the distribution given by each model. For the model Sousa23-5c, the
majority of entities are distributed as Endurant and Quality. In contrast, as the model Sousa23-6c
yielded a high concentration of entities in the same concept, the reference alignments also
follow the same tendency. In BioDiv, the model trained on Sousa23-6c achieves only 9.65% of
the alignments with the same type for the alignment between Agrovoc and Nat ontologies as
the majority of the diference between them is that the model gives physical-endurant for one
entity, and non-physical-endurant for the other. This problem does not happen with the model
trained on Sousa23-5c since both will be classified as Endurant and so, the alignments will have
the same type.
Both models trained in 5 and 6 classes yielded a few correspondences with the same types</p>
        <p>Perdurant
in the MSE track. The main reason resides in the lack of information in the entities, such as
acronyms for chemical elements, in which the model has enough information to give the correct
classification. A similar problem appears in the Conference datasets, for instance, between the
alignments of PaperAbstract and Abstract or Country and State. Without further information,
the label State is ambiguous, and the model can yield an incorrect classification based on it. As
can be seen in these examples, a dataset containing entities along with its contained subgraphs
could lead to better results for this task, as the model should have more information to deal
with the ambiguous labels.</p>
      </sec>
      <sec id="sec-6-3">
        <title>5.3. Discussion of Terminological Distribution</title>
        <p>As entity comments are equivalent to ontology comments, the models are expected to have the
best results when evaluated on them. However, comments are rare in the ontologies present in
all tracks, and in this case, the labels need to be used to predict the top level. The distribution of
labels and comments in all ontologies for schema matching is analyzed to verify the relation to
the distribution of the proposed datasets. The number of labels and comments in all tracks are
present in Table 11. Among all tracks, Anatomy, Food, and BioDiv have no comments. BioML
(MONDO), BioML (UMLS) and BioDiv have less than 1% of comments. Conference and Complex
have less than 5% and KGh while MSE have respectively 13.49% and 36.17%. Since the number
of comments is low, the labels need to be used to predict the top-level types. However, as most
machine learning models sufer from the well-known Out-of-Distribution Generalization (OOD)
[26] problem, they are hampered by both the labels that do not have a similar syntactic structure
of their comments and their length distribution difers.</p>
        <p>The frequency of the lengths of labels and comments for each entity in all ontologies of
all tracks is compared to the distribution of the comment lengths in the Sousa23-5c dataset.
As can be seen in Figure 4, the average length of the comments in the training datasets is 50
characters, however, it was noticed that the majority of the labels are shorter. Furthermore,
comments, which are relatively rare, have a high standard deviation. Also, the diferences in the
distribution of text length between labels and comments hinder the capacity for generalization
of the models.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusion and Future Work</title>
      <p>In this work, we investigated the task of top-level concept prediction. We generated one dataset
with 5 top-level concepts (Sousa23-5c) and one with 6 concepts (Sousa23-6c) based on
OntoWordNet. In this generation, we discussed the multi-inheritance problem and proposed procedures to
obtain top-level concepts for entities in unambiguous cases. In addition, classifier models were
then trained and tested varying between the use of only labels, comments, and labels+comments.
The yielded results show that the use of rdfs:comment improves the prediction performance
of classification models. These results show the importance of rdfs:comment for automated
system understanding of concepts. We selected the best-generated model to estimate the
distribution of concepts in ontologies from well-known ontology matching benchmarks (OAEI).</p>
      <p>The results show that tracks have diferent distributions among top-level types. The performed
analysis of the reference alignments showed that a high number of correspondences, as expected,
are of the same type. We consider that this gives us an estimation of the accuracy of the trained
classifiers.</p>
      <p>In future work, we intend to conduct experiments with new deep-learning architectures that
should improve the results reported in this paper. Dynamically predicting the top-level types
for each concept of an ontology should help in downstream tasks such as ontology matching.
Therefore, as diferent ontologies have distinct top-level distributions, we expect that our present
analysis could be used for generating better classification models in the near future. Since the
labels of entities are ambiguous in some cases, including the ontology structure as contextual
information for the classification models may improve the prediction of top-level concept types.
Also, the high number of correspondences that have the same type in some OAEI tracks shows
that using these tools could help increase matching systems performance by increasing the
similarity of entities with the same top-level type.
A systematic evaluation, IEEE Trans. Knowl. Data Eng. 22 (2010) 609–623. URL: https:
//doi.org/10.1109/TKDE.2009.154. doi:10.1109/TKDE.2009.154.
[13] N. F. Padilha, F. Baião, K. Revoredo, Alignment Patterns based on Unified Foundational</p>
      <p>Ontology, in: Proc. of the Brazilian Ontology Research Seminar, 2012, pp. 48–59.
[14] P. Mika, D. Oberle, A. Gangemi, M. Sabou, Foundations for Service Ontologies: Aligning</p>
      <p>OWL-S to Dolce, in: Proc. of the 13th Conf. on World Wide Web, 2004, pp. 563–572.
[15] R. Stevens, P. Lord, J. Malone, N. Matentzoglu, Measuring expert performance at manually
classifying domain entities under upper ontology classes, J. Web Semant. 57 (2019). URL:
https://doi.org/10.1016/j.websem.2018.08.004. doi:10.1016/j.websem.2018.08.004.
[16] D. Schmidt, R. Basso, C. Trojahn, R. Vieira, Matching domain and top-level ontologies
exploring word sense disambiguation and word embedding, in: E. Demidova, A.
Zaveri, E. Simperl (Eds.), Emerging Topics in Semantic Technologies - ISWC 2018
Satellite Events [best papers from 13 of the workshops co-located with the ISWC 2018
conference], volume 36 of Studies on the Semantic Web, IOS Press, 2018, pp. 27–38. URL:
https://doi.org/10.3233/978-1-61499-894-5-27. doi:10.3233/978-1-61499-894-5-27.
[17] A. Lopes, J. L. Carbonera, D. Schmidt, L. F. Garcia, F. H. Rodrigues, M. Abel, Using
terms and informal definitions to classify domain entities into top-level ontology concepts:
An approach based on language models, Knowl. Based Syst. 265 (2023) 110385. URL:
https://doi.org/10.1016/j.knosys.2023.110385. doi:10.1016/j.knosys.2023.110385.
[18] D. Schmidt, C. Trojahn, R. Vieira, M. Kamel, Validating top-level and domain ontology
alignments using wordnet, in: Proceedings of the IX ONTOBRAS Brazilian Ontology
Research Seminar, Curitiba, Brazil, October 3rd, 2016, volume 1862 of CEUR Workshop
Proceedings, CEUR-WS.org, 2016, pp. 119–130.
[19] A. Graves, N. Jaitly, A. Mohamed, Hybrid speech recognition with deep bidirectional
LSTM, in: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding,
Olomouc, Czech Republic, December 8-12, 2013, IEEE, 2013, pp. 273–278. URL: https:
//doi.org/10.1109/ASRU.2013.6707742. doi:10.1109/ASRU.2013.6707742.
[20] M. E. Peters, W. Ammar, C. Bhagavatula, R. Power, Semi-supervised sequence tagging
with bidirectional language models, in: Proceedings of the 55th Annual Meeting of the
Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August
4, Volume 1: Long Papers, Association for Computational Linguistics, 2017, pp. 1756–1765.</p>
      <p>URL: https://doi.org/10.18653/v1/P17-1161. doi:10.18653/v1/P17-1161.
[21] J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional
transformers for language understanding, in: Proceedings of the 2019 Conference of
the North American Chapter of the Association for Computational Linguistics: Human
Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume
1 (Long and Short Papers), Association for Computational Linguistics, 2019, pp. 4171–4186.</p>
      <p>URL: https://doi.org/10.18653/v1/n19-1423. doi:10.18653/v1/n19-1423.
[22] Y. Sun, A. K. Wong, M. S. Kamel, Classification of imbalanced data: A review, International
journal of pattern recognition and artificial intelligence 23 (2009) 687–719.
[23] J. Pennington, R. Socher, C. D. Manning, Glove: Global vectors for word representation, in:
Proceedings of the 2014 conference on empirical methods in natural language processing
(EMNLP), 2014, pp. 1532–1543.
[24] P. Bojanowski, E. Grave, A. Joulin, T. Mikolov, Enriching word vectors with subword
information, Trans. Assoc. Comput. Linguistics 5 (2017) 135–146. URL: https://doi.org/10.
1162/tacl_a_00051. doi:10.1162/tacl\_a\_00051.
[25] Z. Zhang, Improved adam optimizer for deep neural networks, in: 26th IEEE/ACM
International Symposium on Quality of Service, IWQoS 2018, Banf, AB, Canada, June
4-6, 2018, IEEE, 2018, pp. 1–2. URL: https://doi.org/10.1109/IWQoS.2018.8624183. doi:10.
1109/IWQoS.2018.8624183.
[26] Z. Shen, J. Liu, Y. He, X. Zhang, R. Xu, H. Yu, P. Cui, Towards out-of-distribution
generalization: A survey, CoRR abs/2108.13624 (2021). URL: https://arxiv.org/abs/2108.13624.
arXiv:2108.13624.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>McDaniel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. C.</given-names>
            <surname>Storey</surname>
          </string-name>
          ,
          <article-title>Evaluating domain ontologies: Clarification, classification, and challenges</article-title>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>52</volume>
          (
          <year>2019</year>
          )
          <volume>70</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>70</lpage>
          :
          <fpage>44</fpage>
          . URL: https://doi.org/10.1145/3329124. doi:
          <volume>10</volume>
          .1145/3329124.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>I. G.</given-names>
            <surname>Husein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Akbar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Sitohang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. N.</given-names>
            <surname>Azizah</surname>
          </string-name>
          ,
          <article-title>Review of ontology matching with background knowledge</article-title>
          ,
          <source>in: 2016 International Conference on Data and Software Engineering (ICoDSE)</source>
          , IEEE,
          <year>2016</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Trojahn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Vieira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pease</surname>
          </string-name>
          , G. Guizzardi,
          <article-title>Foundational ontologies meet ontology matching: A survey</article-title>
          ,
          <source>Semantic Web</source>
          <volume>13</volume>
          (
          <year>2022</year>
          )
          <fpage>685</fpage>
          -
          <lpage>704</lpage>
          . URL: https://doi.org/10. 3233/SW-210447. doi:
          <volume>10</volume>
          .3233/SW-210447.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gangemi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Navigli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Velardi</surname>
          </string-name>
          ,
          <article-title>The ontowordnet project: Extension and axiomatization of conceptual relations in wordnet, in: On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and</article-title>
          <string-name>
            <surname>ODBASE - OTM Confederated International</surname>
            <given-names>Conferences</given-names>
          </string-name>
          , CoopIS, DOA, and
          <article-title>ODBASE 2003, Catania</article-title>
          , Sicily, Italy, November 3-
          <issue>7</issue>
          ,
          <year>2003</year>
          , volume
          <volume>2888</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2003</year>
          , pp.
          <fpage>820</fpage>
          -
          <lpage>838</lpage>
          . URL: https://doi.org/10. 1007/978-3-
          <fpage>540</fpage>
          -39964-3_
          <fpage>52</fpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>540</fpage>
          -39964-3\_
          <fpage>52</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <article-title>Wordnet: A lexical database for english</article-title>
          ,
          <source>Commun. ACM</source>
          <volume>38</volume>
          (
          <year>1995</year>
          )
          <fpage>39</fpage>
          -
          <lpage>41</lpage>
          . URL: http://doi.acm.
          <source>org/10</source>
          .1145/219717.219748. doi:
          <volume>10</volume>
          .1145/219717.219748.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Borgo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ferrario</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gangemi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Guarino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Masolo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Porello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Sanfilippo</surname>
          </string-name>
          , L. Vieu,
          <article-title>DOLCE: A descriptive ontology for linguistic and cognitive engineering</article-title>
          ,
          <source>Appl. Ontology</source>
          <volume>17</volume>
          (
          <year>2022</year>
          )
          <fpage>45</fpage>
          -
          <lpage>69</lpage>
          . URL: https://doi.org/10.3233/AO-210259. doi:
          <volume>10</volume>
          .3233/ AO-210259.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A. G. L.</given-names>
            <surname>Junior</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Carbonera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schimdt</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Abel, Predicting the top-level ontological concepts of domain entities using word embeddings, informal definitions, and deep learning</article-title>
          ,
          <source>Expert Syst. Appl</source>
          .
          <volume>203</volume>
          (
          <year>2022</year>
          )
          <article-title>117291</article-title>
          . URL: https://doi.org/10.1016/j.eswa.
          <year>2022</year>
          .
          <volume>117291</volume>
          . doi:
          <volume>10</volume>
          .1016/j.eswa.
          <year>2022</year>
          .
          <volume>117291</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M. A. N.</given-names>
            <surname>Pour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Algergawy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Buche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. J.</given-names>
            <surname>Castro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Fallatah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Faria</surname>
          </string-name>
          , I. Fundulaki,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hertling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Horrocks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Huschka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ibanescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Jiménez-Ruiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Karam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Laadhar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lambrix</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Michel</surname>
          </string-name>
          , E. Nasr,
          <string-name>
            <given-names>H.</given-names>
            <surname>Paulheim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pesquita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Saveta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shvaiko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Trojahn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Verhey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yaman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Zamazal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Results of the ontology alignment evaluation initiative 2022</article-title>
          , in: P. Shvaiko,
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          , E. JiménezRuiz,
          <string-name>
            <given-names>O.</given-names>
            <surname>Hassanzadeh</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          Trojahn (Eds.),
          <source>Proceedings of the 17th International Workshop on Ontology Matching (OM</source>
          <year>2022</year>
          )
          <article-title>co-located with the 21th International Semantic Web Conference (ISWC 2022), Hangzhou, China, held as a virtual conference</article-title>
          ,
          <source>October</source>
          <volume>23</volume>
          ,
          <year>2022</year>
          , volume
          <volume>3324</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>84</fpage>
          -
          <lpage>128</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>H.</given-names>
            <surname>Paulheim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gangemi</surname>
          </string-name>
          ,
          <article-title>Serving DBpedia with DOLCE - More than Just Adding a Cherry on Top</article-title>
          ,
          <source>in: The Semantic Web</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>180</fpage>
          -
          <lpage>196</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gangemi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Guarino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Masolo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Oltramari</surname>
          </string-name>
          ,
          <article-title>Sweetening WORDNET with DOLCE</article-title>
          ,
          <source>AI</source>
          Magazine
          <volume>24</volume>
          (
          <year>2003</year>
          )
          <fpage>13</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>V.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Campos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cavalcanti</surname>
          </string-name>
          ,
          <article-title>An Approach for the Alignment of Biomedical Ontologies based on Foundational Ontologies, Information and Data Management 2 (</article-title>
          <year>2011</year>
          )
          <fpage>557</fpage>
          -
          <lpage>572</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>V.</given-names>
            <surname>Mascardi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Locoro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <article-title>Automatic ontology matching via upper ontologies:</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>