<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Using BERT Models to Automatically Classify Domain Concepts into DOLCE Top-Level Concepts: A Study of the OAEI Ontologies</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Guilherme</forename><surname>Sousa</surname></persName>
							<email>sousa@irit.fr</email>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">IRIT</orgName>
								<orgName type="institution" key="instit2">Institut de Recherche en Informatique de Toulouse</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Rinaldo</forename><surname>Lima</surname></persName>
							<email>rinaldo.jose@ufrpe.br</email>
							<affiliation key="aff1">
								<orgName type="institution">Universidade Rural de Pernambuco</orgName>
								<address>
									<settlement>Recife</settlement>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Renata</forename><surname>Vieira</surname></persName>
							<email>renatav@uevora.pt</email>
							<affiliation key="aff2">
								<orgName type="department">CIDEHUS</orgName>
								<orgName type="institution">Universidade de Évora</orgName>
								<address>
									<country key="PT">Portugal</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Cassia</forename><surname>Trojahn</surname></persName>
							<email>cassia.trojahn@irit.fr</email>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">IRIT</orgName>
								<orgName type="institution" key="instit2">Institut de Recherche en Informatique de Toulouse</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff3">
								<address>
									<postCode>2023</postCode>
									<settlement>Sherbrooke</settlement>
									<region>Québec</region>
									<country key="CA">Canada</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Using BERT Models to Automatically Classify Domain Concepts into DOLCE Top-Level Concepts: A Study of the OAEI Ontologies</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">9BA577484F11ED5CC41946534ED3FFAD</idno>
					<idno type="arXiv">arXiv:2108.13624.</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T19:25+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Foundational Ontologies, Top Level Prediction, Ontology Matching (C. Trojahn) 0000-0002-2896-2362 (G. Sousa)</term>
					<term>0000-0002-1388-4824 (R. Lima)</term>
					<term>0000-0003-2449-5477 (R. Vieira)</term>
					<term>0000-0003-2840-005X (C. Trojahn)</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Top-level ontologies provide a set of foundational concepts that have a well-founded philosophical meaning, being a useful tool in ontology engineering. However, in practice, few domain ontologies integrate top-level concepts. One of the difficulties refers to the selection of appropriate top-level concepts. This paper presents an analysis of top-level categories of a set of well-known domain ontologies from ontology matching benchmarks. Our main hypothesis is that training classification models using only concept comments (i.e., rdfs:comment) from top-level concepts can improve reported results in the literature. We then consider the best classifiers to estimate the distribution of concepts from Ontology Alignment Evaluation Initiative (OAEI) ontologies aligned to DOLCE top-level concepts.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Top-level ontologies provide a set of foundational concepts that have a well-founded philosophical meaning, being a useful tool in ontology engineering <ref type="bibr" target="#b0">[1]</ref>. They play an essential role in different tasks, such as ontology matching, providing a bridge for different ontologies <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3]</ref>. However, not all ontologies were built using top-level ontologies as a foundation and some existing ones are too large to be manually annotated. In this sense, the use of automatic top-level classifiers can help to establish a link between domain and foundational ontologies. In order to train these classifiers, a large amount of labeled data aligned with top-level concepts is required. One relevant source of such data is OntoWordNet <ref type="bibr" target="#b3">[4]</ref> which aligns WordNet <ref type="bibr" target="#b4">[5]</ref> synsets to DOLCE <ref type="bibr" target="#b5">[6]</ref> concepts.</p><p>A recent effort in such direction has been done in <ref type="bibr" target="#b6">[7]</ref>, where a training dataset was constructed using labels and comments associated with the entities in OntoWordNet, which are aligned with top-level DOLCE concepts. Using this data, several classifiers were evaluated for predicting top-level concepts of entities. Based on that previous work, in this paper, we evaluate the performance of a set of classification models and the impact of using comments as features in the classification task. We address the cases of multi-inheritance, which may lead to different top-level concepts in DOLCE, in a different manner from this previous work by disambiguating cases that lead to a unique top-level concept and filtering those that lead to multiple concepts. We then select the best classifier to study the distribution of top-level concepts in well-known domain ontologies from benchmarks used for evaluating matching systems. Our study analyses the distribution of the concepts of the ontologies from each track from the Ontology Alignment Evaluation Initiative (OAEI) <ref type="bibr" target="#b7">[8]</ref>. This is the first effort in such direction and our intuition is that concepts in ontology correspondences (a correspondence is a triple involving a source concept in the source ontology and a target concept from the target ontology, together with a relation between them).</p><p>The remainder of this paper is structured as follows: in Section 2 related work is presented. In Section 3 the multi-inheritance is discussed along with the approaches adopted. Section 4 presents an evaluation of the performance of the models. Section 5 presents the analysis of OAEI ontologies and, finally, in Section 6 the conclusion and future work are discussed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>Previous work has shown the importance of associating concepts from top-level to domain ontologies. In <ref type="bibr" target="#b8">[9]</ref>, correspondences between DBPedia ontology and DOLCE-Zero <ref type="bibr" target="#b9">[10]</ref> have been used to identify inconsistent statements in DBPedia. In <ref type="bibr" target="#b10">[11]</ref> an alignment between a foundational ontology (BFO) and a biomedical ontology (GO) is used for filtering out correspondences at the domain level that relate two different kinds of ontology entities.</p><p>Analyzing the impact of using top ontologies as semantic bridges in ontology matching have been done in <ref type="bibr" target="#b11">[12]</ref>, where a set of algorithms exploiting such bridges are applied and the circumstances under which foundational ontologies improve matching approaches are studied. They propose algorithms that use structural and mixed information and show that different combinations have different impacts on precision and recall. In <ref type="bibr" target="#b12">[13]</ref>, where OAEI ontologies were manually aligned to UFO, adopting a set of patterns grounded by UFO ontology. In <ref type="bibr" target="#b13">[14]</ref>, a domain ontology describing web services (OWL-S) has been manually aligned to DOLCE-Lite-Plus, in order to overcome conceptual ambiguity, poor axiomatization, loose design, and narrow scope of the domain ontology. The difficulties of such a manual alignment have also been addressed in <ref type="bibr" target="#b14">[15]</ref>, where the authors evaluate the performance of manual classification of entities in top-level concepts. The experiment was conducted by asking experts to manually classify a set of entities into top-level concepts. They showed a high level of disagreement between experts and that a methodological framework for this integration is needed.</p><p>In order to automate this process, in <ref type="bibr" target="#b15">[16]</ref>, word sense disambiguation and word embedding models have been used to automatically align top and domain concepts. The evaluation has been conducted with the task of associating DOLCE and SUMO top-level concepts to ontologies from three different domains. Automatisation has been also addressed in <ref type="bibr" target="#b6">[7]</ref>. The authors organize two datasets based on OntoWordNet with the goal of training top-level concept classifiers of ontology entities. The first dataset contains OntoWordNet entities with their respective DOLCE concepts. The second dataset contains the same entities but is classified into 5 top-level concepts (Endurant, Perdurant, Quality, Situation, and Abstract). Along with the datasets, the authors evaluate several models that predict the top-level concept based on entity labels and comments. In their following work <ref type="bibr" target="#b16">[17]</ref>, they propose a method to extract two datasets from OntoWordNet that have target concepts from DOLCE Lite and DOLCE Lite Plus. Different language models are evaluated in the task of predicting the top-level concept from the textual comments, the BERT base achieves the best results in predicting the concept from comments. Their models were not available at the time we conducted our study.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Materials and Methods</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Training Datasets</head><p>This section describes the characteristics of the datasets from <ref type="bibr" target="#b6">[7]</ref> and how they have been rebuilt and extended in order to deal with multi-inheritance cases <ref type="foot" target="#foot_0">1</ref> .</p><p>Dataset Lopes22-5c This is the original dataset from <ref type="bibr" target="#b6">[7]</ref>, which is used to train models for top-level concept prediction. The dataset is built from OntoWordNet containing 116838 entities. It provides links for each concept of OntoWordNet to one of the 5 top concepts of DOLCE (Endurant, Perdurant, Quality, Situation, and Abstract). This dataset is composed of 3 columns (Table <ref type="table" target="#tab_0">1</ref>): Concept (DOLCE top-level concept), Label (OntoWordNet entity label, rdfs:label, and Comment (OntoWordNet entity comment, rdfs:comment). </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Dataset Sousa23-5c</head><p>The Sousa23-5c rebuilds the Lopes22-5c dataset while taking into account the problem of multi-inheritance, which is further detailed in Section 3.1.1. Hence, the resulting dataset Sousa23-5c differs from Lopes22-5c since the strategy of dealing with multi-inheritance filters ambiguous entities, while in Lopes23-5c, the entity is inserted multiple times with each possible top-level concept.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Dataset Sousa23-6c</head><p>Another characteristic of Lopes22-5c dataset is its highly imbalance concept distribution (Endurant 76%, Perdurant 10%, Quality 4%, Situation 6%, Abstract 3%). Our proposal to deal with this problem is to break Endurant into two groups of concepts, generating a more balanced dataset with 6 concepts. This dataset was built from Sousa23-5c by following the hierarchy of entities until reaching one of the 5 top-level concepts (Endurant, Perdurant, Quality, Situation, Abstract), and in the case that the type Endurant is found, it is replaced by the immediate child in the path (Physical-endurant or Non-physical-endurant).</p><p>The concept distribution of the 3 datasets is presented in Table <ref type="table" target="#tab_1">2</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.1.">Dealing with Multi-Inheritance in DOLCE and OntoWordNet</head><p>In order to deal with the case of multiple paths to the top-level concepts, we consider two distinct scenarios: one when the multi-inheritance occurs in the Wordnet part of OntoWordNet, and the other when it is present in the DOLCE hierarchy. If an entity in WordNet has multi-inheritance, many paths are traversed, and if they all lead to the same type, the entity is added to the dataset.</p><p>If the paths diverge, the entity is ignored. In DOLCE, one example of a concept without a direct path to the proposed top-level concepts is Physical-realization that is a sub-concept of Spatio-temporal-particular and which in turn is the super concept of Endurant, Perdurant, and Quality causing ambiguity. To deal with these cases, when multi-inheritance occurs in the DOLCE hierarchy, a breadth-first search is performed until one of the defined top-level concepts is found. If the concept found is Spatio-temporal-particular, then the WordNet entity is not added to the dataset.</p><p>From the total of 66065 entities present in OntoWordNet, 889 entities in the WordNet hierarchy have multi-inheritance. However, 5023 entities in the DOLCE hierarchy remain ambiguous even after applying the strategy mentioned above. To solve this problem, for DOLCE concepts that do not have a direct superclass, a breadth-first search is performed by traversing the predicates RDFS.subClassOf, OWL.equivalentClass, OWL.intersectionOf, OWL.unionOf, RDF.first, RDF.rest in decreasing order priority, adding the resulting objects to the priority queue used for the search. Using this approach, the distribution of concepts remained deterministic over several runs.</p><p>After this disambiguation process, entities are post-processed, in which labels are selected from the entity name in the WordNet part and rdfs:comment as comments. comments are then converted to lowercase. Both newline characters and quotes are removed, and semicolons are replaced with periods. Labels having synonyms separated by two underscores ('__') are split and generate new rows in the dataset. For example, SOFTHEARTEDNESS__TENDERNESS is split and generates two entries in the dataset SOFTHEARTEDNESS and TENDERNESS.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.2.">Test Dataset</head><p>For the purpose of evaluating the performance of the classification models, 2 testing datasets based on OAEI Conference track ontologies <ref type="foot" target="#foot_1">2</ref> were created. The top-level concepts are assigned using an existing reference alignment provided in <ref type="bibr" target="#b17">[18]</ref> that aligns the highest concepts in Conference to DOLCE. The Conference dataset contains 70 correspondences between the concepts in Conference with concepts in DOLCE-Lite-Plus (DLP). The sub-concepts of topconcepts in Conference are aligned, by transitivity, to the top-level concepts in DOLCE. From the 70 alignments present in the initial reference alignment, 1 has multiple paths leading to the same top-level concept, whereas 34 were ambiguous and were manually assigned with the concept Endurant. The resulting datasets have 5 concepts (Conference-5c) and 6 concepts (Conference-6c) and their respective distributions are shown in Table <ref type="table" target="#tab_2">3</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Learning Models</head><p>In <ref type="bibr" target="#b6">[7]</ref>, the prediction model relies on the use of labels and comments. The system is composed of two parts as can be seen in Figure <ref type="figure" target="#fig_0">1 a</ref>). The first Feed-Forward Neural Network (FNN) part has as input the average of the embeddings of the words contained in the labels, whereas the second part consists of a BiLSTM <ref type="bibr" target="#b18">[19]</ref> neural architecture that contextualizes learned embeddings for each word in the dataset comment. After passing through the BiLSTM, an average pooling is applied to generate the embedding representation of the whole comment. The BiLSTM part of this architecture has the same setting as ELMO <ref type="bibr" target="#b19">[20]</ref> one. However, using more robust architectures like BERT <ref type="bibr" target="#b20">[21]</ref> may achieve improved results in this task as also reported in <ref type="bibr" target="#b16">[17]</ref>. One of the reasons is that BERT can generate better natural language text representations due to its capacity of managing context. Another point is that some entities have the same label while being assigned to different top concepts in the dataset Lopes22-5c. This can hamper the model's ability to distinguish among the different concepts while giving less importance to the label part of the input. Another issue is that, while comments can impact the training step, ontologies often contain a low amount of comments. Since the model makes a distinction between labels and comments, its capacity for generalization in the test phase can be reduced.</p><p>In this way, unifying the model's input between labels and comments can improve the model's performance since it will be able to take advantage of the information from comments during training while being able to work only with labels when comments are not present. Based on those assumptions, for better generalization, we used BERT with a classification head to predict the top-level concept that accepts a single text input that can be both labels or comments. The architecture using BERT is present in Figure1 b).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experimental Evaluation</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Performance of the Classification Models</head><p>In order to verify the hypothesis that the use of comment improves the baseline model results in BERT and in the model proposed in <ref type="bibr" target="#b6">[7]</ref>, the 3 datasets (Lopes22-5c, Sousa23-5c, and Sousa23-6c) were split using 10-fold cross-validation. Before training, the majority concept instances are reduced to match the number of instances in the minority concept using downsampling <ref type="bibr" target="#b21">[22]</ref>. The exceeding entities are added to the test folds.</p><p>In <ref type="bibr" target="#b6">[7]</ref>, different word embeddings are tested, however, here we selected Glove 6B <ref type="bibr" target="#b22">[23]</ref> because it provides a good balance between performance and model size. It also provides a more straightforward implementation compared to fastText <ref type="bibr" target="#b23">[24]</ref> which is trained using character ngrams and needs a further tokenization procedure. Other baseline models were tested including Bernoulli Naive Bayes (BNB), Feed Forward Neural Network (FNN), Gaussian Naive Bayes (GNB), Decision Tree (DT), Random Forest (RF), Logistic Regression (LR), Feed-forward Neural Network (FNN), and Support Vector Machine (SVM). The proposed model is <ref type="bibr" target="#b6">[7]</ref> Model-Lopes and the FNN was trained using the Adam <ref type="bibr" target="#b24">[25]</ref> optimizer with a learning rate of 0.001 for 10 epochs with a batch size of 64. The BERT model was also trained with Adam optimizer, with a learning rate of 0.00003, employing 1 epoch with a batch size of 64. The models are evaluated using the micro-F1 metric. For each model, we tested the following alternatives of input: using the comment only, the label only, and the label+comment. The results of the evaluation on the datasets Lopes22-5c, Sousa23-5c, and Sousa23-6c are presented in Table <ref type="table" target="#tab_3">4</ref>. One can notice that all classifiers achieved higher results using only comment as input, except the Gaussian Naive Bayes in the datasets Lopes22-5c and Sousa23-5c. The BERT model achieved the highest performance in all categories. And, in some cases, the BERT model obtained the same results even using label+comment input. One possible reason for that result is that the model can make better use of label information when appropriate due to the attention mechanism.</p><p>The confusion matrix for the BERT model in the 3 datasets can be seen in Tables <ref type="table" target="#tab_5">5, 6</ref>, and 7. One can see that, in the dataset Lopes22-5c, the model tends to misclassify a considerable amount of Perdurants into Situation (16%), and Situations into Perdurant (17%). In the dataset Sousa23-5c, this misclassification is between Perdurant and Quality 21%. In the Sousa23-6c dataset, the model also misclassifies 13% of Perdurants into Situation and 23% of situations into Perdurant. Similar misclassification in Situation concept is found in <ref type="bibr" target="#b6">[7]</ref>, and these results may have a relation to the fact that Situation is nondisjoint with the other classes. In that sense, it is not an appropriate class for a top-level classification model if a single class is required for the task. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Evaluation on the OAEI Conference Datasets</head><p>The models trained on Sousa23-5c and Sousa23-6c were tested on Conference-5c and Conference-6c, respectively. Since the downsampling technique used for balancing the training datasets generates different dataset partitions, the results are evaluated in 150 steps to account for possible variations. In this evaluation phase, all the models receive as input only the labels of the entities, while the BERT model was trained only with comments. Model-Lopes is trained  <ref type="table" target="#tab_7">8</ref>.</p><p>In the results, the BERT model trained with the adopted hyperparameter settings is unstable and has the highest standard deviation. This model collapses in some cases, giving the same output for every input, causing it to have values ranging between 0 and 0.78 F-measure. The model achieves 0.00 F-1 when it outputs Quality for every input, as the test dataset does not have any element labeled as Quality. On the other hand, it achieves the highest F-1 (0.78) when the model predicts Endurant for every input as, in the test dataset, 0.78% of the entities are Endurant. The BERT model had the highest scores in Conference-5c when considering the 75% percentile results, excluding instances of collapse. In Conference-6c, the BERT model and the model trained and tested with comments had the best and nearly equal results. Since the BERT model achieves the highest scores on Conference test datasets, it was selected to analyze the OAEI datasets described in Section 5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Applying the Best Classifier: An Evaluation on the OAEI Tracks</head><p>This section presents an analysis of the top-level concepts in different OAEI tracks (Conference, Anatomy, Complex, Food, BioML, BioDiv, MSE, and KG<ref type="foot" target="#foot_2">3</ref> along with the characteristics of comments present in ontology entities. In the first subsection, an analysis of the distribution of top concepts in the ontologies is provided. The second subsection evaluates the consistency of the reference alignments in terms of their top-level concepts. The third subsection evaluates the distribution of label length compared to the comments present in the training datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Distribution of Top Concepts</head><p>Using the best model (BERT), the distribution of the top-level types of the concepts in the ontologies from schema alignment tracks is estimated. The distribution concerns the entities of each ontology present in the tracks, excluding blank nodes, and properties. For each entity, one label is collected searching for label predicates rdfs.label, skos.prefLabel, skos.altLabel, or if no label is found the label is retrieved from the resource identifier. As can be seen in Figure <ref type="figure" target="#fig_1">2</ref>, the distribution of concepts for each track is distinct. The estimation by the model trained in Sousa23-5c shows that Complex, Food, and BioDiv have a high concentration of Endurants while the others are more distributed.</p><p>The distribution by the model trained on Sousa23-6c is presented in Figure <ref type="figure" target="#fig_2">3</ref> as well as its estimation. The Anatomy, Food, BioML, and KG ontologies have a high concentration of entities in one concept. For BioDiv, the majority of entities concentrate on Physical-endurant and Non-physical-endurant. The two models disagree with the distribution of Quality entities between tracks. The model trained on Sousa23-5c tends to classify some Endurant as Quality compared to the model trained on Sousa23-6c. The two models achieve similar distributions of Perdurant and Abstract types.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Alignment Consistency across Correspondence Entities</head><p>We analyzed the number of correspondences having entities associated with the same toplevel types (a correspondence is composed of a source and a target ontology entities). The proportion can be seen in Table <ref type="table" target="#tab_9">9</ref> for the estimations given by the model trained on Sousa23-5c and Sousa23-6c. The distribution of correspondences that have the same type is similar for the two models in Conference, MSE, CommonKG, BioML, and KG. The difference in scores for the  Anatomy track is related to the distribution given by each model. For the model Sousa23-5c, the majority of entities are distributed as Endurant and Quality. In contrast, as the model Sousa23-6c yielded a high concentration of entities in the same concept, the reference alignments also follow the same tendency. In BioDiv, the model trained on Sousa23-6c achieves only 9.65% of the alignments with the same type for the alignment between Agrovoc and Nat ontologies as the majority of the difference between them is that the model gives physical-endurant for one entity, and non-physical-endurant for the other. This problem does not happen with the model trained on Sousa23-5c since both will be classified as Endurant and so, the alignments will have the same type.   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.">Discussion of Terminological Distribution</head><p>As entity comments are equivalent to ontology comments, the models are expected to have the best results when evaluated on them. However, comments are rare in the ontologies present in all tracks, and in this case, the labels need to be used to predict the top level. The distribution of labels and comments in all ontologies for schema matching is analyzed to verify the relation to the distribution of the proposed datasets. The number of labels and comments in all tracks are present in Table <ref type="table" target="#tab_11">11</ref>. Among all tracks, Anatomy, Food, and BioDiv have no comments. BioML (MONDO), BioML (UMLS) and BioDiv have less than 1% of comments. Conference and Complex have less than 5% and KGh while MSE have respectively 13.49% and 36.17%. Since the number of comments is low, the labels need to be used to predict the top-level types. However, as most machine learning models suffer from the well-known Out-of-Distribution Generalization (OOD) <ref type="bibr" target="#b25">[26]</ref> problem, they are hampered by both the labels that do not have a similar syntactic structure of their comments and their length distribution differs. The frequency of the lengths of labels and comments for each entity in all ontologies of all tracks is compared to the distribution of the comment lengths in the Sousa23-5c dataset. As can be seen in Figure <ref type="figure" target="#fig_3">4</ref>, the average length of the comments in the training datasets is 50 characters, however, it was noticed that the majority of the labels are shorter. Furthermore, comments, which are relatively rare, have a high standard deviation. Also, the differences in the distribution of text length between labels and comments hinder the capacity for generalization of the models.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion and Future Work</head><p>In this work, we investigated the task of top-level concept prediction. We generated one dataset with 5 top-level concepts (Sousa23-5c) and one with 6 concepts (Sousa23-6c) based on OntoWord-Net. In this generation, we discussed the multi-inheritance problem and proposed procedures to obtain top-level concepts for entities in unambiguous cases. In addition, classifier models were then trained and tested varying between the use of only labels, comments, and labels+comments. The yielded results show that the use of rdfs:comment improves the prediction performance of classification models. These results show the importance of rdfs:comment for automated system understanding of concepts. We selected the best-generated model to estimate the distribution of concepts in ontologies from well-known ontology matching benchmarks (OAEI). The results show that tracks have different distributions among top-level types. The performed analysis of the reference alignments showed that a high number of correspondences, as expected, are of the same type. We consider that this gives us an estimation of the accuracy of the trained classifiers.</p><p>In future work, we intend to conduct experiments with new deep-learning architectures that should improve the results reported in this paper. Dynamically predicting the top-level types for each concept of an ontology should help in downstream tasks such as ontology matching. Therefore, as different ontologies have distinct top-level distributions, we expect that our present analysis could be used for generating better classification models in the near future. Since the labels of entities are ambiguous in some cases, including the ontology structure as contextual information for the classification models may improve the prediction of top-level concept types. Also, the high number of correspondences that have the same type in some OAEI tracks shows that using these tools could help increase matching systems performance by increasing the similarity of entities with the same top-level type.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: a) The proposed model in [7]. b) The BERT model.</figDesc><graphic coords="6,97.61,84.20,400.05,359.73" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: OAEI tracks distribution with 5 concepts.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: OAEI tracks distribution with 6 concepts.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Distribution between labels and comments in the analyzed tracks compared with the Sousa23-5c comments. The x-axis represents the length of the text and the y-axis is the number of labels or comments in each length. The visualization is limited to texts ranging from 0 to 400 characters long.</figDesc><graphic coords="14,95.36,84.19,404.54,310.13" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Example of the Lopes22-5c dataset.</figDesc><table><row><cell>Concepts</cell><cell>Label</cell><cell>Comment</cell></row><row><cell>endurant</cell><cell>order</cell><cell>established customary state esp. of society. 'order</cell></row><row><cell></cell><cell></cell><cell>ruled in the streets'. 'law and order'</cell></row><row><cell>endurant</cell><cell>ritual</cell><cell>the prescribed procedure for conducting religious</cell></row><row><cell></cell><cell></cell><cell>ceremonies</cell></row><row><cell cols="3">endurant celebration the public performance of a sacrament or solemn</cell></row><row><cell></cell><cell></cell><cell>ceremony with all appropriate ritual. 'the celebra-</cell></row><row><cell></cell><cell></cell><cell>tion of marriage'</cell></row><row><cell>endurant</cell><cell>ritual</cell><cell>stereotyped behavior</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 :</head><label>2</label><figDesc>Concept</figDesc><table><row><cell cols="4">distribution of the datasets. Sousa-5c deals with multi-inheritance and</cell></row><row><cell cols="4">Sousa23-6c does not have the Endurant concept. The number of perdurants can</cell></row><row><cell cols="4">vary since Spatio-temporal-particular is the super concept of both Endurant and Per-</cell></row><row><cell cols="4">durant and due to the path selection method used, different types can be reached by</cell></row><row><cell cols="3">varying the number of perdurants in the two datasets.</cell><cell></cell></row><row><cell>concept</cell><cell>Lopes22-5c</cell><cell>Sousa23-5c</cell><cell>Sousa23-6c</cell></row><row><cell>Endurant</cell><cell cols="2">88410 (75.7%) 27900 (53.0%)</cell><cell>-</cell></row><row><cell>Physical-endurant</cell><cell>-</cell><cell>-</cell><cell>44553 (41.7%)</cell></row><row><cell>Non-physical-endurant</cell><cell>-</cell><cell>-</cell><cell>35853 (33.5%)</cell></row><row><cell>Perdurant</cell><cell cols="3">11683 (10.0%) 9045 (17.2%) 10847 (10.1%)</cell></row><row><cell>Quality</cell><cell>4948 (4.2%)</cell><cell>4245 (8.1%)</cell><cell>4245 (4.0%)</cell></row><row><cell>Situation</cell><cell>7763 (6.6%)</cell><cell>7157 (13.6%)</cell><cell>7157 (6.7%)</cell></row><row><cell>Abstract</cell><cell>4035 (3.5%)</cell><cell>4268 (8.1%)</cell><cell>4268 (4.0%)</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3 :</head><label>3</label><figDesc>Distribution of concepts in the alignment between Conference and DOLCE with 5 concepts.</figDesc><table><row><cell>Endurant</cell><cell></cell><cell>Perdurant Situation Abstract Quality</cell></row><row><cell cols="2">473 (77.8%)</cell></row><row><cell cols="2">Physical-endurant Non-physical-endurant</cell><cell>95 (15.6%) 34 (5.6%) 6 (1.0%) 0 (0.0%)</cell></row><row><cell>0</cell><cell>473 (77.8%)</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4 :</head><label>4</label><figDesc>Results of all models in terms of micro F1 score evaluated in the datasets Lopes22-5c,</figDesc><table><row><cell cols="2">Sousa23-5c, Sousa23-6c.</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell cols="3">Lopes22-5c</cell><cell></cell><cell cols="2">Sousa23-5c</cell><cell></cell><cell cols="2">Sousa23-6c</cell></row><row><cell>model</cell><cell>c</cell><cell>l</cell><cell>l+c</cell><cell>c</cell><cell>l</cell><cell>l+c</cell><cell>c</cell><cell>l</cell><cell>l+c</cell></row><row><cell>BERT</cell><cell cols="9">0.87 0.71 0.88 0.88 0.71 0.88 0.87 0.69 0.87</cell></row><row><cell cols="10">Model-Lopes 0.75 0.36 0.73 0.76 0.37 0.75 0.74 0.35 0.74</cell></row><row><cell>SVM</cell><cell cols="9">0.68 0.36 0.43 0.68 0.38 0.45 0.64 0.37 0.43</cell></row><row><cell>RF</cell><cell cols="9">0.64 0.38 0.37 0.65 0.38 0.38 0.58 0.36 0.36</cell></row><row><cell>FNN</cell><cell cols="9">0.59 0.33 0.39 0.58 0.34 0.39 0.57 0.32 0.38</cell></row><row><cell>LR</cell><cell cols="9">0.52 0.26 0.37 0.53 0.25 0.37 0.49 0.24 0.35</cell></row><row><cell>BNB</cell><cell cols="9">0.49 0.26 0.35 0.5 0.25 0.34 0.40 0.23 0.31</cell></row><row><cell>DT</cell><cell cols="9">0.40 0.27 0.25 0.41 0.27 0.25 0.40 0.25 0.23</cell></row><row><cell>GNB</cell><cell cols="9">0.31 0.43 0.43 0.30 0.44 0.44 0.46 0.33 0.37</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 5 :</head><label>5</label><figDesc>BERT confusion matrix evaluated in (Lopes22-5c).</figDesc><table><row><cell>concept</cell><cell cols="3">Endurant Perdurant Quality</cell><cell>Abstract</cell><cell>Situation</cell></row><row><cell cols="2">Endurant 753515</cell><cell>22208</cell><cell>19023</cell><cell>21977</cell><cell>31062</cell></row><row><cell></cell><cell>(88.9%)</cell><cell>(2.6%)</cell><cell>(2.2%)</cell><cell>(2.6%)</cell><cell>(3.7%)</cell></row><row><cell cols="2">Perdurant 1584</cell><cell>62592</cell><cell>2220</cell><cell cols="2">895 (1.1%) 13224</cell></row><row><cell></cell><cell>(2.0%)</cell><cell>(77.7%)</cell><cell>(2.8%)</cell><cell></cell><cell>(16.4%)</cell></row><row><cell>Quality</cell><cell cols="3">116 (0.9%) 381 (2.9%) 11666</cell><cell cols="2">397 (3.0%) 595 (4.5%)</cell></row><row><cell></cell><cell></cell><cell></cell><cell>(88.7%)</cell><cell></cell></row><row><cell cols="5">Abstract 142 (3.5%) 31 (0.8%) 44 (1.1%) 3795</cell><cell>23 (0.6%)</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell>(94.1%)</cell></row><row><cell cols="3">Situation 718 (1.7%) 7017</cell><cell>1449</cell><cell cols="2">230 (0.6%) 31901</cell></row><row><cell></cell><cell></cell><cell>(17.0%)</cell><cell>(3.5%)</cell><cell></cell><cell>(77.2%)</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 6 :</head><label>6</label><figDesc>BERT confusion matrix evaluated in (Sousa23-5c)</figDesc><table><row><cell>concept</cell><cell cols="3">Endurant Perdurant Quality</cell><cell>Abstract</cell><cell>Situation</cell></row><row><cell cols="2">Endurant 682031</cell><cell>22033</cell><cell>20942</cell><cell>24965</cell><cell>15884</cell></row><row><cell></cell><cell>(89.1%)</cell><cell>(2.9%)</cell><cell>(2.7%)</cell><cell>(3.3%)</cell><cell>(2.1%)</cell></row><row><cell cols="2">Perdurant 1410</cell><cell>54861</cell><cell>10339</cell><cell cols="2">995 (1.4%) 2660</cell></row><row><cell></cell><cell>(2.0%)</cell><cell>(78.1%)</cell><cell>(14.7%)</cell><cell></cell><cell>(3.8%)</cell></row><row><cell>Quality</cell><cell cols="2">571 (1.7%) 7052</cell><cell>24064</cell><cell cols="2">326 (1.0%) 1352</cell></row><row><cell></cell><cell></cell><cell>(21.1%)</cell><cell>(72.1%)</cell><cell></cell><cell>(4.1%)</cell></row><row><cell cols="5">Abstract 94 (2.1%) 59 (1.3%) 20 (0.4%) 4047</cell><cell>255 (5.7%)</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell>(90.4%)</cell></row><row><cell cols="6">Situation 49 (1.2%) 130 (3.1%) 112 (2.6%) 99 (2.3%) 3855</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell>(90.8%)</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 7 :</head><label>7</label><figDesc>BERT confusion matrix evaluated in (Sousa23-6c) and tested in different cases. Case 1: the model was trained only with labels and the test input is fed into the label input of the model. Case 2: the model was trained only with comments and the test input is fed into the comment input of the model. Case 3: the model was trained with both labels and comments and the test input is fed into the label input of the model. And Case 4: the model was trained only with both labels and comments and the test input is fed into the comment input of the model. The results are present in Table</figDesc><table><row><cell>concept</cell><cell>Physical-</cell><cell>Non-</cell><cell cols="2">Perdurant Quality</cell><cell>Abstract</cell><cell>Situation</cell></row><row><cell></cell><cell>endurant</cell><cell>physical-</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell></cell><cell>endurant</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>Physical-endurant</cell><cell>377547</cell><cell>12629</cell><cell>2422</cell><cell>2062</cell><cell>10657</cell><cell>2008</cell></row><row><cell></cell><cell>(92.7%)</cell><cell>(3.1%)</cell><cell>(0.6%)</cell><cell>(0.5%)</cell><cell>(2.6%)</cell><cell>(0.5%)</cell></row><row><cell cols="2">Non-physical-endurant 9329</cell><cell>264356</cell><cell>13729</cell><cell>8926</cell><cell>11090</cell><cell>12895</cell></row><row><cell></cell><cell>(2.9%)</cell><cell>(82.5%)</cell><cell>(4.3%)</cell><cell>(2.8%)</cell><cell>(3.5%)</cell><cell>(4.0%)</cell></row><row><cell>Perdurant</cell><cell cols="2">437 (0.6%) 1145</cell><cell>55421</cell><cell>2430</cell><cell>1412</cell><cell>9420</cell></row><row><cell></cell><cell></cell><cell>(1.6%)</cell><cell>(78.9%)</cell><cell>(3.5%)</cell><cell>(2.0%)</cell><cell>(13.4%)</cell></row><row><cell>Quality</cell><cell cols="4">21 (0.5%) 27 (0.6%) 122 (2.9%) 3688</cell><cell cols="2">218 (5.1%) 169 (4.0%)</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell>(86.9%)</cell><cell></cell><cell></cell></row><row><cell>Abstract</cell><cell cols="5">90 (2.0%) 26 (0.6%) 29 (0.6%) 144 (3.2%) 4168</cell><cell>18 (0.4%)</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell>(93.1%)</cell><cell></cell></row><row><cell>Situation</cell><cell cols="2">203 (0.6%) 1001</cell><cell>7693</cell><cell>1308</cell><cell cols="2">290 (0.9%) 22870</cell></row><row><cell></cell><cell></cell><cell>(3.0%)</cell><cell>(23.1%)</cell><cell>(3.9%)</cell><cell></cell><cell>(68.5%)</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_7"><head>Table 8 :</head><label>8</label><figDesc>Results of the models in the Conference test dataset.</figDesc><table><row><cell>model</cell><cell cols="5">mean std min 25% 50% 75% max</cell></row><row><cell>BERT (Sousa23-5c)</cell><cell cols="2">0.45 0.25 0.0</cell><cell cols="3">0.17 0.54 0.66 0.78</cell></row><row><cell>BERT (Sousa23-6c)</cell><cell cols="2">0.41 0.23 0.0</cell><cell cols="2">0.29 0.48</cell><cell>0.6 0.78</cell></row><row><cell>Model-Lopes (l) (Sousa23-5c)</cell><cell cols="4">0.37 0.03 0.29 0.35 0.37</cell><cell>0.4</cell><cell>0.46</cell></row><row><cell>Model-Lopes (l) (Sousa23-6c)</cell><cell cols="5">0.19 0.03 0.12 0.17 0.19 0.21 0.25</cell></row><row><cell>Model-Lopes (c) (Sousa23-5c)</cell><cell>0.5</cell><cell cols="4">0.08 0.27 0.45 0.49 0.54 0.72</cell></row><row><cell>Model-Lopes (c) (Sousa23-6c)</cell><cell cols="5">0.55 0.07 0.38 0.51 0.56 0.61 0.71</cell></row><row><cell>Model-Lopes (l+c test l) (Sousa23-5c)</cell><cell cols="5">0.13 0.04 0.05 0.11 0.13 0.15 0.27</cell></row><row><cell>Model-Lopes (l+c test l) (Sousa23-6c)</cell><cell>0.2</cell><cell cols="2">0.08 0.07 0.15</cell><cell>0.2</cell><cell>0.25 0.52</cell></row><row><cell cols="6">Model-Lopes (l+c test c) (Sousa23-5c) 0.56 0.07 0.38 0.52 0.57 0.61 0.72</cell></row><row><cell cols="6">Model-Lopes (l+c test c) (Sousa23-6c) 0.38 0.07 0.22 0.34 0.38 0.43 0.58</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_9"><head>Table 9 :</head><label>9</label><figDesc>Number of reference alignments with the same type in source and target ontologies using both models.</figDesc><table><row><cell></cell><cell cols="3">BERT Sousa23-5c</cell><cell cols="3">BERT Sousa23-6c</cell></row><row><cell>Track</cell><cell cols="2">Same Total</cell><cell>%</cell><cell cols="2">Same Total</cell><cell>%</cell></row><row><cell>Conference</cell><cell>213</cell><cell>258</cell><cell>82.56</cell><cell>197</cell><cell>258</cell><cell>76.36</cell></row><row><cell>Anatomy</cell><cell>908</cell><cell cols="3">1516 59.89 1248</cell><cell cols="2">1516 82.32</cell></row><row><cell>MSE</cell><cell>159</cell><cell>388</cell><cell>40.98</cell><cell>162</cell><cell>388</cell><cell>41.75</cell></row><row><cell>CommonKG</cell><cell>268</cell><cell>304</cell><cell>88.16</cell><cell>264</cell><cell>304</cell><cell>86.84</cell></row><row><cell>BioDiv</cell><cell cols="6">83427 96483 86.47 42807 96483 44.37</cell></row><row><cell>BioML</cell><cell cols="6">18007 25270 71.26 18158 25270 71.86</cell></row><row><cell>KG</cell><cell cols="6">13627 15359 88.72 13452 15359 87.58</cell></row><row><cell cols="7">Both models trained in 5 and 6 classes yielded a few correspondences with the same types</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_10"><head>Table 10 :</head><label>10</label><figDesc>Some examples where the model predicts different types for the reference alignments in Conference and MSE test data.</figDesc><table><row><cell>Track</cell><cell>Source</cell><cell>Target</cell></row><row><cell cols="3">Conference PaperAbstract (Endurant) Abstract (Situation)</cell></row><row><cell></cell><cell>Country (Situation)</cell><cell>State (Quality)</cell></row><row><cell></cell><cell>Attendee (Quality)</cell><cell>Listener (Situation)</cell></row><row><cell>MSE</cell><cell>Mt (Endurant)</cell><cell>MeitneriumAtom (Perdurant)</cell></row><row><cell></cell><cell>Dy (Situation)</cell><cell>DysprosiumAtom (Endurant)</cell></row><row><cell></cell><cell>Ag (Endurant)</cell><cell>Silver (Situation)</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_11"><head>Table 11 :</head><label>11</label><figDesc>Count of labels and comments in the analyzed tracks.</figDesc><table><row><cell>track</cell><cell>entities</cell><cell>rdfs.label</cell><cell>skos.prefLabel</cell><cell>skos.altLabel</cell><cell>rdfs.comment</cell></row><row><cell>Conference</cell><cell>1020</cell><cell>0 (0.00%)</cell><cell>0 (0.00%)</cell><cell>0 (0.00%)</cell><cell>32 (3.14%)</cell></row><row><cell>Anatomy</cell><cell>12498</cell><cell>12017 (96.15%)</cell><cell>0 (0.00%)</cell><cell>0 (0.00%)</cell><cell>0 (0.00%)</cell></row><row><cell>Complex</cell><cell>32620</cell><cell>9565 (29.32%)</cell><cell>0 (0.00%)</cell><cell>0 (0.00%)</cell><cell>434 (1.33%)</cell></row><row><cell>Food NC</cell><cell>11737</cell><cell>0 (0.00%)</cell><cell>9396 (80.05%)</cell><cell>0 (0.00%)</cell><cell>0 (0.00%)</cell></row><row><cell>BioML (MONDO)</cell><cell>34872</cell><cell>33755 (96.80%)</cell><cell>0 (0.00%)</cell><cell>0 (0.00%)</cell><cell>60 (0.17%)</cell></row><row><cell>BioML (UMLS)</cell><cell>145486</cell><cell>145093 (99.73%)</cell><cell>55312 (38.02%)</cell><cell>23948 (16.46%)</cell><cell>43 (0.03%)</cell></row><row><cell cols="2">BioDiv (ncbitaxon) 233776</cell><cell>233746 (99.99%)</cell><cell>0 (0.00%)</cell><cell>5861 (2.51%)</cell><cell>0 (0.00%)</cell></row><row><cell>BioDiv</cell><cell cols="3">3599317 2582980 (71.76%) 359314 (9.98%)</cell><cell>155987 (4.33%)</cell><cell>882 (0.02%)</cell></row><row><cell>MSE</cell><cell>2411</cell><cell>953 (39.53%)</cell><cell>451 (18.71%)</cell><cell>47 (1.95%)</cell><cell>872 (36.17%)</cell></row><row><cell>KG</cell><cell cols="5">2050682 808385 (39.42%) 321695 (15.69%) 708059 (34.53%) 276631 (13.49%)</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">The source code used in dataset generation and experiments can be found at https://gitlab.irit.fr/melodi/ ontology-matching/top-level. For the rebuild of the dataset, the version of OntoWordNet used was downloaded from http://www.loa.istc.cnr.it/ontologies/OWN/OWN.owl(on 01/04/23) having 66065 entities</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://oaei.ontologymatching.org/2022/conference/index.html (on 01/07/23)</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">These tracks are described on https://oaei.ontologymatching.org/2022/ (on 01/07/23)</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Evaluating domain ontologies: Clarification, classification, and challenges</title>
		<author>
			<persName><forename type="first">M</forename><surname>Mcdaniel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">C</forename><surname>Storey</surname></persName>
		</author>
		<idno type="DOI">10.1145/3329124</idno>
		<ptr target="https://doi.org/10.1145/3329124.doi:10.1145/3329124" />
	</analytic>
	<monogr>
		<title level="j">ACM Comput. Surv</title>
		<imprint>
			<biblScope unit="volume">52</biblScope>
			<biblScope unit="page">44</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Review of ontology matching with background knowledge</title>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">G</forename><surname>Husein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Akbar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Sitohang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">N</forename><surname>Azizah</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2016 International Conference on Data and Software Engineering (ICoDSE), IEEE</title>
				<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="1" to="6" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Foundational ontologies meet ontology matching: A survey</title>
		<author>
			<persName><forename type="first">C</forename><surname>Trojahn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Vieira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Pease</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Guizzardi</surname></persName>
		</author>
		<idno type="DOI">10.3233/SW-210447</idno>
		<ptr target="https://doi.org/10.3233/SW-210447.doi:10.3233/SW-210447" />
	</analytic>
	<monogr>
		<title level="j">Semantic Web</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="page" from="685" to="704" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">The ontowordnet project: Extension and axiomatization of conceptual relations in wordnet</title>
		<author>
			<persName><forename type="first">A</forename><surname>Gangemi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Navigli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Velardi</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-540-39964-3_52</idno>
		<idno>doi:</idno>
		<ptr target="10.1007/978-3-540-39964-3\_52" />
	</analytic>
	<monogr>
		<title level="m">On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE -OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<meeting><address><addrLine>Catania, Sicily, Italy</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2003">November 3-7, 2003. 2003</date>
			<biblScope unit="volume">2888</biblScope>
			<biblScope unit="page" from="820" to="838" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Wordnet: A lexical database for english</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">A</forename><surname>Miller</surname></persName>
		</author>
		<idno type="DOI">10.1145/219717.219748</idno>
		<idno>doi:</idno>
		<ptr target="10.1145/219717.219748" />
	</analytic>
	<monogr>
		<title level="j">Commun. ACM</title>
		<imprint>
			<biblScope unit="volume">38</biblScope>
			<biblScope unit="page" from="39" to="41" />
			<date type="published" when="1995">1995</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">DOLCE: A descriptive ontology for linguistic and cognitive engineering</title>
		<author>
			<persName><forename type="first">S</forename><surname>Borgo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Ferrario</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gangemi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Guarino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Masolo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Porello</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">M</forename><surname>Sanfilippo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Vieu</surname></persName>
		</author>
		<idno type="DOI">10.3233/AO-210259</idno>
		<ptr target="https://doi.org/10.3233/AO-210259.doi:10.3233/AO-210259" />
	</analytic>
	<monogr>
		<title level="j">Appl. Ontology</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="page" from="45" to="69" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Predicting the top-level ontological concepts of domain entities using word embeddings, informal definitions, and deep learning</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">G L</forename><surname>Junior</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Carbonera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Schimdt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Abel</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.eswa.2022.117291</idno>
		<ptr target="https://doi.org/10.1016/j.eswa.2022.117291.doi:10.1016/j.eswa.2022.117291" />
	</analytic>
	<monogr>
		<title level="j">Expert Syst. Appl</title>
		<imprint>
			<biblScope unit="volume">203</biblScope>
			<biblScope unit="page">117291</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Results of the ontology alignment evaluation initiative</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A N</forename><surname>Pour</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Algergawy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Buche</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">J</forename><surname>Castro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Fallatah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Faria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Fundulaki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hertling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Horrocks</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Huschka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ibanescu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Jiménez-Ruiz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Karam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Laadhar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lambrix</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Michel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Nasr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Paulheim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Pesquita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Saveta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Shvaiko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Trojahn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Verhey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Yaman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Zamazal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zhou</surname></persName>
		</author>
		<ptr target="org" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 17th International Workshop on Ontology Matching (OM 2022) co-located with the 21th International Semantic Web Conference (ISWC 2022)</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<editor>
			<persName><forename type="first">P</forename><surname>Shvaiko</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Euzenat</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Jiménez-Ruiz</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">O</forename><surname>Hassanzadeh</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Trojahn</surname></persName>
		</editor>
		<meeting>the 17th International Workshop on Ontology Matching (OM 2022) co-located with the 21th International Semantic Web Conference (ISWC 2022)<address><addrLine>Hangzhou, China</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022-10-23">2022. October 23, 2022. 2022</date>
			<biblScope unit="volume">3324</biblScope>
			<biblScope unit="page" from="84" to="128" />
		</imprint>
	</monogr>
	<note>, held as a virtual conference</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Serving DBpedia with DOLCE -More than Just Adding a Cherry on Top</title>
		<author>
			<persName><forename type="first">H</forename><surname>Paulheim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gangemi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Semantic Web</title>
				<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="180" to="196" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Sweetening WORDNET with DOLCE</title>
		<author>
			<persName><forename type="first">A</forename><surname>Gangemi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Guarino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Masolo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Oltramari</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">AI Magazine</title>
		<imprint>
			<biblScope unit="volume">24</biblScope>
			<biblScope unit="page" from="13" to="24" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">An Approach for the Alignment of Biomedical Ontologies based on Foundational Ontologies</title>
		<author>
			<persName><forename type="first">V</forename><surname>Silva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Campos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Silva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Cavalcanti</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information and Data Management</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="557" to="572" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Automatic ontology matching via upper ontologies: A systematic evaluation</title>
		<author>
			<persName><forename type="first">V</forename><surname>Mascardi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Locoro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<idno type="DOI">10.1109/TKDE.2009.154</idno>
		<ptr target="https://doi.org/10.1109/TKDE.2009.154.doi:10.1109/TKDE.2009.154" />
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Knowl. Data Eng</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="page" from="609" to="623" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Alignment Patterns based on Unified Foundational Ontology</title>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">F</forename><surname>Padilha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Baião</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Revoredo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of the Brazilian Ontology Research Seminar</title>
				<meeting>of the Brazilian Ontology Research Seminar</meeting>
		<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="48" to="59" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Foundations for Service Ontologies: Aligning OWL-S to Dolce</title>
		<author>
			<persName><forename type="first">P</forename><surname>Mika</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Oberle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gangemi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sabou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of the 13th Conf. on World Wide Web</title>
				<meeting>of the 13th Conf. on World Wide Web</meeting>
		<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="page" from="563" to="572" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Measuring expert performance at manually classifying domain entities under upper ontology classes</title>
		<author>
			<persName><forename type="first">R</forename><surname>Stevens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lord</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Malone</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Matentzoglu</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.websem.2018.08.004</idno>
		<ptr target="https://doi.org/10.1016/j.websem.2018.08.004.doi:10.1016/j.websem.2018.08.004" />
	</analytic>
	<monogr>
		<title level="j">J. Web Semant</title>
		<imprint>
			<biblScope unit="volume">57</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Matching domain and top-level ontologies exploring word sense disambiguation and word embedding</title>
		<author>
			<persName><forename type="first">D</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Basso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Trojahn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Vieira</surname></persName>
		</author>
		<idno type="DOI">10.3233/978-1-61499-894-5-27</idno>
		<idno>-61499-894-5-27</idno>
		<ptr target="10.3233/978-1" />
	</analytic>
	<monogr>
		<title level="m">Emerging Topics in Semantic Technologies -ISWC 2018 Satellite Events [best papers from 13 of the workshops co-located with the ISWC 2018 conference</title>
				<editor>
			<persName><forename type="first">E</forename><surname>Demidova</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Zaveri</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Simperl</surname></persName>
		</editor>
		<imprint>
			<publisher>IOS Press</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">36</biblScope>
			<biblScope unit="page" from="27" to="38" />
		</imprint>
	</monogr>
	<note>of Studies on the Semantic Web</note>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Using terms and informal definitions to classify domain entities into top-level ontology concepts: An approach based on language models</title>
		<author>
			<persName><forename type="first">A</forename><surname>Lopes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Carbonera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">F</forename><surname>Garcia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">H</forename><surname>Rodrigues</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Abel</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.knosys.2023.110385</idno>
		<ptr target="https://doi.org/10.1016/j.knosys.2023.110385.doi:10.1016/j.knosys.2023.110385" />
	</analytic>
	<monogr>
		<title level="j">Knowl. Based Syst</title>
		<imprint>
			<biblScope unit="volume">265</biblScope>
			<biblScope unit="page">110385</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Validating top-level and domain ontology alignments using wordnet</title>
		<author>
			<persName><forename type="first">D</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Trojahn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Vieira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kamel</surname></persName>
		</author>
		<ptr target="org" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IX ONTOBRAS Brazilian Ontology Research Seminar</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>the IX ONTOBRAS Brazilian Ontology Research Seminar<address><addrLine>Curitiba, Brazil</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016-10-03">October 3rd, 2016. 2016</date>
			<biblScope unit="volume">1862</biblScope>
			<biblScope unit="page" from="119" to="130" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Hybrid speech recognition with deep bidirectional LSTM</title>
		<author>
			<persName><forename type="first">A</forename><surname>Graves</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Jaitly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mohamed</surname></persName>
		</author>
		<idno type="DOI">10.1109/ASRU.2013.6707742</idno>
		<ptr target="https://doi.org/10.1109/ASRU.2013.6707742.doi:10.1109/ASRU.2013.6707742" />
	</analytic>
	<monogr>
		<title level="m">IEEE Workshop on Automatic Speech Recognition and Understanding</title>
				<meeting><address><addrLine>Olomouc, Czech Republic</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2013-12-08">2013. December 8-12, 2013. 2013</date>
			<biblScope unit="page" from="273" to="278" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Semi-supervised sequence tagging with bidirectional language models</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Peters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Ammar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Bhagavatula</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Power</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/P17-1161</idno>
		<ptr target="https://doi.org/10.18653/v1/P17-1161.doi:10.18653/v1/P17-1161" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017</title>
				<meeting>the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017<address><addrLine>Vancouver, Canada</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017-08-04">July 30 -August 4. 2017</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1756" to="1765" />
		</imprint>
	</monogr>
	<note>Long Papers, Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">BERT: pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/n19-1423</idno>
		<ptr target="https://doi.org/10.18653/v1/n19-1423.doi:10.18653/v1/n19-1423" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019</title>
				<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019<address><addrLine>Minneapolis, MN, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">June 2-7, 2019. 2019</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Classification of imbalanced data: A review</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Wong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Kamel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International journal of pattern recognition and artificial intelligence</title>
		<imprint>
			<biblScope unit="volume">23</biblScope>
			<biblScope unit="page" from="687" to="719" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Glove: Global vectors for word representation</title>
		<author>
			<persName><forename type="first">J</forename><surname>Pennington</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Socher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">D</forename><surname>Manning</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)</title>
				<meeting>the 2014 conference on empirical methods in natural language processing (EMNLP)</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="1532" to="1543" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Enriching word vectors with subword information</title>
		<author>
			<persName><forename type="first">P</forename><surname>Bojanowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Grave</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Joulin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
		<idno type="DOI">10.1162/tacl_a_00051</idno>
		<ptr target="https://doi.org/10.1162/tacl_a_00051.doi:10.1162/tacl\_a\_00051" />
	</analytic>
	<monogr>
		<title level="j">Trans. Assoc. Comput. Linguistics</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page" from="135" to="146" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Improved adam optimizer for deep neural networks</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<idno type="DOI">10.1109/IWQoS.2018.8624183</idno>
		<ptr target="https://doi.org/10.1109/IWQoS.2018.8624183.doi:10.1109/IWQoS.2018.8624183" />
	</analytic>
	<monogr>
		<title level="m">26th IEEE/ACM International Symposium on Quality of Service, IWQoS 2018</title>
				<meeting><address><addrLine>Banff, AB, Canada</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2018">June 4-6, 2018. 2018</date>
			<biblScope unit="page" from="1" to="2" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<author>
			<persName><forename type="first">Z</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Cui</surname></persName>
		</author>
		<idno>CoRR abs/2108.13624</idno>
		<ptr target="https://arxiv.org/abs/2108.13624" />
		<title level="m">Towards out-of-distribution generalization: A survey</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
