=Paper=
{{Paper
|id=Vol-3905/master5
|storemode=property
|title=Ontology extraction and evaluation for the Blue Amazon
|pdfUrl=https://ceur-ws.org/Vol-3905/master5.pdf
|volume=Vol-3905
|authors=Vivian Magri Alcaldi Soares,Renata Wassermann
|dblpUrl=https://dblp.org/rec/conf/ontobras/SoaresW24
}}
==Ontology extraction and evaluation for the Blue Amazon==
<pdf width="1500px">https://ceur-ws.org/Vol-3905/master5.pdf</pdf>
<pre>
                         Ontology extraction and evaluation for the Blue Amazon
                         Vivian Magri A. Soares1,2 , Renata Wassermann1,2
                         1
                             Universidade de São Paulo, Instituto de Matemática e Estatística, Rua do Matao 1010, Cidade Universitaria, São Paulo, SP, Brazil
                         2
                             Center for Artificial Intelligence (C4AI), Av. Prof. Lúcio Martins Rodrigues, 370, Cidade Universitaria, São Paulo, SP, Brazil


                                        Abstract
                                        The Brazilian maritime territory, often referred to as Blue Amazon, has invaluable significance for its resources,
                                        biodiversity, commercial importance etc. Yet, information about it is disperse. This project is geared towards the
                                        organization of knowledge on the form of an ontology. Searching for efficient methods with satisfying results
                                        for this task, a recent approach involving Large Language Models (LLMs) in the role of experts for building a
                                        conceptual hierarchy has shown promising results. This work presents the proposal for the experimentation with
                                        the construction of an ontology about the Blue Amazon related concepts using LLMs, followed by human and
                                        application-based evaluations.

                                        Keywords
                                        ontology extraction, ontology evaluation, Large Language Models, Blue Amazon


                         1. Introduction
                         The Brazilian maritime territory, a vast area with approximately the same size as the Amazon rainforest,
                         and often referred to as Blue Amazon, is a region of invaluable importance, for Brazil and for the world,
                         because of its economical and commercial resources, its multiple different ecosystems, and even for its
                         key role in climate regulation. Yet, information about it is dispersed [1]. As such, it has become the
                         focus of this project the organization of knowledge of this domain on the form of an ontology.
                            The manual construction of ontologies is an strenuous endeavor which requires access to domain
                         experts. Many (semi-)automatic ontology extraction methods have been proposed over the years.
                         Nevertheless, the problem of obtaining a well structured, relevant ontology without a fair amount of
                         human labor persists. In that scenario, a recent approach [2] has shown promising results. It involves
                         using a Large Language Model (LLM) in the role of experts providing subconcepts of a seed concept
                         iteratively. This work is still preliminary and yields but a simple ontology, with only "is-a" type of
                         relation. We believe, however, that this methodology has great potential in the aid of ontology learning,
                         with plenty room for expansion over future works.
                            To assess how successful the LLM-powered algorithm is on the task, we envisioned two kinds of
                         evaluation for the outputs. The first aided by human domain experts; and, then, testing the efficiency of
                         the validated ontologies in enhancing the accuracy of a conversational agent.
                            This work contains the discussion of the concepts grounding our project, as well as the context of its
                         development, and presents the current results and the proposal for the next steps of this research.


                         2. Related Work
                         Various techniques from different fields have contributed for the improvement of ontology development
                         and extraction, and in the first part of this section we will highlight some of them. In the second part,
                         we discuss some measures for ontology evaluation.


                          Proceedings of the 17th Seminar on Ontology Research in Brazil (ONTOBRAS 2024) and 8th Doctoral and Masters Consortium on
                          Ontologies (WTDO 2024), Vitória, Brazil, October 07-10, 2024.
                          $ vivian.magri@ime.usp.br (V. M. A. Soares); renata@ime.usp.br (R. Wassermann)
                           https://www.ime.usp.br/~renata (R. Wassermann)
                           0009-0009-9767-4127 (V. M. A. Soares); 0000-0001-8065-1433 (R. Wassermann)
                                        © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
2.1. Ontology Extraction
Asim et al. [3] state ontologies can be created by extracting information from unstructured text in a
step-by-step process known as ontology learning layer cake. In this approach, after preprocessing text
corpora using linguistic techniques, relevant terms and concepts of a domain are extracted utilizing
techniques of natural language processing (NLP). These methods may also be applied to obtain taxonomic
and non-taxonomic relations among these concepts. The authors still mention semantic lexicons, which
can be used at both term/concept extraction and relationship extraction stage, and ILP, that may be
used to form axioms in the later stages [3].
   More recently, though, we observed a great rise on the use of deep neural network-based methods.
Reshadat et al. [4] say deep neural networks are powerful approaches for ontology population since the
feature engineering procedure is done automatically. Furthermore, these systems are not constrained
to a predefined set of relations and can extract any type of relation from a massive and unstructured
corpus automatically. The disadvantages, however, are that usually these approaches either require an
annotated corpus with concepts and the relations between them [4], or they deflect from the more rigid
framework that constitutes an ontology.
   Analyzing the literature, we observe that the purely automatic information extraction systems using
the aforementioned approaches usually have trouble keeping the information at the same time complete,
coherently hierarchical and within the domain, and at the useful level of granularity and relevance
for specific applications. This is probably the reason why rule-based methods have been common for
the task of ontology population [4]. One great inconvenience, though, is that these methodologies
basically require the schema of the ontology to be chosen in advance and then provided as an input,
and designing a schema for an entire domain is a non-trivial task itself that requires a domain expert
and involves many design decisions [2]. Another way of dealing with this issue lies on the proposal of
various approaches to semi-automatic ontology construction. But, while such approaches look good on
paper, there’s little evidence they have been applied in practice [2].
   In this context, Funk et al. [2] presented an approach to use OpenAI’s GPT 3.5 API to aid on the
construction of a concept hierarchy for a context provided by the user. GPT stands for Generative
Pre-trained Transformer, a type of Large Language Model (LLM). Over the past decade, advancements
in natural language processing and machine learning have led to the development of increasingly
sophisticated Language Models [5]. The rapid development of (what became known as) LLMs in recent
years has also been fueled by growth in computational resources, availability of large datasets and
evolving software stacks [6]. Rozière et al. [7] claim LLMs have reached a proficiency in natural language
that allows them to be commanded and prompted to perform a variety of tasks, including ones that
require advanced natural language understanding.
   The general strategy in [2] is to take as input from the user a seed concept (for example, Animals,
Music, or even Things) that will determine the domain. The algorithm then provides a textual description
for it, identifies its subconcepts, and places it on the hierarchy being constructed. And so the loop
continues with the exploration of every subconcept discovered, until the model either can not find any
more new subconcepts, or some stop condition defined by the hyperparameters, like the maximum
exploration depth, is met. The outputs include an owl file, with all concepts and their definitions,
and a graph that allows for the visualization of the produced hierarchy. A small example of the latter
can be seen on figure 1. Although the authors recognize further investigation is necessary to draw
general conclusions, they believe their experiments indicate that LLMs can be of considerable help for
constructing concept hierarchies and that the research has great potential for expansion.

2.2. Evaluation of Ontologies
The literature shows that there is no consensus regarding the classification of ontology evaluation
proposals [8]. The diversity of initiatives poses difficulties to creating a unified classification [9]. One
possible grouping for the techniques are the four categories proposed by Asim et al. [3]: a) Golden
standard-based evaluation, that evaluates resultant ontology with a predefined benchmark, which
Figure 1: Example of output of the algorithm from [2] using "coastal ecosystems" as input


depicts an ideal ontology of a particular domain; b) Application-based evaluation, also referred as
‘Task Based Evaluation’, that evaluates a given ontology by exploiting it in a specific application to
perform some task. The outcome of particular task determines the goodness of the specified ontology
regardless of its structural properties; c) Data-driven evaluation (or so-called Corpus-based evaluation),
that utilizes existing domain-specific knowledge sources (usually textual corpora) to assess the extent
of coverage by one or more target ontologies in a particular domain; and d) Human evaluation, also
called ‘Criteria Based Evaluation’, which is generally based on defining various decision criteria for
the selection of the best ontology from a specified set of candidate ontologies. A numerical score is
assigned as experts rate each relevant aspect of an ontology and a weighted sum is calculated.
   As a technique that fits precisely the above definition of human evaluation, we can highlight ONTO-
METRIC [10]. In order to solve what they call the election problem, when many candidate ontologies
are available, they present a taxonomy of 160 characteristics that provides the outline to compare them.
It is the skeleton used to build the multilevel tree of characteristics, that should be adapted by adding
or removing characteristics according to their relevance for a given application of this method. The
superior level of the taxonomy originally possesses five dimensions, i.e., the main aspects that the user
should consider to examine an ontology. These are: content; language of implementation; methodology
used for the development; software tools used for building it; and costs for the ontology in a certain
project. Each dimension is defined through a set of factors, that is, the fundamental elements that
should be analyzed to obtain the value of the dimensions. The factors, in turn, are defined through a
group of characteristics that allow calculating the value of their suitability. These characteristics can be
defined, recurrently, by means of even more specific subcharacteristics. Although the specialization
of the characteristics and the assessment of the criteria of a particular ontology require considerable
effort, feedback from project managers reveals that once the framework has been defined, and if it is
applied to one particular type of ontology, ONTOMETRIC helps to justify decisions taken and to weigh
up the choice of one ontology from other options [10].
   Not every human evaluation method, however, is aimed at comparing ontologies. Almeida [9]
proposes a multidisciplinary approach to ontology content evaluation using experts. Their research was
conducted with the use of a simple search engine developed to allow the visualization of an ontology
by the user, and a group of three questionnaires to be answered next. Those were prepared using
concepts from distinct research fields that are related to content assessment. The questionnaire related
to Information Quality was made based on criteria such as coverage, accuracy and content. The one
based on Competency Questions means to assess the capacity of an ontology-based system to answer
the questions it was designed to address. Finally, the questions based on educational objectives are
intended to assess whether specific content was learned by a person during the exploration of the
ontology. The study concluded it is a good idea to aid the domain experts in verifying the quality of
the conceptualization present in the ontology according to scientific criteria in order to validate the
adequacy of the specification of a model [9].
   Human evaluation have the possibility of covering all high levels of evaluation for ontologies dis-
tinguished on the [3] review, that is, Lexical, vocabulary, concept and data; Hierarchy and taxonomy;
Other semantic relations; Context and application; Syntactic and Structure, architecture and design.
Their major shortcoming is the requirement of high manual cost in terms of time and effort. Araújo and
Lima [8] also point out the difficulties in establishing who the right users are and what the best criteria
are for evaluating the ontology. Furthermore, the fact that ultimately the evaluation still depends on the
expert’s intuition makes it uncertain whether it is accurate. However, they state there is criticisms in the
literature regarding each of the types of proposals for ontology evaluation, and that it is necessary to
take into account what are the objectives of the evaluation and what exactly is intended to be evaluated
in the ontology.


3. Context
Approximately as big as the Amazon rainforest, the Brazilian maritime territory is often referred to
as the Blue Amazon [1]. In total, the it amounts to around 45,000,000 km² of sea. The Blue Amazon
carries 95% of Brazil’s international trade, 90% of its oil reserves and 77% of the country’s gas reserves
[1]. Another noteworthy aspect is its rich environment and biodiversity. The territory encompasses
multiple different ecosystems [11]. The region is also a vital source of food supply and a key player in
climate regulation [1]. Despite its importance, the Blue Amazon is still poorly documented.
   Such was the context that originated the Knowledge Enhanced Machine Learning (KEML)1 , a research
group integrated into the Center for Artificial Intelligence (C4AI)2 , which is a large research center
headquartered at the Universidade de São Paulo, that congregates researchers and students from a
wide variety of fields. KEML’s objective has been to merge data-driven learning with knowledge-based
reasoning [12]. [1] describes the group’s first project, the BLue Amazon Brain (BLAB). BLAB was born
with the goal of building an architecture aimed at disseminating information about the Brazilian coast
domain and its importance [13]. It would function as a tool for education and environmental awareness.
   Currently, the KEML team is mainly dedicated to producing conversational agents or related systems
(such as QA Systems) based on LLMs, specially those enriched by Knowledge Representation (KR)
mechanisms [13]. Among them is Blabinha 1.0 [14], a conversational agent specifically designed as
an evaluation environment for LLMs and prompt engineering when placed in the role of conducting
task-oriented and domain-oriented dialogues. Blabinha 1.0 is implemented using GPT-family models
and prompt engineering of the chain-of-thought (step-by-step) type, aiming at promoting a child’s
engagement in a conversation about the Blue Amazon domain through a gamification strategy. During
the conversation, the language model is subjected to a series of tasks, ranging from introducing itself to
the child, to performing topic analysis and suggestion of subjects within the context [14].


4. Research Proposal
As shown in the previous sections, the Blue Amazon is a region of great importance for Brazil, about
which structured information is scarce. Having such data organized and made available as an ontology
could improve information distribution on the theme, as well as improve the maintenance of the data and
it’s integration on systems and in many AI applications. But the manual crafting of ontologies is a difficult
engineering task that is both time consuming and costly [2]. Besides, it demands reasonable dominance
on the subject. With that in mind, the previously discussed work [2] using an LLM to construct ontologies
made of subconcept/is-a relations presented itself as a promising path for (semi)automatically extracting
structured information about the Blue Amazon domain.
1
    https://sites.usp.br/keml/en/keml-en/
2
    https://c4ai.inova.usp.br/
   The framework presented on their paper, however, is still recent, as is the use of LLMs in the
construction of knowledge-based structures, and haven’t yet been formally accessed. Thus, we consider
relevant as a research goal to construct and conduct an evaluation for the results produced by their
algorithm when applied in the domain of the Brazilian Coast. The evaluation will also be an opportunity
to improve the outputs, as well as fine-tuning the workflow. It shall happen in two steps, using two
different techniques, starting by employing the aid of experts on the domain to supervise the results,
assessing the correctness and the design of the resultant ontologies. Finally, we intend to proceed with
an application-based evaluation, using Blabinha 1.0 as a test environment.
   We believe the contributions of this research will not only be the ontologies produced and evaluated,
but also the findings to guide the use of the algorithm for the construction of concepts hierarchies
and methods for their validation, as well as expanding the investigation on the LLMs potential for the
construction of ontologies.


5. Preliminary Result
Currently, we have concluded the intended ontology generation part, and we are conducting the first
step of the tests, working with the domain experts for the human validation of the outputs.
   After installing the requirements and testing the code3 described in [2] connected to the GPT 3.5
API, a few experiments with examples also used by the authors were performed, just to confirm the
installation was functional. Then the concept "Blue Amazon" was first directly tested. The execution,
however, was finished without obtaining any verified sub-concepts for it. As a workaround, "Brazilian
Water Resources", a more general named concept, but still related to the original theme, was chosen
for the experimentation with the main set of hyperparameters for the algorithm: Exploration depth
(d) - up to this depth new concepts will be explored. The depth of a concept is its shortest path to
the seed concept; and Frequency threshold (f) - lower values like 5 favor completeness, while higher
values like 20 tend to benefit correctness. In accordance with the author’s direct recommendation, two
values suitable for test runs were used for the first, 2 and 3; and the range from 5 to 20 recommended
for the second was covered using a step of 5. Thus, eight executions, all the possible combinations
of the selected values, were tested using "Brazilian Water Resources" as the initial concept. From the
observation of the results, we concluded that the combination of d = 3 and f = 10 produced the most
interesting outputs. Therefore, these were the chosen values to be used in the subsequent experiments.
   For the second round of tests, the chosen concept was "Coastal ecosystems". Since the initial tests
showed the results where quite sensible to variations on the form of the input, a few different forms
were experimented: first letter of the central concept capitalized or not (keeping the rest in lowercase in
both cases, as the concepts outputted by the algorithm); referencing Brazil or not; concept in English or
in Brazilian Portuguese (PT-BR). Also, the variations in Portuguese were used as input in the prompts
modified to request the answers to be in PT-BR, as well. Finally, one concept was executed twice in the
same conditions, to be considered as a parameter of the model’s normal variation. In total, 13 executions,
yielding ontologies containing between 1 and 36 concepts each, are being considered for analysis. The
model struggled particularly to expand concepts in English referencing Brazil, such as "Brazilian Coastal
ecosystems". Besides the edition to the prompts to force the PT-BR outputs, they were also slightly
modified to reproduce the tests using GPT-4. These executions provided hierarchies containing between
2 and 99 concepts. This model had notably worse results when the version requesting answers in PT-BR
(with instructions kept in English) was run.
   For a deep evaluation of the outputs, it was considered essential to seek the aid of experts on the
domain to oversee the results. We are employing five academics to respond forms aimed at evaluating
each of the ontologies with at least 5 concepts. These experts have backgrounds related to the study of the
Ocean, from fields such as Geosciences, Environmental Resource Management and Oceanography, most
with experience in projects or studies related to Sustainability. Although five is not exactly a high number
of people, the diversification of background should help diminish the bias of individual evaluation.
3
    https://git.informatik.uni-leipzig.de/hosemann/onto-llm
There is also a diverse range of age, gender, and educational levels — that go from undergraduate to
post-doctor (some of which are also instructors).
    The questionnaires were formulated mostly based on the work of [9] and [10]. From [10], the analysis
of some factors, namely concepts, relations and taxonomy, from the dimension content, served as
inspiration. Their work also motivated the idea of grouping the topics under evaluation hierarchically
and the preparation for the calculation of a numeric score for each ontology. From [9], the main
inspirations were the criteria related to information quality and the insights of how to use the questions
to assess how well knowledge on the domain was being transmitted and to what degree an ontology
was succeeding in the goal of modeling the real world concept. The higher level dimensions of our
questionnaires are: accuracy, relevance, coverage, precision and information design. All of them, except
for the last one, need to be analyzed at the concept level, considering each of their relationships. On
this first round of evaluation it was considered important to go into this level of detailing to have a
parameter over GPT’s suggestions for each element of a hierarchy. The form also gives the opportunity
for the respondents to express some impressions that will not be translated into a quantitative metric,
but will be qualitatively analyzed to compose the results.
    A first evaluation round was conducted, preparing forms for two smaller outputs, which were replied
by the evaluators after a basic explanation, followed by a feedback meeting to both clarify their doubts,
and to collect their insights to improve the evaluation. One important reminder that came from this
experience is that it is incoherent to attempt an ontology evaluation without stating a clear purpose for
it. Thus, the assessment is now being performed considering the goal of the ontologies as to construct a
consistent concept hierarchy for the root concept of the given ontology considering what an elementary
school student should be taught about such topic (in accordance with the next evaluation planned
for the next steps, using Blabinha 1.0). The model form on its revised version, with the layout for the
questions for each kind of concept (considering the type of relations it holds on a given ontology), can
be found in: https://forms.gle/mxHiX61zUvu5BiqK6. Given the size of the outputs produced by GPT-4
(many exceeding 50 concepts), the model of the forms was simplified for the evaluation of these outputs.
Great effort was put in to minimize the impact of this modification, as to guarantee it would only make
it more practical, but the quality of the assessment for each dimension would be maintained. The latest
model of the form is available on the link: https://forms.gle/oAbSjy6UnfKXgiFR6.


6. Next Steps and Future Works
The following activities are planned for this project, to be concluded, ideally, by the first trimester of
2025:
       1. Process the human evaluation results
          After the conclusion of the specialists activities with the prepared forms, we must collect the
          answers, standardize the scales and calculate the metrics, as well as analyze the qualitative results.
          The metrics and the experts’ impressions should provide means for us to compare the results
          between the different forms of inputs and the different models experimented, and hopefully derive
          enlightening conclusions.
       2. Perform the application-based evaluation
          Following the assessing of the constructed ontologies, we intend to chose some of the outputs that
          achieve the best scores to merge into the Blabinha 1.0 prompts to test their impact on the model’s
          performance. The aspect of topic analysis had previously been rated by human evaluation of
          the tool 4 . So, to measure the effect of inserting the concept hierarchies into the algorithm, a
          new round of evaluation would be conducted in a similar format to the previous one, focused on
          Blabinha’s capacity to explore the domain of Coastal Ecosystems, in order to compare the results
          regarding the model’s discernment about context.


4
    The results of this work are in process of publication.
   We believe this research also has great potential for expansion beyond the scope of this project.
For instance, through the construction of more ontologies, either using other concepts related to the
Blue Amazon domain using the refinements to the pipeline our findings will provide, or applying
the workflow for different contexts. As suggested by the article [2] itself, other relevant direction for
future works are experimenting with the construction of ontologies that are more expressive, adding
other kinds of relations and, possibly, even rules, as disjointness, for example; and testing the effect of
fine-tuning for domain-specific ontology construction, training the model with curated information
about the intended subjects.


Acknowledgments
This work was carried out at the Center for Artificial Intelligence (C4AI-USP), with the support of the
University of São Paulo, the São Paulo Research Foundation (FAPESP, grant #2019/07665-4) and by the
IBM Corporation. Vivian Magri A. Soares was also supported by CAPES.


References
 [1] P. Pirozelli, A. B. R. Castro, A. L. C. de Oliveira, A. S. Oliveira, F. N. Cação, I. C. Silveira, J. G. M.
     Campos, L. C. Motheo, L. F. Figueiredo, L. F. A. O. Pellicer, M. A. José, M. M. José, P. de M. Ligabue,
     R. S. Grava, R. M. Tavares, V. B. Matos, Y. V. Sym, A. H. R. Costa, A. A. F. Brandão, D. D. Mauá,
     F. G. Cozman, S. M. Peres, The blue amazon brain (BLAB): A modular architecture of services
     about the brazilian maritime territory, CoRR abs/2209.07928 (2022). URL: https://doi.org/10.48550/
     arXiv.2209.07928. doi:10.48550/ARXIV.2209.07928. arXiv:2209.07928.
 [2] M. Funk, S. Hosemann, J. C. Jung, C. Lutz, Towards ontology construction with language mod-
     els, CoRR abs/2309.09898 (2023). URL: https://doi.org/10.48550/arXiv.2309.09898. doi:10.48550/
     ARXIV.2309.09898. arXiv:2309.09898.
 [3] M. N. Asim, M. Wasim, M. U. G. Khan, W. Mahmood, H. M. Abbasi, A survey of ontology learning
     techniques and applications, Database J. Biol. Databases Curation 2018 (2018) bay101. URL:
     https://doi.org/10.1093/database/bay101. doi:10.1093/DATABASE/BAY101.
 [4] V. Reshadat, A. Akcay, K. Zervanou, Y. Zhang, E. de Jong, SCRE: special cargo relation extraction
     using representation learning, Neural Comput. Appl. 35 (2023) 18783–18801. URL: https://doi.org/
     10.1007/s00521-023-08704-9. doi:10.1007/S00521-023-08704-9.
 [5] D. Nunes, R. Primi, R. Pires, R. de Alencar Lotufo, R. F. Nogueira, Evaluating GPT-3.5 and GPT-
     4 models on brazilian university admission exams, CoRR abs/2303.17003 (2023). URL: https:
     //doi.org/10.48550/arXiv.2303.17003. doi:10.48550/ARXIV.2303.17003. arXiv:2303.17003.
 [6] S. Smith, M. Patwary, B. Norick, P. LeGresley, S. Rajbhandari, J. Casper, Z. Liu, S. Prabhumoye,
     G. Zerveas, V. Korthikanti, E. Zheng, R. Child, R. Y. Aminabadi, J. Bernauer, X. Song, M. Shoeybi,
     Y. He, M. Houston, S. Tiwary, B. Catanzaro, Using deepspeed and megatron to train megatron-
     turing NLG 530b, A large-scale generative language model, CoRR abs/2201.11990 (2022). URL:
     https://arxiv.org/abs/2201.11990. arXiv:2201.11990.
 [7] B. Rozière, J. Gehring, F. Gloeckle, S. Sootla, I. Gat, X. E. Tan, Y. Adi, J. Liu, T. Remez, J. Rapin,
     A. Kozhevnikov, I. Evtimov, J. Bitton, M. Bhatt, C. C. Ferrer, A. Grattafiori, W. Xiong, A. Défossez,
     J. Copet, F. Azhar, H. Touvron, L. Martin, N. Usunier, T. Scialom, G. Synnaeve, Code Llama: Open
     foundation models for code, 2023. arXiv:2308.12950.
 [8] W. Araújo, G. Lima, O cenário da avaliação de ontologias: revisão de literatura, Tendencias da
     Pesquisa Brasileira em Ciência da Informação 9 (2016).
 [9] M. B. Almeida, A proposal to evaluate ontology content, Appl. Ontology 4 (2009) 245–265. URL:
     https://doi.org/10.3233/AO-2009-0070. doi:10.3233/AO-2009-0070.
[10] A. L. Tello, A. Gómez-Pérez, ONTOMETRIC: A method to choose the appropriate ontology, J.
     Database Manag. 15 (2004) 1–18. URL: https://doi.org/10.4018/jdm.2004040101. doi:10.4018/JDM.
     2004040101.
[11] P. de M. Ligabue, A. A. F. Brandão, S. M. Peres, F. G. Cozman, P. Pirozelli, Blabkg: a knowledge
     graph for the blue amazon, in: P. Li, K. Yu, N. V. Chawla, R. Feldman, Q. Li, X. Wu (Eds.),
     IEEE International Conference on Knowledge Graph, ICKG 2022, Orlando, FL, USA, November
     30 - Dec. 1, 2022, IEEE, 2022, pp. 164–171. URL: https://doi.org/10.1109/ICKG55886.2022.00028.
     doi:10.1109/ICKG55886.2022.00028.
[12] KEML, About keml, https://sites.usp.br/keml/en/keml-en/, 2023. Accessed: 2024-08-01.
[13] KEML, Conversational agents, https://sites.usp.br/keml/en/conversational-agents/, 2023. Accessed:
     2024-08-01.
[14] KEML, Frameworks, https://sites.usp.br/keml/en/frameworks-2/, 2023. Accessed: 2024-08-01.

</pre>