<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Methods in a Practical Context</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Roos M. Bakker</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daan L. Di Scala</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Soesterberg</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>The Netherlands</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>roos.bakker@tno.nl</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>daan.discala@tno.nl.</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Knowledge Graph Extraction, Relation Extraction, Ontology Learning, Knowledge Graphs</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Leiden University Centre for Linguistics, Leiden University</institution>
          ,
          <addr-line>2311 BE Leiden, Reuvensplaats 3-4</addr-line>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>TNO Netherlands Organisation for Applied Scientific Research, Department Data Science</institution>
          ,
          <addr-line>Kampweg 55, 3769 ZG</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Knowledge graphs provide structure and semantic context to unstructured data. Creating them is labour intensive: it requires a close collaboration of graph developers and domain experts. Therefore, previous work has made attempts to automate (parts of) this process, utilising information extraction methods. This paper presents a comparative analysis of methods for extracting relations, with the goal of automated knowledge graph extraction. The contributions of this paper are two-fold: 1) the creation of a small dataset containing diferent versions of a news message annotated with triples, and 2) a comprehensive comparison of relation extraction methods within the context of this dataset. The primary objective of this paper is to assess these methods within a real-life use case scenario, where the resulting graph should aspire to the quality standards achievable through manual development. Prior methodologies often relied on automatically extracted datasets and a limited range of relation types, consequently constraining the expressivity and richness of resulting graphs. Furthermore, these datasets typically feature short or simplified sentences, failing to reflect the complexity inherent in real-world texts like news messages or research papers. The results show that GPT models demonstrate superior performance compared to the other relation extraction methods we tested. However, in the qualitative analysis performed additionally to the evaluation metrics, it was noted that alternative approaches like REBEL and KnowGL exhibit strengths in leveraging external world knowledge to enrich the graph beyond the textual content alone. This finding underscores the importance of considering a variety of methods that not only excel in extracting relations directly from text but also incorporate supplementary knowledge sources to enhance the overall richness and depth of the resulting knowledge graph.</p>
      </abstract>
      <kwd-group>
        <kwd>Practical Context</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>In the digital age, the abundance of textual data presents a chance and a challenge. Knowledge
graphs can aid in structuring and utilising this knowledge, leading to an increasing demand
for methodologies to extract knowledge from textual sources. Knowledge graph extraction
has become a popular technique to answer this demand because of their ability to organise
and connect information. By representing knowledge and data in a structured graph format,
relationships between entities become explicit, allowing for reasoning, interoperability, eficient
retrieval, and other downstream applications.</p>
      <p>
        Creating knowledge graphs from scratch is a labour intensive task. Domain specialised
models require time from domain experts and graph developers to ensure its quality. Information
extraction techniques can support this process by extracting entities and relations from text.
Existing fields of study that are often used in the process of knowledge graph extraction include,
but are not limited to, relation extraction, Named Entity Recognition (NER), keyword extraction,
and link prediction [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Additionally, end-to-end approaches have been suggested, which
classify relations in texts utilising language models [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The multi-step approach sufers from
its dependence on individual parts. As Jaradeh et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] discuss, their text triple extractor is a
weakness in their architecture due to low quality of its output. End-to-end relation extraction
models are often task specific, with set relation types and do not have the flexibility to work on
a range of texts [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Additionally, many datasets for this task are created distantly supervised,
which impacts the quality.
      </p>
      <p>In this paper, we compare diferent knowledge graph extraction techniques with the goal to
create a knowledge graph that represents the text from which it is extracted in a way a graph
developer would manually create it for an applied use case. For this purpose, a news message
and a simple baseline text are annotated, identifying subjects, objects, and their relation. We run
a collection of relation extraction methods on this dataset, and evaluate their performance. The
contributions of this paper are as follows: 1) We introduce a small annotated dataset containing
diferent versions of a news message, and 2) We carry out a comprehensive comparison of
relation extraction techniques within the context of this dataset. With the dataset, containing
simple and complex versions of a news message, we demonstrate how complexity afects
performance. Additionally to using standard evaluation metrics (precision, recall, F1), we
include a clustering coeficient to show the density of the knowledge graph. To illustrate the
strengths and weaknesses of the methods that are not reflected in the metrics, we provide a
qualitative analysis. Additional results and materials are included in our open source repository.1</p>
      <p>In the next section, we will give an overview of knowledge graph extraction, including
information extraction techniques. In Section 3, we introduce our data and annotation method
and discuss the various methods. In Section 4, we will present our findings and discuss their
implications in Section 5. Finally, we will conclude with a summary of our work and suggestions
for future research directions in Section 6.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <p>Several fields are involved or related to the extraction of knowledge graphs, such as
information extraction and ontology learning. In this section, we first elaborate on the definition of
knowledge graphs, followed by a short overview of knowledge graph extraction and related
ifelds. Afterwards, we will discuss the field of information extraction and techniques related to
knowledge graph extraction.</p>
      <p>
        Knowledge graphs and similar concepts such as ontologies have been around since the
originating of the field of philosophy. The computer science interpretation of knowledge
1https://gitlab.com/genesysubmission/text2kg
modelling, with terms such as knowledge graphs, appears in literature as early as the 1970s
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], with early research on extraction from text and other sources starting a little later [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
Knowledge graphs as a term and technique have become more popular since Google announced
their implementation [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        The knowledge graph as it is known today is a powerful representation framework that
organises information in the form of interconnected nodes and edges [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In this graph-based
structure, nodes typically represent entities (such as people, places, or concepts), while edges
denote relationships between these entities. The combination of the two nodes and its relation is
called a triple. This structure allows for the creation of a rich network that captures the context
and connections within a dataset. Knowledge graphs enable a more nuanced and context-aware
understanding of information, facilitating efective data exploration and retrieval [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. They are
therefore increasingly more used in applications where understanding and transparency of the
data is essential, such as the safety or the medical domain.
      </p>
      <sec id="sec-3-1">
        <title>2.1. Knowledge Graph Extraction</title>
        <p>
          Knowledge Graph Extraction is a task that aims to extract knowledge graphs from
diferent sources, using a variety of techniques. It is closely related to ontology learning and
information extraction. Ontology learning, an established field, focuses on automatically or
semi-automatically learning ontologies, including complex elements like rules and hierarchies.
Information Extraction techniques are often used for extracting knowledge graphs. We will
discuss often used techniques in Section 2.2. Throughout this work, we adopt the term “knowledge
graph extraction”. Our objective is to extract structured information from unstructured texts,
with the goal to create a knowledge graph. This term underscores our focus on this specific
task, distinct from related tasks like relation extraction or specialised areas such as ontology
learning.
2.1.1. Ontology Learning
Ontologies have been used as knowledge bases, reasoning tools, and schematic tools in the
context of information science. Consensus is that they are stricter than knowledge graphs; in
ontology terms they could be considered a subclass of knowledge graph [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. An ontology is a
formal specification of concepts in the world [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. In other words, an ontology can represent
knowledge about part of our world. For instance, an ontology about cofee can include diferent
types of cofee, cofee machines, types of preparation, etc.
        </p>
        <p>
          Creating ontologies is an extensive task, which involves the time of a domain expert and a
modeller and requires maintenance. Therefore, automatically creating ontologies or part of
them has been a fruitful field of research. Multiple overviews have been published, for instance
earlier overviews based on rule-based approaches from Buitelaar et al. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] to recent surveys
including machine learning approaches from Khadir et al. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. Buitelaar et al. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] give an
extensive overview of the field as it was until 2005. They divide the task into complexity levels:
starting with the learning of just terms and ending at the top with hierarchies, relations, and
ifnally rules. State-of-the-art techniques were rule-based and focused on lexico-syntactical
patterns. Such patterns could not consistently be identified, and the recall was low [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. For all
approaches on all levels, manual work was necessary to produce a coherent ontology.
        </p>
        <p>
          As statistical approaches gained in popularity due to the increase of computing power and
successful machine learning applications, the field changed. In their survey, Asim et al. [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]
include more recent statistical approaches such as co-occurrences, hierarchical clustering, and
shortly touch upon transforming ontological concepts and relations into vectors. They make the
distinction between linguistic and statistical approaches, and recognise the diference between
term extraction and relation extraction, with the second being the more complex task [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
        </p>
        <p>
          Recent advancements in ontology learning include the use of natural language processing
techniques for more eficient and scalable ontology learning [
          <xref ref-type="bibr" rid="ref11 ref13">13, 11</xref>
          ]. The term ontology learning
is used less, and the focus seems to have shifted to knowledge graphs. Information extraction
techniques are often combined, for example in the multi-tool Plumber [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], that tries to optimise
the combination of diferent approaches. Jaradeh et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] state that current approaches are not
viable for complete knowledge graph construction from unstructured text, because tasks such
as keyword extraction are not enough by themselves to produce a knowledge graph. However,
within the field of information extraction, the relation extraction task has been approached as
an end-to-end task, which might produce a knowledge graph with its combined relations. In the
next section, we discuss the field of information extraction and relevant tasks and techniques
for ontology learning and knowledge graph extraction.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2. Information Extraction Techniques</title>
        <p>
          The field of Information Extraction is closely associated with the extraction of knowledge
graphs. A specific area within information extraction dedicated to knowledge graphs is Open
Information Extraction (OpenIE) [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. This technique involves generating triples, comprising
a subject, relation, and object, from textual data. Multiple techniques are often combined, in
a similar approach to Jaradeh et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] as described above. Information Extraction as a field
underwent a surge with the implementation of word embeddings [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. These first models lead
to more complex architectures such as long short-term memory models (LSTMs) and the current
state-of-the-art, transformers [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], of which BERT [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] is widely used for a variety of tasks.
        </p>
        <p>
          Currently, decoder-only models such as GPT [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] have gained prominence. They are known
as generative Large Language Models (LLMs), due to the vast amounts of texts and parameters
with which they are trained. They excel at absorbing and producing factual information in
natural text [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ], with their main purpose being language generation. However, they also show
emergent behaviour on other tasks such as semantic entailment [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. Albeit powerful models,
they show limitations such as factual consistency and hallucinations [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ].
        </p>
        <p>
          Node Extraction For Knowledge Graph Extraction, multiple techniques can be combined in
steps to produce a graph. A first step is the extraction of nodes. Named Entity Recognition,
where entities are identified in texts, can serve as an initial step in knowledge graph extraction,
although its application has been limited thus far [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. Similarly to NER, keyword extraction
can provide subjects and objects to the graph. A popular technique, TF-IDF, is based on
frequency analysis together with the uncommonness of a word [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. Recent approaches are
often based on language models which are trained on this task, such as the BERT-based method
KeyBERT [
          <xref ref-type="bibr" rid="ref17 ref23">17, 23</xref>
          ]. This approach demonstrates high performances in producing accurate
keywords for respective texts.
        </p>
        <p>
          Relation Extraction Beyond extracting terms or concepts from text, determining the
relationships between these concepts is crucial for creating graphs. This task is often called
Relation Extraction. Similarly to early approaches of keyword extraction, early approaches of
relation extraction were based on frequencies. For knowledge graph extraction, such approaches
have been demonstrated by [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ], with the side note that manual filtering and other steps are
necessary to produce a high quality graph. Nowadays, statistical approaches are prevalent, with
architectures such as graph LSTMs [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ] and transformers.
        </p>
        <p>
          Relation extraction can be done on sentences, or on paragraphs or documents. On a document
level, this is still a challenging task, but the results might be more representative to the domain
and comparable to manual development because relations outside sentence boundaries are also
included [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]. Wang et al. [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ] pre-train language models to recognise triples using relation
datasets, demonstrating state-of-the-art performance. However, with state-of-the-art F1 scores
going up until 0.67 for models such as [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ], extracting relations on a document level is still a
challenging task with room for improvement.
        </p>
        <p>
          Extracting relations on a sentence level has been approached traditionally as a multi-step
problem, similarly to ontology learning approaches described above. Recently, more end-to-end
solutions have been proposed [
          <xref ref-type="bibr" rid="ref2 ref28">2, 28</xref>
          ]. Huguet Cabot and Navigli [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] introduce the REBEL model
and dataset, where an encoder-decoder transformer is trained on their dataset for relation
extraction. The distantly supervised dataset is created by expanding on T-REx [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ]. 220 types of
relations are extracted from Wikipedia abstracts, which are combined with extracted relations
from Wikipedia texts using a Natural Language Inference model. The REBEL model has shown
state-of-the-art performance on relation classification benchmarks. With the KnowGL model,
Rossiello et al. [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] extend the REBEL dataset by adding entity labels and types, with the goal
to generate a set of facts relevant for generating a knowledge graph.
        </p>
        <p>
          Early work on generative large language models for relation extraction shows potential.
Bakker et al. [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] perform a qualitative comparison of methods among which GPT-3.5 Turbo
and propose a multi-step approach. However, this works lacks a quantitative analysis. Wan et al.
[
          <xref ref-type="bibr" rid="ref32">32</xref>
          ] use a prompt engineering approach in a multi-step architecture for relation extraction and
demonstrate that such an approach shows promise for relation classification. Allen et al. [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ]
outline the diverse roles of LLMs in knowledge engineering, and propose research questions,
among which are questions regarding how LLMs can support the engineering of knowledge
systems. Our work implements and tests one possible answer to this question: the automatic
extraction of a knowledge graph from text.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Method</title>
      <p>In this paper, we compare diferent extraction techniques with the goal to create a knowledge
graph that represents the text in a way a knowledge graph developer would manually create it.
For this purpose, a small annotated dataset was created, which is described in Section 3.1. We
run a collection of relation extraction methods on the dataset, they are described in Section 3.2.
Finally, we evaluate their performance with the metrics described in Section 3.3.</p>
      <sec id="sec-4-1">
        <title>3.1. Dataset</title>
        <p>We study a real-life use case and compare performance of diferent relation extraction methods.
An example of a domain where knowledge graph extraction is valuable is the Safety domain.
Keeping track of multiple sources of information is critical for efective decision-making and
response to emerging threats. News messages are an important source of information. By
extracting structured knowledge from news messaged, safety organisations can swiftly identify
and analyse relevant entities and relations. Therefore, we chose to annotate a news message
that is relevant for this domain: the first news message about the Nord Stream Pipeline incident
by Reuters2. On September 26, 2022, a series of underwater explosions and subsequent gas
leaks struck the two Nord Stream natural gas pipelines. Both pipelines were inactive due to the
Russian invasion of Ukraine. The leaks occurred in international waters prompting separate
investigations by Denmark, Germany, and Sweden. As of January 2024, investigations continue,
with the explosions characterised as sabotage and the perpetrators yet to be oficially identified.
The news message does not include details on the cause or involved parties.</p>
        <p>This news message consists of two parts: 1) a report of the incident and involved parties, and
2) a collection of statements on the incident that the writer has gathered from involved parties.
Knowledge graphs are better suited to factual information, therefore we decided to only include
the first part of the news message as our baseline complex text. Additionally to this complex
news message text, we created a simplified version of the text, containing only the key points
of the news message in simple sentences. We use this simplified text with the hypothesis that
relation extraction should be easier than on the news message.</p>
        <p>We annotated both texts with full triples, consisting of two nodes and the relation between
them, as shown in Figure 1. We used the following conditions: 1) Each triple must be fully
stated in the text, 2) each triple must indicate factual information, and 3) each triple must adhere
to the (subject, relation, object) format. The first condition ensures that no common-sense
2https://www.reuters.com/business/energy/pressure-defunct-nord-stream-2-pipeline-plunged-overnight-operator-2022-09-26
world knowledge is included, or other information that is known to the annotator but cannot
be found in the text. The second condition excludes opinions or non-factual statements such as
‘Denmark’s energy agency gave a statement’. The final condition excludes statements about
statements, so no additional time or location information, which means the triple (operator,
disclosed, pressures drop) is extracted instead of (operator, (disclosed, pressure drop, on
Monday)). The constructed baseline knowledge graphs can be seen in Figures 2 and 3. The full
news message, the complex and simple texts, the complex and simple triples and the graph
visualisations can all be found in our repository1.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Methods</title>
        <p>In this section, we describe the diferent methods we apply on the news message and its
simplified version described above. We test all methods on sentence level and on document
level. These methods each have their strengths and weaknesses, which can be vary depending
on the applied use case. An overview of the methods and their properties is given in Table 1.
Each method is compared based on the type of information it can extract (Extracts), whether
it can generate additional type information (Type info), whether weights are included in its
output (Weights), whether its output follows a formal standard (Standard output), whether
external knowledge outside of document input is provided in the output or just information
strictly from the given text (External info), and on which model it is based on (Base model).
Each method and its parameters is described below.</p>
        <p>
          KeyBERT In creating a knowledge graph, selecting informative and representative nodes
is crucial. Therefore, we implemented KeyBERT [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] which is based on BERT [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. We set
parameters to a top-n of 15 and a diversity of 0.5, along with Maximal Marginal Relevance
(MMR) to ensure a balance between similarity and diversity in the extracted keywords. MMR
minimises redundancy and maximises diversity by selecting keywords similar to the document
iteratively, enhancing the quality of keywords for graph construction.
Co-occurrences (COOC) While useful in a pipeline because of its high quality concepts,
keyword extraction does not provide us with triples; there are no relations between the nodes.
For extracting triples, we implemented several approaches. Firstly, we implemented a
cooccurrence algorithm as a baseline, similarly to previous work by De Boer and Verhoosel [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]
and Bakker et al. [
          <xref ref-type="bibr" rid="ref34">34</xref>
          ]. The algorithm works by analysing the frequency with which words
appear together within a sentence or document. It is suitable as a baseline method because of its
simplicity and interpretability. We set the maximum distance for words that occur together to 5.
Further, a threshold of 0.9 is used to define a minimum amount of times a word pair must occur.
REBEL and KnowGL We also implemented the REBEL model [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. The REBEL model is
specifically designed for extracting relation triplets from raw text and is built upon the BART
Transformer model [35]. We implemented the REBEL model using the number of beams and
the number of return sequences settings of 5 on sentence level, and both on 30 on document
level. KnowGL [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] is an extension to REBEL with entity types. We set the KnowGL parameters
to the same values as REBEL.
        </p>
        <p>
          GPT-3.5 Turbo and GPT-4 Alternatively, we implemented relation extraction using two
generative LLMs. Previous approaches using these models for relation extraction showed
promise [
          <xref ref-type="bibr" rid="ref31 ref32">31, 32</xref>
          ], but no extensive analysis has been performed yet. Two models from the GPT
family were tested: GPT-3.5 Turbo [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] and GPT-4 [36]. We selected these models due to their
superior performance on most tasks and their availability. We queried both models with the
following prompt: “Take the following document: [text], Extract all relations in this text to a
graph. The graph format must be in JSON, with nodes and edges. Make sure to include the three
parts of the ‘subject’, ‘object’ and ‘relation’ triple for each relation you find. Think carefully before
you answer.” The temperature was kept at the low setting of 0.3 to prevent hallucinations of
relations that can not be found in the text.
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>3.3. Evaluation</title>
        <p>To evaluate the performance of above methods on extracting nodes and triples from our dataset,
we use precision, recall, and the F1 score; commonly used metrics for information retrieval
tasks. Correctness of nodes and triples are compared manually to the dataset, and nodes/triples
that are semantically identical to the baseline are counted as correct (e.g., ‘the pipeline’ instead
of ‘pipeline’ is approved, yet ‘Danish’ instead of ‘Danish Area’ is disapproved). The triple is
only evaluated as correct when both the nodes and the relation is correct.</p>
        <p>Additionally, we measure the average Clustering Coeficient   [37] of each of the graphs.
The   is based on the density of the neighbourhood surrounding each node of the graph,
and is calculated by counting for each node the amount of triples it is either a subject or object
of, divided by the total amount of possible triples ( ×  − 1 , with  =total amount of nodes). With
this metric, we measure the completeness of the graph. As discussed by Guéret et al. [37], the
aim of knowledge graphs should not be to be fully complete (  = 1), as most links would
be meaningless, but a high clustering coeficient indicates a well-cohesive graph. Finally, we
ofer a brief qualitative analysis highlighting distinctive qualities of various methods from the
perspective of a graph developer.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Results</title>
      <p>4.1. Nodes
In this section, we state the results from the methods described in Section 3.2 on the dataset
as discussed in Section 3.1, based on the metrics described in Section 3.3. First, we describe
the performance on node level, where only the extracted entities are evaluated (Section 4.1).
Second, we present the results evaluated on the triple level, considering the full extracted triples
(Section 4.2). Third, we compare the extracted graphs on their density (Section 4.3). Finally, we
provide a short qualitative evaluation on observations we made during evaluation (Section 4.4).
The node extraction performance of the methods is shown in Tables 2 and 3. As shown in Table
2, performance on the simple text is overall high, with GPT-3.5 Turbo on sentence level scoring
an F1 of 1, matching perfectly to the ground truth. While REBEL on document level scores
lowest, on sentence level REBEL scores a perfect recall and high precision. As seen in Table 3,
on the complex text, the highest precision is scored by KeyBERT on document level. Highest
recall and F1-measure are by GPT-3.5 Turbo on sentence level. On the complex text, F1 and
recall scores are lower on a document level in all cases except for co-occurrences. No such
pattern can be observed for precision.</p>
      <sec id="sec-5-1">
        <title>4.2. Triples</title>
        <p>The triple extraction performance of the methods is shown in Tables 4 and 5. Note that due
to KeyBERT only providing entities and COOC only providing linked entities (see Table 1),
both methods score 0 on all metrics and are left out. Table 4 shows that both GPT-3.5 Turbo on
sentence level and GPT-4 on document level score a perfect precision on the simple text, with
GPT-3.5 scoring a perfect recall and F1-measure as well. KnowGL scores the lowest F1-measure
on the simple text. On the complex text, as seen in Table 5, GPT-3.5 Turbo scores the highest
F1-measure on sentence level, yet GPT-4 scores the highest precision on document level and
recall on sentence level. KnowGL on document level notably extracted no correct triples. Recall
on document level was again lower for all methods than on sentence level.</p>
      </sec>
      <sec id="sec-5-2">
        <title>4.3. Graph Density</title>
        <p>The density of the graphs extracted by the methods is shown in Figures 4 and 5, based on
the average clustering coeficient   metric (see Section 3.3). KeyBERT does not produce
relations and therefore the density cannot be calculated. The density of the baseline knowledge
graphs is shown as a line, which is   =0.054 for Simple text and   =0.00085 for Complex
text. As seen in Figure 4, on the Simple text, KnowGL on document level and REBEL both
score highest and COOC on sentences has the lowest density score. The density of GPT-3.5
Turbo’s graph is closest to the baseline’s density. Results of the complex text can be seen in
Figure 5. Because on document level REBEL and KnowGL produce high outlier density scores
(  =0.061 for KnowGL and   =0.027 for REBEL), both are left out of the Figure. On
complex text, REBEL and KnowGL score higher than the baseline, while GPT-4’s density on
document level is closest to the baseline density.</p>
      </sec>
      <sec id="sec-5-3">
        <title>4.4. Qualitative Analysis</title>
        <p>Co-occurrences (COOC) COOC generates odd results from a knowledge graph developer’s
standpoint. It only outputs unigrams, so many full entities are missed. For example, ‘Danish’,
‘maritime’ and ‘authorities’ are extracted instead of ‘Danish maritime authorities’. Furthermore,
many verbs are incorrectly found as entities (e.g., ‘declined’, ‘ran’). Also, due to only providing
linked relations, no triples are incorrect as no semantic value is given when a connection
between subject and object does exist. For example, ‘st, cooc, petersburg’ while St. Petersburg
exists as a city, or ‘pressure, cooc, pipeline’ while pipeline and pressure are connected because
it is the pipeline’s pressure.</p>
        <p>KeyBERT Because KeyBERT provides no relations altogether, sometimes entities are picked
up what ideally would be considered the relation of a triple. For example, the simple text
baseline includes ‘Europe, tension, Moscow’ as (subject, relation, object)-triple, yet KeyBERT
provides ’tension’ as entity.</p>
        <p>REBEL and KnowGL Where the results of the COOC method entail too little semantic value,
REBEL and KnowGL often provide too much additional information. For example, KnowGL
includes world knowledge, providing triples such as ‘Germany, shares border with, Denmark’
or ‘St. Petersburg, located in or next to body of water, Baltic Sea’. While including this world
knowledge can be useful to create well-connected knowledge graphs, it is rather a knowledge
graph extension step with additional information outside of the text. This is less efective when
generated triples are slightly incorrect or non-informative, such as ‘Danish, located in or next
to body of water, Danish area’.</p>
        <p>However, opposed to other methods, KnowGL does include symmetric relations (e.g.,
providing both ‘Nord Stream AG, owner of, website’ and ‘website, owned by, Nord Stream AG’), which
from the perspective of knowledge graph development is an informative feature, although a
proper schema might be able to infer them. Due to the REBEL classifying the text to relations it
is trained on, often the relation part of the triples are semantically slightly incorrect, such as
‘Nord Stream 2, product or material produced, gas’ instead of ‘Nord Stream 2 pipeline, is filled
with, gas’.</p>
        <p>GPT The GPT models both often provided nodes made out of multiple entities, sometimes
even providing entire subclauses as nodes, such as ‘currently not known what had caused the
pressure drop’ or ‘leak today occurred on the nord stream 2 pipeline in the danish area’. Keeping
the goal of knowledge graph development in mind, smaller connected entities are preferable.
However, this is also a strength of this approach, as the baseline knowledge graph include
occasional long entities (e.g., ‘German network regulator president’), which are only picked up
by GPT.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Discussion</title>
      <p>In the process of annotating the news message and its simplified version, we tried to adhere
to the text as much as possible. However, this approach has its limitations, as it precludes the
inclusion of details that a graph developer might typically incorporate into the model. For
example, by adhering to our restrictions (see Section 3.1) we did not include world knowledge
facts and meta-triples which abstract over what is in the text. During annotation, we made
some abstractions but chose to leave them out of the dataset, due to the dificulty in objectively
determining which additional information should be incorporated. This is always a limitation
of manually creating knowledge graphs, as graph developers are influenced by their world and
domain knowledge on deciding what is relevant to include. For extracted knowledge graphs,
lfattened results without abstractions are often acceptable, but for an ontology, abstractions
and external world knowledge are essential. Some methods do have such additional knowledge.
During evaluation, we noticed that KnowGL, and to lesser extent REBEL, include additional
triples such as ‘St. Petersburg is located in Russia’. According to our annotated data, this is
not counted as a correct triple. However, such additional triples containing external world
knowledge might be valuable and desirable, depending on the use case.</p>
      <p>In the results, all methods on the complex text demonstrated higher recall scores on
sentencelevel extraction. This makes sense, as more triples are extracted when methods are ran on
sentence level. Triple counts are included in our repository1. The complex text yielded much
lower performance scores that the simple text. While to be expected, this is an indication
that the efectiveness of relation extraction is influenced by the complexity of text. For node
extraction, a similar pattern could be observed. Scores were also influenced by the fact that
most methods are suitable for relation extraction, not necessarily node extraction, whereas
KeyBERT scores high on the simple text, and has a high precision on the complex text. However,
we are not necessarily interested in separate entities, they have to be relevant for forming a
knowledge graph. This is reflected in the qualitative results, where KeyBERT identifies ‘tension’
as an entity, where ideally this is a relation. The GPT models outputs the nodes separately from
the relations, resulting in some nodes that are not part of triples. Similarly to KeyBERT, from a
graph developer’s perspective this is not desirable. An advantage of KeyBERT and all relation
extraction methods except GPT, is that depending on the method, parameters can be set — such
as the amount of results to be extracted — giving more control over the results and making
such methods more suitable for pipeline usage. Overall, the results from the GPT models were
of decent quality, as can be seen from their high scores. However, the output content and its
format difered per run and required manual post-processing to evaluate the results. This makes
this method less suitable for use in a pipeline. The output consistency can be influenced to a
certain extent by utilising few-shot learning and/or function calling. Further research into these
consistency methods would make a valuable addition to our work.</p>
      <p>The Graph Density results indicate that REBEL and KnowGL are able to generate high density
knowledge graphs, which might be desirable depending on the use case. However, GPT-3.5
Turbo performs closest to the baseline density on simpler texts, while GPT-4 performs closest
on complex texts. Note that with the complex text, the graphs tend to have very low  
scores overall, because the larger and more complicated the text, the more possible nodes, and
thus the denominator of the formula growing exponentially. While having denser graphs as
results might not always be better, it is an interesting metric as there might be use cases where
you would want a highly connected knowledge graph, or have your graphs meet a threshold of
connectivity, for which REBEL and KnowGL might be more useful than GPT.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusion and Future Work</title>
      <p>In this paper, we created a small annotated dataset containing diferent versions of a news
message annotated with triples, and provided a comprehensive comparison of relation extraction
methods within the context of this dataset. Our primary objective was to assess relation
extraction methods on a real-life use case scenario, where the resulting graph should reflect
a manually created graph. Our results indicate that the generative Large Language Models
(LLMs) GPT-3.5 Turbo and GPT-4 outperform the other relation extraction methods we tested.
However, in the qualitative analysis performed additionally to the evaluation metrics, we noted
that alternative approaches like REBEL and KnowGL exhibit strengths in leveraging external
world knowledge to enrich the graph beyond the textual content alone.</p>
      <p>For future work, a comparison of a broader range of generative LLMs on this task would be a
promising direction, since our results show their superior performance for knowledge graph
extraction. Another future venue lies in exploring alternative methods for evaluating the quality
and performance of extracted knowledge graphs, since standard metrics do not convey all
qualitative aspects of an extracted graph. Additionally, our results show that on complex texts,
performance of all methods is lacking. We treated this as an end-to-end task, but improvements
might be made by including the methods in a pipeline with pre- and post-processing steps.
Finally, extending this work with a larger and more diverse dataset of knowledge graphs based
on real-world text would ofer a valuable opportunity to evaluate and fine-tune language models
on this task. Looking forward, such avenues ofer significant potential to further advance the
ifeld of knowledge graph extraction.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>We extend our gratitude to the NATO Science &amp; Technology Organization Centre for Maritime
Research &amp; Experimentation (CMRE) for providing us with a practical use case, and we extend
special thanks to Dr. Pawel Kowalski for his guidance in the maritime domain and previous work.
Additionally, we would like to thank the DEFRAUDify, TrustLLM, and RVO Causal Relations in
Behavioural Modelling projects for their support.
[35] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, L.
Zettlemoyer, BART: Denoising sequence-to-sequence pre-training for natural language
generation, translation, and comprehension, arXiv preprint arXiv:1910.13461 (2019).
[36] OpenAI, GPT-4 technical report, 2023. arXiv:2303.08774.
[37] C. Guéret, P. Groth, C. Stadler, J. Lehmann, Assessing linked data mappings using network
measures, in: The Semantic Web: Research and Applications: 9th Extended Semantic
Web Conference, ESWC 2012, Heraklion, Crete, Greece, May 27-31, 2012. Proceedings 9,
Springer, 2012, pp. 87–102.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. Y.</given-names>
            <surname>Jaradeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stocker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Both</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          ,
          <article-title>Information extraction pipelines for knowledge graphs</article-title>
          ,
          <source>Knowledge and Information Systems</source>
          <volume>65</volume>
          (
          <year>2023</year>
          )
          <fpage>1989</fpage>
          -
          <lpage>2016</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>P.-L. Huguet</surname>
            <given-names>Cabot</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Navigli</surname>
          </string-name>
          , REBEL:
          <article-title>Relation extraction by end-to-end language generation, in: Findings of the Association for Computational Linguistics: EMNLP 2021, Association for Computational Linguistics</article-title>
          ,
          <year>2021</year>
          , pp.
          <fpage>2370</fpage>
          -
          <lpage>2381</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Feigenbaum</surname>
          </string-name>
          ,
          <article-title>The art of artificial intelligence: themes and case studies of knowledge engineering</article-title>
          ,
          <source>in: Proceedings of the 5th International Joint Conference on Artificial Intelligence -</source>
          Volume
          <volume>2</volume>
          , IJCAI'
          <fpage>77</fpage>
          , Morgan Kaufmann Publishers Inc.,
          <year>1977</year>
          , p.
          <fpage>1014</fpage>
          -
          <lpage>1029</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R. R.</given-names>
            <surname>Bakker</surname>
          </string-name>
          ,
          <article-title>Knowledge Graphs: representation and structuring of scientific knowledge</article-title>
          ,
          <source>Ph.D. thesis</source>
          , University of Twente,
          <year>1987</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Singhal</surname>
          </string-name>
          ,
          <article-title>Introducing the knowledge graph: things, not strings</article-title>
          ,
          <year>2012</year>
          . URL: https://blog. google/products/search/introducing
          <article-title>-knowledge-graph-things-not/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Khadir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Aliane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Guessoum</surname>
          </string-name>
          ,
          <article-title>Ontology learning: Grand tour and challenges</article-title>
          ,
          <source>Computer Science Review</source>
          <volume>39</volume>
          (
          <year>2021</year>
          )
          <fpage>100339</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zou</surname>
          </string-name>
          ,
          <article-title>A survey on application of knowledge graph</article-title>
          ,
          <source>in: Journal of Physics: Conference Series</source>
          , volume
          <volume>1487</volume>
          ,
          <string-name>
            <given-names>IOP</given-names>
            <surname>Publishing</surname>
          </string-name>
          ,
          <year>2020</year>
          , p.
          <fpage>012016</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>F. N.</given-names>
            <surname>AL-Aswadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. Y.</given-names>
            <surname>Chan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. H.</given-names>
            <surname>Gan</surname>
          </string-name>
          ,
          <article-title>From ontology to knowledge graph trend: Ontology as foundation layer for knowledge graph</article-title>
          ,
          <source>in: Iberoamerican Knowledge Graphs and Semantic Web Conference</source>
          , Springer,
          <year>2022</year>
          , pp.
          <fpage>330</fpage>
          -
          <lpage>340</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Studer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. R.</given-names>
            <surname>Benjamins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fensel</surname>
          </string-name>
          ,
          <article-title>Knowledge engineering: Principles and methods</article-title>
          ,
          <source>Data &amp; knowledge engineering 25</source>
          (
          <year>1998</year>
          )
          <fpage>161</fpage>
          -
          <lpage>197</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Buitelaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cimiano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Magnini</surname>
          </string-name>
          ,
          <article-title>Ontology learning from text: methods, evaluation and applications</article-title>
          , volume
          <volume>123</volume>
          , IOS press,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Khadir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Aliane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Guessoum</surname>
          </string-name>
          ,
          <article-title>Ontology learning: Grand tour and challenges</article-title>
          ,
          <source>Computer Science Review</source>
          <volume>39</volume>
          (
          <year>2021</year>
          )
          <fpage>100339</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M. N.</given-names>
            <surname>Asim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wasim</surname>
          </string-name>
          , M. U. G. Khan,
          <string-name>
            <given-names>W.</given-names>
            <surname>Mahmood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. M.</given-names>
            <surname>Abbasi</surname>
          </string-name>
          ,
          <article-title>A survey of ontology learning techniques and applications</article-title>
          , Database,
          <source>The Journal of Biological Databases and Curation</source>
          <year>2018</year>
          (
          <year>2018</year>
          )
          <fpage>1</fpage>
          -
          <lpage>24</lpage>
          . doi:
          <volume>10</volume>
          .1093/database/bay101.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hari</surname>
          </string-name>
          , P. Kumar,
          <article-title>WSD based ontology learning from unstructured text using transformer</article-title>
          ,
          <source>Procedia Computer Science</source>
          <volume>218</volume>
          (
          <year>2023</year>
          )
          <fpage>367</fpage>
          -
          <lpage>374</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>C.</given-names>
            <surname>Niklaus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cetto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Freitas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Handschuh</surname>
          </string-name>
          ,
          <article-title>A survey on open information extraction</article-title>
          , arXiv preprint arXiv:
          <year>1806</year>
          .
          <volume>05599</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Corrado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>26</volume>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>Attention is all you need</article-title>
          ,
          <source>in: NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>5998</fpage>
          -
          <lpage>6008</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>T.</given-names>
            <surname>Brown</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ryder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Subbiah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kaplan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dhariwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Neelakantan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shyam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sastry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Askell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Herbert-Voss</surname>
          </string-name>
          , G. Krueger,
          <string-name>
            <given-names>T.</given-names>
            <surname>Henighan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Child</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ramesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ziegler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Winter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Amodei</surname>
          </string-name>
          ,
          <article-title>Language models are few-shot learners</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>33</volume>
          (
          <year>2020</year>
          )
          <fpage>1877</fpage>
          -
          <lpage>1901</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bommasani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rafel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zoph</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Borgeaud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yogatama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bosma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Metzler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. H.</given-names>
            <surname>Chi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hashimoto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Vinyals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Fedus</surname>
          </string-name>
          ,
          <source>Emergent abilities of large language models</source>
          ,
          <year>2022</year>
          . arXiv:
          <volume>2206</volume>
          .
          <fpage>07682</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Frieske</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Ishii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. J.</given-names>
            <surname>Bang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Madotto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Fung</surname>
          </string-name>
          ,
          <article-title>Survey of hallucination in natural language generation</article-title>
          ,
          <source>ACM Computing Surveys</source>
          <volume>55</volume>
          (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .1145/3571730.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>T.</given-names>
            <surname>Al-Moslmi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Ocaña</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. L.</given-names>
            <surname>Opdahl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Veres</surname>
          </string-name>
          ,
          <article-title>Named entity extraction for knowledge graphs: A literature overview</article-title>
          ,
          <source>IEEE Access 8</source>
          (
          <year>2020</year>
          )
          <fpage>32862</fpage>
          -
          <lpage>32881</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>G.</given-names>
            <surname>Salton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.-S.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. T.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <article-title>Contribution to the theory of indexing</article-title>
          ,
          <source>Technical Report</source>
          , Cornell University,
          <year>1973</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M.</given-names>
            <surname>Grootendorst</surname>
          </string-name>
          ,
          <source>Keyword extraction with BERT</source>
          ,
          <year>2020</year>
          . URL: https://towardsdatascience. com
          <article-title>/keyword-extraction-with-bert-</article-title>
          724efca412ea, accessed on:
          <fpage>07</fpage>
          -
          <lpage>03</lpage>
          -
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>M. De Boer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Verhoosel</surname>
          </string-name>
          ,
          <article-title>Creating and evaluating data-driven ontologies</article-title>
          ,
          <source>International Journal on Advances in Software 12</source>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>N.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Poon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Quirk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , W.-t. Yih,
          <article-title>Cross-sentence n-ary relation extraction with graph lstms, Transactions of the Association for Computational Linguistics 5 (</article-title>
          <year>2017</year>
          )
          <fpage>101</fpage>
          -
          <lpage>115</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>C.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Hong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <article-title>Deepstruct: Pretraining of language models for structure prediction</article-title>
          ,
          <source>arXiv preprint arXiv:2205.10475</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Okazaki</surname>
          </string-name>
          , Dreeam:
          <article-title>Guiding attention with evidence for improving document-level relation extraction</article-title>
          ,
          <source>arXiv preprint arXiv:2302.08675</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <article-title>Two are better than one: Joint entity and relation extraction with tablesequence encoders</article-title>
          , arXiv preprint arXiv:
          <year>2010</year>
          .
          <volume>03851</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>H.</given-names>
            <surname>Elsahar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Vougiouklis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Remaci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gravier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hare</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Laforest</surname>
          </string-name>
          , E. Simperl, T-rex:
          <article-title>A large scale alignment of natural language with knowledge base triples</article-title>
          ,
          <source>in: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC</source>
          <year>2018</year>
          ),
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>G.</given-names>
            <surname>Rossiello</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. F. M. Chowdhury</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Mihindukulasooriya</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Cornec</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          <string-name>
            <surname>Gliozzo</surname>
          </string-name>
          ,
          <article-title>Knowgl: Knowledge generation and linking from text</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>37</volume>
          ,
          <year>2023</year>
          , pp.
          <fpage>16476</fpage>
          -
          <lpage>16478</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>R. M. Bakker</surname>
            ,
            <given-names>G. J.</given-names>
          </string-name>
          <string-name>
            <surname>Kalkman</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <string-name>
            <surname>Tolios</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Blok</surname>
            ,
            <given-names>G. A.</given-names>
          </string-name>
          <string-name>
            <surname>Veldhuis</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Raaijmakers</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. H. de Boer</surname>
          </string-name>
          ,
          <article-title>Exploring knowledge extraction techniques for system dynamics modelling: Comparative analysis and considerations</article-title>
          ,
          <source>in: Proceedings of the Benelux Conference on Artificial Intelligence (BNAIC)</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wan</surname>
          </string-name>
          , F. Cheng,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kurohashi</surname>
          </string-name>
          , GPT-RE:
          <article-title>In-context learning for relation extraction using large language models</article-title>
          ,
          <source>arXiv preprint arXiv:2305.02105</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>B. P.</given-names>
            <surname>Allen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Stork</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Groth</surname>
          </string-name>
          ,
          <article-title>Knowledge engineering using large language models (</article-title>
          <year>2023</year>
          ). arXiv:
          <volume>2310</volume>
          .
          <fpage>00637</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>R. M. Bakker</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. H. de Boer</surname>
            ,
            <given-names>A. P.</given-names>
          </string-name>
          <string-name>
            <surname>Meyer-Vitali</surname>
            ,
            <given-names>B. J.</given-names>
          </string-name>
          <string-name>
            <surname>Bakker</surname>
            ,
            <given-names>S. A.</given-names>
          </string-name>
          <string-name>
            <surname>Raaijmakers</surname>
          </string-name>
          ,
          <article-title>A hybrid approach for creating knowledge graphs: Recognizing emerging technologies in dutch companies, HHAI2022: Augmenting Human Intellect (</article-title>
          <year>2022</year>
          )
          <fpage>307</fpage>
          -
          <lpage>309</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>