<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Models as Autonomous Agents for Semantic Triple Extraction from Unstructured Text</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ananya Ananya</string-name>
          <email>ananyah@iitbhilai.ac.in</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sanju Tiwari</string-name>
          <email>tiwarisanju18@ieee.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nandana Mihindukulasooriya</string-name>
          <email>nandana@ibm.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tommaso Soru</string-name>
          <email>tom@tommaso-soru.it</email>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ziwei Xu</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Diego Moussallem</string-name>
          <email>diegomoussallem@gmail.com</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Triple extraction, Knowledge Graph, Knowledge Graph Construction, LLM Agents, Reasoning, Handling</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>BVICAM</institution>
          ,
          <addr-line>New Delhi</addr-line>
          ,
          <country country="IN">India &amp;</country>
          <addr-line>UAT</addr-line>
          ,
          <country country="MX">Mexico</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>IBM Research</institution>
          ,
          <addr-line>New York</addr-line>
          ,
          <country country="US">United States</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Indian Institute of Technology</institution>
          ,
          <addr-line>Bhilai</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Jusbrasil</institution>
          ,
          <addr-line>Salvador</addr-line>
          ,
          <country country="BR">Brazil</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>National Institute of Advanced Industrial Science and Technology</institution>
          ,
          <addr-line>Tokyo</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>Serendipity AI Ltd</institution>
          ,
          <addr-line>London</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The use of Large Language Models as autonomous agents interacting with tools has shown to improve the performance of several tasks from code generation to API calling and sequencing. This paper proposes a framework for using Large Language Models as autonomous agents for the task of Knowledge Graph construction from unstructured text. Specifically, it focuses on triple extraction, which involves identifying entities and their relationships from text to construct a Knowledge Graph. Our novel framework “Auto-KG agent” incorporates two relation extraction tools, REBEL and KnowGL, in conjunction with Large Language Models. Experimental results on the CONLL04 dataset demonstrate that while multi-tool approaches face challenges like hallucination, LLM-based agents are promising in mitigating biases, major event identification, handling negations and modalities thus enhancing extraction accuracy, particularly for complex linguistic structures. The impetus for this research is to overcome the current limitations of existing systems for Knowledge Graph construction and propose a roadmap for developing a robust framework capable of handling the intricacies of natural language with minimal human interference. The paper also discusses future directions, such as emulating Large Language Model training using reinforcement learning with human feedback, incorporating query decomposition, and integrating a re-ranking module. Through this research, the authors aim to set a new direction for future endeavours in building advanced, reliable systems for knowledge extraction. Overall, this work highlights the potential of LLM-based agents for knowledge graph construction and proposes a framework for harnessing their capabilities.</p>
      </abstract>
      <kwd-group>
        <kwd>Extraction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR</p>
      <p>ceur-ws.org
modalities and negations, Mitigating biases</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        The advent of Knowledge Graphs has revolutionized the way we represent and utilize
information in the digital age. By structuring data as triples—consisting of a head entity, a relationship,
and a tail entity (h, r, t) — Knowledge Graphs provide a semantic framework to describe the
varied and countless entities and their interrelations in the objective world. This structured
approach to data organization underpins intelligent applications and has garnered significant
attention in both academic and industrial spheres due to its potent semantic processing capabilities
and open organizational structure [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        In the field of Natural Language Processing, extracting relational facts from text is crucial.
Understanding the semantic relationships between entities in unstructured text helps convert
raw data into structured formats. This structured data is extremely valuable for several tasks,
such as building and enhancing Knowledge Bases. These bases are essential for powering
applications that rely on knowledge [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        In the realm of information extraction, frameworks like REBEL [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and KnowGL [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] have
emerged as powerful tools for converting unstructured text into structured relational data. These
frameworks leverage the advancements in machine learning and natural language processing to
perform tasks that traditionally required separate models for Named Entity Recognition (NER)
and Relation Classification (RC). REBEL, which stands for Relation Extraction By End-to-end
Language Generation, utilizes an autoregressive sequence-to-sequence (seq2seq) model,
specifically a BART-large model, to extract relationships between entities in a text. The architecture of
REBEL is designed to represent relations as a linearized sequence that includes entity mentions,
labels, types, and the relation label. Similarly, KnowGL is a comprehensive framework that
aims to transform natural language text into structured data that aligns with the schema of a
Knowledge Graph like Wikidata. KnowGL consists of three main components: “Knowledge
Generation”, “Fact Ranking”, and “Linking to Wikidata”. We focus on the Knowledge Generation
component of the KnowGL framework. The Knowledge Generation component uses fine-tuned,
pre-trained language models to identify entity mentions and generate facts, including entity
labels, types, and relationships.
      </p>
      <p>Despite their innovative approaches, REBEL and KnowGL have certain limitations. The
performance of these models is heavily influenced by the quality of their pre-training data.
Biases or inaccuracies present in the training datasets can propagate through the models,
afecting the accuracy of the extracted relations. Furthermore, the ability of these frameworks to
generalize across various domains and text types hinges on the extent to which the pre-trained
language models are fine-tuned or further pre-trained on domain-specific datasets. While Large
Language Models inherently possess a broad understanding of language, their performance
in specialized contexts improves significantly with targeted fine-tuning. Additionally, these
systems may struggle with complex sentence structures and fail to identify all relevant major
events, particularly in sentences laden with modalities or multiple clauses.</p>
      <p>To address these challenges, we introduce a novel framework that synergizes the capabilities
of REBEL and KnowGL with the nuanced understanding of Large Language Models. Large
Language Models demonstrate remarkable eficacy in processing sentences with modalities
and complex structures, leading to accurate event identification and triple extraction. This
integration not only enhances event detection but also aids in mitigating biases inherent in
training data, ensuring a more comprehensive extraction of triples. Apart from introducing a
novel framework, we also aim to answer the following research questions:
RQ1: How efective are Large Language Models in mitigating biases for extracting triples which
are present in the datasets used for training information extraction tools?
RQ2: To what extent do Large Language Models accurately handle modalities and negations in
natural language, and how does this capability afect the quality of triple extraction?
RQ3: Can Large Language Models enhance the identification of events within unstructured
text, thereby improving the accuracy and completeness of triple extraction?
RQ4: How well do Large Language Models generalize across diferent datasets without the
need for extensive training or fine-tuning, particularly in the context of triple extraction
for knowledge graph construction?
RQ5: What is the impact of using multiple tools versus a single tool on the performance of
triple extraction?</p>
      <p>The impetus for our research is to overcome the current limitations of existing systems and
chart a course for the development of a robust framework capable of handling the intricacies of
natural language. Our experiments are designed with this objective in mind and are conducted
with the resources available to us. Through this research, we aim to set a new direction for
future endeavors in building advanced, reliable systems for knowledge extraction and reasoning.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <p>
        Using Large Language Models as autonomous agents has become increasingly popular in recent
research. Large Language Models possess advanced reasoning abilities and skills in utilizing
tools, making them well-suited for autonomous operations. They excel in tasks like acquiring
knowledge, understanding instructions, generalizing information, planning, and reasoning,
showcasing their potential for autonomous tasks [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. However, Large Language Models do
have limitations, such as performing arithmetic operations and staying updated with the latest
information, which cannot be fully addressed through simple fine-tuning alone. This highlights
the need for designing autonomous agent frameworks that can complement LLMs by integrating
external data and supplementary tools [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        This section has covered the existing studies on LLM-based Agents for Knowledge Graph
Generation from Text. Jiang et. al [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] have introduced KG-Agent, an autonomous framework based
on Large Language Models, designed to enable a small Large Language Model to independently
make decisions throughout the reasoning process over Knowledge Graphs until completion.
Within KG-Agent, a Large Language Model is combined with a versatile toolbox, a knowledge
memory system and a KG-based executor. Jiang et. al [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] also introduced StructGPT tool, an
Iterative Reading-then-Reasoning (IRR) framework aimed at addressing question-answering
tasks using structured data. In this framework, a specialized interfaces has been designed to
acquire relevant information from structured data and allowing Large Language Models to
focus on the reasoning tasks based on the acquired information. This research also introduced
a procedure termed invoking linearization generation to promote Large Language Models in
reasoning over structured data with the help of provided interfaces. Zhu et. al [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] explore
Large Language Models for Knowledge Graph construction with reasoning and introduced
an innovative approach called AutoKG, which utilizes multiple agents to eficiently handle
both Knowledge Graph construction and reasoning tasks. There are several LLM-based Agents
which are shown in Table 1.
      </p>
      <p>
        Methodology
Tool Learning [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
Instruction tuning [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
Instruction Tuning with Hu- Instruction data mimicking human learning
man Curriculum [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] progression
ReAct [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] Prompting LLMs for Decision Making
      </p>
      <p>Description Tasks
In-context demonstration and Generation Reg- Tool manipulation, multi-tool
ulation usage
Learning on high-quality instruction datasets Tool manipulation, multi-tool
usage
Reasoning and
knowledgebased tasks
Reasoning algorithm</p>
      <p>
        Dataset
ToolBench [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
ToolBench [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
CORGI [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]
Prompting LLMs for Decision Making
Prompting LLMs for Decision Making
Online Decision Making
Text-based web browsing environment
A system integrated with ChatGPT with a pool
of vision experts
Prompt with program-like specifications of the
available actions and objects in an environment
KG reasoning agent
Enables a small LLM to actively make decisions
until finishing the reasoning process over KGs
Enhances the planning capabilities of LLMs by
incorporating explicit action knowledge from
KGs
      </p>
      <p>A knowledgeable
selflearning strategy for path
planning</p>
      <p>
        HotPotQA [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], FEVER
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]
Reasoning algorithm ToolBench [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
Reasoning algorithm GSM8K [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], SVAMP
[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], ASDiv [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], AQuA,
      </p>
      <p>
        MAWPS [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]
Any decision-making task ALFRED [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], DAgger
      </p>
      <p>
        [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]
Long-form QA ELI5 [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]
Multimodal reasoning and ac- Self
tion
Generate situated Robot Task Self
Plans
Conversational Reasoning on OpenDialKG [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]
Knowledge Graph,
predictions on KG paths as a
decision-making task
Improve the reasoning ability [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ],
and complex QA [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ],
      </p>
      <p>
        WebQSP
CWQ
GrailQA [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ]
HotPotQA [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]
      </p>
    </sec>
    <sec id="sec-4">
      <title>3. Background</title>
      <p>
        In the construction of LLM agents, an LLM acts as the primary controller or “brain,” orchestrating
the sequence of operations required to accomplish a task or respond to a user query. These LLM
agents may require additional modules like planning, memory, and tool utilization to enhance
their functionality [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ]. To activate the LLM component, a prompt template containing essential
operational details and tool access specifications is utilized. While not mandatory, agents can
be characterized or given a persona to define their role. This profiling information is typically
embedded within the prompt and may include details such as role description, personality traits,
social characteristics, and other demographic attributes [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ].
      </p>
      <p>Our research focuses on extracting triples from unstructured text to facilitate Knowledge
Graph construction. Triple extraction entails identifying and extracting structured information
in the form of triples, which comprise a subject entity, a relation or predicate, and an object
entity. This process is crucial in converting text into a structured format suitable for tasks like
knowledge representation and information retrieval in natural language processing.</p>
      <p>Relation extraction is another vital task we explore, involving the identification and extraction
of semantic relationships between entities mentioned in unstructured text. These relationships,
such as “is married to” or “works for”, capture meaningful associations and are represented as
triples for downstream applications like Knowledge Graph construction.</p>
      <p>
        REBEL [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] REBEL, which stands for Relation Extraction By End-to-end Language generation,
is an extraction technique to pull out relationship details from raw text.
      </p>
      <p>
        REBEL uses a special kind of model called autoregressive sequence-to-sequence (seq2seq)
models. These models are good at making text and also understanding natural language. The
main part of REBEL is a seq2seq model based on BART [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]. It represents relations between
entities in the input text as a linearized sequence following a specific schema involving entity
mentions, labels, types, and the relation label. REBEL uses BART-large as the base model
which is first pre-trained on a large distantly supervised dataset called REBEL which was
created by extracting over 800K training instances with 220 relation types from Wikipedia
abstracts aligned with Wikidata facts. The pre-trained BART is then fine-tuned on this REBEL
dataset to maximize the likelihood of generating the correct linearized triplet representation
given the input text. REBEL demonstrates several advantages - it frames relation extraction in
an end-to-end manner, can extract open-ended relation types, allows quickly fine-tuning on
new datasets across domains, and achieves state-of-the-art performance on multiple relation
extraction benchmarks while being simpler than prior complex pipeline approaches.
      </p>
      <p>
        KnowGL [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] KnowGL is a comprehensive framework designed to convert natural language
text into structured relational data that aligns with the schema of a Knowledge Graph like
Wikidata. This framework comprises three key components: “Knowledge Generation”, “Fact
Ranking”, and “Linking to Wikidata”. The Knowledge Generation aspect focuses on extracting
facts by fine-tuning pre-trained language models to identify entity mentions and generate
sets of facts including entity labels, types, and relationships. Fact Ranking involves parsing
generated sequences to create a ranked list of distinct facts based on scores assigned to each fact.
Lastly, Linking to Wikidata facilitates retrieving Wikidata IDs associated with the generated
semantic annotations. By enabling the conversion of text into Wikidata statements in JSON
format, KnowGL demonstrates the potential of pre-trained language models for generating
structured data from text, ofering an alternative to traditional information extraction pipelines.
      </p>
    </sec>
    <sec id="sec-5">
      <title>4. System Architecture</title>
      <p>The goal of our framework is to facilitate automatic triple extraction from text inputs. This
framework is designed as a multi-tool system utilizing Large Language Models to execute the
task of triple extraction.</p>
      <p>
        Figure 1 outlines a framework for training a Large Language Model (LLM) using a method
referred to as RLHF, which stands for Reinforcement Learning from Human Feedback. The
lfowchart is divided into two main sections:
1. The Large Language Model training procedure using RLHF [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ]
2. RLHF LLM Based Autonomous Agents for triple Extraction for Knowledge Graph
construction
LLM Training Procedure Using RLHF
      </p>
      <p>The training procedure begins with raw text to pretrain the Large Language Model. This
pre-training step is where the model learns from a large corpus of text to understand language
patterns and structures.</p>
      <p>After pretraining, the model becomes a “Pretrained LLM” which is then subjected to
“Supervised Fine-tuning” using Demonstration data. Demonstration data consists of prompts and
response pairs. This step involves training the model on specific examples to perform certain
tasks and understand particular domains better. This form an instruction-following Chatbot of
low quality.</p>
      <p>The fine-tuned model, referred to as “SFT LLM” is then used in conjunction with “Human
Preference Data” to train a “Reward Model”. The Human Preference data consist of prompts,
winner tuples and loser tuples. This reward model evaluates the outputs of the LLM and provides
feedback on its performance.</p>
      <p>
        The feedback from the reward model is used in Reinforcement Learning, where the Large
Language Model is further trained using RLHF (Reinforcement Learning with Human Feedback)
training prompts to improve its outputs based on human preferences and feedback. The RLHF
training prompts also consists of prompt, winner tuples and loser tuples. In the end we get a
high-quality instruction-following RLHF LLM [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ]. The diagram presented in the first half of
the Figure 1
      </p>
      <p>RLHF LLM Based Autonomous Agents for Triple Extraction for KG Construction
The bottom section of the flowchart shows the application of the trained RLHF LLM for a
specific task: Autonomous agents for triple extraction to construct Knowledge Graph.</p>
      <p>A User inputs a complex query, a command which is then decomposed by the
querydecomposition LLM. The decomposed query is processed by the “RLHF LLM” which interacts
with a Tool DB (Tool Database). Here the tools in the Tool DB are “REBEL” and “KnowGL”.</p>
      <p>The RLHF LLM then performs re-ranking of triples extracted from the complex query which
involves selecting the most relevant and accurate triples based on their count of occurrences of
triples in the Knowledge Graph. The final output are then re-ranked which would be used in
the construction of a Knowledge Graph. The system prompt for RLHF LLM is provided in the
Appendix A.</p>
      <p>It is a comprehensive framework for training a Large Language Model using human feedback
and then applying this model to extract triples for building Knowledge Graph.</p>
      <p>We implement the “Auto-KG Agent” framework to facilitate automatic triple extraction from
text inputs. This framework is designed as multi-tool system utilizing Large Language Models
to extract triples. We utilise REBEL and KnowGL as tools for triple extraction (relation along
with entity). More such frameworks can be added as tools in the Tool DB. Large Language
Model is asked to return the entities in JSON format. The diagram presented in the second half
of the Figure 1 serves as the visual representation for this section.</p>
      <p>The current system comprises the second half of the the Figure 1, without the query
decomposition Large Language Model and Large Language Model training using RLHF. As of now we
only incorporate Tool DB, LLM without RLHF (only system prompt) and re-ranking of triples
based on the length of relation extracted. We state in the section 7 for incorporating other
modules in the “Auto-KG Agent” framework.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Preliminary Experimental Setup</title>
      <p>
        Dataset We evaluate our system’s performance on the CONLL04 dataset [
        <xref ref-type="bibr" rid="ref39">39</xref>
        ], which comprises
sentences extracted from news articles. Each sentence is annotated with four entity types
(person, organization, location, and other) and five relation types (kill, work for, organization
based in, live in, and located in). Our evaluation focuses on the test split consisting of 288
instances [
        <xref ref-type="bibr" rid="ref40">40</xref>
        ], the ground truth, comparing the performance of our model against the REBEL
model. The dataset statistics is described in Table 2.
      </p>
      <sec id="sec-6-1">
        <title>Dataset</title>
        <sec id="sec-6-1-1">
          <title>CONLL04</title>
        </sec>
      </sec>
      <sec id="sec-6-2">
        <title>Entity Types</title>
        <p>4</p>
      </sec>
      <sec id="sec-6-3">
        <title>Relation Types</title>
        <p>5</p>
      </sec>
      <sec id="sec-6-4">
        <title>Train</title>
        <p>1,290 (922)</p>
      </sec>
      <sec id="sec-6-5">
        <title>Validation</title>
        <p>343 (231)</p>
      </sec>
      <sec id="sec-6-6">
        <title>Test</title>
        <p>422 (288)
Evaluation Metrics The evaluation process compares the predicted triples extracted from test
data with the ground truth triples. Each instance in both datasets is represented as a dictionary,
with a unique identifier and a set of triples. Each sentence has an object corresponding to it
which stores the triple.</p>
        <p>To calculate the true positives (correctly predicted triples), we iterate through each instance
in the ground truth data. For each instance, we check if the corresponding instance exists in
the predicted data. If it does, we find the intersection of the triples in the ground truth and
predicted data, which gives us the number of correct predictions (true positives).</p>
        <p>Additionally, we also calculate the number of extra predictions made by the model that is the
count of triples not present in the ground truth. However, we don’t calculate scores for them as
it would require Human evaluation.</p>
        <p>After calculating count of true positives, we compute both micro and macro scores for
precision, recall, and F1 score. Micro scores consider the total number of triples in the entire
dataset for calculating precision, recall, and F1 score, while macro scores average these metrics
across each instance in the dataset.</p>
        <p>Overall, this evaluation process enables us to assess the performance of the triple extraction
model by quantifying its precision, recall, and F1 score, considering both individual instances
and the entire dataset.</p>
        <p>We did a strict evaluation where the correctness for triples extracted as a whole is compared
with the corresponding head entity, tail entity and relation in the ground truth. Following are
the counts of unique relations extracted for diferent frameworks:
• REBEL - 68 relations mapped to 5 relations
• KnowGL - 58 relations mapped to 5 relations
• REBEL + KnowGL - 90 relations mapped to 5 relations
Triple Extraction Tools REBEL and KnowGL frameworks are used as triple extraction tools
in our experiment. In our evaluation of the REBEL model on the CONLL04 test dataset, we
encountered a diverse set of 68 unique relations extracted by the model. To align these with
the CONLL04 dataset’s five predefined relations, we undertook a manual mapping process.
This process was guided by semantic similarity and contextual relevance, ensuring that each
extracted relation was correctly associated with one of the canonical relations such as ‘killed
by’, ‘residence’, ‘location’, ‘headquarters location’, and ‘employer’, as originally formatted in the
REBEL paper. The necessity for this manual mapping arose from the fact that the REBEL model,
trained on multiple datasets, identified relations beyond the scope of the CONLL04 dataset,
requiring careful consideration to maintain semantic integrity. The manual mappings for these
relations, based on their semantic similarity to the corresponding five relations were carried out.
Similarly, KnowGL had 54 unique relations being extracted. We followed the same mapping
procedure for it. Figure 2 shows the distribution based on the number of occurrences for 5 types
of relations extracted for diferent experiment design settings.</p>
        <p>For details, readers can refer to Appendix A. The code for experiments is available in GitHub 1.
1https://github.com/Ananyaiitbhilai/Text2Triple-LLM-Agent
RLHF LLM In our experimental setup, we use Large Language Models without any fine-tuning
or instruction tuning. However, in future we plan for RLHF LLMs being used to orchestrate
the tool execution for triple extraction. Specifically, we used two open-source Large Language
Models for benchmarking: “Gemma” and “Mistral”</p>
        <p>
          Gemma [
          <xref ref-type="bibr" rid="ref41">41</xref>
          ] is a family of lightweight LLMs built from the same research and technology
Google used to create the Gemini models. Gemma models are available in two sizes, 2 billion
and 7 billion parameters. These models are trained on up to 6T tokens of primarily English web
documents, mathematics, and code, using a transformer architecture with enhancements like
Multi-Query Attention, RoPE Embeddings, GeGLU Activations, and advanced normalization
techniques. We use the 2B one.
        </p>
        <p>
          Mistral-7B-Instruct-v0.2 [
          <xref ref-type="bibr" rid="ref42">42</xref>
          ] Large Language Model (LLM) is an improved instruct
finetuned version of Mistral-7B-Instruct-v0.1. The Mistral-7B-v0.1 Large Language Model is a
pretrained generative text model with 7 billion parameters. Mistral-7B-v0.1 outperforms Llama
2 13B.
        </p>
        <p>For additional experimental set-up refer Appendix 3.</p>
        <p>The Table 3 illustrates the number of relations and instances for diferent experiment design
settings.</p>
        <p>In Table 4, REBEL refers to all triples extracted by REBEL, REBEL (subset Mistral) refers to
removing all the triples which could not be extracted because of the hallucination (not returning
triples in expected JSON format rather string format or other format) in the Mistral LLM. So
those triples are removed from ground truth and then evaluation is carried out. Similar follows
for REBEL (subset Gemma). Here the tools like KnowGL, REBEL are used as a single-tool in
conjuction with LLM (Mistral and Gemma), REBEL and KnowGL as a multi-tool in conjuction
with LLM (Mistral and Gemma).</p>
        <sec id="sec-6-6-1">
          <title>A summary of statistics for diferent experiment design settings.</title>
        </sec>
      </sec>
      <sec id="sec-6-7">
        <title>Model</title>
        <sec id="sec-6-7-1">
          <title>REBEL</title>
        </sec>
        <sec id="sec-6-7-2">
          <title>REBEL (subset Mistral)</title>
        </sec>
        <sec id="sec-6-7-3">
          <title>REBEL (subset Gemma)</title>
        </sec>
        <sec id="sec-6-7-4">
          <title>MISTRAL (single-tool REBEL)</title>
        </sec>
        <sec id="sec-6-7-5">
          <title>GEMMA (single-tool REBEL)</title>
        </sec>
        <sec id="sec-6-7-6">
          <title>KNOWGL</title>
        </sec>
        <sec id="sec-6-7-7">
          <title>Mistral (KNOWGL + REBEL multi-tool)</title>
          <p>P
0.16</p>
          <p>From Table 5 it can be observed that Gemma, a significant number of hallucinations were
observed, accounting for 99 out of 288 total responses. This higher incidence of hallucination in
Gemma was primarily attributed to incorrect JSON format returned by the model. Conversely,
Mistral exhibited a lower occurrence of hallucination, with only 26 out of 288 total responses
displaying such phenomena. The single-tool here refers to REBEL, and multi-tool has both
REBEL and KnowGL. “Hallucination” refers to a phenomenon where the model generates text
that is incorrect, nonsensical, or not real.</p>
        </sec>
      </sec>
      <sec id="sec-6-8">
        <title>Total Responses</title>
      </sec>
      <sec id="sec-6-9">
        <title>Number of Hallucinations Model</title>
        <sec id="sec-6-9-1">
          <title>Gemma</title>
        </sec>
        <sec id="sec-6-9-2">
          <title>Mistral (single-tool)</title>
          <p>Mistral (multi-tool)
288
288
288
99
26
42</p>
        </sec>
        <sec id="sec-6-9-3">
          <title>Comparison of Hallucination (particularly not giving response in expected JSON format) Occurrences</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>6. Results</title>
      <p>We investigated the performance of multiple tools versus single tools for relation extraction
and observed a notable decline in scores with multi-tool usage, suggesting that single-tool
approaches may yield better results as shown in Table 4. We attributed this drop to increased
hallucination, particularly more prevalent when employing multiple tools due to hallucination.
However, single-tool usage also presented challenges, as occasionally, the returned format did
not align with the one specified in the system prompt. Moreover, From Table 4 , it can be
also be observed that REBEL with Large Language Model and only REBEL has almost same
performance. It is due to the fact that REBEL and KnowGL is being used as a tool to trigger the
action of extracting the relations. Both KnowGL and REBEL have the same architecture thus
similar biases.</p>
      <p>Our findings underscore the need for integrating Large Language Models with extraction
tools to harness their full potential. While tools exhibits shortcomings in certain contexts,
Large Language Models ofer a complementary approach, particularly in mitigating biases and
enhancing extraction accuracy. By empowering Large Language Models to engage in more
nuanced planning or decomposing the query, we anticipate significant improvements in relation
extraction performance.</p>
      <p>Gemma’s higher scores are because of a larger number of responses being generated as
strings rather than in the expected JSON format as shown in Table 5. Consequently, these
nonformatted responses are removed, resulting in fewer triples available for evaluation as compared
to Mistral. This phenomenon contributes to Gemma’s higher scores in triple extraction tasks
compared to Mistral.</p>
      <p>In order to answer our aforementioned Research Questions, we took certain examples that
had multiple events, complex clauses, negation, modalities and then evaluation by human was
carried out manually to check for the correctness of triples extracted from the pre-existing tools
and our Auto-KG agent.</p>
      <p>Event identification and mitigating biases in Triple Extraction Our investigation
uncovered instances of flawed relation extraction within the REBEL and KnowGL tools. In a sample
sentence, “While Marie Curie and Albert Einstein conducted groundbreaking experiments
in their laboratories at the University of Paris, Leonardo da Vinci’s sketches of Renaissance
architecture in the bustling streets of Florence sparked inspiration across Italy,” REBEL solely
identified the triple “Florence, located in, Italy”. However, this sentence encompasses two
distinct events: “The experimentation conducted by Marie Curie and Albert Einstein at the
University of Paris”, and “the inspiration sparked by Leonardo da Vinci’s sketches across Italy”.
REBEL’s oversight stemmed from its inclination towards location-centric relations, influenced
by biases within the training data.</p>
      <p>In contrast, our Auto-KG agent showcased promise in mitigating such limitations. They
accurately extracted triples from the sentence, capturing nuanced relations such as “Marie Curie,
experimented at, University of Paris”, “Albert Einstein, experimented at, University of Paris”,
“Leonardo da Vinci’s sketches, located in, Florence”, and “Leonardo da Vinci’s sketches, sparked
inspiration in, Italy”. This underscores the Auto-KG agent’s proficiency in comprehending
complex linguistic structures and discerning meaningful relations, showcasing their potential
for enhancing relation extraction accuracy.</p>
      <p>Negation Handling Discrepancies in Triple Extraction In our comparative analysis,
we also observed a notable discrepancy in the handling of negation between REBEL and
KnowGL tools, as opposed to our Auto-KG agent in triple extraction tasks. Both REBEL and
KnowGL demonstrated limitations in efectively managing negation cues within text, resulting
in erroneous extraction of triples. Conversely, our tool exhibited robust performance in negation
handling, yielding more accurate triple extractions even in the presence of negation cues.</p>
      <p>We considered the sentence “Fado does not work at IIT”. REBEL and KnowGL erroneously
extract a triple indicating that Fado works at IIT, failing to account for the negation. In contrast,
our tool was adept at discerning the negation cue “not” and appropriately adjusting the extracted
triple to reflect the absence of the stated relationship, thereby accurately capturing the intended
semantics of the sentence.</p>
      <p>This discrepancy underscores the nuanced understanding of language exhibited by Large
Language Models, enabling them to efectively navigate linguistic complexities such as negation
cues in triple extraction tasks. It highlights the potential of leveraging LLM-based approaches to
enhance the accuracy and reliability of triple extraction processes in natural language processing
applications.</p>
      <p>Generalising well on various datasets A significant disparity in the performance of seq2seq
based approaches such as REBEL and KnowGL when trained or fine-tuned on specific datasets.
While these models exhibit impressive performance within the confines of their training data,
they demonstrate limited generalization capabilities beyond the dataset they were trained on.
Conversely, Large Language Models showcase remarkable generalization prowess even without
explicit training on a particular dataset. This discrepancy underscores the inherent adaptability
and robustness of LLMs, enabling them to efectively handle diverse datasets and tasks without
the need for extensive training or fine-tuning.</p>
    </sec>
    <sec id="sec-8">
      <title>7. Future Directions</title>
      <p>For future, our focus lies on emulating the Large Language Model training methodology using
Reinforcement Learning with Human Feedback (RLHF) as detailed in Section 4</p>
      <p>Additionally, we intend to incorporate a query-decomposition LLM to partition complex user
queries into sub-queries, facilitating more precise event identification and subsequent triple
extraction.</p>
      <p>Furthermore, our proposed future work entails synergizing LLMs with multiple extraction
tools to enhance their generalization capabilities across diverse datasets without requiring
explicit training. This approach holds potential to surpass the performance of seq2seq models
such as REBEL and KnowGL.</p>
      <p>Moreover, we aim to integrate a re-ranking module into our framework. This module will
prioritize all extracted triples based on their confidence levels, ensuring a more refined and
accurate output.</p>
      <p>We also aim to develop a diverse dataset that encompasses a wide range of relations and
includes a variety of sentence structures. This dataset is intended to serve as a robust benchmark
for evaluating performance in triple extraction tasks.</p>
    </sec>
    <sec id="sec-9">
      <title>8. Limitations and Conclusion</title>
      <p>The paper presents a novel framework that integrates Large Language Models (LLMs) with
existing tools like REBEL and KnowGL for the task of triple extraction from unstructured text
to construct knowledge graphs. The proposed framework aims to leverage the strengths of
LLMs in understanding complex linguistic structures, handling modalities and negations, and
mitigating biases inherent in training data. The experimental results on the CONLL04 dataset
indicate that while multi-tool approaches face challenges such as hallucination, the integration
of LLMs shows promising results in enhancing extraction accuracy.</p>
      <p>There are certain limitation of our research work:
1. Limited LLM Models Evaluated: The experiments were confined to using the Gemma
(2B parameters) and Mistral-7B LLMs. The performance of other large language models like
LaMDA or models with higher parameter counts (e.g., GPT-4) remains unexplored. Future work
could extend these experiments to a broader range of LLM architectures and sizes to provide a
more comprehensive evaluation.</p>
      <p>2. Limited Task Coverage: The current study focused on a specific task: triple extraction
for knowledge graph construction. However, knowledge graph construction and reasoning
encompass a wide range of tasks, and the performance of LLMs on other tasks, such as entity
linking, relation classification, or multi-hop reasoning, remains unexplored. Future research
could extend the evaluation to a broader set of tasks to provide a more comprehensive
understanding of Large Language Model capabilities in the context of knowledge graph construction
and reasoning.</p>
      <p>3. Limited Evaluation Dataset: The paper evaluates the proposed framework on the
CONLL04 dataset, which comprises sentences extracted from news articles with a limited set of
entity types and relation types. This dataset may not fully represent the diversity and complexity
of real-world text, potentially limiting the generalizability of the findings to other domains and
contexts. In Future, evaluation can be carried out on other datasets</p>
      <p>4. Reliance on Manual Mapping: The paper mentions that manual mapping was required
to align the relations extracted by REBEL and KnowGL with the canonical relations in the
CONLL04 dataset. This manual intervention introduces potential biases and inconsistencies, as
the mapping process may not be entirely objective or scalable across larger datasets or domains.</p>
      <p>The authors acknowledge these limitations and express their anticipation for future research
opportunities that would allow them to further explore these areas and provide a more
comprehensive evaluation of LLM capabilities in the context of knowledge graph construction and
reasoning. The research sets a new direction for future work in building advanced, reliable
systems for knowledge extraction and reasoning. It highlights the potential of LLM-based agents
for knowledge graph construction and proposes a comprehensive framework for harnessing
their capabilities.</p>
    </sec>
    <sec id="sec-10">
      <title>A. Appendix: Additional Details</title>
      <p>Additional details about System and parameters for the preliminary Experimental
set-up
• The context size(n_ctx) is the maximum number of tokens that the model can account
for when processing a response. this includes the prompt, and the response itself. In our
case the context size was set to 2048.
• The maximum number of tokens to generate is 2000 in our case. If  _ ≤ 0 or</p>
      <p>None, the maximum number of tokens to generate is unlimited and depends on n_ctx.
• Average inference time per context/sentence for CONLL04 test dataset for extracting
triples in conjunction with LLMs was 25 seconds.
• Temperature was set to 0
• The gguf files for Mistral and Gemma were run locally on Mac M1.</p>
      <p>The system prompt is shown in Figure 3
Relation Mappings
Key</p>
      <p>Values
employer
headquarters location
killed by
location
residence
derivative work, inception, instance of, owned by, owner of, part of,
participant, participant in, performer, twinned administrative body,
occupation, field of this occupation, member of political party, work
location, language used, participant in, participant, owner of, owned
by, member of, notable work, instance of, interested in, ofice held
by head of government, chief executive oficer, educated at, subclass
of, part of, ofice held by head of state, chairperson, executive body,
industry, oficeholder, position held, practiced by, language of work or
name, director / manager, employer, field of work, language of work or
name, notable work, occupation, member of, member of political party,
oficeholder, operator, position held, educated at, founded by, product
or material produced, subsidiary, work location, author, ofice held by
head of government, used by, uses, candidacy in election, candidate,
chairperson, head of government
headquarters location, twinned administrative body, applies to
jurisdiction, legislative body, military branch, contains administrative territorial
entity, parent organization, operating area, legislative body, contains
administrative territorial entity, headquarters location, located in the
administrative territorial entity, ethnic group, language used, military
branch, parent organization, applies to jurisdiction
cause of death, perpetrator, convicted of, killed by, place of death, facet
of, date of death, main subject, place of death, facet of, significant event
location, capital, continent, located in time zone, shares border with,
mountain range, located in or next to body of water, candidate,
significant place, spouse, place of publication, target, country, located in or
next to body of water, location, mouth of the watercourse, point in time,
capital, capital of, shares border with, tributary, diplomatic relation,
place of publication, spouse
place of birth, based on, country of citizenship, date of birth, has part,
number of participants, history of topic, place of birth, country of origin,
has quality, significant event, occupant, relative, residence
Figure 3: System Prompt</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Triple trustworthiness measurement for knowledge graph</article-title>
          ,
          <source>in: The World Wide Web Conference, WWW '19</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          ,
          <year>2019</year>
          . URL: http://dx.doi. org/10.1145/3308558.3313586. doi:
          <volume>10</volume>
          .1145/3308558.3313586.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>P.-L. Huguet</surname>
            <given-names>Cabot</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Navigli</surname>
          </string-name>
          , REBEL:
          <article-title>Relation extraction by end-to-end language generation</article-title>
          , in: M.
          <article-title>-</article-title>
          <string-name>
            <surname>F. Moens</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Specia</surname>
          </string-name>
          , S. W.-t. Yih (Eds.),
          <source>Findings of the Association for Computational Linguistics: EMNLP</source>
          <year>2021</year>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Punta Cana, Dominican Republic,
          <year>2021</year>
          , pp.
          <fpage>2370</fpage>
          -
          <lpage>2381</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          . findings-emnlp.
          <volume>204</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .findings-emnlp.
          <volume>204</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Rossiello</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. F. M. Chowdhury</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Mihindukulasooriya</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Cornec</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          <string-name>
            <surname>Gliozzo</surname>
          </string-name>
          ,
          <article-title>Knowgl: Knowledge generation and linking from text</article-title>
          ,
          <year>2022</year>
          . arXiv:
          <volume>2210</volume>
          .
          <fpage>13952</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-R.</given-names>
            <surname>Wen</surname>
          </string-name>
          ,
          <article-title>A survey on large language model based autonomous agents</article-title>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>H.</given-names>
            <surname>Naveed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. U.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Qiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Saqib</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Anwar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Usman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Barnes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mian</surname>
          </string-name>
          ,
          <article-title>A comprehensive overview of large language models</article-title>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2307</volume>
          .
          <fpage>06435</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-R.</given-names>
            <surname>Wen</surname>
          </string-name>
          ,
          <article-title>Kg-agent: An eficient autonomous agent framework for complex reasoning over knowledge graph</article-title>
          ,
          <source>arXiv preprint arXiv:2402.11163</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-R.</given-names>
            <surname>Wen</surname>
          </string-name>
          ,
          <article-title>Structgpt: A general framework for large language model to reason over structured data</article-title>
          ,
          <source>arXiv preprint arXiv:2305.09645</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Qiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <surname>N. Zhang,</surname>
          </string-name>
          <article-title>Llms for knowledge graph construction and reasoning: Recent capabilities and future opportunities</article-title>
          ,
          <source>arXiv preprint arXiv:2305.13168</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Hong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Zhang,</surname>
          </string-name>
          <article-title>On the tool manipulation capability of open-source large language models</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2305</volume>
          .
          <fpage>16504</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Cong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Qian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gerstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          , Toolllm:
          <article-title>Facilitating large language models to master 16000+ real-world apis</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2307</volume>
          .
          <fpage>16789</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>B. W.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. M. Yoo</surname>
          </string-name>
          ,
          <article-title>Instruction tuning with human curriculum</article-title>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2310</volume>
          .
          <fpage>09518</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Du</surname>
          </string-name>
          , I. Shafran,
          <string-name>
            <given-names>K.</given-names>
            <surname>Narasimhan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Cao</surname>
          </string-name>
          , React:
          <article-title>Synergizing reasoning and acting in language models</article-title>
          ,
          <source>arXiv preprint arXiv:2210.03629</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. W.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Salakhutdinov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          ,
          <article-title>Hotpotqa: A dataset for diverse, explainable multi-hop question answering</article-title>
          ,
          <year>2018</year>
          . arXiv:
          <year>1809</year>
          .09600.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Thorne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vlachos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Christodoulopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mittal</surname>
          </string-name>
          ,
          <article-title>Fever: a large-scale dataset for fact extraction</article-title>
          and verification,
          <year>2018</year>
          . arXiv:
          <year>1803</year>
          .05355.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schuurmans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bosma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ichter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Chi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Chain-ofthought prompting elicits reasoning in large language models</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2201</volume>
          .
          <fpage>11903</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>K.</given-names>
            <surname>Cobbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kosaraju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bavarian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Plappert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tworek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hilton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Nakano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hesse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schulman</surname>
          </string-name>
          , Training verifiers to solve math word problems,
          <year>2021</year>
          . arXiv:
          <volume>2110</volume>
          .
          <fpage>14168</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhattamishra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <article-title>Are nlp models really able to solve simple math word problems</article-title>
          ?,
          <year>2021</year>
          . arXiv:
          <volume>2103</volume>
          .
          <fpage>07191</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18] S.-y. Miao,
          <string-name>
            <surname>C.-C. Liang</surname>
            , K.-
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Su</surname>
          </string-name>
          ,
          <article-title>A diverse corpus for evaluating and developing English math word problem solvers</article-title>
          , in: D.
          <string-name>
            <surname>Jurafsky</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Chai</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Schluter</surname>
          </string-name>
          , J. Tetreault (Eds.),
          <article-title>Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</article-title>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>975</fpage>
          -
          <lpage>984</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .acl-main.
          <volume>92</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .acl-main.
          <volume>92</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>R.</given-names>
            <surname>Koncel-Kedziorski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Amini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kushman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Hajishirzi</surname>
          </string-name>
          ,
          <article-title>MAWPS: A math word problem repository</article-title>
          , in: K.
          <string-name>
            <surname>Knight</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Nenkova</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          Rambow (Eds.),
          <source>Proceedings of the</source>
          <year>2016</year>
          <article-title>Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</article-title>
          , San Diego, California,
          <year>2016</year>
          , pp.
          <fpage>1152</fpage>
          -
          <lpage>1157</lpage>
          . URL: https://aclanthology.org/N16-1136. doi:
          <volume>10</volume>
          . 18653/v1/
          <fpage>N16</fpage>
          -1136.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>H.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <article-title>Auto-gpt for online decision making: Benchmarks and</article-title>
          additional opinions,
          <year>2023</year>
          . arXiv:
          <volume>2306</volume>
          .
          <fpage>02224</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>M.</given-names>
            <surname>Shridhar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Thomason</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gordon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bisk</surname>
          </string-name>
          , W. Han,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mottaghi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fox</surname>
          </string-name>
          ,
          <article-title>Alfred: A benchmark for interpreting grounded instructions for everyday tasks</article-title>
          ,
          <year>2020</year>
          . arXiv:
          <year>1912</year>
          .01734.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ross</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. J.</given-names>
            <surname>Gordon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Bagnell</surname>
          </string-name>
          ,
          <article-title>A reduction of imitation learning and structured prediction to no-regret online learning</article-title>
          ,
          <year>2011</year>
          . arXiv:
          <volume>1011</volume>
          .
          <fpage>0686</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>R.</given-names>
            <surname>Nakano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hilton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Balaji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ouyang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hesse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kosaraju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Saunders</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Cobbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Eloundou</surname>
          </string-name>
          , G. Krueger,
          <string-name>
            <given-names>K.</given-names>
            <surname>Button</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Knight</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chess</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schulman</surname>
          </string-name>
          , Webgpt:
          <article-title>Browser-assisted question-answering with human feedback</article-title>
          ,
          <year>2022</year>
          . arXiv:
          <volume>2112</volume>
          .
          <fpage>09332</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>A.</given-names>
            <surname>Fan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jernite</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Grangier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Weston</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Auli, Eli5: Long form question answering</article-title>
          ,
          <year>2019</year>
          . arXiv:
          <year>1907</year>
          .09190.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Azarnasab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          , C. Liu,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          , Mmreact:
          <article-title>Prompting chatgpt for multimodal reasoning</article-title>
          and action,
          <year>2023</year>
          . arXiv:
          <volume>2303</volume>
          .
          <fpage>11381</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>I.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Blukis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mousavian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tremblay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fox</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Thomason</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Garg</surname>
          </string-name>
          , Progprompt:
          <article-title>Generating situated robot task plans using large language models</article-title>
          ,
          <year>2022</year>
          . arXiv:
          <volume>2209</volume>
          .
          <fpage>11302</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <article-title>Evaluating and enhancing large language models for conversational reasoning on knowledge graphs</article-title>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2312</volume>
          .
          <fpage>11282</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>S.</given-names>
            <surname>Moon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kumar</surname>
          </string-name>
          , R. Subba,
          <article-title>OpenDialKG: Explainable conversational reasoning with attention-based walks over knowledge graphs</article-title>
          , in: A.
          <string-name>
            <surname>Korhonen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Traum</surname>
          </string-name>
          , L. Màrquez (Eds.),
          <article-title>Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</article-title>
          , Florence, Italy,
          <year>2019</year>
          , pp.
          <fpage>845</fpage>
          -
          <lpage>854</lpage>
          . URL: https://aclanthology.org/P19-1081. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>P19</fpage>
          -1081.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-R.</given-names>
            <surname>Wen</surname>
          </string-name>
          ,
          <article-title>Kg-agent: An eficient autonomous agent framework for complex reasoning over knowledge graph</article-title>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2402</volume>
          .
          <fpage>11163</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>W.</given-names>
            <surname>Yih</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Richardson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Meek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Suh,</surname>
          </string-name>
          <article-title>The value of semantic parse labeling for knowledge base question answering, in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics</article-title>
          ,
          <source>ACL 2016, August 7-12</source>
          ,
          <year>2016</year>
          , Berlin, Germany, Volume
          <volume>2</volume>
          :
          <string-name>
            <surname>Short</surname>
            <given-names>Papers</given-names>
          </string-name>
          , The Association for Computer Linguistics,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>A.</given-names>
            <surname>Talmor</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Berant,</surname>
          </string-name>
          <article-title>The web as a knowledge-base for answering complex questions</article-title>
          , in: M. A.
          <string-name>
            <surname>Walker</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Ji</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Stent (Eds.),
          <source>Proceedings of the</source>
          <year>2018</year>
          <article-title>Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</article-title>
          , NAACL-HLT
          <year>2018</year>
          , New Orleans, Louisiana, USA, June 1-6,
          <year>2018</year>
          , Volume
          <volume>1</volume>
          (
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>641</fpage>
          -
          <lpage>651</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kase</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Vanni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. M.</given-names>
            <surname>Sadler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <surname>Beyond I.I.D.</surname>
          </string-name>
          <article-title>: three levels of generalization for question answering on knowledge bases</article-title>
          , in: J.
          <string-name>
            <surname>Leskovec</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Grobelnik</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Najork</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Tang</surname>
          </string-name>
          , L. Zia (Eds.),
          <source>WWW '21: The Web Conference</source>
          <year>2021</year>
          , Virtual Event / Ljubljana, Slovenia,
          <source>April 19-23</source>
          ,
          <year>2021</year>
          , ACM / IW3C2,
          <year>2021</year>
          , pp.
          <fpage>3477</fpage>
          -
          <lpage>3488</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Qiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , S. Lyu,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Knowagent: Knowledge-augmented planning for llm-based agents</article-title>
          ,
          <source>arXiv preprint arXiv:2403.03101</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>E.</given-names>
            <surname>Karpas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Abend</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Belinkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lenz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Lieber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ratner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shoham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bata</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Levine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Leyton-Brown</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Muhlgay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rozen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Schwartz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Shachaf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shalev-Shwartz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shashua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tenenholtz</surname>
          </string-name>
          ,
          <article-title>Mrkl systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning</article-title>
          ,
          <year>2022</year>
          . arXiv:
          <volume>2205</volume>
          .
          <fpage>00445</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>L.</given-names>
            <surname>Weng</surname>
          </string-name>
          ,
          <article-title>Llm-powered autonomous agents, lilianweng</article-title>
          .github.io (
          <year>2023</year>
          ). URL: https: //lilianweng.github.io/posts/2023-06-23-agent/.
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghazvininejad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          , L. Zettlemoyer, Bart:
          <article-title>Denoising sequence-to-sequence pre-training for natural language generation, translation</article-title>
          , and comprehension,
          <year>2019</year>
          . arXiv:
          <year>1910</year>
          .13461.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>H.</given-names>
            <surname>Touvron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Stone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Albert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Almahairi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Babaei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Bashlykov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Batra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bhargava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhosale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bikel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Blecher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. C.</given-names>
            <surname>Ferrer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Cucurull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Esiobu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Fuller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Goswami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hartshorn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hosseini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Inan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kardas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kerkez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Khabsa</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Kloumann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Korenev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Koura</surname>
          </string-name>
          , M.
          <article-title>-</article-title>
          <string-name>
            <surname>A. Lachaux</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lavril</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Liskovich</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Mao</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Martinet</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Mihaylov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mishra</surname>
            , I. Molybog,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Nie</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Poulton</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Reizenstein</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Rungta</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Saladi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Schelten</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>E. M.</given-names>
          </string-name>
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Subramanian</surname>
            ,
            <given-names>X. E.</given-names>
          </string-name>
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Taylor</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>J. X.</given-names>
          </string-name>
          <string-name>
            <surname>Kuan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Yan</surname>
            , I. Zarov,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Fan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Kambadur</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Narang</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Rodriguez</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Stojnic</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Edunov</surname>
          </string-name>
          ,
          <source>T. Scialom, Llama</source>
          <volume>2</volume>
          :
          <article-title>Open foundation and fine-tuned chat models</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2307</volume>
          .
          <fpage>09288</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>M.</given-names>
            <surname>Payne</surname>
          </string-name>
          ,
          <article-title>Fine-tuning open llms with reinforcement learning from human feedback</article-title>
          , https://www.width.ai/post/reinforcement
          <article-title>-learning-from-human-feedback (</article-title>
          <year>2023</year>
          ). URL: https://www.width.ai/post/reinforcement
          <article-title>-learning-from-human-feedback.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>D.</given-names>
            <surname>Roth</surname>
          </string-name>
          , W.-t. Yih,
          <article-title>A linear programming formulation for global inference in natural language tasks</article-title>
          ,
          <source>in: Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL-2004) at HLT-NAACL</source>
          <year>2004</year>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Boston, Massachusetts, USA,
          <year>2004</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          . URL: https://aclanthology.org/ W04-2401.
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>P.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Schütze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Andrassy</surname>
          </string-name>
          ,
          <article-title>Table filling multi-task recurrent neural network for joint entity and relation extraction</article-title>
          , in: Y. Matsumoto, R. Prasad (Eds.),
          <source>Proceedings of COLING</source>
          <year>2016</year>
          ,
          <source>the 26th International Conference on Computational Linguistics: Technical Papers</source>
          ,
          <source>The COLING 2016 Organizing Committee</source>
          , Osaka, Japan,
          <year>2016</year>
          , pp.
          <fpage>2537</fpage>
          -
          <lpage>2547</lpage>
          . URL: https://aclanthology.org/C16-1239.
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>G.</given-names>
            <surname>Team</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mesnard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hardin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Dadashi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhupatiraju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pathak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Sifre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rivière</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Kale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Love</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tafti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hussenot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chowdhery</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roberts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Botev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Castro-Ros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Slone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Héliou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tacchetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bulanova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Paterson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Tsai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Shahriari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. L.</given-names>
            <surname>Lan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Choquette-Choo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Crepy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ippolito</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Reid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Buchatskaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Ni</surname>
          </string-name>
          , E. Noland, G. Yan, G. Tucker, G.-C. Muraru, G. Rozhdestvenskiy,
          <string-name>
            <given-names>H.</given-names>
            <surname>Michalewski</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Tenney</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Grishchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Austin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Keeling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Labanowski</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.-B. Lespiau</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Stanway</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Brennan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Ferret</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Chiu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mao-Jones</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Millican</surname>
            ,
            <given-names>L. L.</given-names>
          </string-name>
          <string-name>
            <surname>Sjoesund</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Dixon</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Reid</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Mikuła</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Wirth</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Sharman</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Chinaev</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Thain</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Bachem</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Wahltinez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Bailey</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Michel</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Yotov</surname>
            ,
            <given-names>P. G.</given-names>
          </string-name>
          <string-name>
            <surname>Sessa</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Chaabouni</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Comanescu</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Jana</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Anil</surname>
            ,
            <given-names>R. McIlroy</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mullins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. L.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Borgeaud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Girgin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Douglas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pandya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shakeri</surname>
          </string-name>
          , S. De, T. Klimenko,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hennigan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Feinberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Stokowiec</surname>
          </string-name>
          , Y. hui
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Ahmed</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Gong</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Warkentin</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Peran</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Giang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Farabet</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Vinyals</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Kavukcuoglu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Hassabis</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Ghahramani</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Eck</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Barral</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Pereira</surname>
            , E. Collins,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Joulin</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Fiedel</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Senter</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Andreev</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Kenealy</surname>
          </string-name>
          ,
          <source>Gemma: Open models based on gemini research and technology</source>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2403</volume>
          .
          <fpage>08295</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>A. Q.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sablayrolles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mensch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bamford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Chaplot</surname>
          </string-name>
          , D. de las Casas,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bressand</surname>
          </string-name>
          , G. Lengyel,
          <string-name>
            <given-names>G.</given-names>
            <surname>Lample</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Saulnier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. R.</given-names>
            <surname>Lavaud</surname>
          </string-name>
          , M.
          <article-title>-</article-title>
          <string-name>
            <surname>A. Lachaux</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Stock</surname>
            ,
            <given-names>T. L.</given-names>
          </string-name>
          <string-name>
            <surname>Scao</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lavril</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lacroix</surname>
            ,
            <given-names>W. E.</given-names>
          </string-name>
          <string-name>
            <surname>Sayed</surname>
          </string-name>
          , Mistral 7b,
          <year>2023</year>
          . arXiv:
          <volume>2310</volume>
          .
          <fpage>06825</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>