<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Enriching Ontologies with Disjointness Axioms using Large Language Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Elias Crum</string-name>
          <email>elias.crum@ugent.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antonio De Santis</string-name>
          <email>antonio.desantis@polimi.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Manon Ovide</string-name>
          <email>manon.ovide@univ-tours.fr</email>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jiaxin Pan</string-name>
          <email>jiaxin.pan@ki.uni-stuttgart.de</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessia Pisu</string-name>
          <email>alessia.pisu96@unica.it</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicolas Lazzari</string-name>
          <email>nicolas.lazzari3@unibo.it</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sebastian Rudolph</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>TU Dresden</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Germany sebastian.rudolph@tu-dresden.de</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ghent University</institution>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Politecnico di Milano</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Cagliari</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Pisa and University of Bologna</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of Stuttgart</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>University of Tours</institution>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Ontologies often lack explicit disjointness declarations between classes, despite their usefulness for sophisticated reasoning and consistency checking in Knowledge Graphs. In this study, we explore the potential of Large Language Models (LLMs) to enrich ontologies by identifying and asserting class disjointness axioms. Our approach aims at leveraging the implicit knowledge embedded in LLMs, using prompt engineering to elicit this knowledge for classifying ontological disjointness. We validate our methodology on the DBpedia ontology, focusing on open-source LLMs. Our findings suggest that LLMs, when guided by efective prompt strategies, can reliably identify disjoint class relationships, thus streamlining the process of ontology completion without extensive manual input. For comprehensive disjointness enrichment, we propose a process that takes logical relationships between disjointness and subclass statements into account in order to maintain satisfiability and reduce the number of calls to the LLM. This work provides a foundation for future applications of LLMs in automated ontology enhancement and ofers insights into optimizing LLM performance through strategic prompt design. Our code is publicly available on GitHub at https://github.com/n28div/llm-disjointness.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Large Language Models</kwd>
        <kwd>Disjointness Learning</kwd>
        <kwd>Ontology Enrichment</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>It is generally understood that complementing the factual (assertional) knowledge represented
in Knowledge Graphs with ontological (terminological) information greatly advances the
usefulness of the ensuing knowledge base in terms of querying and many other downstream tasks.
This is because combining assertional information with terminological background knowledge
allows for the derivation of a vast amount of implicit knowledge, which is not explicitly stated
in the knowledge base but follows logically from it and thus can be taken into account for all
kinds of knowledge management activities, including query answering.</p>
      <p>The by far most widespread type of ontological information added to knowledge graphs is
taxonomic in nature, that is, it is related to (i) putting the individual objects of interest into categories,
usually referred to as classes, based on shared characteristics and (ii) establishing set-theoretic
relationships between these classes. Among the diverse possible such taxonomic relationships,
the subclass/superclass relationships – tightly connected to the linguistic hyponymy/hypernymy
relationships of the corresponding class names – are the ones predominantly found across
numerous ontologies today, typically forming sizeable conceptual hierarchies. As an example,
the subclass/superclass relationship between the classes Mammal and Vertebrate implies that
any object that belongs to (or, in more technical terms: is an instance of) the class Mammal also
must belong to the class Vertebrate.</p>
      <p>Another well-known basic type of taxonomic relationship between two classes is that of
disjointness. Two classes are said to be disjoint if it is impossible that they have common
instances, which, intuitively, means that the two classes cannot overlap, and membership
in these two classes is mutually exclusive. For example, disjointness of the classes Mammal
and Fish implies that any instance of Mammal must not be an instance of Fish. Given the
symmetric nature of disjointness, this is logically equivalent to saying that any instance of
Fish must not be an instance of Mammal. As opposed to subclass statements, which allow for
inferring positive facts from other positive facts, disjointness statements enable the inference
of negated facts. For example, given the fact that Flipper is an instance of Mammal, the above
subclass relationship gives rise to the information that Flipper is an instance of Vertebrate,
whereas the disjointness statement allows us to infer the information that Flipper must not
be an instance of Fish. This fact makes disjointness information particularly valuable in the
context of machine-learning approaches that rely on the presence of negative examples, such
as Knowledge Graph Embedding.</p>
      <p>When specifying taxonomic relationships between classes in the course of the ontology design
process, it should be kept in mind that they are not meant to reflect spurious relationships in
the data currently available, but rather they are supposed to represent immutable background
knowledge that continues to hold in diferent situations or at diferent points in time. For
instance, although historically, no woman has served as US President, a woman may be elected
as the US President in the future. Therefore, the corresponding classes Woman and USPresident
are not (ontologically) disjoint.1 To reflect this situation more formally, one can employ the idea
of possible or conceivable worlds (referred to as interpretations in model-theoretic terms), which
e.g., include potential future or just hypothetical circumstances. Then, a certain taxonomic
relationship (such as subclass or disjointness) between two classes holds if the corresponding
set relation (such as subset or intersection-emptiness) holds between the sets of class instances
in every conceivable world (under every conceivable interpretation). Based on this, we will employ
a very lightweight logical framework to give our arguments a formal underpinning: Stipulating
a set I of conceivable worlds, we define taxonomic relationships for this set. The goal of
ontological knowledge modeling is to capture I using a knowledge base  whose statements
rule out the inconceivable worlds so that only the conceivable ones remain as models of .
1We might call them materially disjoint due to the absence of material evidence demonstrating their non-disjointness.
Definition 1. Fixing a vocabulary consisting of a set C of class names and a set I of individual
names, an interpretation ℐ = (∆ , · ℐ ) consists of a set ∆ called the domain and a function · ℐ
mapping every class name C ∈ C to a subset Cℐ ⊆ ∆ and every individual name i ∈ I to an
element iℐ ∈ ∆ .</p>
      <p>Let I be a set of interpretations, representing the conceivable worlds. Then, for an individual
name i ∈ I and for concept names C, D ∈ C we call
• i an instance of C (written i : C) if every interpretation ℐ ∈ I satisfies iℐ ∈ Cℐ ,
• C a subclass of D (written C ⊑ D) if every interpretation ℐ ∈ I satisfies Cℐ ⊆ Dℐ ,
• C disjoint with D if every interpretation ℐ ∈ I satisfies Cℐ ∩ Dℐ = ∅.</p>
      <p>• C incoherent if every interpretation ℐ ∈ I satisfies Cℐ = ∅.</p>
      <p>
        As discussed above, ontologically dictated taxonomic relationships can be leveraged for
sophisticated reasoning and consistency-checking tasks when reasoning over a knowledge graph.
Yet, despite their usefulness, disjointness relationships are rarely explicitly recorded within
an ontology. Research on 1,275 ontologies showed that only 97 of them include disjointness
assertions [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Arguably, this can be explained by the fact that disjointness information is so
self-evident from a human common-sense point of view, that human experts are often not
aware that it is not logically “built-in” but needs to be explicitly specified. For this reason,
semi-automated labeling of disjoint classes could be advantageous. Recent approaches [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ]
propose supervised and unsupervised models using various features in disjointness axioms.
However, the generalizability of these methods is limited to their specific datasets and cannot
be implemented on a large scale. Additionally, the sophisticated feature engineering required
hinders their practical application. Therefore, a method that functions independently of feature
design and dataset restrictions is highly desirable.
      </p>
      <p>Given that (i) ontological class descriptions are often recorded as (or associated with) terms
in natural language and (ii) LLMs have been found to possess wide linguistic and semantic
working knowledge, we aim to assess the potential of LLMs to decide on the question which
classes ought to be disjoint while assessing the impact of prompt engineering on classification
validity. We hypothesize that through the use of prompt engineering, LLMs are to classify
ontologically disjoint classes with high validity in both positive (two classes are ontologically
disjoint), and negative (two classes are not ontologically disjoint), cases. We test our hypothesis
on the DBpedia ontology2 using LLMs. We propose a method that intertwines the
LLMbased disjointness classification with basic logical inferencing to increase eficiency, maintain
consistency, and minimize the number of calls to the LLM.</p>
      <p>Thus, this paper is dedicated to answering the following main research questions:
RQ1: Can LLMs help enrich ontologies with class disjointness axioms?
RQ2: Which LLM prompts work better for disjointness discovery?</p>
      <p>RQ3: How can we exploit taxonomic relationships to reduce interaction with the LLM?</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Disjointness Learning Models for disjointness learning can be categorized into supervised
and unsupervised approaches. In the unsupervised category, Schlobach [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] follows the strong
disjointness assumption [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which posits that children of a common parent in the subsumption
hierarchy should be considered disjoint. They introduced a pinpointing algorithm to identify
minimal sets of axioms that need revision to make an ontology coherent, thereby enriching
appropriate disjointness statements. However, this approach neglects background knowledge,
which could be beneficial in identifying disjoint classes. Rizzo et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] proposes an
unsupervised approach based on concept learning and inductive classification. This method employs a
hierarchical conceptual clustering technique capable of providing intensional cluster
descriptions and utilizes a novel form of semi-distances over individuals in an ontological knowledge
base, incorporating available background knowledge. In the supervised category, Völker et al.
[
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ] gather syntactic and semantic evidence, such as positive and negative association rules
as well as correlation coeficients, from various sources to establish a strong foundation for
learning disjointness. However, their work exploits background knowledge and reasoning
only to a limited extent. Subsequent work, the DL-Learner by Lehmann [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], uses Inductive
Logic Programming (ILP) for learning class descriptions, including disjointness. Despite these
advancements, disjointness learning with LLMs remains much underexplored.
Large Language Models In recent years, Large Language Models (LLMs) have become
state-of-the-art for Natural Language Processing and have also significantly impacted other
ifelds such as knowledge engineering [
        <xref ref-type="bibr" rid="ref10 ref11 ref8 ref9">8, 9, 10, 11</xref>
        ]. LLMs rely on pre-training Transformer
models [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] over large-scale unlabeled corpora. Pre-trained context-aware word representations
achieve state-of-the-art performance on various downstream tasks and set the “pre-training
and fine-tuning” learning paradigm. Early LLMs, such as BERT [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], utilized relatively small
training corpora and required fine-tuning for specific downstream tasks. However, subsequent
research demonstrated that scaling up both model size and dataset volume significantly enhances
performance. GPT-3 [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], for instance, achieves competitive results through few-shot learning
and in-context learning without parameter updates. GPT-3.5 further improves capabilities by
incorporating reinforcement learning from human feedback (RLHF). The introduction of GPT-4
[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] marked a milestone by extending beyond text input to include multimodal signals. Meta
AI introduced the collection of LLaMA models [
        <xref ref-type="bibr" rid="ref16 ref17">16, 17</xref>
        ] with four diferent sizes. Other notable
LLMs, such as Claude, Gemini [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], and Mixtral [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], have also garnered significant attention.
Prompt Engineering Designing efective prompts for LLMs is essential for maximizing their
potential. Key strategies in prompt engineering include zero-shot [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], few-shot [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], and
chainof-thought [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] prompting. Zero-shot [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] involves providing task descriptions to LLMs without
any input-output examples, relying on the models’ pre-existing knowledge to generate responses.
Few-shot [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] includes input-output examples, guiding the models’ generation process.
Chainof-Thought (CoT) [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] promotes coherent and step-by-step reasoning by decomposing a complex
question into a series of simpler logical reasoning questions, mimicking human problem-solving
processes. This method has been shown to significantly improve performance on reasoning
tasks [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. However, the need for multiple prompts makes this approach dificult to use at large
scales. With this in mind, Kojima et al. [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] proposed Zero-shot-CoT prompting. They found that
by appending the phrase “Let’s think step by step.” to the end of a question, LLMs can generate
a chain of thought that leads to more accurate answers w.r.t the vanilla zero-shot approach.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Resources</title>
      <p>To efectively assess the ability of LLMs to support the assertion of disjointness axioms, we
ideally require a reference ontology that includes a sized set of classes, to ensure diversity
during the experiments and some disjoint classes in its description, preferably specified through
a specific disjoint class property such as owl:disjointWith. These criteria maximize the
generalizability of the approach and encourage its use for future studies.</p>
      <p>Several ontologies can be identified for this task, from foundational ontologies, such as
DOLCE3 or UFO4, to domain-specific ontologies, such as FoodOn 5. Disjointness axioms from
these ontologies, however, are not intuitive and require extensive common-sense reasoning
and domain knowledge. For instance, DOLCE defines an Event to be disjoint from an Object
while UFO does not. Both axioms are correct, as they deeply depend on their philosophical
commitment to these abstract concepts. Similarly, the FoodOn ontology asserts that the Arabia
cofee plant 6, the plant used to produce black cofee, is disjoint with Camellia sinensis 7, the plant
used to produce black tea. In this case, deciding whether the two plants should be considered
disjoint highly depends on the domain of the ontology. To avoid feeding the LLM with classes
whose disjointness highly depends on the context or domain, we choose to avoid foundational
and domain-specific ontologies for our initial experiments. Moreover, as our interaction with
the LLM is based on natural language, we only consider ontologies that provide natural language
labels for classes via labeling properties, such as skos:prefLabel or rdfs:label.</p>
      <p>
        We ultimately decided to use the DBpedia ontology8 because of its general popularity and
conformity with dataset minimal requirements. Since the DBpedia ontology is created through
a crowdsourcing approach [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], the availability of disjointness axioms cannot be expected to
be equally accurate across all classes, as it depends on the annotators’ expertise and diligence.
This issue has been actively discussed within the DBpedia community9. The main drawback
is the lack of a systematic approach in the creation of the taxonomy, which greatly impacts
the consistency of the ontology when disjointness axioms are asserted. In particular, we found
23 explicit disjointness axioms in the DBpedia ontology. In Section 4 we show how exploiting
automated reasoning techniques allows the creation of a larger pool of disjoint classes. In Table 1
a selection of disjointness axioms within the ontology is shown. Indeed, most of the disjointness
axioms are universally known common-sense relations, such as disjointness between dbo:Fish
and dbo:Mammal or dbo:Agent and dbo:Place.
3https://github.com/appliedontolab/DOLCE/blob/main/OWL/DOLCEbasic.owl
http://www.ontologydesignpatterns.org/ont/dul/DUL.owl
4https://nemo-ufes.github.io/gufo/
5https://foodon.org/
6https://en.wikipedia.org/wiki/Cofea_arabica
7https://en.wikipedia.org/wiki/Camellia_sinensis
8https://DBpedia.org/ontology/, often referred to with the dbo: namespace, which we omit hereafter
9https://github.com/DBpedia/ontology-tracker/issues/2
      </p>
      <p>Class A</p>
      <p>Class B
http://DBpedia.org/ontology/Person
http://DBpedia.org/ontology/Person
http://DBpedia.org/ontology/Agent
http://DBpedia.org/ontology/Fish
http://DBpedia.org/ontology/Event
http://DBpedia.org/ontology/ProtohistoricalPeriod
http://DBpedia.org/ontology/UnitOf Work
http://DBpedia.org/ontology/Place
http://DBpedia.org/ontology/Mammal
http://DBpedia.org/ontology/Person</p>
    </sec>
    <sec id="sec-4">
      <title>4. Proposed approach</title>
      <p>We now describe our approach which, given a Knowledge Base, clarifies for every pair of named
classes of that ontology if disjointness should hold between the two classes or not. At the core of
the approach is prompting an LLM to exploit the semantic and linguistic “world knowledge” it
has obtained from training on vast amounts of textual data. The two major underlying objectives
of our approach are:
1. Ensuring that the resulting disjointness-enriched ontology is satisfiable (i.e.,
contradictionfree) for usability reasons since otherwise it would be unusable for any reasoning tasks,
including ontology-supported querying.
2. Minimizing the number of interactions with the LLM for eficiency reasons and
costawareness.</p>
      <p>We propose to address both objectives using automated reasoning. More specifically, we
continuously materialize all the (non-)disjointness information that follows logically from the
original knowledge base plus the already acquired disjointness information. Thus, the LLM is
only queried about the disjointness status of pairs of classes, when neither of the outcomes
would result in an inconsistency. In this way, the derived information remains contradiction-free
“by design” and, at the same time, the number of queries to the LLM is significantly reduced.
Our approach relies on several logical correspondences, discussed in the following.
Proposition 1. Let  be a knowledge base and let C1, C2, D1, D2 be classes of  such that the
following statements follow from : (i) C1 and C2 are disjoint, (ii) D1 is a subclass of C1, (iii) D2 is
a subclass of C2. Then  also entails that D1 and D2 are disjoint.</p>
      <p>Proof. Consider an arbitrary model ℐ of . According to the assumptions and in view of
Definition 1, we know that (i) C1ℐ ∩ C2ℐ = ∅, (ii) D1ℐ ⊆ C1ℐ , and (iii) D2ℐ ⊆ C2ℐ . We equivalently
express (ii) and (iii) by (ii’) D1ℐ = C1ℐ ∩ D1ℐ , and (iii’) D2ℐ = C2ℐ ∩ Dℐ . This allows us to infer
2
D1ℐ ∩ D2ℐ = (C1ℐ ∩ D1ℐ ) ∩ (C2ℐ ∩ D2ℐ ) = (C1ℐ ∩ C2ℐ ) ∩ (D1ℐ ∩ D2ℐ ) = ∅ ∩ (D1ℐ ∩ D2ℐ ) = ∅.
We exploit this property to use subclass relationships from  to deduce class disjointness
statements from existing class disjointness statements. This way we avoid posing redundant
disjointness queries to the underlying LLM.</p>
      <p>Proposition 2. Let  be a knowledge base and let C1, C2, C be classes of  such that the following
statements follow from : (i) C1 and C2 are disjoint, (ii) C is a subclass of C1, (iii) C is a subclass of
C2. Then  also entails that C is incoherent.</p>
      <p>Proof. Consider an arbitrary model ℐ of . According to the assumptions and in view of
Definition 1, we know that (i) C1ℐ ∩ C2ℐ = ∅, (ii) Cℐ ⊆ C1ℐ , and (iii) Cℐ ⊆ C2ℐ . We equivalently
express (ii) and (iii) by (ii’) Cℐ = C1ℐ ∩ Cℐ , and (iii’) Cℐ = C2ℐ ∩ Cℐ . This allows us to infer
Cℐ = Cℐ ∩ Cℐ = (C1ℐ ∩ Cℐ ) ∩ (C2ℐ ∩ Cℐ ) = (C1ℐ ∩ C2ℐ ) ∩ Cℐ = ∅ ∩ Cℐ = ∅.</p>
      <p>We exploited this property indirectly under the assumption that any named class  in the
considered ontology is supposed to have instances – which seems to be a reasonable assumption
since, otherwise, the definition of the class appears to be meaningless. In that case, any two
classes that have a common subclass must be not disjoint.</p>
      <p>Proposition 3. Let  be a knowledge base, let C1, C2 be classes and let  be an individual of
 that such that the following statements follow from : (i) C1 and C2 are disjoint, (ii)  is an
instance of C1, (iii) e is an instance of C2. Then  is unsatisfiable.</p>
      <p>Proof. Suppose ℐ is a model of . According to the assumptions and in view of Definition 1,
we know that (i) C1ℐ ∩ C2ℐ = ∅, (ii) eℐ ∈ C1ℐ , and (iii) eℐ ∈ C2ℐ . Then, combining (ii) and (iii) we
obtain eℐ ∈ C1ℐ ∩ C2ℐ and applying (i) yields eℐ ∈ ∅ which is a contradictory statement. Thus 
cannot have any models, which means it is unsatisfiable.</p>
      <p>Again, this property can be exploited by noting that any two classes having common instances
must not be disjoint. These considerations lead to the proposed methodology, detailed in
Algorithm 1, which achieves the above-mentioned objective of producing an enriched knowledge
base that is guaranteed to be contradiction-free, provided that the original knowledge base is.</p>
      <p>Algorithm 2 achieves the objective of reducing the number of interactions with the LLM and
maintaining satisfiability as new disjointness information is added through the LLM. The aim
of producing an output that accurately reflects taxonomic relationships crucially depends on
the quality and accuracy of the LLM’s responses. This, in turn, is influenced by both the LLM
itself and the chosen prompting strategy. We focus on these issues in Section 5.</p>
      <p>The last steps of Algorithm 2 (lines 14 and 15) are optional, but highly recommended, as they
remove logically redundant statements from the disjointness-enriched knowledge base  ∪ .
This yields a knowledge base that is logically equivalent but typically much smaller in size and
hence both easier to process algorithmically and to scrutinize and maintain manually. Also, this
“pruning step” is not computationally expensive, as it only requires |* | calls to a reasoner.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Experiments</title>
      <p>In this section, we experiment with the approach proposed in Section 4 on the classes extracted
from the DBpedia ontology. In particular, by relying on Algorithm 1, we obtain the list ℒ related
to the DBpedia ontology. We have that |ℒ| = 1148, with 370 pairs labeled as disjoint and 778
pairs labeled as not unknown. In Table 2, we provide some examples of classes in ℒ.</p>
      <p>Note that the list ℒ assumes that the ontology designers carefully produced a taxonomy
that is intended to also reflect disjointness between classes. As shown in Section 3, however,
this is not the case. The design of the taxonomy of DBpedia is structured such that disjoint
axioms might result in unwanted inconsistencies. For this reason, we employ multiple metrics
to evaluate the LLMs’ performances, each measuring a diferent behavior of the model. For
all metrics, a higher score indicates better performances, with 1 being the maximum score. In
particular, disjoint recall (DR) measures how much the LLM aligns with humans by measuring
the amount of true disjointness axioms that have been identified by the LLM. This measure
provides an evaluation of the reliability of the prompt. Non-disjoint F1 (NDF1) measures the F1
score between the non-disjoint couples in  and the ones identified by the LLM. This provides a
measure of how conservative the LLM is on its answers – i.e. how much the LLM acknowledges
the open-world assumption. The F1 metrics measure the end-to-end performances of the model.
The symmetric consistency metric (SC) measures how much the answers provided by the LLM
Algorithm 2 Determine the set of disjointness statements  consistent with</p>
      <p>Input A list ℒ containing pairs of classes labeled as “unknown”, “disjoint” or “not disjoint”;
a prompt  for disjointness classification, with  : C × C → {“disjoint”,“not disjoint”}
the function that queries an LLM for disjointness of two classes using prompt</p>
      <p>Output A set  of class disjointness axioms, such that all valid disjointness statements
logically follow from  ∪  and no invalid disjointness statements follow from it.
respect the symmetric property of the disjointness axiom – i.e. if  is disjoint from  then  is
disjoint from . Finally, we measure the overall accuracy of each model.</p>
      <p>Prompting We adopt diferent prompting strategies: a naive approach, where the LLM has
to autonomously understand the task, a task description approach, where the disjointedness
task is described and a few-shot approach that extends the task description by also providing
some positive and negative examples. For each prompt, we frame the problem as a
questionanswering (QA) task, where the LLM has to answer positively or negatively to classify two
classes as disjoint. To identify the best QA approach, we identify two prompts: (i) the LLM has
to answer positively to classify two classes as disjoint and (ii) the LLM has to answer negatively.
Table 3 describes the prompt templates we used. When possible, we rely on the instruction
format of each LLM and use the Prompting Strategy template to instruct the LLM while we use
the QA Strategy as a query to the instructed LLM.</p>
      <sec id="sec-5-1">
        <title>5.1. Experimental setup</title>
        <p>We perform our experiments on publicly available LLMs, to ensure full reproducibility of the
experiments. For each LLM, we set the sampling temperature to 0, to reduce the randomness
of the result. Moreover, we only rely on small LLMs – i.e. LLMs with approximately 8 billion
of parameters. Through the use of proper optimization techniques, it is possible to run these
models on consumer-level devices without the need for specialized hardware. We perform
This is a question about ontological disjointness, answer only with
“yes” or “no”.</p>
        <p>Examples of disjoint are: “person” and “file system”, “tower” and
“person”, “place” and “agent”, “continent” and “sea”, “baseball league”
and “bowling league”, “planet” and “star”.</p>
        <p>Examples of not disjoint are: “basketball player” and “baseball player”,
“means of transportation” and “reptile”, “garden” and “historic place”,
“president” and “beauty queen”, “castle” and “prison”.</p>
        <p>
          QA Strategy
our experiments on a selection of the current state-of-the-art models, including Mistral 0.3 7B
[
          <xref ref-type="bibr" rid="ref25">25</xref>
          ], Gemma 2 9B [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ], LLama 3 8B10, and Qwen 2 7B [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ]11. All experiments are run on 8-bit
quantized models on an RTX3090 with 24GB of RAM. We experiment with each combination of
the prompts of Table 3.
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Results</title>
        <p>The overall results are shown in Table 4. In general, LLMs achieve promising results in
disjointness detection. Notably, the best prompting technique is not providing few-shot examples, but
rather providing the LLM with little to no description of the task. Indeed, it has been observed
how few-shot prompting is more efective when in-context learning is required, while zero-shot
prompting is more efective when the implicit knowledge of the LLM should be exploited [ 28].
Nonetheless, further research on few-shot prompting for disjointness classification should
be performed, as lower performances can also be attributed to the amount and nature of the
examples we provide in the prompt. We manually select examples that are likely to provide
meaningful disjointness instances. However, a more complex approach could be employed,
such as exploiting Retrieval Augmented Generation (RAG) techniques to provide examples
that are more likely to be relevant for the classes used as input. Diferent heuristics can be
used to measure the relevance of other classes, such as word embeddings or knowledge graph
embeddings. Interestingly, framing the problem as a negative QA task – i.e. asking whether
an individual of a class can also be an instance of another class – consistently outperforms the
positive QA prompt. This could be attributed to the fact that using the negative approach is
10https://llama.meta.com/
11Due to their closed-source nature and high costs, we reserve the exploration of GPT-3.5 and GPT-4 for future work.
more consistent with natural language questions. LLMs can actively exploit their pre-training
phase, which generally includes a fine-tuning phase to solve QA tasks akin to our negative
prompt. On average, Gemma 2 performs better than the other LLMs. However, depending on
the requirements, other LLMs might be better suited. For instance, Mistral 0.3 is better aligned
with human judgment, since it has a higher recall on disjointness axioms.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Disjointness on DBpedia</title>
        <p>Given the results of Table 4, we consider Gemma 2 with task description prompt and a negative
QA strategy as the most efective way of producing disjointness axioms among the methods
tested. We execute Algorithm 2 on the whole DBpedia ontology. We rely on a straightforward
random selection for the pair (D1, D2) (line 3). In total, the algorithm takes 21589.75 ≈ 6ℎ to
execute. Note that given the random selection, we are not able to exploit parallelism and query
the LLM with single prompts. However, a selection strategy that enables parallel selection would
greatly enhance the performance of the algorithm. In total, we find 510, 600 disjointness axioms,
which results in ≈ 98% of the classes participating in at least one disjointness axiom. The
number of axioms can be greatly reduced by relying on the “pruning” operation of Algorithm 2
(line 15). In the case of the DBpedia ontology, the number of resulting axioms is 170, 122 – a
reduction of ≈ 66%.</p>
        <p>For illustration and discussion purposes, Table 5 shows a non-representative selection of
particularly discussion-worthy positive and negative disjointness statements retrieved via
Algorithm 2. We observe that for some class relationships, including both common-sense
and domain-specific classes, our approach resulted in the “conservative” misclassification of
classes as non-disjoint, meaning the LLM classified the classes as non-disjoint despite the
classes actually being disjoint. Examples include dbo:VideogamesLeague and dbo:Website,
dbo:GeneLocation and dbo:HumanGene, and dbo:Identifier and dbo:District. Conversely, we
also observed ”aggressive” misclassification where our approach classified classes as disjoint
despite them being really non-disjoint. Straightforward examples include dbo:PlayboyPlaymate
and dbo:Camera or dbo:WikimediaTemplate and dbo:WomensTennisAssociationTournament,
with a more complicated example being dbo:Mosque and dbo:Museum. The latter being
disproven by a counter-example, the famous Mosque Hagia Sophia in Turkey12. To address these
misclassification instances, we suspect that providing more contextual information in the
prompt may improve classification accuracy, especially for domain-specific scenarios. Also,
future work could be done to assess how, through prompt design, the approach could encourage
more “aggressive” or “conservative” disjointness classifications in scenarios where relationships
are more uncertain.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion and Future Work</title>
      <p>This work shows that LLMs can roughly identify and assert disjointness axioms in ontologies,
with a diferent degree of reliability depending on the model. By harnessing their inherent
background knowledge and employing strategic prompt engineering, we showed that these
models can classify ontological disjointness with minimal human intervention. This capability
simplifies ontology management and supports more robust reasoning in knowledge graphs.
Our findings underscore the potential of LLMs as valuable tools for the automated enrichment
of ontologies, which encourages future exploration and innovation in this domain.</p>
      <p>Future works include testing the approach proposed in Section 4 on other ontologies, to
assess its efectiveness on diferent types of ontologies, including domain-specific ontologies.
Additionally, comprehensive validation by human domain experts would be required to obtain
conclusive insights into the degree of reliability of the axioms asserted by the LLM.</p>
      <p>Moreover, using diferent LLMs with diferent numbers of parameters and improving and
expanding our strategies for testing disjointness constitutes interesting future work.</p>
      <p>
        It could be worthwhile to look into heuristics for – given a large list of disjointness candidate
pairs – picking those entries that are particularly “promising”. One option would be to follow
the strong disjointness assumption [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and pick “sibling classes”, that is, classes  and  that
have a common direct superclass . Furthermore, it could be interesting to test class pairs with
just one or two examples of non-disjointness, as these instances may be errors to remove from
the KG. On another note, one could develop strategies for gauging the reliability of an LLM
response by rephrasing the question asked. This involves adding a description of classes in
prompts to see if it improves the answers, relying on proper ontology serialization techniques
[29]. Finally, using advanced prompting techniques, such as chain-of-thought, may improve the
results alongside RAG techniques to pick the few-shot examples. Similarly, a richer prompt,
including more qualifying phrases such as “at the same time” to check the temporality of the
disjointness or “theoretically” to force abstraction might instruct the model toward a more
efective framing of the problem.
      </p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>Elias Crum acknowledges funding provided by VITO NV (UG_PhD_2303_contract). Antonio
De Santis’s doctoral scholarship is funded by the Italian Ministry of University and Research
(MUR) under the National Recovery and Resilience Plan (NRRP), by Thales Alenia Space, and
by the European Union (EU) under the NextGenerationEU project. Alessia Pisu acknowledges
MUR and EU-FSE for financial support of the PON Research and Innovation 2014-2020 (D.M.
1061/2021). Nicolas Lazzari has received funding from the FAIR – Future Artificial Intelligence
Research Foundation as part of the grant agreement MUR n. 341. Sebastian Rudolph is funded
by the Bundesministerium für Bildung und Forschung (BMBF, Federal Ministry of Education
and Research) and DAAD (German Academic Exchange Service) in project 57616814 (SECAI,
School of Embedded and Composite AI).</p>
      <p>Qwen2 Technical Report, arXiv preprint arXiv:2407.10671 (2024).
[28] L. Reynolds, K. McDonell, Prompt Programming for Large Language Models: Beyond the
Few-shot Paradigm, in: Y. Kitamura, A. Quigley, K. Isbister, T. Igarashi (Eds.), CHI ’21: CHI
Conference on Human Factors in Computing Systems, Virtual Event / Yokohama Japan,
May 8-13, 2021, Extended Abstracts, ACM, 2021, pp. 314:1–314:7. URL: https://doi.org/10.
1145/3411763.3451760. doi:10.1145/3411763.3451760.
[29] C. Ringwald, F. Gandon, C. Faron, F. Michel, H. Abi Akl, 12 shades of RDF: Impact of
Syntaxes on Data Extraction with Language Models, in: ESWC 2024 Extended Semantic
Web Conference, May 2024, Hersonissos, Greece., 2024.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>T. D.</given-names>
            <surname>Wang</surname>
          </string-name>
          , Gauging Ontologies and
          <article-title>Schemas by Numbers</article-title>
          , in: D.
          <string-name>
            <surname>Vrandecic</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. C. SuárezFigueroa</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Gangemi</surname>
          </string-name>
          , Y. Sure (Eds.),
          <source>Proceedings of 4th International EON Workshop</source>
          <year>2006</year>
          <article-title>Evaluation of Ontologies for the Web Co-located with the WWW2006 Edinburgh</article-title>
          , UK, May
          <volume>22</volume>
          ,
          <year>2006</year>
          , volume
          <volume>179</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2006</year>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>179</volume>
          /eon2006wang.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Völker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Vrandecic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sure</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hotho</surname>
          </string-name>
          , Learning Disjointness, in: E. Franconi,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kifer</surname>
          </string-name>
          , W. May (Eds.),
          <source>The Semantic Web: Research and Applications, 4th European Semantic Web Conference, ESWC</source>
          <year>2007</year>
          , Innsbruck, Austria, June 3-7,
          <year>2007</year>
          , Proceedings, volume
          <volume>4519</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2007</year>
          , pp.
          <fpage>175</fpage>
          -
          <lpage>189</lpage>
          . URL: https://doi. org/10.1007/978-3-
          <fpage>540</fpage>
          -72667-8_
          <fpage>14</fpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>540</fpage>
          -72667-8\_
          <fpage>14</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Völker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fleischhacker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Stuckenschmidt</surname>
          </string-name>
          ,
          <article-title>Automatic acquisition of class disjointness</article-title>
          ,
          <source>J. Web Semant</source>
          .
          <volume>35</volume>
          (
          <year>2015</year>
          )
          <fpage>124</fpage>
          -
          <lpage>139</lpage>
          . URL: https://doi.org/10.1016/j.websem.
          <year>2015</year>
          .
          <volume>07</volume>
          .001. doi:
          <volume>10</volume>
          . 1016/J.WEBSEM.
          <year>2015</year>
          .
          <volume>07</volume>
          .001.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Rizzo</surname>
          </string-name>
          , C. d'Amato,
          <string-name>
            <given-names>N.</given-names>
            <surname>Fanizzi</surname>
          </string-name>
          ,
          <article-title>An unsupervised approach to disjointness learning based on terminological cluster trees</article-title>
          ,
          <source>Semantic Web</source>
          <volume>12</volume>
          (
          <year>2021</year>
          )
          <fpage>423</fpage>
          -
          <lpage>447</lpage>
          . URL: https: //doi.org/10.3233/SW-200391. doi:
          <volume>10</volume>
          .3233/SW-200391.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Schlobach</surname>
          </string-name>
          ,
          <article-title>Debugging and Semantic Clarification by Pinpointing</article-title>
          , in: A. GómezPérez, J. Euzenat (Eds.),
          <source>The Semantic Web: Research and Applications</source>
          , Second European Semantic Web Conference,
          <string-name>
            <surname>ESWC</surname>
          </string-name>
          <year>2005</year>
          , Heraklion, Crete, Greece, May 29 - June 1,
          <year>2005</year>
          , Proceedings, volume
          <volume>3532</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2005</year>
          , pp.
          <fpage>226</fpage>
          -
          <lpage>240</lpage>
          . URL: https://doi.org/10.1007/11431053_16. doi:
          <volume>10</volume>
          .1007/11431053\_
          <fpage>16</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Cornet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Abu-Hanna</surname>
          </string-name>
          ,
          <article-title>Usability of expressive description logics-a case study in UMLS</article-title>
          ,
          <source>in: AMIA</source>
          <year>2002</year>
          , American Medical Informatics Association Annual Symposium, San Antonio, TX, USA, November 9-
          <issue>13</issue>
          ,
          <year>2002</year>
          , AMIA,
          <year>2002</year>
          . URL: https://knowledge.amia.org/amia-55142
          <source>-a2002a-1</source>
          .610020/t-001
          <source>-1</source>
          .612667/f-001
          <source>-1</source>
          . 612668/a-036
          <source>-1</source>
          .613143/a-037
          <source>-1</source>
          .
          <fpage>613140</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <article-title>DL-Learner: Learning Concepts in Description Logics</article-title>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mach</surname>
          </string-name>
          .
          <source>Learn. Res</source>
          .
          <volume>10</volume>
          (
          <year>2009</year>
          )
          <fpage>2639</fpage>
          -
          <lpage>2642</lpage>
          . URL: https://dl.acm.org/doi/10.5555/1577069.1755874. doi:
          <volume>10</volume>
          .5555/ 1577069.1755874.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>B. P.</given-names>
            <surname>Allen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Stork</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Groth</surname>
          </string-name>
          ,
          <article-title>Knowledge Engineering Using Large Language Models</article-title>
          ,
          <source>Transactions on Graph Data and Knowledge</source>
          <volume>1</volume>
          (
          <year>2023</year>
          ) 3:
          <fpage>1</fpage>
          -
          <lpage>3</lpage>
          :
          <fpage>19</fpage>
          . URL: https://drops.dagstuhl. de/entities/document/10.4230/TGDK.1.
          <issue>1</issue>
          .3. doi:
          <volume>10</volume>
          .4230/TGDK.1.
          <issue>1</issue>
          .3.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <article-title>Unifying large language models and knowledge graphs: A roadmap</article-title>
          ,
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>36</volume>
          (
          <year>2024</year>
          )
          <fpage>3580</fpage>
          -
          <lpage>3599</lpage>
          . doi:
          <volume>10</volume>
          .1109/TKDE.
          <year>2024</year>
          .
          <volume>3352100</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>A. D. Santis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Balduini</surname>
            ,
            <given-names>F. D.</given-names>
          </string-name>
          <string-name>
            <surname>Santis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Proia</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Leo</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Brambilla</surname>
            ,
            <given-names>E. D.</given-names>
          </string-name>
          <string-name>
            <surname>Valle</surname>
          </string-name>
          ,
          <article-title>Integrating large language models and knowledge graphs for extraction and validation of textual test data</article-title>
          ,
          <year>2024</year>
          . URL: https://arxiv.org/abs/2408.01700. arXiv:
          <volume>2408</volume>
          .
          <fpage>01700</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F.</given-names>
            <surname>Petroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rocktäschel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Riedel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bakhtin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <article-title>Language models as knowledge bases?</article-title>
          , in: K. Inui,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ng</surname>
          </string-name>
          ,
          <string-name>
            <surname>X.</surname>
          </string-name>
          Wan (Eds.),
          <source>Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Hong Kong, China,
          <year>2019</year>
          , pp.
          <fpage>2463</fpage>
          -
          <lpage>2473</lpage>
          . URL: https://aclanthology.org/D19-1250. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>D19</fpage>
          -1250.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          , Attention is All you Need, in: I. Guyon, U. von Luxburg, S. Bengio,
          <string-name>
            <given-names>H. M.</given-names>
            <surname>Wallach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fergus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. V. N.</given-names>
            <surname>Vishwanathan</surname>
          </string-name>
          , R. Garnett (Eds.),
          <source>Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9</source>
          ,
          <year>2017</year>
          , Long Beach, CA, USA,
          <year>2017</year>
          , pp.
          <fpage>5998</fpage>
          -
          <lpage>6008</lpage>
          . URL: https://proceedings. neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>J. D. M.-W. C. Kenton</surname>
            ,
            <given-names>L. K.</given-names>
          </string-name>
          <string-name>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of Deep Bidirectional Transformers for Language Understanding</article-title>
          ,
          <source>in: Proceedings of NAACL-HLT</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>T. B. Brown</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Mann</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ryder</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Subbiah</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Kaplan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Dhariwal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Neelakantan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Shyam</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Sastry</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Askell</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Agarwal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Herbert-Voss</surname>
            , G. Krueger,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Henighan</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Child</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Ramesh</surname>
            ,
            <given-names>D. M.</given-names>
          </string-name>
          <string-name>
            <surname>Ziegler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Winter</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Hesse</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            , E. Sigler,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Litwin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Chess</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Berner</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>McCandlish</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Radford</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Amodei</surname>
          </string-name>
          ,
          <article-title>Language Models are Few-shot Learners</article-title>
          , in: H.
          <string-name>
            <surname>Larochelle</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Ranzato</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Hadsell</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Balcan</surname>
          </string-name>
          , H. Lin (Eds.),
          <source>Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems</source>
          <year>2020</year>
          ,
          <article-title>NeurIPS 2020</article-title>
          , December 6-
          <issue>12</issue>
          ,
          <year>2020</year>
          , virtual,
          <year>2020</year>
          . URL: https://proceedings.neurips.cc/paper/2020/hash/ 1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>R. OpenAI</surname>
          </string-name>
          , Gpt-4
          <source>technical report. arxiv 2303</source>
          .08774, View in Article 2 (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H.</given-names>
            <surname>Touvron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lavril</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Izacard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Martinet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lachaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lacroix</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Rozière</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Hambro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Azhar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rodriguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joulin</surname>
          </string-name>
          , E. Grave, G. Lample, LLaMA: Open and Eficient Foundation Language Models,
          <source>CoRR abs/2302</source>
          .13971 (
          <year>2023</year>
          ). URL: https://doi.org/ 10.48550/arXiv.2302.13971. doi:
          <volume>10</volume>
          .48550/ARXIV.2302.13971. arXiv:
          <volume>2302</volume>
          .
          <fpage>13971</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>H.</given-names>
            <surname>Touvron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Stone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Albert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Almahairi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Babaei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Bashlykov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Batra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bhargava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhosale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bikel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Blecher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Canton-Ferrer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Cucurull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Esiobu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Fuller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Goswami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hartshorn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hosseini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Inan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kardas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kerkez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Khabsa</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Kloumann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Korenev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Koura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lachaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lavril</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Liskovich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Martinet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mihaylov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mishra</surname>
          </string-name>
          , I. Molybog,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Poulton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Reizenstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rungta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Saladi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Schelten</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Subramanian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. E.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Taylor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. X.</given-names>
            <surname>Kuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yan</surname>
          </string-name>
          , I. Zarov,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kambadur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Narang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rodriguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Stojnic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Edunov</surname>
          </string-name>
          ,
          <source>T. Scialom, Llama</source>
          <volume>2</volume>
          :
          <string-name>
            <given-names>Open</given-names>
            <surname>Foundation</surname>
          </string-name>
          and
          <string-name>
            <surname>Fine-tuned Chat</surname>
            <given-names>Models</given-names>
          </string-name>
          ,
          <source>CoRR abs/2307</source>
          .09288 (
          <year>2023</year>
          ). URL: https://doi.org/10.48550/arXiv.2307.09288. doi:
          <volume>10</volume>
          .48550/ ARXIV.2307.09288. arXiv:
          <volume>2307</volume>
          .
          <fpage>09288</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Reid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Savinov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Teplyashin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lepikhin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. P.</given-names>
            <surname>Lillicrap</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Alayrac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Soricut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lazaridou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Firat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schrittwieser</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Antonoglou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Anil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Borgeaud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Millican</surname>
          </string-name>
          , E. Dyer,
          <string-name>
            <given-names>M.</given-names>
            <surname>Glaese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Sottiaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Viola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Reynolds</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Molloy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Isard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Barham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hennigan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>McIlroy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Johnson</surname>
          </string-name>
          , J.
          <string-name>
            <surname>Schalkwyk</surname>
            ,
            <given-names>E.</given-names>
            Collins, E.
          </string-name>
          <string-name>
            <surname>Rutherford</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Moreira</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Ayoub</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Goel</surname>
            , C. Meyer, G. Thornton,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Michalewski</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Abbas</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Schucher</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Anand</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Ives</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Keeling</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Lenc</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Haykal</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Shakeri</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Shyam</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Chowdhery</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Ring</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Spencer</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Sezener</surname>
          </string-name>
          , et al.,
          <source>Gemini</source>
          <volume>1</volume>
          .
          <article-title>5: Unlocking multimodal understanding across millions of tokens of context</article-title>
          ,
          <source>CoRR abs/2403</source>
          .05530 (
          <year>2024</year>
          ). URL: https://doi.org/10.48550/arXiv.2403.05530. doi:
          <volume>10</volume>
          .48550/ARXIV.2403.05530. arXiv:
          <volume>2403</volume>
          .
          <fpage>05530</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A. Q.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sablayrolles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mensch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Savary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bamford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Chaplot</surname>
          </string-name>
          ,
          <string-name>
            <surname>D. de Las Casas</surname>
            ,
            <given-names>E. B.</given-names>
          </string-name>
          <string-name>
            <surname>Hanna</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Bressand</surname>
            ,
            <given-names>G.</given-names>
            Lengyel, G. Bour, G.
          </string-name>
          <string-name>
            <surname>Lample</surname>
            ,
            <given-names>L. R.</given-names>
          </string-name>
          <string-name>
            <surname>Lavaud</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Saulnier</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Lachaux</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Stock</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Subramanian</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Antoniak</surname>
            ,
            <given-names>T. L.</given-names>
          </string-name>
          <string-name>
            <surname>Scao</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Gervet</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lavril</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lacroix</surname>
            ,
            <given-names>W. E.</given-names>
          </string-name>
          <string-name>
            <surname>Sayed</surname>
          </string-name>
          , Mixtral of Experts,
          <source>CoRR abs/2401</source>
          .04088 (
          <year>2024</year>
          ). URL: https://doi.org/10.48550/arXiv.2401.04088. doi:
          <volume>10</volume>
          .48550/ ARXIV.2401.04088. arXiv:
          <volume>2401</volume>
          .
          <fpage>04088</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>A.</given-names>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Child</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Luan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Amodei</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Sutskever</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghemawat</surname>
          </string-name>
          ,
          <article-title>Language Models are Unsupervised Multitask Learners</article-title>
          ,
          <source>in: OSDI'04: Sixth Symposium on Operating System Design and Implementation</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>137</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>B.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Langrené</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <article-title>Unleashing the potential of prompt engineering: a comprehensive review</article-title>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2310</volume>
          .
          <fpage>14735</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schuurmans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bosma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ichter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. H.</given-names>
            <surname>Chi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q. V.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Chain-of-thought Prompting Elicits Reasoning in Large Language Models</article-title>
          , in: S. Koyejo,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Belgrave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Oh (Eds.),
          <source>Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems</source>
          <year>2022</year>
          , NeurIPS
          <year>2022</year>
          , New Orleans, LA, USA, November 28 - December 9,
          <year>2022</year>
          ,
          <year>2022</year>
          . URL: http://papers.nips.cc/paper_files/paper/2022/hash/ 9d5609613524ecf4f15af0f7b31abca4-Abstract-Conference.html.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kojima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Reid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Matsuo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Iwasawa</surname>
          </string-name>
          ,
          <article-title>Large Language Models are Zeroshot Reasoners</article-title>
          , in: S. Koyejo,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Belgrave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Oh (Eds.),
          <source>Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems</source>
          <year>2022</year>
          , NeurIPS
          <year>2022</year>
          , New Orleans, LA, USA, November 28 - December 9,
          <year>2022</year>
          ,
          <year>2022</year>
          . URL: http://papers.nips.cc/paper_files/paper/2022/ hash/8bb0d291acd4acf06ef112099c16f326-Abstract-Conference.html.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Isele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jakob</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jentzsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kontokostas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. N.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hellmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Morsey</surname>
          </string-name>
          , P. van Kleef,
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          , C. Bizer, DBpedia
          <article-title>- A large-scale, multilingual knowledge base extracted from Wikipedia, Semantic Web 6 (</article-title>
          <year>2015</year>
          )
          <fpage>167</fpage>
          -
          <lpage>195</lpage>
          . URL: https://doi.org/10. 3233/SW-140134. doi:
          <volume>10</volume>
          .3233/SW-140134.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>A. Q.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sablayrolles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mensch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bamford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Chaplot</surname>
          </string-name>
          ,
          <string-name>
            <surname>D. de Las Casas</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Bressand</surname>
            ,
            <given-names>G.</given-names>
            Lengyel, G.
          </string-name>
          <string-name>
            <surname>Lample</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Saulnier</surname>
            ,
            <given-names>L. R.</given-names>
          </string-name>
          <string-name>
            <surname>Lavaud</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Lachaux</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Stock</surname>
            ,
            <given-names>T. L.</given-names>
          </string-name>
          <string-name>
            <surname>Scao</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lavril</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lacroix</surname>
            ,
            <given-names>W. E.</given-names>
          </string-name>
          <string-name>
            <surname>Sayed</surname>
          </string-name>
          , Mistral 7B,
          <source>CoRR abs/2310</source>
          .06825 (
          <year>2023</year>
          ). URL: https://doi.org/10.48550/arXiv.2310.06825. doi:
          <volume>10</volume>
          .48550/ ARXIV.2310.06825. arXiv:
          <volume>2310</volume>
          .
          <fpage>06825</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mesnard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hardin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Dadashi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhupatiraju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pathak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Sifre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rivière</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Kale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Love</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tafti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hussenot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chowdhery</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roberts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Botev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Castro-Ros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Slone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Héliou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tacchetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bulanova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Paterson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Tsai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Shahriari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. L.</given-names>
            <surname>Lan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Choquette-Choo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Crepy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ippolito</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Reid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Buchatskaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Ni</surname>
          </string-name>
          , E. Noland, G. Yan, G. Tucker, G. Muraru, G. Rozhdestvenskiy,
          <string-name>
            <given-names>H.</given-names>
            <surname>Michalewski</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Tenney</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Grishchenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Austin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Keeling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Labanowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lespiau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Stanway</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Brennan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ferret</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chiu</surname>
          </string-name>
          , et al.,
          <source>Gemma: Open Models Based on Gemini Research and Technology, CoRR abs/2403</source>
          .08295 (
          <year>2024</year>
          ). URL: https://doi.org/10.48550/arXiv.2403.08295. doi:
          <volume>10</volume>
          .48550/ARXIV.2403.08295. arXiv:
          <volume>2403</volume>
          .
          <fpage>08295</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>A.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Huang</surname>
          </string-name>
          , et al.,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>