<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Retrieval-Augmented Generation for Query Target Type Identification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Darío Garigliotti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Bergen</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The paradigm shift unleashed by Entity-Oriented Search still characterizes a vast space of the dynamics with which users engage in digital information access, from Web search to e-commerce and social networks. The progress in research around Entity Retrieval tasks has in particular shown the convenience of incorporating type-based information for entities in their methods to provide relevant answers to queries. As types are typically accessible in an ontology of reference within the knowledge base where their assigned entities live, automatically identifying query target types is a relevant problem to tackle. In this work, we propose to address the task of Query Target Type Identification by assessing the capabilities of Large Language Models that have recently shown widespread success. Our experimentation with methods based on Retrieval-Augmented Generation over a purposely built test collection from the literature challenges a well-established closed LLM by presenting it with entity type information from a resource within the core in hubbing Linked Open Data.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The emergence and consolidation of Entity-Oriented Search (EOS) represent a milestone in the evolution
of web search paradigms [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The space of services provided by search engines were subtly yet decisively
upgraded by ofering more direct responses to the user beyond the traditional “blue links.” Indeed, the
cognitive workload of finding the actual desired information within relevant documents is alleviated in
several scenarios where Search Engine Response Pages (SERPs) provide focused replies, among others,
via widgets (like the ones about weather information), entity cards (or infoboxes summarizing factoids,
for example, about an artist or a city) and direct answers. In a virtuous cycle, users demand it more
often and in more scenarios, by issuing more entity-centric queries, hence the paradigm is reinforced
and nowadays as expected as it is ubiquitous. Interwoven with these industrial developments, the focus
on entities in research largely established around studying a paradigmatic type of entity, people, in
tasks such as expert finding [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. It then broadened to all kinds of entities alongside the increasing eforts
in Semantic Web to build knowledge bases (KBs) for more and more domains of knowledge, where
uniquely identified entities are first-class citizens described by properties and related to other entities.
Addressing Entity Retrieval (this is, the problem of ranking relevant entities for an entity-centric query)
has enabled a considerable body of research where methods often incorporate information relevant to
entities that is stored in KBs [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. Other than properties and relationships as mentioned, a characteristic
knowledge item for an entity is its semantic classes or types, typically assigned from an ontology or type
taxonomy existing in correspondence to the KB [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. As it has been studied, Entity Retrieval (ER) gets
benefited from integrating type-based information about the expected entities for the query into the ER
approach [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Hence, a related significant problem in Entity-Oriented Search is the one of automatically
identifying query target types, i.e. the types of all the entities expected to be relevant for answering the
query [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. We refer to this problem as Target Type Identification (TTI) , and it is the task that we address
in this paper.
      </p>
      <p>
        To the best of our knowledge, the state-of-the-art TTI approach over the latest developed benchmark
is a Learning-to-Rank (LtR) method, where a combination of several kinds of features is manually
selected and whose respective importance is then supervisedly learnt [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. These features complement
each other, as this particular LtR instance brings together query attributes, type attributes, and the
scoring from multiple rankings using both traditional lexical matching and Language Models to capture
similarity based on distributional semantics in latent space. An attempt to update its performance results
into a new state of the art for TTI could exploit modern Large Language Models (LLMs) to replace the
more primitive LMs previously used, or even further, possibly replacing many or all previous features
with a feature set only made of LLMs. However, although this may indeed improve and, in a way, absorb
several of the non-(L)LM features into few modeled by representation –or deep– learning (DL), the
feature engineering of the underlying LtR method remains. An alternative consists in re-approaching
the TTI task by replacing the feature design altogether with a mechanism that directly uses a single,
powerful LLM while integrating the external knowledge that is expected not to be well represented
in, or not captured at all by, the billions of an LLM’ parameters. The object of experimentation in our
work is this alternative by the means of a Retrieval-augmented Generation (RAG) framework [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] that
by design integrates explicit knowledge –here, types in the ontology of a widely known KB– with the
parametric knowledge that has shown successful performance in a variety of tasks [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13">10, 11, 12, 13</xref>
        ]. In a
series of experimental configurations for RAG-based methods, we assess the abilities exhibited by an
established, commercial, closed LLM in identifying target types. Throughout these analyses, we explore
an area such as TTI in the encompassing integration of structured entries from a key hub resource in
Linked Open Data into the vast space of information technology around LLMs.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Query Target Type Identification</title>
      <p>
        Given an entity-centric or entity-oriented query  —this is, a query to be answered with relevant
entities— and a type taxonomy  , Hierarchical Target Type Identification (TTI) (or Query TTI) [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ] is
the problem of returning a ranking that lists all the main target types of the query, this is, such that (i)
they are the most specific entity classes that are relevant to , and (ii) they are not on the same path
from the root in the tree  two-to-two.
      </p>
      <p>Since this definition imposes a number of criteria in the way a TTI method is to be assessed, we
assume a slightly diferent set of criteria during evaluation. First, we evaluate a TTI output with
set-based metrics, i.e. not order-based as we do not request the LLM to provide a ranking of the correct
target types for a query. Also, although requested to the LLM with this definition during augmentation
—i.e. regarding the need for being of highest specificity in non-overlapping taxonomy branches—, we
do not allow for leniency on the evaluation of this hierarchical nature; the criterion is binary and hence
we do not give any sort of discounted scoring for wrongly predicted types that are taxonomically close
to the correct one(s).</p>
      <sec id="sec-2-1">
        <title>2.1. Retrieval-Augmented Generation</title>
        <p>
          We consider each of our parameter configurations as a TTI method based on Retrieval-augmented
Generation (RAG) [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. Following established literature practices [
          <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
          ], we instantiate the common
naive RAG framework with its three distinctive stages, in a similar fashion as done previously for other
tasks [
          <xref ref-type="bibr" rid="ref13 ref16">13, 16</xref>
          ].
        </p>
        <p>
          The first component performs retrieval to obtain a ranked list of target types for each query. This
crucial step is the one selecting the items from external information to be then provided explicitly
during the following stage, augmentation. As in this work we are interested in assessing the abilities
of LLMs to integrate this explicit knowledge with its parametric sources, we assume two possible
(near-)optimal retrievers. First, an oracle retrieval assumes the ground-truth ranked target types from
the test collection (cf. Section 2.3) as a perfectly informed type ranking. Second, an extension of this
oracle, that we refer to as pseudo-oracle, is obtained for each query by adding, after the oracle-given
ranked types, all new types coming from the union of the type sets in the ontology for all the top-1,000
entities retrieved with BM25, a solid first-pass retriever [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. As we describe below in Section 2.3, the
relevance score for each query-type pair in the collection allows us to use the ground-truth target types
for each query as an oracle ranking in our experimentation for the retrieval stage. This order is also the
by-ranking oracle order of passages in augmentation phase, and the order of the opening sub-sequence
(corresponding to this oracle ranking) of the retrieved pseudo-oracle.
        </p>
        <p>In the second RAG stage, augmentation, we engineer a prompt to provide the retrieved target types
for LLM assessment. The prompt starts with a header that describes the task at hand, and the format of
the input data and its expected output:</p>
        <p>“ You are an assistant for a question-answering task. You are provided with entity types (or classes) from
DBpedia ontology version 2015-10, each type preceded with its corresponding identifier or type ID. Then, you
are provided with a user query that has been issued to a search engine. This query is best answered by ranking
entities that are relevant to the query. Use the types and the query that you are provided with, to ANSWER the
QUESTION to the best of your ability. If you don’t know the answer, just say that you don’t know. Keep the
answer concise. Always mention one or more corresponding type IDs (which must be among the given TYPES)
in their correct format (this is, &lt;dbo:X&gt; for a type with name X). Examples are given below, each example
between the ’&lt;example&gt;’ and ’&lt;/example&gt;’ tags. After that, you are given the query with types so that you
answer the question. ”</p>
        <p>In the zero-shot prompting scenarios, the part about providing examples (from “as it’s done in
each example” to “After that,”) is omitted from the prompt. Otherwise, this header above is followed
by one or more examples in the same format as the main item of the task. The prompt ends with
this main item, i.e. the actual set of retrieved types –provided in a particular order according to an
experimental parameter– followed by the query, the main task question for the LLM and an open field
“Answer:” to be completed with the generated answer. Each type is given by its ID in a pseudo-URI
(Universal Resource Identifier) &lt;dbo:X&gt; for camel-cased type X, and its full type name as a valid
expression in natural language. In an augmentation-phase parameter, we experiment with providing
also a short description of the type, if available for its URI in DBpedia 2016-04 via the comment predicate,
&lt;http://www.w3.org/2000/01/rdf-schema#comment&gt;. The task question asks to the generator:
“ Which one(s), if any, of the provided entity type(s) are the main target types of the query, this is, such that
(i) they are the most specific category of entities that are relevant to the query, and (ii) they are not on the
same path from the root in the tree induced by the DBpedia 2015-10 ontology? ”</p>
        <p>
          The generation phase completes the RAG pipeline. In all our experiments, we input the prompt
build during augmentation into GPT-3.5 (gpt-3.5-turbo-0125) [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Research Questions</title>
        <p>The research questions that guide our experimental analysis in Section 3 are the following:
• RQ1: How does the LLM perform at assessing the target types given by an optimal or near-optimal
retriever?
• RQ2: What is the impact of the order in which these retrieved types are in the prompt?
• RQ3: Does adding a textual description to each type name help identifying target types?
• RQ4: How much do the few-shot examples provided in the prompt contribute?
• RQ5: What performance diferences are observed across query groups?</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Experimental Setup</title>
        <p>
          Test collection. The dataset to evaluate TTI performances is based on DBpedia-Entity, an Entity
Retrieval (ER) collection [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. DBpedia-Entity contains 485 user queries that aggregate multiple ER
benchmarks, where relevance of entities from DBpedia 2015-10 were judged by annotators. The TTI
collection [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] builds on top of it by assigning human judgements of target types from DBpedia 2015-10
ontology for 479 of the 485 DBpedia-Entity queries (the remaining 6 failed to get annotated types).
Specifically, 7 annotators judged types pooled by first-pass lexical retrievers. Each query gets then a
set of relevant target types, with the number of annotators for each type as its relevance, making the
total amount of annotators across all types for a query to be 7. These scores allow for assuming the TTI
ground truth as a ranking.
        </p>
        <p>Evaluation metrics. We measure the performance of a method on each query, in terms of traditional
set-based metrics used in Information Retrieval, specifically Precision, Recall, and their harmonic mean
F-Score. We report the average performance across a query set, for few query sets of interest: the entire
set of 479 queries in the TTI collection, and each of the four query groups in its partition (SemSearch-ES,
INEX-LD, QALD2 and ListSearch).</p>
        <p>Method configurations. This is a summary of our experimental parameters:
• (Retrieval) method: oracle or pseudo-oracle retriever;
• (Augmentation) type information –only the name, or name and description–; type order –as
given by the ranking, or random–; and few-shot learning –zero-shot or one-shot.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experimental Results</title>
      <p>RQ1: Assessing near-perfect type retrieval. We first observe a very high precision for most cases,
and especially for one-shot augmentation setting. Yet, in these same results we can verify that this LLM
still fails to compare with the perfect retrievers, hurting the performance even further with peudo-oracle
retrieval. Recall measurements are clearly underperforming.</p>
      <sec id="sec-3-1">
        <title>RQ2: Impact of order of retrieved types in prompt. In the majority of the cases of methods and</title>
        <p>evaluation metrics, the scenario of types ordered by ranking is the best performing, giving weight to
observations in the related work regarding artefacts in the LLM favouring the memorization of the
typical by-ranking order in RAG. However, for some metrics the best performing method in an entire
query group, and in particular for every metric in the measurement for all queries in the collection,
performs best with random order of types.</p>
      </sec>
      <sec id="sec-3-2">
        <title>RQ3: Importance of type description alongside type name in prompt. In most cases, we observe</title>
        <p>that for oracle-based methods, the provision of the type description results slightly detrimental, arguably
introducing confusion rather than positively complementing the type name. Instead, for methods using
the pseudo-oracle retriever, the type description helps improve the performance, as they possibly add a
correction to the misleading types.</p>
      </sec>
      <sec id="sec-3-3">
        <title>RQ4: Contribution of examples in few-shot learning. As expected, the one-shot learning in the</title>
        <p>prompt benefits almost all the scenarios across methods and metrics, often substantially, and in several
cases achieving a performance very close to perfect.</p>
        <p>RQ5: Breakdown by query groups. INEX-LD –general keyword queries– is, overall, as expected,
the most challenging group, while ListSearch –entity list queries– and SemSearch-ES –named entity
queries–, groups that lend themselves into entity types, the best performing ones.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Future Work</title>
      <p>In this work we have addressed Target Type Identification via LLM-powered RAG methods.</p>
      <p>In future lines of work, we aim to experiment with (i) alternative evaluation criteria that take
into account the hierarchical nature of TTI, (ii) sub-optimal retrievers to assess the more realistic
performance, and (iii) incorporating more aspects of the graph-based nature of both the type system
and the knowledge base that hosts the entities.</p>
      <p>Oracle
Pseudo-O.</p>
      <p>Oracle
Pseudo-O.</p>
      <p>Oracle
Pseudo-O.</p>
      <p>Oracle
Pseudo-O.</p>
      <p>Oracle
Pseudo-O.</p>
      <p>Name + Description
Name + Description
Name + Description
Name + Description
Name
Name
Name
Name
Name
Name
Name
Name
Name
Name
Name + Description
Name + Description
Name + Description
Name + Description
Name + Description
Name + Description</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This work was funded by the Norwegian Research Council grant 329745 Machine Teaching for
Explainable AI.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K.</given-names>
            <surname>Balog</surname>
          </string-name>
          ,
          <string-name>
            <surname>Entity-Oriented</surname>
            <given-names>Search</given-names>
          </string-name>
          , volume
          <volume>39</volume>
          <source>of The Information Retrieval Series</source>
          , Springer,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>Balog</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Azzopardi</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. de Rijke</surname>
          </string-name>
          ,
          <article-title>A language modeling framework for expert finding</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>45</volume>
          (
          <year>2009</year>
          )
          <fpage>1</fpage>
          -
          <lpage>19</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K.</given-names>
            <surname>Balog</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bron</surname>
          </string-name>
          , M. de Rijke,
          <article-title>Query modeling for entity search based on terms, categories, and examples</article-title>
          ,
          <source>ACM Trans. Inf. Syst</source>
          .
          <volume>29</volume>
          (
          <year>2011</year>
          )
          <fpage>1</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Balog</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Neumayer</surname>
          </string-name>
          ,
          <article-title>A test collection for entity search in DBpedia</article-title>
          ,
          <source>in: Proceedings of the 36th international ACM SIGIR conference on Research and development in Information Retrieval, SIGIR '13</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>737</fpage>
          -
          <lpage>740</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Garigliotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Balog</surname>
          </string-name>
          ,
          <article-title>On type-aware entity retrieval</article-title>
          ,
          <source>in: Proceedings of the 2017 ACM International Conference on Theory of Information Retrieval, ICTIR '17</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>27</fpage>
          -
          <lpage>34</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Garigliotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Hasibi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Balog</surname>
          </string-name>
          ,
          <article-title>Identifying and exploiting target entity type information for ad hoc entity retrieval</article-title>
          ,
          <source>Information Retrieval Journal</source>
          <volume>22</volume>
          (
          <year>2019</year>
          )
          <fpage>285</fpage>
          -
          <lpage>323</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>K.</given-names>
            <surname>Balog</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Neumayer</surname>
          </string-name>
          ,
          <article-title>Hierarchical target type identification for entity-oriented queries</article-title>
          ,
          <source>in: Proceedings of the 21st ACM International Conference on Information and Knowledge Management</source>
          ,
          <source>CIKM '12</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>2391</fpage>
          -
          <lpage>2394</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Garigliotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Hasibi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Balog</surname>
          </string-name>
          ,
          <article-title>Target type identification for entity-bearing queries</article-title>
          ,
          <source>in: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '17</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>845</fpage>
          -
          <lpage>848</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Piktus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Petroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Karpukhin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Küttler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          , W.-t. Yih,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rocktäschel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Riedel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kiela</surname>
          </string-name>
          ,
          <article-title>Retrieval-augmented generation for knowledge-intensive nlp tasks</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>33</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2020</year>
          , pp.
          <fpage>9459</fpage>
          -
          <lpage>9474</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Child</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Luan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Amodei</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Sutskever</surname>
          </string-name>
          ,
          <article-title>Language models are unsupervised multitask learners</article-title>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>H. T.</surname>
          </string-name>
          et al.,
          <source>Llama</source>
          <volume>2</volume>
          :
          <string-name>
            <given-names>Open</given-names>
            <surname>Foundation</surname>
          </string-name>
          and
          <string-name>
            <surname>Fine-Tuned Chat</surname>
            <given-names>Models</given-names>
          </string-name>
          ,
          <source>ArXiv abs/2307</source>
          .09288 (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Si</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>An empirical study of instruction-tuning large language models in Chinese, in: Findings of the Association for Computational Linguistics: EMNLP 2023, Association for Computational Linguistics</article-title>
          , Singapore,
          <year>2023</year>
          , pp.
          <fpage>4086</fpage>
          -
          <lpage>4107</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D.</given-names>
            <surname>Garigliotti</surname>
          </string-name>
          ,
          <article-title>SDG target detection in environmental reports using retrieval-augmented generation with LLMs</article-title>
          , in: Proceedings of ClimateNLP, Association for Computational Linguistics, Bangkok, Thailand,
          <year>2024</year>
          , pp.
          <fpage>241</fpage>
          -
          <lpage>250</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Enabling large language models to generate text with citations</article-title>
          , in: H.
          <string-name>
            <surname>Bouamor</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          Bali (Eds.),
          <source>Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing</source>
          , Association for Computational Linguistics, Singapore,
          <year>2023</year>
          , pp.
          <fpage>6465</fpage>
          -
          <lpage>6488</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Retrieval-augmented generation for large language models: A survey</article-title>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2312</volume>
          .
          <fpage>10997</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>D.</given-names>
            <surname>Garigliotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Johansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. V.</given-names>
            <surname>Kallestad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-E.</given-names>
            <surname>Cho</surname>
          </string-name>
          , C. Ferri,
          <article-title>EquinorQA: Large Language Models for Question Answering over proprietary data</article-title>
          ,
          <source>in: ECAI 2024 - 27th European Conference on Artificial Intelligence - Including 13th Conference on Prestigious Applications of Intelligent Systems (PAIS</source>
          <year>2024</year>
          ), IOS Press,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Robertson</surname>
          </string-name>
          ,
          <article-title>The probability ranking principle in IR</article-title>
          ,
          <source>Journal of Documentation</source>
          <volume>33</volume>
          (
          <year>1977</year>
          )
          <fpage>294</fpage>
          -
          <lpage>304</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>