<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>SEMANTiCS</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Towards a Knowledge-Graph-Driven Retrieval-Augmented Generation for Exploring and Curating Active Archives</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>DanaePla Karid</string-name>
          <email>danae@athenarc.g</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>ChristosChrysanthopoulo</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>sand Ioannis Triantafyllo</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Retrieval-Augmented Generation, Knowledge Graphs, Active Archives, Data Exploration, Automated Curation</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Archimedes, Athena Research Center</institution>
          ,
          <addr-line>Athens</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of West Attica</institution>
          ,
          <addr-line>Athens</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>21</volume>
      <abstract>
        <p>Active archives are dynamic collections that continuously grow and evolve through continuous ingestion and metadata enrichment. This vision paper outlines a modular architecture that fuses semantic metadata management with knowledge-graph-driven (KG-driven) exploration and retrieval-augmented generation (RAG) data curation. In our design, an exploration pipeline leverages domain ontologies and knowledge graphs to help users refine their queries and discover relevant information. Complementing this, a RAG-enabled curation pipeline combines retrieved archival content with generative AI, specifically large language models (LLMs), to synthesize and summarize findings into coherent narratives. We pose three research questions on retrieval quality, annotation accuracy, and usability for future evaluation. The framework is domain-agnostic and can be applied to any digital archive or library collection.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Generative AI systems, powered by large language models (LLMs), are transforming how we interact
with complex information. Digital libraries and large digital archives now act as active archives.
Archives hold unique records with provenance, libraries published works, but both keep growing
and need ongoing curation. Yet in digital archives, non-experts still struggle to locate context, while
archivists labour to maintain consistent semantic metadata, forming a discovery-curation challenge as
collections grow. Traditional keyword interfaces reveal only fragments of data, limiting broader context.
Research on generous interface1s][and faceted browsing2[] underscores the need for
knowledgedriven exploration tools. We define anactive archive as a born-digital or digitized collection that ingests
new content continuously, requiring ongoing curation. Even hybrid archives that combine physical and
digital records can exhibit this active, evolving behavior. Therefore, maintaining large active archives
requires continuous curation, enriching documents with metadata and tags to keep the knowledge
well-structured. In this vision paper, we propose a modular knowledge-based architecture with two
connected components: (a) a user-facing knowledge-guided exploration system, and (b) an
expertfacing retrieval-augmented generation (RAG) curation system. A key strength of this architecture
is the dynamic feedback loop between exploration and curation. When a particular topic or query
is searched frequently by users, the system flags that interest for archivists, who can then add or
refine metadata. Updates appear in user-facing exploration in real-time, ensuring the archive evolves
alongside engagement. Our method fuses discovery and curation in real time, creating a living archive.
To our knowledge, this is the first work to couple KG-guided exploration with RAG-based curation in
a single continuous loop. Both modules share a unified knowledge infrastructure (e.g., DBpe3d]ia [
and internal archives) for a consistent factual base. We formulate three research questions: (1) RQ1
(Exploration): Does KG-guided keyword expansion improve top-k precision for non-expert queries?, (2)
RQ2 (Curation): How accurate are RAG-generated metadata suggestions after curator review?, (3) RQ3</p>
      <p>CEUR</p>
      <p>ceur-ws.org
(Usability): What usability barriers emerge in a semantic-metadata interface for active archives?</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        In this section, we briefly summarize relevant prior research on active data, knowledge-based interfaces,
and retrieval-augmented generation in digital archives and librAarciteisv.e Data denotes datasets
still being generated, processed, or iteratively refined. “Active curation” locates curatorial work at
the very start of the data lifecycle, well before analysis or publication. Early technical, organisational,
and human choices determine usability and long-term value, yet coherent, integrated guidance is
scarce. Collaboration between researchers and data professionals is therefore essential at the point of
creation4[]. In Computational Archival Science (CAS), the “Vanishing Bo5x]”s[ignals the collapse
of orderly structure in digital records: messy, fragmented objects require generative AI, knowledge
graphs, and other novel methods to restore meaning. Active curation is both technical and interpretive,
ensuring today’s complex records remain accessible, comprehensible, and usa5b].leL[ifecycle models
cast curation as a chain of stages, from planning and collection to preservation, reuse, and eventual
disposal, aimed at maximising value over tim6e].[ Empirical work reinforces this view7].[ Active
curation is equally central to open science: a pragmatic, incremental stance maintains that partial
reproducibility and accessibility beat inaction, urging curators to engage as data are produced, even with
limited expertise, to enhance later discoverability and r8e]u.sCeo[nsequently, digital library collections
function asactive archives, growing continuously and demanding constant metadata maintenance.
Knowledge-Based Interfaces for Non-Expert Users in Digital Archives and Libraries. Traditional
keyword search often frustrates non-experts, the public or humanities scholars unfamiliar with archival
structure, because a lone search box presumes knowledge of both collection and terminology, an
“ungenerous” design that withholds conte1x]t. [To lower this barrier, digital-heritage research promotes
exploratory, knowledge-based interfaces. Faceted search, long established, exposes structured metadata
(dates, places, topics) so users filter results incrementally, cutting zero-hit queries and teaching archive
vocabulary 2[]. Rich-prospect or “generous” views go further: thumbnail grids, concept maps, and other
overviews foster browsing and serendipity without assuming domain expertise, instead turning metadata
and contextual links into navigation ai1d]s. [Increasingly, such interfaces draw on knowledge graphs
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. For example, 1[0] visualises entities (people, places, works) and their relationships, letting users
traverse semantic links rather than isolated records. Knowledge-based UIs thus allow non-experts to
assemble broader narratives, showing how ontologies, thesauri, and graphs underpin generous,
conceptcentric exploratioRn.etrieval-Augmented Generation for Metadata Curation in Digital Archives.
Rich metadata powers sophisticated archive interfaces but is expensive to pr1o1d].uRcec[ent work
exploits large language models (LLMs) to draft or complete records with minimal supervision. Song et
al.[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] cast description assignment as zero-shot classification, while Huang e1t3a]lp.[rompt GPT-4 to
generate titles and abstracts for archived pages. ChatGPT cataloguing pilots also report time savings,
though curator review remains essential. Retrieval-Augmented Generation (RAG) limits hallucinations
by retrieving evidence before conditioning the LLM outp1u4]t. [Nguyen et al. 1[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] show that a hybrid
RAG pipeline boosts answer relevance and surfaces citations in an archival search prototype. Overall,
coupling semantic retrieval with LLMs enhances user access and back-ofice metadata quality. On
another front, general-purpose library platforms, like Europ1e6a]naan[d the Digital Public Library
of America (DPLA) 1[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], enrich metadata ofline and rely on manual provider updates; no mechanism
links live user queries to immediate, AI-assisted curation, highlighting the gap of automated curation
solutions that enable continuous metadata injection.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. System Overview</title>
      <p>We propose a modular architecture in which knowledge‐based exploration and RAG-based curation
share a common knowledge layer: external graphs, an internal KB, and archive metadata, ensuring
factuality, adaptability, and consistency. Because both pipelines draw on the same grounding facts, each
LLM-powered module can evolve independently while their outputs reinforce one another: exploration
reveals gaps for curators, and curator-approved updates enrich exploration in real-time. This closed
loop minimizes hallucinations, builds trust, and keeps the archive current1)(.Fig.</p>
      <sec id="sec-3-1">
        <title>3.1. Knowledge-Based Exploration for Non-Expert Users</title>
        <p>For exploration, the system enables users to navigate the archive in natural language, augmented by
intelligent suggestions. Instead of formulating precise queries or browsing static categories, a
nonexpert user can ask a question or type a broad topic, and the system will guide them by leveraging
background knowledge.User Input: The user provides a natural-language query (or a short list of
keywords), using their informal vocabularNyL.P Parser: detects key entities and broader topics in
the user query, then assigns each to a node in the knowledge graph (e.g. DBpedia) with a confidence
score.KB / Graph Retriever: receives the parser frame and retrieves relevant nodes and relations
from three sources: (i) external public KGs, (ii) the internal domain KB, and (iii) the archive’s own
metadata and full-text index. Then merges the results into a focused heterogeneous subgraph centered
on the seed entities. This subgraph bridges the gap between the non-expert user’s natural-language
query and specialized domain conceptKs.eyword Expansion: recommends additional keywords
or phrases to narrow/widen or align the search scope to the existing internal data. To do this, it
mines the generated subgraph using graph mining techniques (co-occurrence, embedding similarity,
PageRank centrality, etc.) to select upttoop-relevant alternative nodes to recommend to the user. The
resulting keyword graph highlights central nodes and presents them so users can adopt more precise
terminology or explore related topSicesm.antic Subgraph Generator: after the user selects some
of the recommended keywords, this module renders an interactive subgraph that shows the user’s
query alongside related concepts, entities, and linked archive items. New metadata produced by the
RAG pipeline, new tags, is incorporated instantly, while heavily travelled paths or recurring searches
are flagged for curators to expose gaps or emerging themes. This closed-loop mechanism keeps the
subgraph and the archive evolving in sync with user intereRstess.ult Viewer: renders (i) a ranked list
of documents with snippet highlights, (ii) an interactive graph pane showing the semantic subgraph,
and (iii) dynamically generated facets (entities, time, location, topic). User interactions: node click,
facet toggle, feedback thumbs emit events logged for both analytics and online learning. Users see not
just a list of items but also how each relates to broader concepts. All relevance scores are exposed to
support explainability. By grounding recommendations in a knowledge graph, the system keeps them
understandable and context-aware, lowering barriers for exploring complex archives. It extends the
notion of faceted search by dynamically generating AI-driven facets for each query instead of relying on
static categories alone. This context-rich interface provides multiple pathways for discovery, revealing
how information is interlinked across the archive. The continuous feedback loop, which synchronizes
user exploration with metadata updates, ensures the archive remains current, comprehensive, and
responsive to emerging interests.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. RAG-Based Approach for Active Data Curation</title>
        <p>The RAG pipeline curates new documents under human oversight. As new documents arrive, the
pipeline generates metadata, summaries, and knowledge base entries under human supervision. The
objective is to enrich and maintain the archive’s structured knowledge using AI eficiently, while a
human expert ensures quality contrNole.w Document Input: accepts single files or batches and the
curator initiates the pipeline rather than annotating it manually. The system ingests the document
(or batch) as input, ready for automated processRinegt.riever (Internal &amp; External): for each
incoming document, queries two retrieval tiers: (1) semantic retrieval over the existing archive to find
similar contents, and (2) KG-aware retrievers on external sources (e.g. DBPedia) to fetch definitions and
reference facts for every mentioned entCitoyn. text Fusion: builds a composite prompt comprising the
new document’s content with supporting facts. Relevant excerpts are concatenated and organized (e.g.,
by topic), ensuring the LLM has grounded information, for instance, pairing an external
knowledgebase definition of each entity with the sentence where it is mentioned in the new documLLenMt.
Generation: processes the fused context and the document’s content to generate the desired outputs.
This can include a draft summary of the document, a set of suggested metadata tags, identified entities
(with potential links to the knowledge graph), and even relational triples (for updating the graph).
Crucially, because the LLM has been fed with relevant context, these outputs are grounded, allowing
the model to cite or incorporate actual facts from the retrieved documents instead of hallucinating.
Archivist Review: inspects the LLM’s outputs for accuracy, relevance, and proper sourcing, correcting
or confirming as needed. Because suggestions are backed by references, each piece of metadata is
traceable to its source. After approval, new or revised metadata updates the knowledge graph and
indexes. This human-in-the-loop step ensures curation qualiKtBy. &amp; Archive Update: integrates
approved metadata and new knowledge into the archive’s internal knowledge base, updating the index,
tags, and relationships. A scheduled ofline job learns from archivist edits, refining prompts and retrieval
to improve accuracy. Knowledge-based exploration and Active Curation can operate separately or
integrate to share insights in real-time. User exploration logs (such as frequent unanswered queries
or clicked concepts) inform the curators about emerging interests or gaps in the archive, prompting
targeted curation. while newly curated metadata instantly appears in the exploration interface. This
continuous loop ensures an evolving, living archive: documents are constantly updated and made
discoverable through both AI automation and expert oversight.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Future Work</title>
      <p>
        Answering the posed research questions will guide our future evaluation of the system using both
expert judgment and quantitative metrics. We plan to evaluate whether KG-guided keyword expansion
improves top - precision (RQ1) by comparing search results with and without the expansion on a
set of test queries (e.g. using precision@k and recall metrics). To assess RAG-generated metadata
accuracy (RQ2), we will conduct a study where archivists review AI-suggested tags/ summaries. We will
measure the acceptance rate of suggestions and the frequency of corrections needed, thereby evaluating
accuracy after human review. We will also compare the factual consistency against the source (e.g.
using metrics likeFactScore[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], hallucination rate = incorrect factual statements / total statements)
against a zero-shot variant without retrieval. Usability (RQ3) will be evaluated through user studies:
we will gather feedback from non-experts and archivists using the system. Metrics like task success
rate, user satisfaction, and qualitative feedback on the interface will help identify any usability barriers.
Declaration on Generative AI: The authors have not employed any Generative AI tools.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>The publication of this article was funded by the University of West Attica.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Whitelaw</surname>
          </string-name>
          ,
          <article-title>Generous interfaces for digital cultural collections</article-title>
          ,
          <source>Digital humanities quarterly 9</source>
          (
          <year>2015</year>
          )
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.-P.</given-names>
            <surname>Yee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Swearingen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hearst</surname>
          </string-name>
          ,
          <article-title>Faceted metadata for image search and browsing</article-title>
          ,
          <source>in: Proceedings of the SIGCHI conference on Human factors in computing systems</source>
          ,
          <year>2003</year>
          , pp.
          <fpage>401</fpage>
          -
          <lpage>408</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Isele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jakob</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jentzsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kontokostas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. N.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hellmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Morsey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Van Kleef</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          , et al.,
          <article-title>Dbpedia-a large-scale, multilingual knowledge base extracted from wikipedia</article-title>
          ,
          <source>Semantic web 6</source>
          (
          <year>2015</year>
          )
          <fpage>167</fpage>
          -
          <lpage>195</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>I.</given-names>
            <surname>Kouper</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. L.</given-names>
            <surname>Tucker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Tharp</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. E. van Booven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Clark</surname>
          </string-name>
          ,
          <article-title>Active curation of large longitudinal surveys: A case study</article-title>
          ,
          <source>Journal of EScience Librarianship</source>
          <volume>10</volume>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Proctor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Marciano</surname>
          </string-name>
          ,
          <article-title>A computational review of the literature of computational archival science (cas): Advancing archival theory in the age of the digital tsunami and the vanishing box problem</article-title>
          ,
          <source>in: 2024 IEEE International Conference on Big Data (BigData)</source>
          , IEEE,
          <year>2024</year>
          , pp.
          <fpage>2514</fpage>
          -
          <lpage>2523</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Oliver</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Harvey</surname>
          </string-name>
          , Digital curation, American Library Association,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H. L.</given-names>
            <surname>Rhee</surname>
          </string-name>
          ,
          <article-title>A new lifecycle model enabling optimal digital curation</article-title>
          ,
          <source>Journal of librarianship and information science 56</source>
          (
          <year>2024</year>
          )
          <fpage>241</fpage>
          -
          <lpage>266</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S. L.</given-names>
            <surname>Sawchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Khair</surname>
          </string-name>
          ,
          <article-title>Computational reproducibility: A practical framework for data curators</article-title>
          ,
          <source>Journal of eScience Librarianship</source>
          <volume>10</volume>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Pla Karidi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Stavrakas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Vassiliou</surname>
          </string-name>
          ,
          <article-title>Tweet and followee personalized recommendations based on knowledge graphs</article-title>
          ,
          <source>Journal of Ambient Intelligence and Humanized Computing</source>
          <volume>9</volume>
          (
          <year>2018</year>
          )
          <fpage>2035</fpage>
          -
          <lpage>2049</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Khoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Tan</surname>
          </string-name>
          , S.-G. Ng,
          <string-name>
            <given-names>C.-F.</given-names>
            <surname>Chan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stanley-Baker</surname>
          </string-name>
          , W.-N. Cheng,
          <article-title>Knowledge graph visualization interface for digital heritage collections</article-title>
          ,
          <source>Information Technology and Libraries</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>I. Triantafyllou</surname>
          </string-name>
          , Thematic categorization on university records,
          <source>in: 2023 IEEE 11th International Conference on Systems and Control (ICSC)</source>
          , IEEE,
          <year>2023</year>
          , pp.
          <fpage>384</fpage>
          -
          <lpage>389</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>H.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bethard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Thomer</surname>
          </string-name>
          ,
          <article-title>Metadata enhancement using large language models</article-title>
          ,
          <source>in: Proceedings of the Fourth Workshop on Scholarly Document Processing (SDP</source>
          <year>2024</year>
          ),
          <year>2024</year>
          , pp.
          <fpage>145</fpage>
          -
          <lpage>154</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nair</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z. R.</given-names>
            <surname>Goh</surname>
          </string-name>
          , T. Liu,
          <article-title>Web archives metadata generation with gpt-4o: Challenges and insights</article-title>
          ,
          <source>arXiv preprint arXiv:2411.05409</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Piktus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Petroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Karpukhin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Küttler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          , W.-t. Yih,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rocktäschel</surname>
          </string-name>
          , et al.,
          <article-title>Retrieval-augmented generation for knowledge-intensive nlp tasks</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>33</volume>
          (
          <year>2020</year>
          )
          <fpage>9459</fpage>
          -
          <lpage>9474</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>H. D.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.-H. A.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          , T. B.
          <string-name>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <article-title>A proposed large language model-based smart search for archive system</article-title>
          ,
          <source>in: International Symposium on Information and Communication Technology</source>
          , Springer,
          <year>2024</year>
          , pp.
          <fpage>210</fpage>
          -
          <lpage>223</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>B.</given-names>
            <surname>Haslhofer</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Isaac, data.europeana.eu - the europeana linked open data pilot</article-title>
          ,
          <source>in: Proc. DC-2011</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bragg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tumlin</surname>
          </string-name>
          ,
          <article-title>Metadata aggregation at the digital public library of america</article-title>
          ,
          <source>in: Proc. JCDL</source>
          <year>2015</year>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>S.</given-names>
            <surname>Min</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lewis</surname>
          </string-name>
          , W.-t. Yih, Factscore:
          <article-title>Evaluating factual consistency in retrieval-augmented generation</article-title>
          ,
          <source>in: Proc. ACL</source>
          <year>2023</year>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>