<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>G. Bardelloni);</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>with LLMs⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kudzai Sauka</string-name>
          <email>k.sauka@hva.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gianluigi Bardelloni</string-name>
          <email>gianluigi.bardelloni@kpn.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jigsa Bulto</string-name>
          <email>jigsa.bulto@kpn.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Frederik B. I. Situmeang</string-name>
          <email>f.b.i.situmeang@hva.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Knowledge Graph, Coreference Resolution, Schema Alignment, Triple Extraction, Ontology</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Amsterdam University of Applied Sciences</institution>
          ,
          <addr-line>Fraijlemaborg 133, 1102 CV, Amsterdam</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>KPN</institution>
          ,
          <addr-line>Teleportboulevard 121, 1043 EJ, Amsterdam</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Amsterdam</institution>
          ,
          <addr-line>Plantage Muidergracht 12, 1018TV, Amsterdam</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>This paper introduces a hierarchy-aware framework, Document-Preprocessing-Extract-Resolve-MergeCanonicalize (DERMC), for constructing knowledge graphs using large language models (LLMs). It addresses the limitations of naive LLM prompting in creating a knowledge graph, which often results in redundancy, LLM cognitive overload, and inconsistencies across languages. Building on the Extract-Define-Canonicalize (EDC) paradigm, our approach integrates multilingual coreference resolution, hierarchical document parsing, and a RAG-MCP-inspired schema retriever that dynamically narrows candidate relations for each document context, thereby reducing the size of the LLM prompt. The system supports both commercial and local LLM backends, allowing flexible deployment. Preliminary results show improved alignment between flat and hierarchical parsing modes. Full-scale experiments and evaluations, including assessments by industry and knowledge representation experts, are planned to validate performance and quality in enterprise contexts ∗Corresponding author. †These authors contributed equally.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The demand for non-hallucinating, transparent Gen-AI chatbots continues to grow as businesses aim to
address user trust and consumer acceptance [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ]. Using retrieval-augmented generation (RAG) with
knowledge graphs (KGs) in Gen-AI chatbots helps reduce hallucinations and improves transparency
[
        <xref ref-type="bibr" rid="ref2 ref4 ref5">4, 5, 2</xref>
        ]. However, creating KGs from unstructured text remains challenging, particularly in customer
service, where chatbots ground answers in knowledge articles. KG construction is typically a manual and
cumbersome process that requires domain experts and knowledge representation specialists [
        <xref ref-type="bibr" rid="ref6 ref7 ref8 ref9">6, 7, 8, 9</xref>
        ].
This manual process is time-consuming, costly, and challenging to scale for large, rapidly changing
domains [
        <xref ref-type="bibr" rid="ref10 ref7">7, 10</xref>
        ]. Current methods to speed up KG building involve using the language understanding
capabilities of large language models (LLMs) [
        <xref ref-type="bibr" rid="ref11 ref6 ref8">11, 8, 6</xref>
        ]. However, applying LLMs directly to KG creation
has drawbacks, such as generating inconsistent or duplicate triples due to the absence of an explicit,
unified schema [
        <xref ref-type="bibr" rid="ref7 ref8 ref9">7, 8, 9</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Industry Challenges</title>
      <p>In industry use cases, the drawbacks of directly applying LLMs to auto-create KGs are worsened by the
hierarchical structure of the documents and large ontologies. Knowledge documents are organized with
headings to facilitate navigation and present information in a hierarchical manner. Without awareness
of this hierarchy, context can be lost, and triples may miss the correct parent-child relationships.
Another challenge is that the generated KGs may be incomplete or biased toward the LLM’s training
https://ksauka.github.io/ (K. Sauka)</p>
      <p>CEUR</p>
      <p>ceur-ws.org
data, which may not fully cover the target domain, especially for proprietary documents not included
in the pre-training dataset. Additionally, if domain ontologies are available, prompts for LLMs become
very large when integrating knowledge articles with entire domain ontologies, thereby increasing costs
and slowing inference. Furthermore, our industry’s specific use cases, knowledge documents, and
ontologies often involve mixed languages.</p>
      <sec id="sec-2-1">
        <title>Input CSV</title>
        <p>container for unstructured documents
Document column stores raw markdown/</p>
      </sec>
      <sec id="sec-2-2">
        <title>1. Document Preprocessing</title>
        <p>CoreferenceResolver, HierarchicalParser,
Document→Structured Hierarchy</p>
      </sec>
      <sec id="sec-2-3">
        <title>2. Extract (E)</title>
        <p>Schema-Aware Prompting +</p>
        <p>LLM Generation,
Text→Raw Triples</p>
      </sec>
      <sec id="sec-2-4">
        <title>3. Resolve (R)</title>
        <p>Ontology-Guided Enhancement,</p>
        <p>Quality Refinement,</p>
        <p>Schema Retrieval
+ Context Integration</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Proposal</title>
      <p>4. Merge (M)
Similarity-Based
Triple Merging</p>
      <sec id="sec-3-1">
        <title>5. Canonicalize (C)</title>
        <p>Entity
+ Relation Canonicalization,</p>
        <p>Final Quality</p>
        <p>Assurance</p>
        <p>
          Structured
Knowledge Graph
(Triples)
To address challenges from naive LLM prompting in knowledge graph construction, we propose
DERMC, a modular framework extending Zhang and Soh’s [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] Extract–Define–Canonicalize plus
Refinement (EDC+R) paradigm. Our system incorporates hierarchy awareness, multilingual coreference
resolution, and similarity-based triple merging with redundancy reduction (see Figure 1). Without
section awareness, relations can leak across scopes, causing misinterpretations and duplication. Our
experiment with a simulated dataset confirms the benefits of this approach 1.
        </p>
        <p>
          Unlike Zhang and Soh’s [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] six-prompt approach, our pipeline reduces LLM reliance by replacing
the original “Define” phase with schema-context-based resolution. Consequently, DERMC requires
just two LLM prompts: one for extraction (incorporating ontology relations and document
hierarchy) and another for refinement (using retrieved ontology candidates to enhance triple quality). All
other components—coreference resolution, hierarchical parsing, similarity calculations, and
canonicalization—utilize specialized models and algorithms, which we anticipate will significantly reduce
computational overhead and LLM costs.
        </p>
        <p>
          Following [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], our schema-driven resolution precomputes dense embeddings of schema relations and
entity types. To improve beyond label-only matching, we encode ontology predicates with relation
labels, synonyms, glosses, class labels, predicate phrases, and sentence windows, then compute cosine
similarity. The method falls back on label and class labels when data is missing to maintain accuracy
and compactness. When scores are close, the top candidate is selected with an ambiguity flag, balancing
failure visibility and recall. These embeddings enable dynamic retrieval of only the top-k relevant
candidates per document segment, reducing token requirements, operational costs, and hallucination
risks. This retrieval step draws inspiration from the RAG-MCP architectures in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], which aim to
narrow candidates for eficient context window management. Our implementation difers from the
EDC+R approach in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], which utilized a fine-tuned E5-Mistral-7b-Inst embedding model trained with
the InfoNCE loss on the TekGen dataset. Instead, we employ the of-the-shelf all-MiniLM-L6-v2 model,
which provides strong semantic understanding for short texts like relations and entities.
1We have provided an example implementation code and instructions to generate the synthetic data in the README file./
https://github.com/ksauka/DERMC-.git
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Framework</title>
      <p>The DERMC framework 1 begins with document preprocessing, where we incorporate a
multilingual coreference resolver based on FastCoref’s LingMess architecture, combined with translation
pipelines like Facebook’s M2M100, enabling smooth handling of mixed-language documents, including
those with both Dutch and English. Additionally, we developed a custom DocumentParser to
process markdown-structured documents, preserving their hierarchical structure, which is essential for
maintaining contextual relationships during downstream triple extraction.</p>
      <p>During the extraction phase, initial subject–predicate–object triples are derived from the preprocessed
text, guided by schema-aware prompts and informed by the document’s hierarchical context to enhance
accuracy. Our pipeline replaces the traditional “Define” phase of EDC with a schema-driven resolution
process, thereby reducing our reliance on LLMs within the pipeline. While EDC+R included a dedicated
“Define” phase for generating relation definitions via LLM calls, we replaced this with the Resolve step,
leveraging schema context. Following the resolution phase, our pipeline performs similarity-based
triple merging and redundancy reduction, clustering near-duplicate triples using semantic similarity
measures to produce a more precise and coherent knowledge graph. The final canonicalization step
then aligns these triples to the domain ontology, ensuring consistent representation of entities and
relations across the graph.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Implementation challenges and Future work</title>
      <p>Currently, we are testing the proposed approach within our pipeline. However, further team
brainstorming has raised several complex questions that require deeper investigation. While our current
relation-level retrieval and refinement process provides a solid foundation, several challenges persist:
Firstly, cosine similarity between relation strings captures only surface-level semantics and may not
resolve deeper ambiguities, mainly in domains where relations difer slightly in meaning or usage. Our
current pipeline lacks comprehensive integration of entity-type alignment and attribute-level matching,
which are crucial for accurately mapping entities into the ontology’s class hierarchy, especially when
properties or contextual roles distinguish classes.</p>
      <p>Future work should aim to efectively extract and manage the full ontology during schema retrieval,
incorporating class descriptions, hierarchical paths, and attribute-level metadata into the retrieval and
canonicalization stages to improve knowledge graph alignment. Additionally, handling n-ary relations,
nested statements, and context-dependent relation semantics remains an open challenge, particularly
in long, hierarchical enterprise documents. Finally, rigorous human evaluation involving industry
experts and knowledge representation researchers is essential to validate the quality of the resultant
triples. Addressing these issues will be central to transforming our current prototype into a robust,
industry-grade knowledge graph construction pipeline.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This publication is part of the project LESSEN with project number NWA.1389.20.183 of the research
program NWA ORC 2020/21 which is (partly) financed by the Dutch Research Council (NWO).</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used ChatGPT-4, Grammarly in order to: Grammar
and spelling check, Paraphrase and reword. Claude Sonnet 4 (GitHub Copilot) was utilized during code
development phases to assist the authors. After using these tool(s)/service(s), the author(s) reviewed
and edited the content as needed and take(s) full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Piktus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Petroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Karpukhin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Küttler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          , W.-t. Yih,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rocktäschel</surname>
          </string-name>
          , et al.,
          <article-title>Retrieval-augmented generation for knowledge-intensive nlp tasks</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>33</volume>
          (
          <year>2020</year>
          )
          <fpage>9459</fpage>
          -
          <lpage>9474</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Retrieval-augmented generation for large language models: A survey</article-title>
          ,
          <source>arXiv preprint arXiv:2312.10997 2</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ranjan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. N.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <article-title>A comprehensive survey of retrieval-augmented generation (rag): Evolution, current landscape and future directions</article-title>
          ,
          <source>arXiv preprint arXiv:2410.12837</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Cruz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Guevara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Deshpande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Retrieval-augmented generation with knowledge graphs for customer service question answering</article-title>
          ,
          <source>in: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>2905</fpage>
          -
          <lpage>2909</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kashmira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Dantanarayana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Brodsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mahendra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Flautner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mars</surname>
          </string-name>
          ,
          <article-title>A graph-based approach for conversational ai-driven personal memory capture and retrieval in a real-world application</article-title>
          ,
          <source>arXiv preprint arXiv:2412.05447</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , H. Soh, Extract, define, canonicalize
          <article-title>: An llm-based framework for knowledge graph construction</article-title>
          ,
          <source>arXiv preprint arXiv:2404.03868</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>N.</given-names>
            <surname>Mihindukulasooriya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tiwari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. F.</given-names>
            <surname>Enguix</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lata</surname>
          </string-name>
          ,
          <article-title>Text2kgbench: A benchmark for ontologydriven knowledge graph generation from text</article-title>
          , in: International semantic web conference, Springer,
          <year>2023</year>
          , pp.
          <fpage>247</fpage>
          -
          <lpage>265</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Nie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , J. Shi,
          <article-title>Knowledge graph eficient construction: Embedding chain-of-thought into llms</article-title>
          ,
          <source>Proceedings of the VLDB Endowment. ISSN</source>
          <volume>2150</volume>
          (
          <year>2024</year>
          )
          <fpage>8097</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Regino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Dos Reis</surname>
          </string-name>
          ,
          <article-title>Can llms be knowledge graph curators for validating triple insertions?</article-title>
          ,
          <source>in: Proceedings of the workshop on generative AI and knowledge graphs (GenAIK)</source>
          ,
          <year>2025</year>
          , pp.
          <fpage>87</fpage>
          -
          <lpage>99</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>X.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <article-title>Ontology-grounded automatic knowledge graph construction by llm under wikidata schema</article-title>
          ,
          <source>arXiv preprint arXiv:2412.20942</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J. Z.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Razniewski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-C.</given-names>
            <surname>Kalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singhania</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dietze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jabeen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Omeliyanenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lissandrini</surname>
          </string-name>
          , et al.,
          <article-title>Large language models and knowledge graphs: Opportunities and challenges</article-title>
          ,
          <source>arXiv preprint arXiv:2308.06374</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>TeX - LaTeX Stack Exchange</surname>
          </string-name>
          ,
          <article-title>Drawing flow diagram in LaTeX using TikZ</article-title>
          , https://tex.stackexchange. com/questions/149602/drawing-flow
          <article-title>-diagram-in-latex-using-</article-title>
          <string-name>
            <surname>tikz</surname>
          </string-name>
          ,
          <year>2014</year>
          . Accessed:
          <fpage>2025</fpage>
          -09-10.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>T.</given-names>
            <surname>Gan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <article-title>Rag-mcp: Mitigating prompt bloat in llm tool selection via retrieval-augmented generation</article-title>
          ,
          <source>arXiv preprint arXiv:2505.03275</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>