<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Semantic Web 15
(2024) 2193-2207. URL: https://journals.sagepub.com/doi/abs/10.3233/SW-233471. doi:10.3233/
SW- 233471. arXiv:https://journals.sagepub.com/doi/pdf/10.3233/SW</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1038/s41597</article-id>
      <title-group>
        <article-title>GRASP: Generic Reasoning And SPARQL Generation across Knowledge Graphs - Demo System</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sebastian Walter</string-name>
          <email>bast@cs.uni-freiburg.de</email>
          <email>swalter@cs.uni-freiburg.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hannah Bast</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Question Answering, SPARQL, Knowledge Graphs</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Freiburg</institution>
          ,
          <addr-line>Georges-Köhler-Allee 51, 79110 Freiburg im Breisgau</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>769</volume>
      <fpage>2</fpage>
      <lpage>6</lpage>
      <abstract>
        <p>GRASP is the first approach for SPARQL-based question answering that, in principle, works for arbitrary given RDF knowledge graphs zero-shot, that is, without prior training or information on the graph. In this work, we present and describe a prototypical demo system that implements the GRASP approach. The system also supports general question answering and follow-up questions. We extend the evaluation of the associated research paper by experiments on the IMDb knowledge graph and the TEXT2SPARQL challenge.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Contributions
the architecture and the core components (see Section 3).</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        The SPARQL QA problem fits within the broader domain of knowledge graph question answering,
which can be divided into three categories. The first category includes methods that are fine-tuned for
a specific benchmark and knowledge graph [
        <xref ref-type="bibr" rid="ref7 ref8 ref9">7, 8, 9</xref>
        ]. These methods often achieve strong results by
being able to adapt to patterns in the benchmark and knowledge graph. The second category includes
https://ad.informatik.uni-freiburg.de/staff/walter (S. Walter); https://ad.informatik.uni-freiburg.de/staff/bast (H. Bast)
      </p>
      <p>CEUR</p>
      <p>ceur-ws.org
{
}
{
}</p>
      <p>Send query message
“question”: “Who published
the most papers …”,
“knowledge_graphs”: [“dblp”],
“task”: “sparql-qa”</p>
      <p>Send update
messages
“type”: “model”,
“content”: “To answer the
question, I need to follow
these steps: …”,</p>
      <p>GRASP Agent</p>
      <p>Execute</p>
      <p>SPARQL
Knowledge graph
SPARQL endpoints</p>
      <p>Search in
indices</p>
      <p>Knowledge graph</p>
      <p>indices</p>
      <p>Query
subgraphs</p>
      <p>Get index</p>
      <p>data*
{
{
}</p>
      <p>
        { “type”: “function”, ““tcyopnet”e:n“tf”u:n“scteioanrc”,h_entity”,
“““tnayrapgmes””e::”“{:f“u“qsnueceatriroyc”nh:”_“,NenetuitryIP”,S”, ““ca“okrggns”te:”:n“d{t“”bq:lu“pse”e}r,ya”r:c“hIC_eMn“aLt“irr”t“eg,ykss”g,u””::lt{”““:dq“buTlepor”py},”1:0“IeCnLtRity”,…”,
“kg”: “dblp”}, “result”: “Top 10 entit}y …”,
“result”: “Top 10 entity …”, }
* Building knowledge graph indices is currently done offline and uses pre-defined SPARQL queries to extract the
index data from the knowledge graphs via the SPARQL endpoints. However, these queries could in theory also be
found by the GRASP agent itself and the index building could happen online, if the knowledge graph is reasonably
small.
methods that do not require fine-tuning but instead use in-context learning with information that is
specific for the benchmark or knowledge graph, such as exemplary question-SPARQL pairs [
        <xref ref-type="bibr" rid="ref10 ref11 ref12">10, 11, 12</xref>
        ].
The third category covers methods which function without any of the above, but rather explore the
knowledge graph on the go [
        <xref ref-type="bibr" rid="ref1 ref13 ref14 ref15">1, 13, 14, 15</xref>
        ]. For a more detailed account of related work, we refer to [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Only one of the latter methods, SPINACH [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], provides a publicly available demo of their approach
at spinach.genie.stanford.edu, which uses a chat-like interface similar to ours.1 But like its underlying
approach, this system is limited to the Wikidata knowledge graph.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. System</title>
      <p>In the following, we describe the main components of our system: an LLM-based agent that controls the
function calls (Section 3.1), the indices used by the agent to search the knowledge graphs (Section 3.2),
and the configuration allowing the user to adapt the system to their needs (Section 3.3).</p>
      <p>The GRASP CLI is the main tool for working with the GRASP system. It is used to prepare knowledge
graphs, build indices, start the GRASP server, run the GRASP agent in a headless fashion, and more.
See Fig. 2 for an overview. Most users will use GRASP in a client-server setup. For that, we also provide
a compatible web application. See Fig. 1 for an overview.</p>
      <sec id="sec-3-1">
        <title>3.1. GRASP Agent</title>
        <p>In a nutshell, the GRASP agent works as follows: starting from the user’s question augmented by a
generic instruction prompt, it enters a loop, querying the knowledge graph exploratively until it finds a
SPARQL query that produces the desired answer. It reasons about previous query results to determine
the next query or whether the final answer has been found. The queries are realized via function calling.</p>
        <p>
          Specifically, the agent is provided with a fixed set of functions, each with a unique name and a
1See also www.wikidata.org/wiki/User:SpinachBot
# Run GRASP agent on a question
echo "Name 10 german race car drivers" |
grasp run &lt;config&gt;
# Run GRASP agent on multiple questions
cat questions.jsonl | grasp file &lt;config&gt;
# Start the GRASP server
grasp serve &lt;config&gt;
# Get search index data for a knowledge graph
grasp data &lt;kg&gt; [--endpoint &lt;endpoint&gt;]
# Build search indices for a knowledge graph
grasp index &lt;kg&gt;
# Build an example index from question-sparql-pairs
grasp examples examples.jsonl examples.index
# Evaluate GRASP's predictions on a benchmark
grasp evaluate benchmark.jsonl pred.jsonl &lt;endpoint&gt;
description in natural language of its purpose and its parameters.2 The functions are: EXE (execute an
arbitrary SPARQL query), LST (list triples with given constraints), SEN (search for entities matching a
given query string), SPR (search for properties matching a given query string), SPE (search for properties
of a given entity), SOP (search for objects of a given property), ANS (answer and stop), CAN (cancel and
stop). If few-shot examples are available, one of the functions FSE (find similar examples) or FEX (find
random examples) is provided as well. See [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] for a more detailed description of each of these functions.
        </p>
        <p>
          For this paradigm to work properly, we rely on the underlying model being trained to support
zero-shot function calling, which is true for nearly all recent closed-source models like GPT-4.1 by
OpenAI [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] or Gemini 2.5 by Google [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], as well as for many recent open-source models like Qwen2.5
[
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] or Qwen3 [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. The latter can be easily self-hosted via vLLM [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. Wherever supported by the
model provider, we use constrained decoding to force the model to output valid function calls that are
guaranteed to follow the available function signatures for increased reliability.
        </p>
        <p>Table 1 provides F1-scores and statistics on six benchmarks. For CWQ, WQSP, and SPINACH, GRASP
achieves a comparatively low F1-score and, on average, uses more steps, function calls, and time. We
suspect that this is due to the harder questions, in particular on SPINACH. For CWQ and WQSP, we
observe more LST and fewer EXE calls than on the other benchmarks. We assume that this is due to the
older Freebase knowledge graph being less familiar to the underlying LLM, which thus requires more
exploration and verification steps (which GRASP typically realizes via LST). We also find that the SOP
function is almost never used; it could therefore probably be removed without loss of performance.</p>
        <sec id="sec-3-1-1">
          <title>Tasks</title>
          <p>Not all questions are answerable by or via a SPARQL query. We have therefore implemented an
extension that allows the user to dynamically switch to the task of general question answering over
2The implementation of the functions is fixed and part of the GRASP system. When calling a function, the model provides the
function name and parameter values, and receives back the results in text format from our implementation.
knowledge graphs by adapting the GRASP instruction and ANS and CAN functions respectively. For this
task, the final output is not a SPARQL query, but arbitrary natural language text. For example, this is
useful for a question like “Write a Python script to download all Wikipedia articles about dog breeds”,
which can be answered by first finding a SPARQL query to retrieve the article URLs, and then using
this SPARQL query in a Python program.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>Follow-up question answering</title>
          <p>
            In [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ], the GRASP agent is used to answer questions by finding a corresponding SPARQL query in a
single uninterrupted interactive process between model and knowledge graphs. In practice, one would
like to have multi-turn conversations and ask follow-up questions, potentially switching tasks and the
underlying knowledge graphs at each subsequent question. We implement this by first determining
the GRASP instruction for the current task and knowledge graph selection, then adding all previous
questions and reasoning or function call steps unchanged, and finally asking the follow-up question.
The web application supports this use case by sending an additional past field containing the full
interaction history on follow-up questions.
          </p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Search indices</title>
        <p>
          Besides the agent, the search indices are the second integral part of the overall GRASP approach.
Currently, the GRASP system supports two types of search indices: prefix-keyword indices (PFX) for
prefix-sensitive keyword search, and similarity indices (SIM) for vector-based similarity search. We
refer to [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] for a more detailed explanation of both index types.
        </p>
        <p>
          The indices enable search queries that are either ineficient in SPARQL (in the case of prefix-keyword
search) or not supported by SPARQL (in the case of vector-based similarity search). We make the search
indices accessible to the GRASP agent via easy-to-use search functions (see Section 3.1), which are
purpose-built for the most common types of searches a human expert performs while writing SPARQL
queries. It is shown in [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] that these search functions (implemented using the mentioned indices)
significantly boost the overall system performance (compared to when only EXE is provided).
        </p>
        <p>For entities, we build a PFX by default, because it requires less disk space and RAM, and is faster
to query than a SIM. Besides, SIM does not give significantly better results because keyword search
works well on entities. Given a PFX query, we calculate a score between each entity and the query as
the weighted sum of the number of exact and prefix keyword matches minus a weighted sum of the
number of unmatched query and entity keywords. If there is neither an exact nor a prefix keyword
match, the entity is excluded entirely. The entities with the  highest scores are then returned as search
results, where  is a GRASP configuration option (see search_top_k: in Fig. 3).</p>
        <p>For properties, keyword queries often miss relevant search results. For example, searching for “born
in” should also match “place of birth”, but does not when using a PFX. We therefore build a SIM for
properties by default. Since knowledge graphs typically have only few properties, the higher disk space
and RAM consumption of SIM compared to PFX does not matter. Given a search query, we compute its
vector embedding (using Qwen/Qwen3-Embedding-0.6B [26] by default), compute the cosine-similarity
to all pre-computed property embeddings and return the top  properties with the highest similarity as
search results. Note that a GPU is required to run a SIM eficiently in practice.</p>
        <p>Both PFX and SIM support specifying a subset of items to restrict the search to. Together with a
SPARQL endpoint we use this functionality to implement the advanced search functions of GRASP.
For example, SPE allows searching for properties of a given entity. For that, we first send a SPARQL
query to the endpoint to retrieve all potential properties for the given entity, restrict the corresponding
property index to these properties, and execute the search query over that restricted index.</p>
        <p>Table 2 provides statistics for the search indices of eight knowledge graphs. Starting the GRASP
server with all these indices takes less than 20 s, and uses ≈ 20 GB of RAM and ≈ 3 GB of GPU memory,
measured on a machine with an AMD Ryzen 9 7950X CPU, an NVIDIA GeForce RTX 4090 GPU, and 4
× 4 TB NVMe SSD.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Configuration</title>
        <p>GRASP can be easily configured via a single configuration file in YAML format. See Fig. 3 for the
ifle structure and configuration options. We provide sensible defaults for all configuration options,
so the user often only needs to configure the particular knowledge graphs they want to use with
GRASP. For example, a YAML config to run GRASP with Wikidata and Freebase can be as simple as
knowledge_graphs: [kg: "wikidata", kg: "freebase"]. With the client-server setup, all
configured knowledge graphs are automatically available in the web application, which itself requires no
configuration; only the address and port of the GRASP server need to be set.</p>
        <p>
          We briefly discuss the three most important configuration options for GRASP besides the model and
knowledge graphs themselves:
1. Setting feedback: true corresponds to GRASP-F from [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] and allows the GRASP agent to
relfect on and improve its own answers, which increases quality at the cost of longer runtimes. The
max_feedbacks: option sets the upper bound for the number of feedback loops per generation.
2. Setting force_examples: to a knowledge graph that specifies an example index via example_index:
triggers a call of either the FEX or FSE function (depending on whether random_examples: is set to
true or false) at the beginning of a generation. This enables few-shot learning in the style of the
few-shot evaluations from [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
3. Setting know_before_use: true tells GRASP to verify knowledge graph items before using them in
EXE function calls. This is enforced by returning an error message rather than the query result if an
EXE call uses items that were not present in any previous function call result. This mechanism avoids
hallucinations of knowledge graph items, which we found to be a frequent problem with GRASP, in
particular on knowledge graphs that are less familiar to the underlying LLM. For example, for the
DBLP-QuAD [31] benchmark, without this setting, GRASP often uses incorrect properties without
verification, like the seemingly more canonical dblp:author instead of the correct dblp:authoredBy.
Consequently, this setting improves the F1-score on this benchmark from 51.0 to 66.8.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Additional Evaluations</title>
      <p>
        To further validate GRASP’s zero-shot question answering capabilities, we extend the set of evaluated
knowledge graphs and benchmarks from [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. First, we build an own small benchmark for IMDb [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ],
the popular movie and series database, consisting of 15 questions. Second, we evaluate GRASP on
the small, domain-specific knowledge graph representing a corporate setting from the TEXT2SPARQL
challenge [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The corresponding challenge benchmark contains 50 questions and is “designed to test a
model’s ability to adapt to restricted and domain-focused data environments”. On both benchmarks,
GRASP achieves good results and even surpasses the best (INFAI) and second best (IIS-Q) entries from
the TEXT2SPARQL challenge by a large margin. See Table 3 for full results.
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>
        We have built a complete system based on the GRASP approach from [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. We have combined the core
SPARQL question-answering capability with general question answering and multi-turn follow-up
questions. We have extended the evaluation from [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] by new experiments on the IMDb knowledge
graph and the TEXT2SPARQL challenge, with strong results. This provides further support for GRASP’s
zero-shot capabilities across knowledge graphs.
      </p>
      <p>For future work, we consider the support of automatic builds of search indices from nothing but a
SPARQL endpoint, as well as integrating search indices and their functionality directly into SPARQL.
This would be a step towards both zero-shot and zero-configuration question answering on arbitrary
given knowledge graphs. To prevent GRASP from repeatedly making the same mistakes, which can
occur in the zero-shot setting, we also consider adding memory or other forms of online learning as
potential future work.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID
499552394 – SFB 1597.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Walter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bast</surname>
          </string-name>
          , GRASP:
          <article-title>Generic reasoning and SPARQL generation across knowledge graphs</article-title>
          ,
          <source>in: ISWC</source>
          ,
          <year>2025</year>
          .
          <article-title>Accepted for publication</article-title>
          . Preprint available at https://arxiv.org/abs/2507.08107.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Vrandecic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krötzsch</surname>
          </string-name>
          ,
          <article-title>Wikidata: a free collaborative knowledgebase</article-title>
          ,
          <source>Commun. ACM</source>
          <volume>57</volume>
          (
          <year>2014</year>
          )
          <fpage>78</fpage>
          -
          <lpage>85</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K. D.</given-names>
            <surname>Bollacker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Evans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Paritosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Sturge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Taylor</surname>
          </string-name>
          , Freebase:
          <article-title>a collaboratively created graph database for structuring human knowledge</article-title>
          ,
          <source>in: SIGMOD Conference</source>
          , ACM,
          <year>2008</year>
          , pp.
          <fpage>1247</fpage>
          -
          <lpage>1250</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Ackermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. M.</given-names>
            <surname>Beckermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kalmbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Neises</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Ollinger,</surname>
          </string-name>
          <article-title>The dblp knowledge graph and SPARQL endpoint</article-title>
          ,
          <source>TGDK</source>
          <volume>2</volume>
          (
          <year>2024</year>
          ) 3:
          <fpage>1</fpage>
          -
          <lpage>3</lpage>
          :
          <fpage>23</fpage>
          . URL: https://doi.org/10.4230/ TGDK.2.
          <issue>2</issue>
          .3. doi:
          <volume>10</volume>
          .4230/TGDK.2.
          <issue>2</issue>
          .3.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5] IMDb, IMDb: Ratings, Reviews, and
          <article-title>Where to Watch the Best Movies</article-title>
          &amp; TV Shows,
          <year>2025</year>
          . URL: https://www.imdb.com/.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>AKSW</surname>
          </string-name>
          ,
          <year>TEXT2SPARQL</year>
          '
          <fpage>25</fpage>
          -
          <lpage>AKSW</lpage>
          ,
          <year>2025</year>
          . URL: https://text2sparql.aksw.org/.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , P. Ng,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. Xiang,</surname>
          </string-name>
          <article-title>DecAF: Joint decoding of answers and logical forms for question answering over knowledge bases, in: ICLR, OpenReview</article-title>
          .net,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Hafari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <article-title>Reasoning on graphs: Faithful and interpretable large language model reasoning</article-title>
          , in: ICLR, OpenReview.net,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>H.</given-names>
            <surname>Luo</surname>
          </string-name>
          , H. E,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , C. Ma, G. Dong,
          <string-name>
            <given-names>M.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhu</surname>
          </string-name>
          , A. T. Luu,
          <article-title>ChatKBQA: A generate-then-retrieve framework for knowledge base question answering with ifne-tuned large language models</article-title>
          , in: ACL (Findings),
          <source>Association for Computational Linguistics</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>2039</fpage>
          -
          <lpage>2056</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Patidar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sawhney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          , Mausam,
          <string-name>
            <surname>I. Bhattacharya</surname>
          </string-name>
          ,
          <article-title>Few-shot transfer learning for knowledge base question answering: Fusing supervised models with in-context learning</article-title>
          ,
          <source>in: ACL (1)</source>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>9147</fpage>
          -
          <lpage>9165</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <article-title>Don't generate, discriminate: A proposal for grounding language models to real-world environments</article-title>
          ,
          <source>in: ACL (1)</source>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>4928</fpage>
          -
          <lpage>4949</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Chai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Pei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Song</surname>
          </string-name>
          , J. Liu,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , L. Cui,
          <article-title>Debate on graph: A flexible and reliable reasoning framework for large language models</article-title>
          , in: AAAI, AAAI Press,
          <year>2025</year>
          , pp.
          <fpage>24768</fpage>
          -
          <lpage>24776</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Semnani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Triedman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. D.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Lam</surname>
          </string-name>
          ,
          <article-title>SPINACH: SPARQL-based information navigation for challenging real-world questions</article-title>
          , in: EMNLP (Findings),
          <source>Association for Computational Linguistics</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>15977</fpage>
          -
          <lpage>16001</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. M.</given-names>
            <surname>Ni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Shum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <article-title>Think-on-Graph: Deep and responsible reasoning of large language model on knowledge graph, in: ICLR, OpenReview</article-title>
          .net,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Wen,</surname>
          </string-name>
          <article-title>StructGPT: A general framework for large language model to reason over structured data</article-title>
          , in: EMNLP, Association for Computational Linguistics,
          <year>2023</year>
          , pp.
          <fpage>9237</fpage>
          -
          <lpage>9251</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>OpenAI</surname>
          </string-name>
          ,
          <string-name>
            <surname>Introducing</surname>
            <given-names>GPT</given-names>
          </string-name>
          <source>-4</source>
          .1 in
          <string-name>
            <surname>the</surname>
            <given-names>API</given-names>
          </string-name>
          ,
          <year>2025</year>
          . URL: https://openai.com/index/gpt-4-1/, accessed:
          <fpage>2025</fpage>
          -05-11.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Google</surname>
            <given-names>DeepMind</given-names>
          </string-name>
          ,
          <source>Introducing Gemini 2</source>
          .
          <article-title>0: our new AI model for the agentic era</article-title>
          ,
          <year>2024</year>
          . URL: https: //blog.google/technology/google-deepmind/
          <article-title>google-gemini-ai-</article-title>
          <string-name>
            <surname>update-</surname>
          </string-name>
          december-2024/, accessed:
          <fpage>2025</fpage>
          -05-11.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>A.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wei</surname>
          </string-name>
          , et al.,
          <source>Qwen2.5 technical report, arXiv preprint arXiv:2412.15115</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Qwen</surname>
            <given-names>Team</given-names>
          </string-name>
          ,
          <source>Qwen3 technical report</source>
          ,
          <year>2025</year>
          . URL: https://arxiv.org/abs/2505.09388. arXiv:
          <volume>2505</volume>
          .
          <fpage>09388</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>W.</given-names>
            <surname>Kwon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhuang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. H.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. E.</given-names>
            <surname>Gonzalez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , I. Stoica, Eficient
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>