<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>I. Kamenko);</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>for Policy-Making Support: Initial Implementation of the Data Exploration LLM-RAG Agent</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ilija Kamenko</string-name>
          <email>ilija.kamenko@ivi.ac.rs</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dragan Kukolj</string-name>
          <email>dragan.kukolj@ivi.ac.rs</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dubravko Culibrk</string-name>
          <email>dculibrk@uns.ac.rs</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Multi-Agent Systems, Data Exploration, RAG, LLM, Policy-Making Support</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Technical Sciences, University of Novi Sad</institution>
          ,
          <addr-line>Trg Dositeja Obradovica 6, Novi Sad</addr-line>
          ,
          <country country="RS">Serbia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>The Institute for Artificial Intelligence Research and Development of Serbia</institution>
          ,
          <addr-line>Fruskogorska 1, Novi Sad</addr-line>
          ,
          <country country="RS">Serbia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>We present research in progress on the development of a multi-agent framework designed to enhance policy making through advanced data analysis techniques. The framework relies on specialized agents collaborating to transform complex datasets into actionable insights for decision-makers. Substantial progress has been achieved with the implementation of the first core component, the Data Exploration LLM-RAG Agent, which facilitates structured and intuitive interaction with complex tabular data by leveraging large language model retrieval-augmented generation techniques (LLM-RAG). A case study focusing on researcher productivity in Serbia illustrates the initial functionality and practical relevance of the approach. Further development is actively underway, with ongoing eforts directed toward the integration of domain knowledge and policy recommendation agents, ultimately aiming to establish a comprehensive, intelligent decision-support ecosystem.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Increasing availability of large-scale datasets in science, education, and governance presents new
opportunities for the development of new decision-making systems in public policy domain. Rapid
development of large language models (LLM), capable of processing and generating huge amounts of
text as well as simulating human behavior and reasoning, enables new approaches in the development
of decision-making systems. Recent studies show that the introduction of the concept of multi-agent
LLM-based systems [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] contributes to enhanced performance, i.e. reduced bottlenecks, enhanced fault
tolerance, improved accuracy, or more refined decisions. For instance, in the domain of societal
simulation a configurable multi-agent interaction framework is utilized to simulate classroom interactions
between teachers and students [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Simulation of human-like behavior using LLM-based agents, e.g.
changing attitudes and emotions in response to social events is presented in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Another multi-agent
framework named Cognitive Agents and Social Evolution Simulator, represents the simulation of social
interaction and communication based on complex networks [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Its capabilities are illustrated by an
election process simulation in which agents represent voters that reproduce voter behavior.
      </p>
      <p>A Multi-agent LLM-based system is a framework where multiple agents powered by LLMs work
together, communicate, collaborate and adapt, in order to solve a given problem. Multi-agent LLM-based
systems are a powerful approach to complex problem-solving. Their power lies in the following features:
parallel processing by distribution of tasks among multiple agents, specialization of agents in various
domains with diferent tools attached, and advanced reasoning by powerful LLM models with adapted
system prompts.</p>
      <p>This work presents an evolving multi-agent framework, where specialized agents collaboratively
analyze data, validate findings, enrich outputs with domain-specific knowledge, and ultimately generate</p>
      <p>CEUR</p>
      <p>ceur-ws.org
policy-relevant recommendations. At the current stage, substantial progress has been made with
the implementation of the Data Exploration LLM-RAG Agent, which enables flexible and dynamic
interaction with structured datasets to support preliminary data understanding, exploratory analysis,
and trend identification. Development of higher-level agents including the Domain Knowledge Agent
and the Policy Recommendation Agent is actively underway, with future research phases aimed at
achieving a fully integrated, decision-support ecosystem.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Multi-Agent Framework Concept</title>
      <p>The full system is envisioned as a multi-agent framework designed to coordinate specialized agents, each
focused on a distinct set of responsibilities within the data-driven research analysis and policy support
workflow. This modular architecture ensures scalability, flexibility, and eficiency by distributing tasks
across agents with complementary capabilities.</p>
      <p>The overall structure of the framework is illustrated in Figure 1, showing the flow from the user’s
input through orchestration, specialized agent collaboration, and final policy-oriented outputs.</p>
      <p>
        In this framework, the Agent Supervisor receives input statements from the user and devises a plan
to break down the problem into smaller tasks and then forwards the tasks to agents who are best suited
for the specific tasks. The supervisor checks the validity of the task executed by the assigned agent. The
reasoning process is inspired in part by [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The Data Exploration LLM-RAG Agent executes structured
data querying, statistical analysis and visualization based on user prompts and retrieved data. The
Domain Knowledge Agent enhances the analytical process by integrating domain-specific expertise,
models and policy frameworks to ensure contextually relevant outputs. The Policy Recommendation
Agent synthesizes analytical results and domain knowledge into coherent, evidence-based policy options
and strategic insights.
      </p>
      <p>Through seamless interaction between these agents, the system is capable of addressing complex
domain questions and translating analytical findings into actionable recommendations. This multi-agent
setup fosters distributed intelligence, allowing specialized components to collaborate dynamically and
deliver rich, multi-dimensional insights to users.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Realized Component: Data Exploration LLM-RAG Agent</title>
      <p>The Data Exploration LLM-RAG Agent is designed to transform natural language queries into structured,
executable analyses. It implements a full Retrieval-Augmented Generation (RAG) architecture, consisting
of three integrated modules: retrieval, augmentation, and generation as is illustrated in Figure 2. These
modules enable the system to guide large language models (LLMs) with contextual data, relevant
analytical methods, and execution logic, delivering both statistical summaries and visual outputs.</p>
      <sec id="sec-3-1">
        <title>3.1. Retrieval</title>
        <p>The retrieval module is responsible for identifying statistically relevant methods to guide the analysis.
When a user submits a query, the system semantically analyzes the text to find appropriate statistical
techniques. It uses a vector database (ChromaDB) to store embeddings of the statistical method
descriptions such as t-tests, ANOVA, Pearson correlation, regression models, and clustering algorithms.
The user’s query as well as embeddings in the vector database are encoded using an embedding model
(all-MiniLM-L6-v2) and then compared to the stored vectors.</p>
        <p>If a statistically relevant method is found with a similarity score below a defined threshold, the
corresponding method name and description are retrieved. This result is used to enhance the prompt
sent to the language model, ensuring the generated analysis aligns with appropriate statistical practices.
If no relevant method is confidently retrieved, the system defaults to generating general-purpose code,
preserving robustness and flexibility.</p>
        <p>This retrieval component forms the first stage of the RAG pipeline, enabling dynamic incorporation
of external, domain-relevant knowledge into the analysis process.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Augmentation</title>
        <p>The augmentation module prepares the language model context by constructing a complete, data-aware
prompt. First, the structured dataset is automatically parsed and modeled into a JSON schema, describing
each field’s name, type, and semantic role. This schema acts as a formalized summary of the dataset’s
structure and is critical for guiding how the language model interacts with the data.</p>
        <p>The system then assembles a system message, which defines constraints and operational guidelines
such as instructing the language model to work only with the preloaded Pandas data (a Python library for
data manipulation and analysis) and prohibiting external imports or data redefinition. Simultaneously,
the user message contains the natural language query, optionally augmented with the statistical method
retrieved during the previous phase.</p>
        <p>These two messages form the composite prompt sent to the LLM model. Together, they give the
model both the semantic intent of the query and the structural context of the dataset, enabling it to
generate appropriate and executable analytical code.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Generation</title>
        <p>In the generation module, the LLM model returns markdown-formatted output containing one or
more python code blocks. These blocks may include data filtering, aggregation, statistical testing, or
visualization instructions depending on the complexity of the query and any injected statistical method.</p>
        <p>The system parses these code blocks, checks for syntax validity, and then executes them in a secure,
isolated environment. Textual results, such as statistical test outputs or numerical summaries, are
captured, while any plots generated using Matplotlib (a python library for data visualization) are
rendered and encoded as Base64 images for downstream presentation.</p>
        <p>To support iterative exploration, the agent also maintains conversational history, allowing users to
refine their questions or ask follow-up queries based on previous results. This module completes the
RAG loop by producing tangible outputs from semantically guided code generation.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Case Study: Researcher Productivity in Serbia</title>
      <p>The efectiveness of the Data Exploration LLM-RAG Agent was evaluated through a real-world case
study focused on researcher productivity in Serbia. This domain was selected because it ofers rich,
structured data across multiple dimensions (demographics, academic progression, and bibliometric
indicators) making it an ideal testbed for demonstrating the agent’s capabilities in extracting actionable
insights to support policy development.</p>
      <sec id="sec-4-1">
        <title>4.1. Dataset Overview</title>
        <p>The system was evaluated using a national dataset of Serbian researchers spanning 15 years. Key fields
include:
• Researcher identifiers (national ID, ORCID)
• Academic fields and titles
• Gender, birth year, and institutional afiliation
• Academic promotion histories
• Citation counts from multiple bibliometric databases
• Research outputs classified by national standards
The dataset was ingested in CSV format and dynamically modeled into JSON schemas for internal use.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Example Analytical Tasks</title>
        <p>To demonstrate the flexibility and capabilities of the Data Exploration LLM-RAG Agent, diferent types
of analytical tasks are showcased in the following examples. The ability of the Agent to return purely
textual insights, generate visual outputs, and combine statistical analysis with textual and graphical
responses is highlighted. Through these varied formats, the versatility of the Data Exploration
LLMRAG Agent in supporting a wide range of user needs, from simple descriptive queries to advanced,
statistically driven explorations, as illustrated.</p>
        <sec id="sec-4-2-1">
          <title>4.2.1. Example 1: Textual Response Only</title>
          <p>In this example task, the question ”How many researchers are between 30 and 45 years old?” is posed.
The request is processed by the agent, and a textual response is generated: ”Number of researchers
between 30 and 45 years old: 8,157.” In this case, no statistical function was applied, as the agent did
not identify suficient semantic similarity between the user prompt and the available vector database
records, resulting in a direct retrieval and simple counting operation.</p>
        </sec>
        <sec id="sec-4-2-2">
          <title>4.2.2. Example 2: Graphical Response Only</title>
          <p>In this example task, the question ”Compare graphically the trend of total citations over the last 10 years
between male and female researchers holding the rank of associate professor” is posed. The request
is processed by the agent, and only a graphical response is generated, illustrating the comparison of
citation trends between the two groups (Figure 3). In this case, no statistical function was applied, as
the agent did not identify suficient semantic similarity between the user prompt and the statistical
operation templates stored in the vector database.</p>
        </sec>
        <sec id="sec-4-2-3">
          <title>4.2.3. Example 3: Textual and Graphical Response</title>
          <p>In this example task, the question ”Create an overview of the number of researchers by gender by year
of birth in 5-year intervals. For each value, calculate the statistical significance in relation to the others
and mark it with a shade from the color palette” is posed. The request is processed by the agent, and a
statistical function is applied. The Chi-squared test is performed, and the resulting p-value is returned:
”Chi-squared test p-value: 2.772350560555721e-51.” In addition to the textual output, a corresponding
graphical response is generated, illustrating the number of researchers by gender across 5-year birth
intervals with color shading used to indicate statistical significance (Figure 4).</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>This work represents a foundational phase toward the development of a modular, multi-agent system for
supporting evidence-based policymaking from complex datasets. The realization of the Data Exploration
Agent demonstrates that structured, natural-language-driven data analysis is not only feasible but also
capable of producing outputs relevant to informing and guiding policy discussions.</p>
      <p>Active research eforts are now directed toward the implementation of the Supervisor Agent, the
Domain Knowledge Agent, and the Policy Recommendation Agent, with the goal of building a
comprehensive decision-support framework capable of delivering collaborative, verified, and policy-aligned
insights.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Acknowledgements</title>
      <p>This work is part of the TANGO project, which has received funding from the European Union’s Horizon
Europe research and innovation programme under grant agreement No. 101070052.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>In preparing the work, no generative AI tools were used.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>A survey on llm-based multi-agent systems: workflow, infrastructure, and challenges</article-title>
          ,
          <source>Vicinagearth</source>
          <volume>1</volume>
          (
          <year>2024</year>
          ).
          <source>doi:10.1007/s44336-024-00009-2.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <article-title>Cgmi: Configurable general multi-agent interaction framework, arXiv preprint (</article-title>
          <year>2023</year>
          ). arXiv:
          <volume>2308</volume>
          .
          <fpage>1250</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.</given-names>
            <surname>Ghafarzadegan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Majumdar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Hosseinichimeh</surname>
          </string-name>
          ,
          <article-title>Generative agent-based modeling: Unveiling social system dynamics through coupling mechanistic models with generative artificial intelligence, arXiv preprint (</article-title>
          <year>2023</year>
          ). arXiv:
          <volume>2309</volume>
          .
          <fpage>11456</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y. Zhang,</surname>
          </string-name>
          <article-title>Casevo: A cognitive agents and social evolution simulator, arXiv preprint (</article-title>
          <year>2024</year>
          ). arXiv:
          <volume>2412</volume>
          .
          <fpage>19498</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Lim</surname>
          </string-name>
          ,
          <string-name>
            <surname>Plan-</surname>
          </string-name>
          and
          <article-title>-solve prompting: Improving zero-shot chain-of-thought reasoning by large language models</article-title>
          , in: A.
          <string-name>
            <surname>Rogers</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Boyd-Graber</surname>
          </string-name>
          , N. Okazaki (Eds.),
          <source>Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL</source>
          <year>2023</year>
          ), volume
          <volume>1</volume>
          , Toronto, Canada,
          <year>2023</year>
          , pp.
          <fpage>2609</fpage>
          -
          <lpage>2634</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2023</year>
          .
          <article-title>acl-long</article-title>
          .
          <volume>147</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>