<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>PeRAG: Multi-Modal Perspective-Oriented Verbalization with RAG for Inclusive Decision Making</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Muhammad Saad Amin</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Horacio Jesús Jarquín Vásquez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Franco Sansonetti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Simona Lo Giudice</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valerio Basile</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Viviana Patti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Turin</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Electrical and Computer Engineering, Aarhus University</institution>
          ,
          <country country="DK">Denmark</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Dipartimento di Economia e Statistica "Cognetti de Martiis", University of Turin</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Urban policy makers require comprehensive insights into transportation issues and demographic distributions to design equitable and eficient infrastructure. However, analyzing multi-modal data (numeric and visual) while accounting for diverse perspectives remains challenging. To address this, we propose PeRAG, a novel pipeline combining multi-modal perspective-oriented verbalization with Retrieval-Augmented Generation (RAG). Our approach first converts numeric transportation/demographic data and population heatmaps into natural language descriptions using LLaMA, incorporating multiple policy-relevant perspectives. These verbalizations are then fed into the RAG system to generate context-aware, perspectivedriven responses for urban planners. We demonstrate the efectiveness of PeRAG in generating actionable insights for transportation policy, bridging the gap between raw data and decision-making. Our experiments highlight the pipeline's ability to handle heterogeneous data modalities while adapting to diverse stakeholder viewpoints, ofering a scalable solution for smart city analytics.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Multi-modal Verbalization</kwd>
        <kwd>Retrieval-Augmented Generation (RAG)</kwd>
        <kwd>Perspective-Aware NLP</kwd>
        <kwd>Large Language Models (LLMs)</kwd>
        <kwd>Urban Transportation Analytics</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>Urban environments provide a rich case for multimodal</title>
        <p>reasoning: data can include numerical variables (e.g.,
popUrban policy makers face significant challenges in de- ulation size, number of transport lines), visual artifacts
signing equitable transportation systems due to the com- (e.g., heatmaps of population density), and geographical
plex interplay of demographic shifts, infrastructure con- descriptors (e.g., district boundaries). Integrating and
straints, and socio-economic disparities [1]. Raw data interpreting these diferent modalities coherently is
es(e.g., transit logs, census metrics, heatmaps) is often sential for supporting informed decision-making.
siloed, requiring labor-intensive integration to derive One of the emerging challenges in this context is
insights [2, 3]. While NLP and computer vision tech- perspective-aware verbalization, the task of
transformniques have been applied to urban analytics, they typi- ing multimodal data into textual descriptions that reflect
cally treat data modalities independently, ignoring the diferent analytical or stakeholder viewpoints [ 6]. For
need for cross-modal reasoning (e.g., correlating heatmap instance, the same urban dataset can be verbalized from a
patterns with numeric poverty indices) [4]. This limits demographics perspective (“This area has a high
populatheir utility for policy decisions requiring holistic, inter- tion of elderly residents”) or a transportation accessibility
pretable inputs. perspective (“This zone has limited coverage of public</p>
        <p>In recent years, advances in machine learning and NLP transport lines despite high population density”).
Generhave enabled new forms of automated data interpretation, ating such targeted descriptions from numeric and image
particularly in multimodal settings where information data requires models that understand not only the input
spans both structured and unstructured modalities [5]. modalities but also the intended angle of interpretation
[7]. This introduces both linguistic complexity—in
choosCLiC-it 2025: Eleventh Italian Conference on Computational Linguis- ing appropriate vocabulary, structure, and focus—and
tics, September 24 — 26, 2025, Cagliari, Italy reasoning complexity—in determining what information
* Corresponding author. is salient for a given perspective.
h$orsaacaido@jeseucse..jaaurq.dukin(vMa.sSq.uAezm@inu)n; ito.it (H. J. Jarquín Vásquez); These challenges compound when integrated into
franco.sansonetti@unito.it (F. Sansonetti); retrieval-augmented generation (RAG) pipelines.
Trasimona.logiudice@unito.it (S. Lo Giudice); valerio.basile@unito.it ditional RAG frameworks are typically designed for
text(V. Basile); viviana.patti@unito.it (V. Patti) based retrieval from large knowledge bases; extending
0000-0002-7002-9373 (M. S. Amin); 0000-0001-8110-6832 them to operate over generated textual representations
(V. Basile); 0000-0001-5991-370X (V. Patti)
© 2025 Copyright for this paper by its authors. Use permitted under Creative Commons License of multimodal data introduces new issues: retrieval is
Attribution 4.0 International (CC BY 4.0).</p>
        <p>• We conduct human evaluation and qualitative
analysis to assess factuality and relevance, and
compare PeRAG outputs against general-purpose</p>
        <p>LLMs.</p>
      </sec>
      <sec id="sec-1-2">
        <title>The rest of this paper is organized as follows: Section 2</title>
        <p>reviews related work in multimodal NLP, verbalization,
and RAG systems. Section 3 describes the methodology,
including dataset details, verbalization techniques, and
system architecture. Section 4 outlines our experimental
setup. Section 5 presents results from verbalization and
QA evaluations. Section 6 ofers a detailed analysis and
discussion. Section 7 concludes the paper and outlines
directions for future work.
2. Related Work
only as efective as the fidelity and perspective alignment
of the verbalized input, and generation must remain
factual, grounded, and contextually relevant [8]. Moreover,
multimodal verbalizations are often more compact and
abstract than traditional long-form documents, which
poses dificulties in relevance ranking and context-aware
generation.</p>
        <p>In this work, we investigate the following core research
questions:
1. How can multimodal data (numeric and visual)
be verbalized in a perspective-aware manner to
support policy-level interpretation?
2. What are the linguistic and functional trade-ofs
between zero-shot and few-shot verbalization
approaches in this context?
3. Can a lightweight, locally-deployable RAG
pipeline (PeRAG) efectively answer urban policy
questions when built on top of such
verbalizations?
4. How does the factuality and utility of such a
system compare to general-purpose LLMs, especially
in high-stakes policy scenarios?</p>
        <p>To address these questions, we present PeRAG, a novel
framework that combines multimodal data verbalization
with a perspective-aware Retrieval-Augmented
Generation pipeline. Our work is based on a custom dataset for
the city of Turin, comprising over 7,000 examples across
multiple years (2012–2019), including 31 features
covering demographics, transportation, and trafic. We
verbalize both numeric and heatmap data into English
summaries across several perspectives (e.g.,
demographicsfocused, transport lines-focused, temporal shifts), using
LLaMA-3.1-8B for the verbalization of numeric data, and
LLaMA-3.2-11B-Vision for the verbalization of heatmap
data in zero-shot and few-shot settings. These
verbalizations serve as the retrievable memory in a
Gemma-34B-IT -powered RAG system, which supports
questionanswering on urban policy issues. All models are run
locally to ensure data privacy and control.</p>
        <p>Our key contributions are as follows:</p>
        <p>A major research avenue in knowledge-enhanced Multi-Modal Data
language modeling is Retrieval-Augmented Generation Initialdparteapcaoralteicotnion and
(RAG), in which a retriever module selects relevant
textual passages from a knowledge base that are then fed into
a generator to produce a grounded, informative response
[8, 11]. This has been particularly efective in tasks like
open-domain QA, summarization, and dialogue.
Variants such as MuRAG [12] have explored incorporating VtherrobuaglihzaLtLioMn VePrebraspliezectdivTeesxt ModAenlaRleysspisonse
multiple modalities into retrieval pipelines. ConvtoertseioxtnuosfinngumLLearMicadata Generatipnegrstpexetctfirvoems various ouEtpvualtutahtrionuggthhequmeosdtieoln'sing</p>
        <p>In our work, we adapt and extend the RAG
architecture for perspective-aware generation by populating the Figure 1: PeRAG: Perspective inclusive pipeline with RAG
retrieval index with natural language verbalizations that
encode distinct viewpoints over the same input data.
Unlike knowledge injection methods that incorporate triplet- 3.1. Homogenizing Heterogeneous Urban
based structured knowledge [13], we work purely with
free-text verbalizations generated from multimodal data. Data for RAG
The retriever retrieves relevant perspective-conditioned Unlike conventional RAG systems that are designed to
passages, and the generator uses them to compose con- interface with a variety of knowledge representations—
textually rich, stakeholder-specific responses. This re- including tables, RDF triples, JSON schemas, and
unstrucsults in a system—PeRAG (Perspective-aware RAG)—that tured documents—our approach standardizes
heterogeenables context-sensitive generation not just based on neous urban data into a unified format of unstructured
topical relevance but on the interpretive stance encoded textual narratives. This design choice fundamentally
simin the input text passages. To the best of our knowl- plifies the retrieval mechanism and maximizes
compatiedge, PeRAG represents the first instantiation of RAG bility with LLM-based generation models. Rather than
tailored for multi-perspective decision support in urban adapting the retriever to handle multiple data
represengovernance contexts. tations, we adopt a single retriever pipeline enabled by</p>
        <p>Although LLMs such as ChatGPT and GPT-3 [14] have transforming structured data, including tables, geospatial
shown great success in general-purpose generation tasks, indicators, and statistical measures, into natural language
their application in decision-making processes has been paragraphs. The resulting textual narratives are
semantilimited by a lack of specificity and contextual adaptation cally enriched and explicitly crafted to reflect distinct
an[15]. Generic outputs are often insuficient in high-stakes alytical perspectives, ensuring that core domain-specific
domains like urban planning, where conflicting group patterns are preserved while adapting the framing to
needs (e.g., between commuters, the elderly, and environ- match varied stakeholder viewpoints.
mentally conscious citizens) must be mediated through The homogenization approach ofers several key
adnuanced communication strategies. Eforts like BLOOM vantages for urban policy applications. First, retrieval
[16] have underlined the importance of transparent, rep- simplification is achieved through a unified
represenresentative training data, particularly for multilingual set- tation that allows for a single dense retriever without
tings. However, our implementation is currently focused requiring modality-specific modules, reducing system
on English language generation, which remains domi- complexity and computational overhead. Second, our
apnant in LLM infrastructure and evaluation. By operating proach enables cross-modal comparability by facilitating
entirely in English while incorporating multi-perspective reasoning across diferent data types, such as comparing
reasoning, our approach can generalize to multilingual demographics with transportation patterns through
unicontexts in future iterations but already demonstrates form verbal representations. Third, LLM compatibility is
strong utility in data-rich governance scenarios [17]. naturally reinforced by using natural language as both
input and output, aligning with the intrinsic design of
3. Methodology generative models and enabling seamless integration into
query-response pipelines. Figure 1 outlines how PeRAG’s
components, multi-modal data, verbalization,
perspective inclusion, RAG modules, and evaluation, integrate
within the pipeline.</p>
      </sec>
      <sec id="sec-1-3">
        <title>Our methodology introduces a novel pipeline that bridges</title>
        <p>heterogeneous urban data and perspective-aware
natural language generation using a tailored
RetrievalAugmented Generation (RAG) architecture. The
following subsections detail our approach to homogeniz- 3.2. Dataset Description
ing structured inputs, dataset preparation, verbalization
strategies, system design, and evaluation.</p>
      </sec>
      <sec id="sec-1-4">
        <title>The dataset comprises 7,019 urban data records cover</title>
        <p>ing Turin’s geography, demography, and transportation
systems from 2012 to 2019, ofering a comprehensive gitudinal scope allows for trend identification, seasonal
longitudinal view of urban dynamics. pattern analysis, and evaluation of policy interventions</p>
        <p>The data encompasses 3,850 census areas, which are over time.
portions of municipal territory organised in polygons, The dataset was constructed by integrating multiple
used by ISTAT1 to divide the city into manageable, sta- sources: all demographic data was obtained from the
tistically meaningful areas. Demographic information GeoPiemonte2 portal, while public transport, trafic, and
about each census area is collected with respect to size safety data were provided by Gruppo Torinese Trasporti
and population distribution. Special attention is given (GTT)3, which manages public transport services
includto urban vulnerabilities, housing conditions, migration ing urban, suburban, and extra urban routes, as well as
lfows, and demographic changes in specific neighbor- tram and metro lines.
hoods. Census areas can vary significantly in both size
and demographic characteristics—they can be as small as 3.3. Perspective-Aware Verbalization of
a single street or encompass an entire residential block.</p>
        <p>For this reason, the census areas difer greatly from one Urban Data
another. To enable retrieval over rich, interpretable textual data,</p>
        <p>The census area is the smallest territorial unit used for we developed an Urban Data Verbalization System that
analysis and is organized into 93 statistical zones. Sta- translates structured urban records into fluent natural
tistical zones are aggregations of multiple census tracts language narratives using large language models (LLMs).
and represent one of the intra-municipal territorial units This system addresses the fundamental challenge of
transinto which the territory of the City of Turin is divided. forming quantitative urban data into qualitative insights
In turn, the statistical zones are grouped into 9 districts - that align with diferent stakeholder perspectives and
territorial subdivisions over which the local civil author- analytical frameworks.
ity exercises its functions. This hierarchy of spatial units
provides multiple levels of geographical granularity for 3.3.1. Verbalization
analysis, enabling both fine-grained local insights and
broader district-level policy evaluation. Additionally, the Our verbalization pipeline employs LLaMA-3.1-8B as the
data for each census area is available for two reference default model for processing numerical data and
LLaMAyears: 2012 and 2019, allowing for temporal comparisons 3.2-11B-Vision for processing heatmaps. The selection of
across various dimensions. The dataset includes 31 struc- these models allows us to maintain compatibility with
tured features for each census-year tuple, systematically other LLMs, ensuring both flexibility and
reproducibilcategorized into four primary domains. ity. We implement two primary verbalization strategies</p>
        <p>Demographic information includes population den- to balance generation quality with computational
efisity, gender distribution, age brackets, foreign residents, ciency. Zero-shot verbalization allows the model to
genand the number of families, providing a comprehensive erate descriptions without specific examples, providing
population profile. Additionally, the density of each de- maximum creative freedom but potentially sacrificing
mographic is calculated within a 500-meter bufer from consistency. Few-shot verbalization employs carefully
the centroid of each census area. This approach accounts curated single-shot examples that guide narrative style
for the spatial distribution of density and makes the areas while preserving creative expression, resulting in more
more comparable in terms of population concentration consistent and domain-appropriate outputs.
and access to services. The system utilizes handcrafted prompts specifically</p>
        <p>Public transport metrics include stop and line density, designed to elicit structured yet non-hallucinatory
sumas well as connectivity indicators that measure how well maries for each data record, ensuring factual accuracy
each census area is linked to others in terms of acces- while maintaining linguistic diversity. Two distinct
sibility and network coverage. Geographical identifiers prompt templates are employed: one for processing
nuencompass census codes, dimensions, statistical zones, merical tabular data using LLaMA-3.1-8B (see Table 6),
district names, and boundaries that enable spatial analy- and another for processing heatmap visualizations using
sis and policy targeting. Trafic and safety data document LLaMA-3.2-11B-Vision (see Table 5). Complete prompt
the number of accidents, vehicle involvement patterns, examples for both verbalization modalities are provided
and the number of public transport incidents, support- in Appendix C to ensure reproducibility. In both LLaMA
ing risk assessment and safety planning initiatives. This configurations, generation control is achieved through
collection represents a significant expansion, enabling carefully tuned parameters, including temperature set to
richer temporal and spatial analyses that capture urban 0.6 for optimal creativity balance, top-5 sampling at 0.9
evolution patterns and long-term policy impacts. The lon- for response diversity, repetition penalty of 1.2 to ensure</p>
      </sec>
      <sec id="sec-1-5">
        <title>1National Institute of Statistics: https://www.istat.it/</title>
      </sec>
      <sec id="sec-1-6">
        <title>2https://geoportale.igr.piemonte.it/cms/ 3https://www.gtt.to.it/cms/</title>
        <p>coherence, and the maximum token length is set to 512 information accuracy and data factuality, identifying
infor the 8B version and 1024 for the 11B-Vision version to stances where ambiguous phrasing might misrepresent
support concise yet informative descriptions. the underlying data. Additionally, the multi-perspective</p>
        <p>Each structured record is transformed into multiple approach inherently reduces ambiguity by providing
exnarrative versions conditioned on distinct stakeholder plicit analytical framing, rather than generating generic
perspectives. These include accessibility-oriented plan- descriptions that could be interpreted in multiple ways.
ning focusing on mobility and inclusion, safety and
equity perspectives highlighting transportation risks 3.4. Perspective-Aware RAG (PeRAG)
and distribution fairness, and demographic inclusion
addressing the needs of diverse populations. This multi- PeRAG extends the traditional RAG paradigm to
hanperspective approach ensures that verbalizations tran- dle structured urban data through its verbalized form,
scend generic summaries and address the specific ana- creating a novel architecture specifically designed for
lytical needs of diferent urban stakeholders. Table 3, perspective-aware policy support. The system integrates
presented in Appendix A, provides an example of this retrieval and generation components that work
synertype of verbalization, illustrating both a general narrative gistically to provide contextually relevant and factually
and its corresponding multi-perspective version. grounded responses to complex urban planning queries.</p>
        <sec id="sec-1-6-1">
          <title>3.3.2. Quality Assessment and Validation</title>
        </sec>
        <sec id="sec-1-6-2">
          <title>3.4.1. Retrieval Module</title>
          <p>Unlike conventional LLM-generated general texts, which The retrieval module employs the all-mpnet-base-v2
senoften sufer from loss of specificity, repetitiveness, or tence transformer for dense vector encoding, chosen for
context ignorance, our perspective-aware narratives em- its superior performance on semantic similarity tasks and
phasize trends, deficiencies, and socio-geographic fac- computational eficiency. Text chunking is implemented
tors of particular interest to diverse urban stakeholders. using a token-based approach with a chunk size set to
The annotation protocol involved a systematic evalua- 500 tokens and an overlap of 50 tokens to ensure
semantion across four key dimensions: (1) contextual relevance tic continuity across chunk boundaries. This strategy
whether the verbalization appropriately captures the ur- ensures that semantically related content remains within
ban context and stakeholder perspective, (2) information the same retrievable segment, preserving coherence and
accuracy alignment between the verbalized content and relevance across retrieval operations.
source data, (3) coverage of information aspects com- The retrieval mechanism operates through cosine
pleteness of perspective-specific elements in the verbal- similarity-based semantic ranking with configurable
topization, and (4) data factuality dealing with absence of k retrieval, defaulting to 5 results to balance
comprehallucinations or fabricated information. Three expert hensiveness with computational eficiency. The system
annotators, including two postdoctoral researchers and maintains comprehensive provenance metadata for
comone NLP researcher, independently evaluated a random plete traceability, enabling users and analysts to verify
sample of generated narratives for each dimension. Given the source of retrieved information and ensuring
accountthe exploratory nature of this novel task and time con- ability in policy-relevant applications.
straints, a focused evaluation was conducted on a
carefully selected subset of examples, with annotation dis- 3.4.2. Generation Module
putes resolved through collaborative discussion among
the research team. Their comprehensive assessment
conifrmed the validity, relevance, and framing alignment
of perspective-aware verbalizations, providing
empirical support for their use in downstream RAG generation
tasks.</p>
          <p>To mitigate potential ambiguities introduced during
the natural language verbalization process, our approach
incorporates several safeguards. First, the verbalization
prompts explicitly instruct models to use exact numerical
values without modification or approximation,
preventing quantitative distortions. Second, the prompts restrict
models from drawing conclusions, making assumptions,
or interpreting data significance, thereby reducing
interpretive ambiguity. Third, during the annotation
process, evaluators specifically assessed verbalizations for
The generation module utilizes Gemma-3-4B-IT as the
default model while supporting any causal decoder-based
large language model to ensure adaptability across
diferent computational environments. The module processes
user queries alongside retrieved perspective-aligned
narratives using carefully engineered prompts that structure
the input format as query plus perspective narratives.</p>
          <p>Generation parameters are optimized for policy
applications, with a temperature of 0.7 balancing creativity
and factuality, and a 512-token limit ensuring brevity
without sacrificing informational depth. The system
demonstrates robust capability in responding to
complex urban planning questions, supporting district-wise
comparisons, demographic-transport correlations, safety
and infrastructure assessments, and trend identification
over temporal dimensions.
3.5. Implementation and System perspective-aware verbalization approaches using our
Eficiency Turin dataset. General verbalization employs
standard data-to-text generation without specific
perspecThe full system is implemented in Python, leveraging tive conditioning, while perspective-aware
verbalizaPyTorch and Hugging Face Transformers for deep learn- tion generates targeted descriptions aligned with
speing and natural language processing tasks, alongside cific stakeholder viewpoints, including
demographicsSentenceTransformers for semantic retrieval capabilities. focused, transportation infrastructure-focused, temporal
The implementation includes comprehensive batch pro- analysis, and deficiency assessment perspectives.
cessing capabilities with integrated performance moni- A random sample of 200 data records is selected for
toring to ensure scalable operation across large datasets. detailed verbalization analysis, ensuring representation
GPU acceleration with automatic device detection opti- across diferent districts, time periods, and demographic
mizes computational eficiency while maintaining com- profiles. Our multi-modal dataset is processed through
patibility across diferent hardware configurations. both zero-shot and few-shot verbalization strategies for</p>
          <p>The system architecture incorporates detailed logging each perspective type, generating a comprehensive
corfor each transformation step, enabling comprehensive de- pus of verbalized descriptions for comparative evaluation.
bugging and performance analysis. Key operational fea- For the verbalization quality assessment, two authors
tures include support for batch verbalization, which pro- jointly annotated three representative examples in a
cesses multiple records simultaneously; real-time query- structured meeting format, with any disagreements
reing capabilities for interactive policy analysis; and modu- solved through immediate discussion. While the limited
lar model swapping, allowing for easy adaptation to dif- sample size ( = 3) precluded formal inter-annotator
ferent language models or domain-specific requirements. agreement (IAA) calculation using Cohen or Fleiss’
This implementation approach ensures both research Kappa, the collaborative annotation process ensured
conreproducibility and practical deployment feasibility for sistency in evaluation criteria application. Future work
real-world urban policy applications. The source code for will expand the annotation sample size to enable robust
our PeRAG system, along with the various verbalization inter-annotator reliability metrics.
configurations, is publicly available at the following link 4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. Experimentation</title>
      <p>4.2. System Performance Evaluation</p>
      <sec id="sec-2-1">
        <title>We develop a comprehensive set of 25 urban policy</title>
        <p>Our experimental evaluation is designed to assess the oriented questions that span diferent complexity
levefectiveness of perspective-aware verbalization and the els and analytical requirements. The question set
inoverall performance of the PeRAG system in supporting cludes factual queries about specific demographic or
urban policy decision-making. We conduct experiments transportation metrics, comparative questions requiring
across two primary dimensions: verbalization quality cross-district or temporal analysis, analytical questions
assessment and end-to-end system performance evalua- demanding trend identification and causal reasoning,
tion. All experiments are performed on locally deployed and policy-oriented questions seeking recommendations
models to ensure data privacy and reproducibility, using based on data insights.</p>
        <p>NVIDIA GPUs for computational acceleration. Questions are categorized by type (factual,
compara</p>
        <p>The experimental framework evaluates our system tive, analytical, policy-oriented), complexity level
(simagainst several key research questions established in the ple, moderate, complex), and required perspective
alignintroduction: the efectiveness of perspective-aware ver- ment (demographics, infrastructure, temporal,
deficiencybalization compared to general approaches, the compar- focused). This categorization enables a systematic
assessative performance of zero-shot versus few-shot verbal- ment of system performance across diferent query types
ization strategies, the utility of PeRAG for urban policy and complexity levels.
question answering, and the factuality and relevance of System performance is evaluated against multiple
system outputs compared to general-purpose large lan- baseline approaches to assess the contribution of our
guage models. perspective-aware framework. These baselines involve
querying general-purpose LLMs without access to
urbanspecific data. For this purpose, we use the Gemini 2.0</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4.1. Verbalization Evaluation Protocol Flash and GPT-4o Mini models. Additionally, we evalu</title>
      <p>We conduct a systematic comparison between gen- ate RAG systems using general (non-perspective-aware)
eral verbalization, i.e., template-based approach, and verbalizations under both zero-shot and few-shot
conifgurations. Each baseline is tested using the same set
of questions and evaluation criteria to ensure a fair and
consistent comparison.</p>
      <sec id="sec-3-1">
        <title>4Code and dataset are available at https://github.com/MasterHoracio/</title>
        <p>CLiC-it-HARMONIA.git.
4.3. Evaluation Metrics
In order to evaluate the performance of our proposed
perspective-aware framework, as well as all the baseline
approaches, we employ the Retrieval Augmented
Generation Assessment (RAGAS) framework, specifically
designed for reference-free evaluation of RAG pipelines
[18]. This framework defines three main metrics. The
ifrst, faithfulness, measures whether the answer
accurately reflects information that can be directly inferred
from the given context. The second, answer relevance,
evaluates whether the answer directly and appropriately
responds to the given question, without being incomplete
or redundant. Finally, the third metric, context relevance,
assesses how well the context includes only the
necessary information to answer the question, avoiding
redundancy. For a detailed explanation, we refer the reader to
the following paper [18].</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Results</title>
      <p>the responses generated by the PeRAG system efectively
leverage information inferred from the provided context.</p>
      <p>On the other hand, the lowest score—both for PeRAG
and previous configurations—was observed in the context
relevance metric. This may be attributed to the diversity
of information retrieved by the retriever module, which
stems from the chunk partitioning strategy used. In
particular, this strategy incorporated independent general
and multi-perspective verbalizations for each district,
zone, or census area.</p>
      <p>Table 1 presents the evaluation results for the diferent
configurations considered. The first section of the
table (rows 2 and 3) shows the results obtained by directly
querying the LLMs without providing any additional
context. It is important to note that the faithfulness and
context relevance metrics could not be computed in this case,
as both require access to the retrieved context. Neverthe- 6. Analysis
less, the answer relevance scores reveal low performance
for both models. This can be attributed to the fact that To gain deeper insight into the performance of our
promost of the responses were of the type “I cannot answer posed PeRAG pipeline, this section presents a
quantitathe question due to lack of necessary data”. Specifically, tive and qualitative analysis of the generated responses.
GPT-4o responded this way in 21 out of 25 cases, while In particular, we conduct a comparative evaluation of the
Gemini 2.0 did so in 18 out of 25. Overall, Gemini demon- answers produced by the RAG system using the diferent
strated marginally better performance in this setting. types of verbalizations. For this analysis, we randomly</p>
      <p>Additionally, Table 1 also compares the performance sample three questions from our set of 25, focusing on the
of general verbalizations using zero-shot and few-shot demographic and transportation perspectives. The
selecconfigurations. These results are shown in the second tion of three questions for detailed BERTScore analysis
section of the table (rows 4 and 5). As can be observed, the was determined by several practical constraints. First,
answer relevance scores are higher than those obtained by generating reference factual answers for comparative
the previously evaluated LLMs, which can be attributed evaluation requires extensive manual verification against
to the incorporation of relevant information retrieved the original Turin dataset, which is a time-intensive
proby the retrieval module. When comparing the general cess involving careful cross-referencing of multiple data
verbalization settings, we observe that the few-shot con- sources and temporal dimensions. Second, as this
repifguration outperforms the zero-shot setting across all resents an initial exploration of a novel task
combinthree evaluation metrics, with an average improvement ing multi-modal verbalization with perspective-aware
of 6%. This gain is likely due to the higher quality and RAG, we prioritized depth over breadth in the
qualitagreater level of detail present in the verbalizations gener- tive analysis to thoroughly examine the mechanisms
unated under the few-shot configuration. derlying performance diferences between general and</p>
      <p>Finally, we present the evaluation results of our pro- perspective-aware verbalizations. Third, the
computaposed PeRAG system. As shown, it achieves the highest tional overhead of generating responses across all
verbalscores across all three evaluation metrics, with an average ization configurations and computing detailed semantic
improvement of 20% compared to the best-performing similarity metrics scales considerably with the number
general verbalization configuration. Overall, the highest of questions analyzed. The three selected questions were
metric score was obtained in faithfulness, indicating that chosen to represent diferent complexity levels and
analytical requirements. expand the evaluation to cover the complete 25-question</p>
      <p>For each of these questions, we generate a reference set, enabling more comprehensive statistical analysis of
factual answer by manually extracting and synthesiz- semantic similarity performance across diferent
quesing the relevant information directly from the original tion types, complexity levels, and analytical perspectives.
Turin dataset. The reference answer generation process Additionally, we plan to incorporate multiple semantic
involves several systematic steps: (1) identifying the similarity metrics beyond BERTScore to provide a more
specific data fields and temporal dimensions required comprehensive assessment of response quality and
facto answer each question, (2) querying the structured tual alignment.
dataset to retrieve exact numerical values for the
relevant census areas, statistical zones, or districts, (3) per- Table 2
forming necessary aggregations or comparisons across Evaluation results based on BERTScore. The columns report
the 2012-2019 timeframe where temporal analysis is re- the macro-average recall, precision, and F1 score across the
quired, and (4) formulating a concise factual response three randomly selected questions. The prefixes ZS and FS
that accurately reflects the quantitative findings without indicate the zero-shot and few-shot configurations of the
geninterpretive bias. For instance, for questions involving eral verbalization.
demographic trends, reference answers include precise Approach Recall Precision F1
population counts, percentage changes, and specific
demographic categories afected, all derived directly from ZS-RAG 0.818 0.831 0.821
the census data. This manual reference generation pro- FS-RAG 0.837 0.852 0.846
cess, while labor-intensive, provides ground-truth
answers that serve as reliable baselines for evaluating the PeRAG 0.851 0.873 0.862
factual accuracy and completeness of system-generated
responses through semantic similarity metrics. We use An important consideration in our verbalization
apthe BERTScore metric [19], a widely adopted measure proach is the management of potential linguistic
ambiof semantic similarity between a generated text and a guities that could impact downstream RAG performance.
reference [20]. Finally, we present a discussion highlight- Our analysis of generated verbalizations reveals that
ing the strengths and weaknesses of the PeRAG pipeline perspective-aware conditioning significantly reduces
incompared to general verbalizations. terpretive ambiguity compared to general verbalization</p>
      <p>Table 2 presents the BERTScore evaluation results for approaches. For instance, when describing
transportathe three randomly selected questions. The first section of tion infrastructure, general verbalizations might use
amthe table (rows 2 and 3) reports the results for the general biguous terms like ‘adequate coverage’ or ‘reasonable
verbalizations, where the few-shot configuration achieves accessibility’, whereas perspective-aware verbalizations
the highest scores across all BERTScore metrics. These provide specific contextual framing, such as ‘limited
acoutcomes are consistent with the trends observed in the cessibility for elderly residents due to sparse stop density
reference-free evaluation metrics. The second section in residential areas’. This specificity not only reduces
amof the table shows the results for our PeRAG pipeline, biguity but also enhances retrieval precision, as queries
which consistently achieves the best performance across can be matched more accurately to relevant
perspectiveall three metrics, further reinforcing the findings obtained conditioned content. However, we acknowledge that
through the reference-free evaluation. some residual ambiguity remains inherent to natural
lan</p>
      <p>We acknowledge that the BERTScore analysis based guage representation, particularly in cases where
numerion three questions represents a preliminary assessment cal thresholds are verbalized using qualitative descriptors
of semantic similarity performance, and the limited sam- (e.g., ‘high density’ vs. specific population counts).
Fuple size constrains the statistical generalizability of these ture work will explore hybrid approaches that preserve
ifndings. The selection was necessitated by the substan- exact numerical values alongside natural language
detial manual efort required for reference answer genera- scriptions to further minimize interpretive ambiguity.
tion and verification against the multi-dimensional Turin To compare the outputs generated by our diferent
condataset. Each reference answer requires careful extrac- figurations, Table 4 (included in Appendix B) presents
tion and synthesis of information across multiple data a comparison between the response produced by our
ifelds, temporal dimensions, and geographical units, fol- PeRAG pipeline and the one generated using the few-shot
lowed by independent verification by domain experts. configuration of the general verbalization. This
configWhile these three questions provide initial evidence of uration was selected due to its strong performance in
PeRAG’s superior semantic alignment with ground truth both the reference-free metrics and the BERTScore.
Addata, we recognize that broader systematic analysis is ditionally, both responses are contrasted with a reference
essential for robust conclusions. Future work will im- answer constructed from factual information. The
quesplement automated reference generation procedures and tion used in this analysis was selected from the set of
three randomly chosen questions. comparative analysis reveals that few-shot verbalization</p>
      <p>As shown in Table 4, the selected question involves strategies provide superior generation fidelity and
pera temporal comparison of demographic characteristics spective alignment compared to zero-shot approaches,
from 2012 to 2019. According to the reference answer, a despite increased computational overhead (RQ2). PeRAG,
population decrease is observed across most demographic our lightweight locally-deployable RAG pipeline,
efecgroups, including males, females, minors, foreigners, and tively answers urban policy questions by leveraging these
working-age citizens. In contrast, the only group that multimodal verbalizations as retrievable memory,
ensurexperienced population growth during this period was ing data privacy while maintaining system
responsivesenior citizens. ness (RQ3). Human evaluation confirms that PeRAG
ex</p>
      <p>When comparing these findings to the output gen- hibits superior factuality and utility compared to
generalerated by the PeRAG pipeline, we observe that it suc- purpose LLMs in high-stakes policy scenarios, with
cessfully identified the overall downward trend across domain-specific grounding providing enhanced
accumultiple demographic groups, highlighting that the re- racy and contextual relevance (RQ4). The framework
duction was not evenly distributed. This aligns with the establishes a reproducible methodology for transforming
factual data presented in the reference answer. Moreover, complex urban datasets into actionable policy insights,
PeRAG accurately captured the groups that experienced demonstrating that specialized, domain-grounded AI
sysdecline—such as the working-age population, minors, tems outperform general-purpose alternatives in critical
and foreigners—and correctly identified an increase in decision-making contexts.
the senior population, consistent with the reference.</p>
      <p>However, the PeRAG response emphasized the Limitations The various perspectives explored in this
reworking-age population as the most afected category, search, such as demographic, population, transportation,
whereas the reference answer pointed to foreigners. This gender, and age, were derived from the dataset used in our
discrepancy may be attributed to the nature of the multi- evaluation. However, these perspectives do not
incorpoperspective verbalizations, which were generated at the rate public opinion. As ongoing work, we are expanding
level of census areas, statistical zones, and districts. Con- these perspectives through a research survey aimed at
sequently, when retrieving information using the re- integrating viewpoints that reflect public opinion of
cititriever module (configured with  = 5), it may not have zens and stakeholders of Turin. The annotation protocol,
captured a fully comprehensive view across all nine dis- while systematic, was applied to a limited sample size
tricts. This limitation has been corroborated by analyz- due to the exploratory nature of this novel task. The
ing the retrieved chunks, where recalculating the values collaborative annotation approach, though ensuring
conbased on the retrieved verbalizations indeed showed that sistency, does not provide quantitative measures of IAA.
the working-age group experienced the largest decline. Future iterations of this work will implement larger-scale</p>
      <p>Finally, Table 4 also includes the output of the gen- annotation studies with multiple independent annotators
eral verbalization under the few-shot configuration. As and IAA metrics to strengthen the evaluation framework.
shown, the response generated by the RAG system fails Additionally, we are working at enriching the evaluation
to clearly identify the downward trends across the difer- framework. We plan to complement the reference-free
ent demographic groups as well as the upward trend for evaluation metrics applied [21] by incorporating
taskseniors. These results are consistent with those observed based evaluation protocols and comprehensive human
in the reference-free evaluation metrics. Moreover, al- evaluation strategies to better assess the practical utility
though the response is factually correct, it does not ad- of perspective-aware verbalizations in real-world urban
dress the perspective implied by the question, highlight- planning contexts.
ing the importance of incorporating perspective-aware
verbalizations. Similar to the PeRAG pipeline, the
retrieved chunks in this configuration also exhibit limi- Acknowledgments
tations, indicating a potential area for improvement in
future work.</p>
      <sec id="sec-4-1">
        <title>The research is conducted at the Department of Com</title>
        <p>puter Science, University of Turin, Italy, and is funded
by the “HARMONIA” project - M4-C2, I1.3 Partenariati
Estesi - Cascade Call - FAIR - CUP C63C22000770006
PE PE0000013 funded under the NextGenerationEU
programme (PI: Viviana Patti).</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>7. Conclusion</title>
      <sec id="sec-5-1">
        <title>This research demonstrates that multimodal urban data</title>
        <p>can be efectively verbalized through perspective-aware
approaches to support policy-level interpretation, with
our framework successfully processing over 7,000
examples across multiple analytical perspectives (RQ1). The
J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, GAs: Automated evaluation of retrieval augmented
G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, generation, in: N. Aletras, O. De Clercq (Eds.),
G. Krueger, T. Henighan, R. Child, A. Ramesh, Proceedings of the 18th Conference of the
EuroD. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, pean Chapter of the Association for Computational
E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, Linguistics: System Demonstrations, Association
C. Berner, S. McCandlish, A. Radford, I. Sutskever, for Computational Linguistics, St. Julians, Malta,
D. Amodei, Language models are few-shot learners, 2024, pp. 150–158. URL: https://aclanthology.org/
in: Proceedings of the 34th International Confer- 2024.eacl-demo.16/.
ence on Neural Information Processing Systems, [19] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger,
NIPS ’20, Curran Associates Inc., Red Hook, NY, Y. Artzi, Bertscore: Evaluating text generation
USA, 2020, pp. 1–25. with BERT, in: 8th International Conference on
[15] W. Liu, X. Wang, M. Wu, T. Li, C. Lv, Z. Ling, Z. Jian- Learning Representations, ICLR 2020, Addis Ababa,
Hao, C. Zhang, X. Zheng, X. Huang, Aligning Ethiopia, April 26-30, 2020, OpenReview.net, 2020,
large language models with human preferences pp. 1–41. URL: https://openreview.net/forum?id=
through representation engineering, in: L.-W. SkeHuCVFDr.</p>
        <p>Ku, A. Martins, V. Srikumar (Eds.), Proceedings [20] M. Hanna, O. Bojar, A fine-grained
analyof the 62nd Annual Meeting of the Association sis of BERTScore, in: L. Barrault, O. Bojar,
for Computational Linguistics (Volume 1: Long F. Bougares, R. Chatterjee, M. R. Costa-jussa, C.
FedPapers), Association for Computational Linguis- ermann, M. Fishel, A. Fraser, M. Freitag, Y. Graham,
tics, Bangkok, Thailand, 2024, pp. 10619–10638. R. Grundkiewicz, P. Guzman, B. Haddow, M. Huck,
URL: https://aclanthology.org/2024.acl-long.572/. A. J. Yepes, P. Koehn, T. Kocmi, A. Martins, M.
Mordoi:10.18653/v1/2024.acl-long.572. ishita, C. Monz (Eds.), Proceedings of the Sixth
Con[16] T. L. Scao, A. Fan, C. Akiki, E. Pavlick, S. Ilic, ference on Machine Translation, Association for
D. Hesslow, R. Castagné, A. S. Luccioni, F. Yvon, Computational Linguistics, Online, 2021, pp. 507–
M. Gallé, J. Tow, A. M. Rush, S. Biderman, A. Web- 517. URL: https://aclanthology.org/2021.wmt-1.59/.
son, P. S. Ammanamanchi, T. Wang, B. Sagot, [21] D. Deutsch, R. Dror, D. Roth, On the limitations
N. Muennighof, A. V. del Moral, O. Ruwase, R. Baw- of reference-free evaluations of generated text, in:
den, S. Bekman, A. McMillan-Major, I. Beltagy, Y. Goldberg, Z. Kozareva, Y. Zhang (Eds.),
ProceedH. Nguyen, L. Saulnier, S. Tan, P. O. Suarez, ings of the 2022 Conference on Empirical
MethV. Sanh, H. Laurençon, Y. Jernite, J. Launay, ods in Natural Language Processing, Association
M. Mitchell, C. Rafel, A. Gokaslan, A. Simhi, for Computational Linguistics, Abu Dhabi, United
A. Soroa, A. F. Aji, A. Alfassy, A. Rogers, A. K. Arab Emirates, 2022, pp. 10960–10977. URL: https:
Nitzav, C. Xu, C. Mou, C. Emezue, C. Klamm, //aclanthology.org/2022.emnlp-main.753/. doi:10.
C. Leong, D. van Strien, D. I. Adelani, et al., BLOOM: 18653/v1/2022.emnlp-main.753.
A 176b-parameter open-access multilingual
language model, CoRR abs/2211.05100 (2022). URL:
https://doi.org/10.48550/arXiv.2211.05100. doi:10.</p>
        <p>48550/ARXIV.2211.05100.
[17] R. Bommasani, D. A. Hudson, E. Adeli, R. B.</p>
        <p>Altman, S. Arora, S. von Arx, M. S. Bernstein,
J. Bohg, A. Bosselut, E. Brunskill, E.
Brynjolfsson, S. Buch, D. Card, R. Castellon, N. S.
Chatterji, A. S. Chen, K. Creel, J. Q. Davis, D.
Demszky, C. Donahue, M. Doumbouya, E. Durmus,
S. Ermon, J. Etchemendy, K. Ethayarajh, L.
FeiFei, C. Finn, T. Gale, L. E. Gillespie, K. Goel, N. D.</p>
        <p>Goodman, S. Grossman, N. Guha, T. Hashimoto,
P. Henderson, J. Hewitt, D. E. Ho, J. Hong, K. Hsu,
J. Huang, T. Icard, S. Jain, D. Jurafsky, P. Kalluri,
S. Karamcheti, G. Keeling, F. Khani, O. Khattab,
P. W. Koh, M. S. Krass, R. Krishna, R. Kuditipudi,
et al., On the opportunities and risks of foundation
models, CoRR abs/2108.07258 (2021). URL: https:
//arxiv.org/abs/2108.07258. arXiv:2108.07258.
[18] S. Es, J. James, L. Espinosa Anke, S. Schockaert,
RAIn 2018, census section 2535 in statistical zone 75 covered an area of 145,065.42 units, with a
population of 192. The population percentage was approximately 0.0027%, of which 54.7% were
female. Immigrants comprised 21.9% of the population, with 13.5% being female immigrants.</p>
        <p>Minors accounted for 13.0%, seniors 22.4%. There were 5 stops and 5 lines stopping, with stops
representing 2.6% coverage, 100% stop-to-line ratio, and 0.017% length coverage.</p>
        <p>In 2018, census section 2535 in statistical zone 75 had a population of 192 individuals. Females
accounted for 54.7% of the population. Immigrants represented 21.9% of residents, with female
immigrants making up 13.5%. Children under 18 years old comprised 13.5% of the population, while
seniors accounted for 22.4%.</p>
        <p>The total population of census section 2535 was 192, representing approximately 0.0027% of the
overall population. The area covered by this section was 145,065.42 units, indicating a low population
density given the large area size.</p>
        <p>Census section 2535 covers a substantial area of 145,065.42 units within statistical zone 75. Despite
its size, the population remains small at 192 individuals.</p>
        <p>Within census section 2535, there are 5 stops served by 5 lines. Stops cover 2.6% of the area, with an
equal number of stops per line. The total length covered by stops corresponds to 0.017% of the area.</p>
        <p>Immigrants make up 21.9% of the population in census section 2535, with female immigrants
comprising 13.5%. This suggests a significant immigrant presence within the local population.</p>
        <p>Age Group
Perspective</p>
        <p>Minors account for 13.5% of the population in census section 2535, while seniors make up 22.4%,
indicating a population with a notable proportion of older adults.</p>
        <p>Gender
tion</p>
        <p>Distribu</p>
        <p>Females represent 54.7% of the population in census section 2535, showing a slightly higher female
population compared to males.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>A. Verbalization Examples</title>
    </sec>
    <sec id="sec-7">
      <title>B. Example of Output Response</title>
    </sec>
    <sec id="sec-8">
      <title>C. Verbalization Prompt Examples</title>
      <sec id="sec-8-1">
        <title>To ensure reproducibility and transparency of our verbalization approach, we provide the complete prompt templates used for both numerical data (see Table 6) and heatmap verbalization (see Table 5).</title>
        <p>C.1. Heatmap Verbalization Prompt
(LLaMA-3.2-11B-Vision)</p>
      </sec>
      <sec id="sec-8-2">
        <title>The following prompt template is used for generating narratives from population density heatmaps and other visual data representations:</title>
        <p>C.2. Numerical Data Verbalization</p>
        <p>Prompt (LLaMA-3.1-8B)</p>
      </sec>
      <sec id="sec-8-3">
        <title>The following prompt template is used for generating narratives from structured numerical urban data:</title>
        <p>From 2012 to 2019 the population in Turin has decreased. Considering all the census, does this
decrease afect all the demographics? Which category is more afected and which one is less
afected?
Between 2012 and 2019, the population of Turin declined by 4.32%, and this trend afected most
demographic groups. Both male and female populations decreased at nearly the same rate (4.34%
and 4.31% respectively), indicating no significant gender disparity. The number of minors saw a
notable decline of 5.96%, while the foreign (legally registered) population experienced the most
significant drop, decreasing by 6.48%, together with the number of working-age citizens (6.28%). In
contrast, the senior population was the only group to increase, growing by 2.05%
Analyzing the demographic shifts across Turin’s statistical zones between 2012 and 2019, it is evident
that the overall population declined during this period. However, the decrease was not evenly
distributed across all demographic categories. The working-age population (19–64) experienced the
most significant contraction, suggesting a notable loss of economic contributors. This trend was
consistently observed across various districts. In contrast, the number of seniors slightly increased,
indicating a growing aging population. The number of minors also declined, though to a lesser
extent than the working-age group. Meanwhile, the foreign-born population registered a visible
decline as well, potentially due to migration dynamics. In summary, while most demographic
segments were negatively afected, the working-age group was the most impacted, whereas the
senior population was the least afected, showing a modest increase.</p>
        <p>Analyzing the provided census data reveals a complex picture regarding demographic shifts between
2012 and 2019 in Turin’s districts. While a precise overall population decrease for the entire city
isn’t evident from this dataset, several districts demonstrate notable changes. District 5 (specifically
areas like Basse di Stura and Le Vallette) shows a consistent population size—around 41 residents
in 2019, mirroring 2012—with minimal fluctuation within the 500-meter radius. Conversely, District
1 exhibits a clear increase in population size between the two years. In 2012, Census Area 799
(Politecnico) held 52 residents, expanding to 1,478 within a 500-meter radius. Similarly, Census
Area 3209 (Politecnico) grew from 212 residents to 2,784. Looking at specific categories, the *minor*
population appears significantly impacted by this growth, increasing dramatically in District 1
(from 32 to 423) compared to a smaller change in District 5. Seniors also show increased numbers
in District 1 (47 to 652) while remaining largely stable in District 5. The foreign national population
experienced a relative increase in District 1 (2 to 104) compared to a slight increase in District 5
(13 to 20). Therefore, while certain districts remained static, the overall trend indicates that the
*minor* population is experiencing the most pronounced demographic shift, particularly within</p>
        <p>District 1, driven by expansion of the area analyzed.
&lt;|begin_of_text|&gt;&lt;|start_header_id|&gt;system&lt;|end_header_id|&gt;
You are an expert urban data analyst. Your task is to generate clear and precise narratives based
on census and transport data for the city of Turin.
&lt;|start_header_id|&gt;user&lt;|end_header_id|&gt;
Declaration on Generative AI</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>