<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Eliciting Image Schemas for Urban Digital Twins</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maria Rosaria Stufano Melone</string-name>
          <email>mariarosariastufanomelone@cnr.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefano Borgo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Domenico Camarda</string-name>
          <email>domenico.camarda@poliba.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefano De Giorgis</string-name>
          <email>s.degiorgis@vu.nl</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Laboratory for Applied Ontology, ISTC-CNR</institution>
          ,
          <addr-line>Trento</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Politecnico di Bari</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Vrije Universiteit Amsterdam</institution>
          ,
          <country country="NL">Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>The construction of urban digital twins (UDT) is a broad objective that, to be successful, must go beyond structural and functional information to include the human factor with its cognitive and experiential knowledge. In this paper, we investigate this issue by focusing on a typical, yet less complex, element of the urban scenario, namely the square. More specifically, our general aim is to study how human knowledge of the square may be elicited and structured for inclusion in UDTs. To do this in a fairly broad perspective, we study which knowledge characterises squares by considering two quite diferent real cases: Navona square in Rome and Djemaa el-Fna square in Marrakech. Starting from three images of each square, we use an LLM to obtain a descriptive text which is then used to generate a knowledge graph. Both the text and the knowledge graph are then provided again to the LLM to elicit relevant image schemas. The results on each square are then compared to select the image schemas that are more characteristic of squares. The hope is that this kind of efort can lead to enrich our knowledge of urban spaces and, subsequently, to develop UDT that are more comprehensive representations of our cities by including along with structural and functional information also human perspectives and values.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Urban planning</kwd>
        <kwd>architecture</kwd>
        <kwd>creativity</kwd>
        <kwd>applied ontology</kwd>
        <kwd>image schema</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The study and construction of Digital Twins (DT) is a live and central research goal of the last 20 years
or so [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] which has attracted attention also in urban planning [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. While characterizing what cities are
is still an ongoing efort [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and there are doubts about the scope and use of DTs for urban areas, the so
called Urban Digital Twins (UDT) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], it remains important to study these representation systems and
to explore how to enrich them.
      </p>
      <p>This paper presents a step within a longer research project aiming to analyse how to understand
and exploit Urban Digital Twins. Our focus goes beyond the structural and functional aspects that are
typically addressed in DT studies, to concentrate on how to build DTs that comprise the cognitive and
experiential perspective typical of humans. We address this problem by focusing on a common, if not
typical, element of cities, namely the square. This focus is not a restriction in interest but a way to deal
with the complexity of the goal without constraining the generality of our project.</p>
      <p>In this contribution, we posit the question of how to elicit cognitive-based information in a way
suitable for enriching DTs. Generally speaking, we believe that the elicitation of this kind of information
should be based on cognitive representation approaches like the image schemas. However, extracting
and analysing image schemas on a varied and complex structure, which includes both static and dynamic
information, is a particularly complicated process which is too dificult and error prone to be done
manually. For this reason, we investigate the use of AI software over a set of sources (text, graphs,
images). The results of this analysis highlight the variety and utility of cognitive-centred knowledge
and the kind of knowledge that today’s structural and functional UDTs clearly lack.</p>
      <p>
        This work should be read within our previous eforts on the application of diferent research paths
to provide a rich, unified and consistent framework for the modelling of complex environments like
the urban territory [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. More specifically, we focus on the study of two approaches, namely applied
ontology and image schema, to support decision-making in urban and territorial planning [
        <xref ref-type="bibr" rid="ref11 ref9">9, 11</xref>
        ]. An
UDT is a powerful tool to help in managing and monitoring some aspects of cities, but it presents
criticalities and limitations of which the research community is not yet well aware [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        In the literature there are only a few software solutions for managing image schemas that could
produce outputs useful for our goal, i.e., knowledge that comes into play during planning decision
processes or in scenario testing. Within this research line, there are some works on pipelines for
extracting image schema from vision to support robots to operate more eficiently within spatial
contexts [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In some cases a connection to ontology is proposed but the aim is to provide a method
to better represent the semantics. De Giorgis [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] has developed a script to extract and identify image
schemas from a text or image, an approach that might be relevant also for architectural and city design
features.
      </p>
      <p>Overall, our approach seeks to incorporate nuanced and dynamic elements of spatial experience
and everyday urban life, extending beyond conventional sensor data and standard
environmentalgeographical information. The focal point of our recent work has been the urban square - a complex
spatial node characterized by diverse patterns of movement and occupation, ranging from habitual
pedestrian routes that have evolved through repeated use, to the qualities of light that characterize the
square atmosphere at particular hours of day.</p>
      <p>By combining the methodologies of ontological analysis and applied ontology with the reading of
places given by image schemas we would like to develop new tools that can manage heterogeneous
knowledge. We think that the use of image schemas to “reverse engineering” human ways to inhabit
and understand places can lead to develop better representation tools for the urban planner, making
her more aware of intrinsic cognitive aspects and dynamics that make spaces and places unique to their
inhabitants.</p>
      <p>
        As anticipated, in this paper we focus on urban squares because they are important spaces within
cities whose meaning goes clearly beyond their structure, are hard to characterize, and yet reflect
the habits and social behaviours of city inhabitants. A city square is often the place where meeting
and crossing events occur—one of the semantically densest places in the urban fabric, where diverse
meanings, events, and objects create a richer substance [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. While spatial, structural and functional
information about squares can be extracted from a variety of existing traditional resources like city
maps and GIS repositories, the information about human perception and cognitive understanding of
squares is not readily available and even standard techniques (like questionnaires, behavioural studies
and the like) are hard to implement and analyse. For this reason we focus on image schemas, understood
as cognitive tools, to establish connections between cognitive representations and lived spatial or
functional interactions [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Image schemas represent semantic frameworks that develop from repeated
perceptual experiences, functioning at a more abstract level than concrete mental imagery. These
cognitive structures demonstrate the capacity to organize and pattern unlimited numbers of perceptions,
images, and experiences [
        <xref ref-type="bibr" rid="ref6 ref8">6, 8</xref>
        ]. To increase generality, we elicit image schemas from distinct scenarios
represented via photos taken at diferent times and during diferent activities. The aim is to acquire
descriptions of the very same place how it changes during times and usage. The analysis of these
photos is done via an LLM (Perplexity) and the Text2AMR2FRED tool [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The use of large language
models (LLMs) to articulate the experiential dimensions of urban space constitutes an emerging frontier
for capturing and enriching knowledge about places and the events that occur within them. The
application of LLMs to the elicitation of image schemas is conceived as a preliminary experiment, aimed
at facilitating a broader and more rapid process of elicitation. Although not the sole method available
for eliciting image schemas, it represents a potentially significant and promising line of inquiry, and it
may complement other elicitation techniques such as computer vision and spatial analysis, which are
not addressed in this contribution. FRED (Frame-based Entity-Relation Description) is a semantic tool
that transforms natural language text into structured knowledge graphs using RDF/OWL formats. The
AMR-to-FRED translation facilitates knowledge graph enrichment, references WordNet, DBPedia, and
DOLCE-Zero, and combines syntactic and semantic parsing with ontologies (such as DBpedia, WordNet,
FrameNet) to extract: entities (e.g., people, objects, places), events (e.g., actions), and relations (e.g., who
did what, where, and when). FRED is used in fields like the semantic web, AI, digital humanities, and
cognitive robotics to enable machines to understand and reason about text in a human-like way.
      </p>
      <p>Following this introduction, §2 briefly describes the tools we use while the following, §3, uses them
to study two paradigmatic squares and elicit the image schemas that characterise them. The last section,
§4, draws some further conclusions.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Setting the approach</title>
      <p>Our proposal is to investigate a three-stage procedure to systematically transform visual spatial
information into image schemas. This is the strict aspect of interest we intend to explore in this contribution,
a partial slice in a broader research scenario. Image schemas can be considered a format for formal
knowledge representation, i.e., a way to capture knowledge that is suitable for enriching UDTs (e.g., in
the form of knowledge graphs).</p>
      <p>The pipeline we use is depicted in Fig. 1. We decided to start from a set of photos of the urban space
of interest because these are rich sources of heterogeneous information and overcome, at least to some
extent, some constraints of verbal language (like imposing the choice of terminology and enforcing a
structural linearity of knowledge), so that these aspects intervene only later in the elicitation process
and are mitigated by the integration of knowledge from large language models, knowledge graphs, and
ontological structures (even though the latter is not really exploited in this paper). The process begins
with the selection of photographic perspectives of the spatial environment of interest. In the next step
we submit these to a large language model for the generation of comprehensive textual descriptions.
These descriptions are aimed to capture not only the visible architectural and environmental features
but also the implicit functional and experiential qualities of the space.</p>
      <p>The resulting textual output is processed through the AMR2FRED tool, which converts natural
language descriptions into structured knowledge graphs. This transformation represents a crucial
transition from the contextual, distributional representations characteristic of neural language models
to the explicit, relational and conceptual structures required for formal reasoning. The generated graph
provides a structured representation of the spatial relationships, entities, and properties identified in
the original descriptions.</p>
      <p>The next stage involves the analysis, again via the large language model, of both the textual description
and of the knowledge graph to extract relevant image schemas in their role of fundamental spatial and
experiential patterns motivated by embodied cognition theory. For instance, the generated description
of an instance of “walking” is now formally connected to the source-path-goal image schema, while the
spatial configuration of filled and empty areas is now mapped to container and support image schemas
as a cognitive counterpart.</p>
      <p>The integration of these ‘ontologized’ (thanks to FRED) image schemas serves a dual function: they
provide formal constraints that guide subsequent language model generation while simultaneously
grounding the abstract spatial concepts in established patterns of human spatial cognition. This creates a
hybrid computational system that combines the contextual understanding capabilities of large language
models with the inferential rigour of formal knowledge representation, resulting in a framework capable
of representing and reasoning about implicit spatial knowledge through explicit formal structures.</p>
    </sec>
    <sec id="sec-3">
      <title>3. A comparison between Navona and Djemaa el-Fna squares</title>
      <p>Below we describe the procedure we followed for the elicitation of image schemas related to the two
squares.</p>
      <p>The procedure, as illustrated in Fig. 1, consists in providing to the LLM Perplexity1 three photos
(daytime image, nighttime image, aerial image) of each square considered for the comparison: Navona
square and Djemaa-el-Fna square. You see here the pictures we used, figure 2 the ones concerning
Piazza Navona, in figure 3 the ones about Djemaa-el-Fna. [figure 2] [figure 3]</p>
      <p>The descriptive text generated by Perplexity was provided to AMR2FRED, which transformed the
text into a graph. Subsequently, both the graphs and the original texts were provided to Perplexity to
elicit the image schemas related to each square under examination. The procedure involved comparing
the resulting image schemas derived from text and graph representations of both Navona and
Djemaael-Fna, and ultimately determining which image schemas better characterize each square in order to
acquire the needed variety that a general DT for urban squares should contain.</p>
      <sec id="sec-3-1">
        <title>3.1. Image schemas in Navona square</title>
        <p>The analysis of the graph produced by FRED regarding the flows and spatial relations within Navona
square, allows us to identify diferent image schemas to support the understanding of space, movement
and relations between elements in the urban space. As a result of this step, the main emerging image
schemas are:
• PATH: the lines drawn in the graph suggest paths and directions, representing the movement
through the square.
• CONTAINER: the square appears as a delimited space that collects events, people and movements.
• LINK: connections emerge between distinct points of the square, visible in the lines that join the
nodes.
• CENTER-PERIPHERY: where elements are concentrated in central areas and more marginal
zones, the distinction between center and periphery emerges and is clear.
• FORCE: curves and directions of the graph suggest dynamic forces, such as flows of people or
spatial attractions.</p>
        <p>• BOUNDARY: the limits of the square are marked, distinguishing the inside from the outside.</p>
        <p>The comparison between the graph and a descriptive text of the square (of a historical, social
and architectural nature) makes evident some substantial diferences. The graph makes immediately
visible the patterns related to movement and spatial structure, such as PATH, AXIS, CONTAINER and
BOUNDARY. The text, on the other hand, emphasizes the lived experience, the social function and
the narration of the space, bringing out patterns such as CONTAINER, AXIS, CENTER-PERIPHERY,
LINK and FORCE, with the PATH being less evident. When the two approaches – visual and textual –
are combined, some main image schemas emerge, here ordered by representativeness according to a
weighted estimate that gives equal weight (50/50) to their evidence in the graph (in terms of recurrence,
spatial centrality and visual structure) and to their relevance in the text (based on frequency, narrative
role and descriptive function):
1. PATH – dominant pattern (30%), since it reflects the longitudinal character of the square and the
strong presence of crossing flows.
2. CONTAINER (25%) – the square is perceived as a large enclosed space that hosts events and
interactions.
3. AXIS (15%) – the central axis, marked by the arrangement of the fountains, is a structuring
element.
4. CENTER-PERIPHERY (12%) – the square has a clear center (the Fountain of the Four Rivers) that
attracts activity, contrasted with less active edges.
5. LINK (8%) – the relationships between the architectural and social elements define a network of
connections.
6. BOUNDARY (5%) – the physical limits, given by the buildings that surround the square, delimit
the urban container.
7. FORCE (5%) – the dynamic forces are less constant but present - for example in the movement of
the crowd or in the water that invades the square in summer.</p>
        <p>Assuming a single image schema could be used to best summarizes Navona square, the most likely
choice would be PATH, which has an estimated probability of 40%, followed by CONTAINER (30%) and
AXIS (15%). This estimate is based on a criterion of relative representativeness, which proportionally
combines the visual centrality of the image schema in the graph (structural presence, directionality,
frequency) and its semantic relevance in the text (narrative function, recurrence, descriptive role). The
other schemas appear less central in comparative terms. A mutual hierarchical evaluation (pairwise)
between the image schemas confirms this preference: PATH is dominant compared to all the others.
Therefore, in decreasing order of importance, other schemas follow: CONTAINER, AXIS,
CENTERPERIPHERY, LINK, BOUNDARY and finally FORCE. As a final note, the VERTICALITY image schema
was not included among the main ones because, although vertical elements are mentioned in the text
(the obelisk, the dome of the church), the graph representation and the experience of the square appear
predominantly horizontal. There are no dominant vertical diferences or developments in the perception
of the urban space, which remains strongly axial and traversable along a longitudinal direction.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Image schemas in Djemaa El-Fna square</title>
        <p>The graph of interactions and flows relative to the Djemaa el-Fna square highlights multiple image
schemas that can shape the possible experience of the urban space. Among these, the dominant image
schema is undoubtedly PATH, highlighted by the multiple lines that represent movements, paths and
lfows of people and activities. Alongside this, the CONTAINER image schema emerges, which describes
the square as an enclosed space, with well-defined areas within it. The connections between elements
stalls, shows, thematic areas - reflect the LINK image schema, which underlines the relationships between
actors and spaces. The presence of a particularly active central area, from which the activities branch
out towards the outside, recalls the CENTER-PERIPHERY image schema. Finally, the intertwining and
simultaneous coexistence of multiple activities and paths brings out the concept of MULTIPLICITY,
typical of chaotic and dynamic spaces like this one. Comparing this analysis of the graph with the text
analysis significant diferences emerge in the perception of the same image schema. In fact, the text
transmits information in a narrative and sequential way, forcing the reader to mentally reconstruct the
spatial relationships. However, even in the text diferent image schemas are clearly recognizable. The
most evident is once again CONTAINER: the square is described as the heart of the medina, a space
that contains activities, people, noises, lights and transformations. CENTER-PERIPHERY follows, since
the square is the focal point from which the souks and streets of the city branch out, and around which
social and symbolic activities revolve. The PATH image schema is implicit in the movements of visitors
who cross the spaces, passing from one function to another, while LINK shows up in the relationships
between functions: trade, entertainment, food and rituality. More subtle patterns can then emerge,
such as SOURCE-PATH-GOAL, which is recognized in the daily transformation of the square – from a
daytime market to an evening theater – and UP-DOWN, suggested by the presence of the minaret that
dominates from above. Finally, BALANCE/FORCE can be perceived in the tension between order and
chaos, between structure and improvisation. Combining graph and text, it is possible to order the image
schemas by degree of representativeness. The most important seems to be CONTAINER, which collects
about a third of the occurrences and reflects the essence of the square as a space that welcomes, contains
and organizes. PATH and the related SOURCE-PATH-GOAL follow, which together represent a quarter
of the spatial dynamics, describing the flow of bodies, goods and meanings. The CENTER-PERIPHERY
pattern covers about 18% of the occurrences, testifying to the central position of the square in urban and
symbolic life. Less representative but still relevant are LINK, UP-DOWN and BALANCE/FORCE, which
complete the layered perception of space. Assuming that only one image schema is to be identified
to represent Djemaa el-Fna square, CONTAINER would have the highest probability of being chosen,
equal to 40%, because of its double evidence, visual and narrative. PATH/SOURCE-PATH-GOAL follows
with 25%, and CENTER-PERIPHERY with 15%. The others would be placed below 10%. A mutual
comparison between the image schemas (according to a “one against one” logic) confirms this hierarchy.
CONTAINER prevails over all the others due to its spatial and conceptual centrality. PATH is placed
second, which is more evident than LINK, CENTER-PERIPHERY and UP-DOWN. The latter, linked to
the vertical perception given by the minaret, is present but secondary. BALANCE/FORCE, finally, is the
most transversal and least structuring schema. As for the theme of VERTICALITY, it is present but not
central. Although the Koutoubia minaret visually dominates the square, its function remains symbolic
rather than organizational. Neither the text nor the graph seem to suggest a vertically hierarchical
spatial or functional structure. Djemaa el-Fna is perceived and experienced as an essentially horizontal,
open, dynamic and reticular space. In short, Djemaa el-Fna appears to be configured as a container
space, crossed by multiple paths, the vital center of the city and a point of connection between activities,
functions and urban identities. The main image schemas not only help to describe it, but also to
understand its symbolic power and its role in the social space.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Analytical comparison between Navona and Djemaa el-Fna squares</title>
        <p>The CONTAINER image schema is strongly represented in both squares, but with diferent nuances.
In Navona square, the notion of container is shown through the clear architectural boundaries and
the role of the square as a closed and organized urban space. The text describes it as a “large urban
theater”, accentuating its aesthetic and scenographic function. In Djemaa el-Fna, however, the concept
of container takes on a more dynamic and social value: the square is described both in the graph and in
the text as a place that welcomes and gathers in a fluid way a multiplicity of activities - from commerce
to shows - and is perceived as the beating heart of the medina. For this reason, while in Navona the
container is more static and architectural, in Djemaa el-Fna it is alive and transformative. The concept of
AXIS is instead clearly present only in Navona square. Here the longitudinal axis is clearly perceptible
both in the visual arrangement (graph) and in the textual description, which highlights the ordered
sequence of the three main fountains along a central path. In Djemaa el-Fna, on the contrary, this image
schema is practically absent. The Moroccan square is not organized along an axis, but shows up as a
reticular and polycentric space, in which the activities are distributed without a prevailing orientation.
Concerning CENTER–PERIPHERY, both urban spaces present a significant centrality, but with diferent
functions. In Navona square the center is symbolically represented by the Fountain of the Four Rivers,
which acts as a visual and architectural node. In the graph, this centrality is underlined by the greater
density of connections that start from this point. In Djemaa el-Fna, instead, the centrality is essentially
functional and social: the square constitutes the heart of the medina, the nerve center of urban life,
from which the streets branch of towards the souks and other neighborhoods. This centrality is less
linked to a single element and more to the overall function of the square.</p>
        <p>The image schema LINK is present in both squares, but with diferent intensity. In Navona, the
connections are mainly spatial and architectural: they are relationships between fountains, churches
and other landmarks, often visually aligned. In Djemaa el-Fna, on the other hand, the links are dense and
varied: connections between diferent functions (market, entertainment, gastronomy), social, cultural
and symbolic relationships. In this case, the square acts as a node of exchange and interaction between
diferent languages and practices, not only as a physical space.</p>
        <p>The notion of BOUNDARY is more evident in Navona square. Its limits are marked by baroque
buildings and historic palaces, which frame the space in a clear way. In the graph, the edges are clearly
represented, while the text underlines the defined perimeter of the square. On the contrary, in Djemaa
el-Fna, the boundaries are more fluid, permeable and less emphasized. The square merges with the
souks and adjacent streets, and there is no clear separation between “inside” and “outside”.</p>
        <p>The image schema SOURCE–PATH–GOAL is particularly present in the textual description of Djemaa
el-Fna, and almost absent in that of Navona square. The Moroccan square is described as a space that
transforms during the day: a market in the morning, a theater in the evening. This transformation
could suggest a functional and temporal path that starts from an initial condition (source), goes through
a change (path), and reaches a final goal (goal), such as the experience of a show or a collective ritual.
Navona square, on the contrary, maintains a greater stability of function and structure.</p>
        <p>As regards the UP–DOWN image schema, linked to verticality, here too a contrast is highlighted.
In Djemaa el-Fna the minaret of the Koutoubia mosque visually “dominates” the square, introducing
an evident, albeit symbolic, vertical dimension. This aspect is recognized both in the text and in the
spatial perception of the place. Navona square, on the other hand, features vertical elements – such as
the obelisk of the central fountain or the domes of the churches – but these do not seem to structure
the perception or use of space, thus appearing accessories to the prevailing horizontal configuration.</p>
        <p>The image schema BALANCE/FORCE is more strongly perceptible in Djemaa el-Fna. The square
is a dynamic place, where order and chaos, stability and improvisation coexist. The text highlights
the simultaneous presence of organized structures (such as markets) and more spontaneous forms
of sociality (shows, crowds). This tension between opposing forces represents a precarious but vital
balance, which characterizes the identity of the place. In Navona square, however, this dimension is
less marked. The movement seems more regulated, the spatial composition more harmonious and
contained.</p>
        <p>Finally, the image schema MULTIPLICITY seems to be absent in Navona square, but central in Djemaa
el-Fna. The Moroccan square simultaneously hosts a plurality of events, people, languages and gestures.
This multiplicity is perceptible both in the graph, which shows a dense and polycentric network, and in
the textual description, which describes the square as a kaleidoscope of activities. Navona square, while
ofering diferent functions (tourism, art, sociality), appears to be experienced in a more sequential and
orderly way, along a traversable axis. Therefore, multiplicity appears to be a structuring quality in
Djemaa el-Fna, but only an accessory quality in Navona.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusions</title>
      <p>This paper presents an approach to the use of image schemas as a methodological contribution to
elicit cognitive and human knowledge and make it accessible to UDTs for planning and urban
decisionmaking. The approach is preparatory to the integration of structural and functional information with
personal and cognitive information that together characterize urban spaces.</p>
      <p>Our experiment concentrated on two diferent squares and revealed that while both share certain
fundamental image schemas (such as container and path), their hierarchical organization difers
significantly in ways that reflect deep cultural and functional logic. Navona square emerges as a space
primarily structured by the path scheme (30%), probably reflecting the longitudinal character and
the Baroque tradition of organized space sequences, with a prominence of the AXIS image schema
(15%) which underlines an emphasis on directional clarity and hierarchical organization. The function
of the square as a well-defined CONTAINER (25%) suggests architectural traditions that favour the
creation of borders and the scenic framework. On the contrary, Djemaa El-Fna reveals a more
characterized space logic characterized by the CONTAINER image schema (30-40%) but in a dynamic and
transformative sense-probably reflective urban traditions of adaptable public space, with
SOURCEPATH-GOAL patterns (25%) that capture the temporal transformations of the ‘market square’ from
market to performative space.</p>
      <p>
        The comparative analysis suggests direct implications for planning and decision-making processes,
revealing the importance of an urban design sensitive to cultural diferences in space cognition and
social organization. For example, the presence of the MULTIPLICITY image schema in Djemaa El-Fna
which is absent in Navona square can indicate how diferent urban traditions organize complexity
and social interaction. This suggests that planning approaches that traditionally emphasize order and
hierarchy can be inadequate for diferent contexts and communities. The time dimension captured
through the SOURCE-PATH-GOAL image schema pattern also suggests that successful public spaces
could be designed for adaptive use rather than a fixed function - a particularly relevant principle for
today’s urban contexts characterized by rapid changes. The intuition we explored in this contribution
is this that in the context of Urban Digital Twins (UDTs), Image Schema information could serve as
a cognitive-semantic layer that enriches the representation of spatial and experiential aspects of the
city. For example, schemas such as CONTAINER or PATH may be used to structure how movement,
accessibility, or boundaries are modeled in digital replicas of urban environments. Integrating such
schemas could improve the interpretability of simulations, support citizen-centered analyses of urban
space, and complement data-driven techniques by grounding them in embodied conceptual structures
together with a knowledge analysed and represented via applied ontology.A follow-up could involve
the extraction of image schemas directly from the participation of a space’s users or the inhabitants
of a place (a direct form of knowledge from which it is often dificult to extract consistent knowledge,
as mentioned). The research evokes important perspectives for applying the Urban Digital Twin
technologies beyond their current limitations in capturing cognitive and social dimensions, ofering
a path towards more knowledge aware and inclusive urban modelling systems. The methodological
potential for simulating future scenarios could allow decision makers to explore not only future physical
impacts but also cognitive and social impacts – so matching very crucial emerging needs nowadays. As
a model of knowledge management, this hybrid approach looks particularly suitable to be integrated
into system architectures to support the interactive and real-time urban decision, allowing systems that
adapt dynamically to upcoming events and occurrences while maintaining methodological consistency.
Recent contributions demonstrate how to build a novel framework that bridges embodied cognition
theory and agent systems by leveraging a formal characterization of image schemas [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. By customizing
LLMs to translate natural language descriptions into formal representations based on these sensorimotor
patterns, can be created a neurosymbolic system that grounds an agent’s understanding in fundamental
conceptual structures [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. In the future we aim to consolidate this pipeline, which seems to be stable
as our simulation with purely textual descriptions suggests, in order to extract combinatorics of image
schemas for squares. We could then present these as framework for a cognition-based representation of
squares to be included in today’s UDT of these areas. As of today, whether this research line can be
consolidated and extended to urban areas in general remains to be verified.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used Claude 4 for: Grammar and spelling check,
Paraphrase and reword; and Perplexity for: automatic image description generation. After using this
tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for
the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Batty</surname>
          </string-name>
          .
          <article-title>Digital twins</article-title>
          .
          <source>Environment and Planning B: Urban Analytics and City Science</source>
          ,
          <volume>45</volume>
          (
          <issue>5</issue>
          ):
          <fpage>817</fpage>
          -
          <lpage>820</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Borgo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Borri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Camarda</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. R. Stufano</given-names>
            <surname>Melone</surname>
          </string-name>
          .
          <source>An Ontological Analysis of Cities, Smart Cities and Their Components</source>
          , volume
          <volume>36</volume>
          of Philosophy of Engineering and Technology, chapter
          <volume>18</volume>
          . Springer, Cham,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>De Giorgis</surname>
          </string-name>
          .
          <article-title>Ethics in the flesh: formalizing moral values in embodied cognition</article-title>
          .
          <source>PhD thesis</source>
          , Università di Bologna,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gangemi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Presutti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. Reforgiato</given-names>
            <surname>Recupero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Nuzzolese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Draicchio</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Mongiovì</surname>
          </string-name>
          .
          <article-title>Semantic web machine reading with fred</article-title>
          .
          <source>Semantic Web</source>
          ,
          <volume>8</volume>
          (
          <issue>6</issue>
          ):
          <fpage>873</fpage>
          -
          <lpage>893</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M. M.</given-names>
            <surname>Hedblom</surname>
          </string-name>
          .
          <article-title>Image schemas: State of the art in spatiotemporal conceptualisation</article-title>
          .
          <source>Image Schemas and Concept Invention: Cognitive, Logical, and Linguistic Investigations</source>
          , pages
          <fpage>33</fpage>
          -
          <lpage>51</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>M. M. Hedblom</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Neuhaus</surname>
            , and
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Mossakowski</surname>
          </string-name>
          .
          <article-title>The diagrammatic image schema language (DISL)</article-title>
          .
          <source>Spatial Cognition &amp; Computation</source>
          ,
          <volume>25</volume>
          (
          <issue>2</issue>
          ):
          <fpage>138</fpage>
          -
          <lpage>175</lpage>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>M. M. Hedblom</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Kutz</surname>
            , and
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Neuhaus</surname>
          </string-name>
          .
          <article-title>Image schemas in computational conceptual blending</article-title>
          .
          <source>Cognitive Systems Research</source>
          ,
          <volume>39</volume>
          :
          <fpage>42</fpage>
          -
          <lpage>57</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G.</given-names>
            <surname>Lakof</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Johnson</surname>
          </string-name>
          . Metaphors we live by. University of Chicago press,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M. R. Stufano</given-names>
            <surname>Melone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Borgo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Camarda</surname>
          </string-name>
          .
          <article-title>Digital twins of cities vs. digital twins for cities</article-title>
          . In A.
          <string-name>
            <surname>Marucci</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Zullo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Fiorini</surname>
          </string-name>
          , and L. Saganeiti, editors,
          <source>Innovation in Urban and Regional Planning</source>
          , pages
          <fpage>192</fpage>
          -
          <lpage>203</lpage>
          , Cham,
          <year>2024</year>
          . Springer Nature Switzerland.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M. R. S.</given-names>
            <surname>Melone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Borgo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Camarda</surname>
          </string-name>
          .
          <article-title>Image schema and ontology-based rules to support planning activities: a study of the urban square</article-title>
          .
          <source>In Proceedings of The Eighth Image Schema Day colocated with The 23rd International Conference of the Italian Association for Artificial Intelligence(AI*IA</source>
          <year>2024</year>
          ), volume
          <volume>3888</volume>
          .
          <source>CEUR Workshop Proceedings</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M. R. S.</given-names>
            <surname>Melone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Borgo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Camarda</surname>
          </string-name>
          .
          <article-title>City interactions in urban planning: The square example from an ontological analysis point of view</article-title>
          .
          <source>In International Conference on Computational Science and Its Applications</source>
          , pages
          <fpage>448</fpage>
          -
          <lpage>455</lpage>
          . Springer,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>M. R. Stufano Melone</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Borgo</surname>
            , and
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Camarda</surname>
          </string-name>
          .
          <article-title>Digital twins facing the complexity of the city: Some critical remarks</article-title>
          .
          <source>Sustainability</source>
          ,
          <volume>17</volume>
          (
          <issue>7</issue>
          ):
          <fpage>3189</fpage>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D.</given-names>
            <surname>Wagg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Worden</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Barthorpe</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Gardner</surname>
          </string-name>
          .
          <article-title>Digital twins: state-of-the-art and future directions for modeling and simulation in engineering dynamics applications</article-title>
          .
          <source>ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems</source>
          , Part B: Mechanical Engineering,
          <volume>6</volume>
          (
          <issue>3</issue>
          ):
          <fpage>030901</fpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>F.</given-names>
            <surname>Olivier</surname>
          </string-name>
          and
          <string-name>
            <given-names>Z.</given-names>
            <surname>Bouraoui</surname>
          </string-name>
          .
          <article-title>Grounding Agent Reasoning in Image Schemas: A Neurosymbolic Approach to Embodied Cognition</article-title>
          .
          <source>arXiv preprint arXiv:2503.24110</source>
          ,
          <year>2025</year>
          . Available at: https://arxiv.org/abs/ 2503.24110.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <article-title>Aerial view of Piazza Navona, Nome del sito web</article-title>
          , Available at: https://www.istantidibellezza.it/ piazza-navona.html
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>[16] Open space of Piazza Navona, Available at: https://www.instagram.com/p/CIMJMdGhywW/</mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <article-title>Nighttime view of Piazza Navona</article-title>
          , Available at: https://www.viator.com/it-IT/Rome/d511-ttd
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <article-title>Aerial view of Djemaa-el-</article-title>
          <string-name>
            <surname>fna</surname>
          </string-name>
          , Available at: https://www.marrakesch.com/sehenswurdigkeiten/ djemaa-el-fna/
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <article-title>Night view of Djemaa-el-</article-title>
          <string-name>
            <surname>fna</surname>
          </string-name>
          , Available at: https://www.marrakesch.com/sehenswurdigkeiten/ djemaa-el-fna/
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <article-title>Open space of Djemaa-el-</article-title>
          <string-name>
            <surname>fna</surname>
          </string-name>
          , Available at: viaggiaescopri.it,https://www.linkedin.com/pulse/ impact
          <article-title>-economique-du-coronavirus-quels-enseignements-pour-maktoum/?originalSubdomain= fr</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>