<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>GPT-4-Based LLMs⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ahmed Mansour</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wu Chen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mahmoud Adham</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Huan Luo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University</institution>
          ,
          <country country="HK">Hong Kong</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Public Works Department, Faculty of Engineering, Cairo University</institution>
          ,
          <addr-line>Giza</addr-line>
          ,
          <country country="EG">Egypt</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Crowd-powered indoor positioning systems (IPS) ofer scalable and cost-efective solutions for building and updating radio maps. However, they face challenges including limited calibration resources, unreliable signal labeling, and incomplete semantic annotations. Passive data collection reduces user burden but often lacks suficient contextual cues to ensure localization accuracy. Auxiliary aids such as GNSS, QR codes, or BLE beacons can support calibration but are confined to specific deployment zones, and GNSS is inefective in many indoor environments. Active user engagement, by contrast, can provide scalable annotation and calibration if prompts are delivered in a timely, context-aware manner and crafted sensitively to user preferences and cognitive load, so as to avoid fatigue and disengagement. This work is motivated by unobtrusive engagement strategies employed in platforms like Google Maps, where users are prompted to report transit conditions during navigation, and by similar feedback mechanisms on YouTube and Facebook that refine recommendation algorithms. We propose the first modular, AI-guided prompting framework for unobtrusive spatial feedback collection in crowdsourced IPS, enabling users to confirm location estimates, floor levels, and Points of Interest (POI) names without disrupting primary tasks. The framework comprises five interoperable layers: contextual situation assessment, intelligent prompt selection, hybrid prompt generation, user interaction handling, and feedback integration through continuous learning. Spatial knowledge graphs (SKGs) are introduced to embed semantic context into prompt logic, while large language models (LLMs) such as GPT-4 generate linguistically optimized, user-specific queries. By selectively prompting users at opportune moments, the system transforms sporadic passive data into semantically rich, trustworthy inputs with minimal disruption.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Crowdsourced indoor positioning</kwd>
        <kwd>user engagement</kwd>
        <kwd>prompt generation</kwd>
        <kwd>radio map construction</kwd>
        <kwd>semantic annotation</kwd>
        <kwd>spatial knowledge graph</kwd>
        <kwd>large language models (LLM)</kwd>
        <kwd>GPT-4</kwd>
        <kwd>unobtrusive interaction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Indoor positioning systems (IPS) have emerged as a critical technology enabling numerous applications
ranging from indoor navigation and location-based services to emergency response and smart building
management. Traditionally, accurate IPS solutions—particularly fingerprinting-based approaches—have
relied heavily on meticulous site surveys, extensive infrastructure deployment, and manual calibration
processes [1, 2]. These methods, while precise, are costly, labor-intensive, and impractical for
largescale or dynamically changing environments. Consequently, crowdsourced approaches leveraging
smartphones’ ubiquity and integrated sensors (Wi-Fi, magnetometers, barometers, etc.) have gained
traction in recent years. By capitalizing on crowd-powered data collection, IPS can substantially reduce
costs, scale more efectively, and adapt dynamically to environmental changes. However, despite these
advantages, crowdsourced data introduces new challenges related to data reliability, user compliance,
and annotation accuracy. One fundamental issue with crowdsourced IPS data is the lack of controlled
conditions for data collection [3]. Users typically contribute passively, unaware of or uninterested in
ensuring the data’s accuracy or completeness. This often leads to noisy, inconsistent, or incomplete
datasets that undermine the reliability of the resulting positioning models. Moreover, the absence of
semantic annotations—such as precise floor identification, building labels, or POI tags—complicates an
IPS’s ability to generate rich, contextually meaningful maps. As a result, IPS built solely on passive
crowdsourced data can exhibit substantial localization errors and uncertainty [4]. To mitigate these
challenges, auxiliary aids such as GNSS signals, QR codes, or BLE beacons have been suggested to
support calibration in passive data collection approaches. These aids, however, are confined to specific
deployment zones, and GNSS remains inefective in many indoor or underground areas. Recent research
in other domains has highlighted the need for active user engagement to improve crowdsourced
data quality. Traditional methods of actively soliciting user input (e.g., periodic surveys or pop-up
queries) often face resistance due to inconvenience or fatigue, resulting in low response rates and
limited scalability. The critical problem, therefore, is balancing the need for accurate, reliable, and
contextually rich data against the requirement of minimal user disruption. Achieving this balance
necessitates innovative solutions inspired by successful digital platforms that seamlessly integrate
unobtrusive interactions into user workflows, fostering high engagement without negatively afecting
user experience.</p>
      <p>Platforms like YouTube, Facebook, and Google Maps have demonstrated that brief, contextually
relevant prompts can elicit significant user engagement with minimal disruption. YouTube’s
contextsensitive video recommendations and prompts efectively guide user interaction, enhancing both
satisfaction and platform metrics. Similarly, Facebook employs brief, unobtrusive questions directly
integrated into users’ browsing flows, enabling efortless user feedback without interrupting primary
activities. In navigation scenarios, Google Maps’ implementation of simple prompts—for instance,
asking users to assess transit crowdedness during a journey—has proven efective in collecting real-time
user-generated data while maintaining a smooth navigation experience. Indeed, unobtrusive user
engagement has driven valuable crowd inputs in domains like trafic and transit monitoring [5].</p>
      <p>Unlike these well-studied domains, the context of crowd-powered IPS has lacked dedicated studies
on unobtrusive user engagement strategies. To fill this gap, and motivated by the efectiveness of such
strategies elsewhere, this paper proposes a structured, AI-driven framework tailored for IPS data quality
enhancement through user feedback that minimizes disruption. The proposed framework is designed to
seamlessly engage users in contributing spatial feedback—such as confirming location estimates, floor
labels, or POI names—to support accurate indoor radio map construction and semantic annotations.</p>
      <p>
        Our framework comprises five interoperable layers: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) contextual situation assessment, (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) intelligent
prompt strategy selection, (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) prompt generation, (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) user interaction and feedback capture, and (
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
feedback integration with continuous learning. It balances system requirements with user attention by
adapting to motion state, data uncertainty, and individual user preferences. Spatial Knowledge Graphs
(SKGs) are introduced at the core of the system’s intelligence to embed semantic awareness into prompt
selection and generation. Additionally, large language models (LLMs) such as GPT-4 are leveraged to
generate linguistically optimized and context-sensitive prompts tailored to each user. By selectively
engaging users at opportune moments and in context-appropriate ways, the system transforms sporadic
passive data into semantically rich, trustworthy inputs with minimal disruption.
      </p>
      <p>This paper presents the design of the AI-driven prompting framework and discusses each of its
components in detail. We also report on a preliminary evaluation through simulation, demonstrating
the framework’s potential to improve user engagement and data quality. Future work will extend this
research with full system implementation on real devices and empirical user studies. The rest of this
paper is organized as follows: Section 2 describes the proposed framework architecture and methodology,
detailing the functionality of each layer. Section 3 presents an evaluation of the framework’s performance
in simulated scenarios, including user engagement rates and map improvement metrics. Finally, Section
4 concludes the paper and outlines directions for future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Proposed Framework and Methodology</title>
      <sec id="sec-2-1">
        <title>2.1. Contextual Situation Assessment (When to Ask)</title>
        <p>The first layer of the framework assesses whether a user should be prompted at a given moment. In
crowd-powered IPS, indiscriminate prompting is counterproductive—it leads to user fatigue, reduced
compliance, and potential disengagement. Thus, Layer 1 is pivotal in ensuring that prompts are issued
only when necessary for improving the system and when they are unlikely to disrupt the user’s primary
task. This situation assessment continuously monitors both the state of the system (e.g., localization
uncertainty, map coverage) and the state of the user (motion, activity, history of prompts) to make
intelligent decisions about prompting. The overall decision logic and key contextual inputs guiding this
layer’s evaluations are illustrated in Figure 2.</p>
        <p>
          As shown in Figure 2, the decision to trigger a prompt is informed by several key criteria. The first
concerns radio map coverage, which refers to the availability (or lack) of existing signal fingerprints at
the user’s estimated location. If the current location is under-represented in the radio map—such as
when few or no prior Wi-Fi scans exist for this area—the system may prompt the user to contribute data
or verify their position. The suficiency of radio data at a given location  is quantified by a normalized
signature count, defined as
sig() = |()| ,
|max|
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
where |()| denotes the number of sensor readings (for example, Wi-Fi RSS fingerprints) collected at
 and |max| represents a reference value corresponding to well-surveyed locations. If sig() falls
below a threshold  , indicating sparse data, the likelihood of triggering a prompt at this location
increases. A second criterion is the presence of semantic label gaps, which evaluates the necessity or
the likelihood of semantic annotations for the current estimated place. Missing labels, such as the
lfoor number, building ID, or POI name, highlight opportunities where user input could meaningfully
enhance the system’s understanding.
        </p>
        <p>The system also accounts for the user’s motion mode, classified using device sensors into states such
as stationary, walking, running, or undergoing an elevator or stair transition. Prompts are ideally
delivered when the user is stationary, moving slowly, or immediately after completing a significant
transition like a floor change. Active movements, especially rapid walking or navigation that demands
the user’s attention, lead to prompt suppression. A binary condition formally captures this:
Promptmotion =
⎧⎨True, if motion state ∈ {stationary, just stopped, floor change },
⎩False, if motion state ∈ {walking, running, driving}.</p>
        <p>
          Only when Promptmotion evaluates to true does the system proceed to consider prompting the user.
Beyond physical movement, the framework incorporates the user activity state by monitoring current
device engagement through activity recognition APIs or analysis of the foreground application context.
If the user is actively engaged in tasks that demand attention—such as being on a phone call, typing a
message, or gaming—the prompt is deferred to avoid disruption. This decision uses predefined sets:
promptable, which includes activities conducive to prompting (such as using a maps app or being idle
on the home screen), and busy, which contains activities like video watching or extensive typing. A
binary flag is then defined as
act =
⎧⎨1, if  ∈ promptable,
⎩0, if  ∈ busy,
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
where the system will only consider triggering a prompt if act = 1 at that particular moment. Moreover,
prompt timing and historical context are integral to prevent overwhelming the user. Letting last denote
the timestamp of the most recent prompt shown or answered by the user, a new prompt is only generated
if the current time now satisfies now − last &gt;   , where   is a configurable timeout parameter, typically
set to several minutes or hours depending on the application’s requirements. This spacing ensures
that prompts are not clustered too closely together. Additionally, users may have personalized settings
that limit the number of prompts per day. If a user specifies—or the system infers—a preference for at
most  prompts daily, the framework respects this by enforcing an additional gating mechanism that
suppresses further prompts once this limit is reached.
        </p>
        <p>The system also adapts based on the user’s engagement history. For users who consistently ignore
or dismiss prompts of a certain type, the model deprioritizes those prompts or raises the required
significance threshold (such as uncertainty or semantic gaps) before issuing them again. Conversely, if
a user reliably responds to specific kinds of queries, the system may preferentially select these when
appropriate. This layered decision-making process ensures that prompt delivery aligns with moments
that are both critical for system information and appropriate for the user, laying a solid foundation for
unobtrusive engagement within Layer 1 of the framework.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Prompt Strategy Engine (What to Ask)</title>
        <p>Once the framework determines it is an opportune moment to prompt, Layer 2 selects the most relevant
question to ask. In a crowd-powered IPS, some prompts (like confirming a floor or naming an unknown
POI) can substantially improve the system, while redundant or irrelevant queries risk user annoyance.
Thus, the engine combines rule-based logic, uncertainty estimation, and learning-based policies to
prioritize prompts.</p>
        <p>It first evaluates the current uncertainty vector across dimensions such as position, floor, or building,
selecting the highest uncertainty component to target. It then checks for missing semantic information,
for example prompting to label an “unknown” room. The Spatial Knowledge Graph (SKG) ensures
contextual relevance by anchoring prompts to nearby known entities, preventing questions about
unrelated floors or wings. The engine also integrates historical trends and user profiles , prioritizing
queries in areas flagged by prior user corrections or tailoring prompt complexity to user expertise. Each
candidate prompt  is scored by</p>
        <p>() = () −  (),
where () measures expected uncertainty reduction and () the user efort, balanced by  . Prompts
are ranked by  () to maximize utility. Over time, a multi-armed bandit and Q-learning approach refine
these choices based on observed user interactions, balancing exploration and exploitation. Redundancy
iflters further prevent asking for data that has already confirmed. This ensures every prompt contributes
meaningfully to improving IPS coverage and accuracy.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Prompt Generation (How to Ask)</title>
        <p>Layer 3 formulates how to phrase the selected question, aiming for clarity, brevity, and contextual
appropriateness. Since prompt wording strongly afects user engagement, our framework combines
predefined templates with adaptive large language models (LLMs) to generate natural, tailored queries.
We employ three strategies (Figure 3):
• Template-Based: Direct patterns with placeholders, e.g., “Are you on Floor [X]?” ensure consistency
and low computational cost, ideal for straightforward yes/no or multiple-choice queries.
• LLM-Based: In complex or nuanced contexts, the system feeds situational data into an LLM to
generate polite, context-aware questions like “This area isn’t labeled—do you know what it’s used
for?” enhancing engagement for open-ended prompts.
• Hybrid: Starts from a template but lets the LLM refine phrasing based on context, balancing
reliability with conversational tone (e.g., after elevator use: “Just to confirm, are you now on
Floor 5?”).</p>
        <p>This selection follows:</p>
        <p>Promptfinal =
⎧Template(),
⎪
⎪
⎨</p>
        <p>LLM(),
if simple context
if complex or novel context
⎪
⎪⎩Hybrid(Template, ), if needing nuance</p>
        <p>
          Localization ensures prompts use the user’s language and local terms (e.g., “ground floor” vs. “first
lfoor”), while personalization tailors tone by learning user preferences—some get concise checks, others
a friendly nudge. All LLM outputs undergo validation to avoid of-topic or verbose phrasing, falling
back to templates if necessary. In sum, Layer 3 blends deterministic and AI-generated approaches to
maximize user response quality, essential for high-fidelity data collection in the IPS.
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          )
Template
        </p>
        <p>Template + LLM</p>
        <p>LLM
Prompt:
- Are you on Floor
5?
- Is this Shop A?</p>
        <p>Context Inputs</p>
        <p>Missing Info</p>
        <p>User State
Prompt:
- You might be in
Shop A, is that
correct?
- Are you now on
Floor 5?</p>
        <p>User Feedback
Yes/No, Label
 = (, )</p>
        <p>Prompt:
- Looks like a floor
change—
are you on Floor
5?
- Can you confirm
if this is Shop A?</p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Spatial Knowledge Graph Integration</title>
        <p>A core innovation of our framework is the Spatial Knowledge Graph (SKG), which provides semantic
context and consistency checks for prompting. The SKG is modeled as a directed multigraph
where each node  ∈  represents a spatial entity (room, floor, building), and each edge (→−   ) ∈ 
captures topological, containment, proximity, or functional relationships. This structure, initially
derived from floor plans or BIM data, is dynamically refined as users contribute new labels and relations
(see Layer 5).</p>
        <p>The SKG ensures prompts are contextually valid. For example, if a user is in Room B on Floor 1 of
Building X, the system knows nearby entities like Room C or the Library and might ask about these if
unlabeled, while avoiding irrelevant queries about distant locations such as Ofice G on Floor 2 unless
recent sensor data suggests a transition. This preserves user trust by preventing nonsensical prompts.</p>
        <p>Formally, we compute a contextual relevance score
(6)
(7)
(candidate, current)
that is high for short graph paths indicating close spatial or semantic proximity. The prompt strategy
engine only considers candidates satisfying (, current) &gt;  . This enables the system to prioritize
questions about nearby or related entities. Moreover, the SKG aids in validating user feedback. If a
user labels a location inconsistently with known floor numbering or spatial layout, the system detects
such anomalies and may trigger follow-ups or delay integration. As users provide new place names or
relations, the SKG expands, enhancing prompt selection and interpretation. Thus, the SKG acts as the
semantic backbone of the framework, connecting individual user inputs into a coherent spatial model
that guides intelligent prompting.</p>
      </sec>
      <sec id="sec-2-5">
        <title>2.5. User Interaction, Feedback Capture, and Continuous Learning</title>
        <p>Layer 4 ensures prompts are delivered unobtrusively through adaptive UI elements like overlays or
notifications that adjust to user context—larger text or speech output if walking, detailed widgets when
stationary—and favor simple one-tap inputs that achieved over 86% compliance in our evaluations.
Ignored or dismissed prompts are treated as signals, with deferred retries governed by back-of to
minimize fatigue. Multi-modal inputs are normalized and validated, with even partial or uncertain
responses used as weak evidence. Feedback immediately updates operational models, such as radio
Second Level</p>
        <sec id="sec-2-5-1">
          <title>Elevator</title>
          <p>p
u
pathway</p>
        </sec>
        <sec id="sec-2-5-2">
          <title>Ofice S</title>
          <p>near</p>
        </sec>
        <sec id="sec-2-5-3">
          <title>Laboratory</title>
        </sec>
        <sec id="sec-2-5-4">
          <title>Library</title>
        </sec>
        <sec id="sec-2-5-5">
          <title>Ofice E</title>
          <p>near</p>
        </sec>
        <sec id="sec-2-5-6">
          <title>Ofice F</title>
          <p>near
Ofice G
near
near
w
o
l
e
b</p>
        </sec>
        <sec id="sec-2-5-7">
          <title>Elevator</title>
          <p>pathway</p>
        </sec>
        <sec id="sec-2-5-8">
          <title>Room B</title>
          <p>near</p>
        </sec>
        <sec id="sec-2-5-9">
          <title>Room C</title>
          <p>near</p>
        </sec>
        <sec id="sec-2-5-10">
          <title>Cafeteria</title>
          <p>First Level</p>
        </sec>
        <sec id="sec-2-5-11">
          <title>Room A</title>
          <p>maps and SKGs, while also populating engagement logs that inform Layer 5’s adaptive strategy. Here,
reinforcement learning and bandit algorithms dynamically adjust prompt types, timing, and phrasing,
balancing information gain against user burden. Validated inputs calibrate barometric floor detection,
refine semantic labels, and strengthen or adjust fingerprints, all weighted to mitigate erroneous or
malicious data. This integrated approach establishes a sustainable loop where user interactions
progressively enhance IPS accuracy and personalization, improving both functional reliability and user
experience over time.</p>
        </sec>
      </sec>
      <sec id="sec-2-6">
        <title>2.6. Experimental Setup</title>
        <p>Because deploying a live system to a large user base was beyond the scope of this initial study, we built
a simulation environment that models user movement, sensor readings, and user behavior in response
to prompts. The simulated environment comprised a two-story building with 20 distinct locations of
interest (rooms, corridors, POIs), some fully labeled and some initially unknown. A set of 50 virtual
users was generated, each with a profile of responsiveness and movement patterns. Environment Model:
Each floor of the building was represented by a grid with certain nodes designated as named locations
(e.g., Lab, Ofice, Elevator area). A Wi-Fi radio map was synthetically generated: certain grid points had
associated Wi-Fi fingerprints, and signal strengths were perturbed with noise. Initially, about 60% of
the grid had suficient fingerprint data; the rest were “sparse” areas (to test how the system handles low
coverage). Semantic labels for about half of the points of interest were provided, while others started as
unknown to simulate missing information. We also defined a spatial knowledge graph for the building
(similar to Figure 4), encoding which rooms were connected or adjacent. Simulated User Trajectories:
Each user was assigned random start points and movement patterns. Some followed regular routes
(e.g., repeatedly going from the entrance to a particular ofice), while others roamed more randomly.
We introduced events like elevator usage (to test floor transitions) and pauses (to simulate stopping in a
hallway or room for a while). The simulation ticked in discrete time-steps, and at each step, each user
could either move to a neighboring cell, stay still, or change activity (some users “checked their phone”
at certain intervals, etc.). User Prompt Response Model: We modeled user behavior in a probabilistic
manner. Each user had a base willingness to respond to prompts (ranging from 20% to 90% chance to
respond when prompted, reflecting diferent engagement levels). This probability was dynamically
adjusted by factors such as current activity (if the user was “busy” in the simulation, response probability
dropped near 0), prompt frequency (if they had been prompted recently, probability dropped due to
fatigue), and prompt type (we assumed users are more likely to respond to easier prompts: yes/no &gt;
multiple choice &gt; text input). If a user decided to respond, the response content was generated based
on ground truth with some chance of error (e.g., 5% chance they hit the wrong button or gave a wrong
label by mistake, adding noise).</p>
      </sec>
      <sec id="sec-2-7">
        <title>2.7. Prompt Triggering Performance</title>
        <p>We evaluated the framework’s ability to trigger prompts precisely when needed, guided by the Layer 1
situation assessment that ensures prompts are issued at the right locations and times—specifically where
data gaps exist and the user context is appropriate. In our simulation, the adaptive system generated
only 420 prompts, compared to 1200 by the naive baseline, yet it efectively targeted approximately 75%
of critical missing information opportunities. By contrast, the baseline covered only about 50%, often
wasting prompts at irrelevant or poorly timed moments. These findings underscore the importance of
Layer 1’s context-sensitive logic in delivering prompts when they are most likely to yield informative,
non-intrusive responses. Future real-world trials will be essential to measure actual response rates and
verify that this context-driven approach sustains high engagement and data quality beyond simulation.</p>
      </sec>
      <sec id="sec-2-8">
        <title>2.8. Efect of the Spatial Knowledge Graph on Prompt Generation</title>
        <p>The integration of the Spatial Knowledge Graph (SKG) proved critical in enhancing both the relevance
and clarity of generated prompts. By encoding topological, containment, and proximity relationships
among spatial entities—such as rooms, corridors, floors, and POIs—the SKG ensures that prompts are
tightly coupled to the user’s immediate context. As shown in Table 1, without SKG support, the system
might issue generic or even spatially inconsistent queries (for example, asking “Is this Room 305?”
when the user is actually on Floor 2 where only 200-series rooms exist). In contrast, SKG-informed
prompts leverage the graph to filter candidate questions to those that make semantic sense, such
as distinguishing between Ofice 201 or 202 when the user is known to be on Floor 2. Moreover,
the SKG enables more natural and persuasive phrasing by incorporating nearby landmarks into the
question. For instance, instead of a blunt “What is the name of this room?”, the system can ask “Near the
Library—could you tell us what this adjacent room is called?”, which not only improves clarity but also
signals to the user that the system understands the environment. This contextual anchoring was also
instrumental in validating user feedback; when a response conflicted with known SKG structures (such
as labeling a space “Room 101” on a floor without 100-series rooms), the system could gracefully initiate
follow-up checks. Overall, the SKG improved prompt targeting by reducing irrelevant or confusing
questions, increased linguistic adaptability by embedding local context into phrasing, and thereby
directly contributed to higher response quality and user trust.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Conclusion</title>
      <p>In this paper, we proposed the first framework for unobtrusive active user intervention in
crowdpowered indoor positioning systems through intelligent prompt generation. Our innovative approach
combines large language models (LLMs) and a Spatial Knowledge Graph (SKG) to dynamically determine
not only what to ask users, but also how and when to do so—ensuring prompts are contextually relevant,
timely, minimally disruptive, and ultimately user-friendly. The framework explicitly incorporates
multiple essential aspects, including spatial semantics for consistency, adaptive language for natural
engagement, fatigue-aware pacing, and personalized prompting strategies, all designed to maximize the
quality and quantity of user contributions without imposing undue burden. To preliminarily evaluate
our approach, we conducted a set of simulation experiments that demonstrated how the framework
efectively triggers prompts at appropriate moments and locations, while highlighting the added value
User near Library en- “What is the name of this room?” “Near the Library—could you tell us what this
trance, unlabeled adja- adjacent room is called?”
cent room
User on Floor 2 in Of- “Is this Room 305?” (possible mis- “In this ofice area on Floor 2, is this Ofice 201
fice cluster, but ambigu- match; Room 305 may be on another or 202?” (filtered by SKG floor associations)
ous about exact ofice floor)
User standing in a corri- “What type of place is this?” “Next to the Cafeteria—would you classify
dor next to a Cafeteria this space as a corridor or seating area?”
User feedback says Accepts the label without question, Follows up: “We usually see Room 101 on
“Room 101” on a floor risking inconsistency Floor 1—could you check the floor number?”
without 100-series
rooms
User stopped at an unla- “Please provide a label for this place.” “On Floor 3 near the elevators, do you know
beled POI on Floor 3 what this place is called or used for?”
of the SKG in guiding both prompt selection and phrasing. These early results validate the potential of
our design to achieve higher engagement rates and more targeted data collection compared to naive
prompting methods. Looking ahead, this work lays the foundation for an ongoing and more extensive
investigation. We plan to deploy the framework in real-world environments and perform rigorous
assessments of its impact on user response rates, positioning accuracy, and overall system improvement,
complemented by user surveys to gauge subjective experience. Through this line of research, we aim to
establish a robust pathway for leveraging intelligent, human-in-the-loop interactions to continuously
refine and enhance indoor positioning systems at scale.</p>
    </sec>
    <sec id="sec-4">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) employed ChatGPT-4o and Grammarly to improve
writing quality, grammar, and language, as well as for general proofreading. All ideas, methodologies,
and architectural frameworks have been independently developed and proposed by the author(s).
After utilizing these tools, the author(s) carefully reviewed and refined the content and assume full
responsibility for the integrity and accuracy of the publication.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Pérez-Navarro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Torres-Sospedra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Montoliu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Conesa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Berkvens</surname>
          </string-name>
          , G. Caso,
          <string-name>
            <given-names>C.</given-names>
            <surname>Costa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Dorigatti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Hernández</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Knauth</surname>
          </string-name>
          , et al.,
          <article-title>Challenges of fingerprinting in indoor positioning and navigation, in: Geographical and Fingerprinting Data to Create Systems for Indoor Positioning</article-title>
          and Indoor/Outdoor Navigation, Elsevier,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mansour</surname>
          </string-name>
          , W. Chen,
          <article-title>Suns: A user-friendly scheme for seamless and ubiquitous navigation based on an enhanced indoor-outdoor environmental awareness approach</article-title>
          ,
          <source>Remote Sensing</source>
          <volume>14</volume>
          (
          <year>2022</year>
          )
          <article-title>5263</article-title>
          . URL: https://doi.org/10.3390/rs14205263.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E. S.</given-names>
            <surname>Lohan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Torres-Sospedra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Leppäkoski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Richter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Huerta</surname>
          </string-name>
          ,
          <article-title>Wi-fi crowdsourced ifngerprinting dataset for indoor positioning</article-title>
          ,
          <source>Data</source>
          <volume>2</volume>
          (
          <year>2017</year>
          )
          <fpage>32</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mansour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Weng</surname>
          </string-name>
          , W. Chen,
          <article-title>Everywhere: A framework for ubiquitous indoor localization</article-title>
          ,
          <source>IEEE Internet of Things Journal</source>
          <volume>10</volume>
          (
          <year>2023</year>
          )
          <fpage>5095</fpage>
          -
          <lpage>5113</lpage>
          . URL: https: //doi.org/10.1109/JIOT.
          <year>2022</year>
          .
          <volume>3222003</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ayub</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shiraz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ullah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Qureshi</surname>
          </string-name>
          ,
          <article-title>Trafic eficiency models for urban trafic management using mobile crowd sensing: A survey</article-title>
          ,
          <source>Sustainability</source>
          <volume>13</volume>
          (
          <year>2021</year>
          )
          <fpage>13068</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>