<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Just-in-Time Interfaces for eXtended</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Luca Cordioli</string-name>
          <email>luca.cordioli@polimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Supervisor: Maristella Matera</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Reality, Human-Computer Interaction, Proactive Agents</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Large Language Models</institution>
          ,
          <addr-line>Multimodal Interaction, Memory-Augmented Agents, Just-in-Time Interfaces, Extended</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Politecnico di Milano</institution>
          ,
          <addr-line>Piazza Leonardo da Vinci 32, 20133 Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <fpage>6</fpage>
      <lpage>10</lpage>
      <abstract>
        <p>This doctoral research investigates multimodal, memory-augmented agents for Extended Reality (XR) to address current deficiencies in integrating generative AI, such as Large Language Models (LLMs), for improving the interaction through multimodality, personalization, and adaptivity. The research wants to investigate how to synthesize just-in-time, context-aware interfaces and integrate persistent memory systems. The primary objective is to identify models and technological architectures enabling the development of proactive, embodied XR assistants capable of mitigating cognitive load and enhancing user interaction through dynamic user interface synthesis, memory architectures, and anticipatory agent behaviors.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Reality⋆</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Over the past few years, Large Language Models (LLMs) have rapidly evolved till reaching multimodal
capabilities, with models that accept both image and text inputs and achieve human-level performance
on professional benchmarks [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In parallel, the Extended Reality (XR) ecosystem is expanding rapidly,
enabling spatial, embodied, and perceptual interaction. These technological advancements are enabling
new forms of interaction by enhancing the ability to understand user intents, interpret usage contexts,
and adapt interaction paradigms accordingly. Research in Human-Computer Interaction (HCI) has long
argued for interaction paradigms presenting only the controls required in situ. Thanks to the new AI
models’ capability, the concept of ephemeral user interfaces, composed of UI elements “intentionally
created to last for a limited time only” and tailored to their physical or digital milieu, now captures this
aspiration [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Nonetheless, most LLM systems are still limited to fixed chat window interactions.
      </p>
      <p>
        Recent proposals suggest evolving traditional language assistants into agentic systems capable of
perceiving environments and performing actions through dynamic interfaces, such as Rabbit’s
GUIfurther facilitates this shift by standardizing inter-agent memory and tool interoperability [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>The convergence of advanced AI, XR technologies, and action-oriented agents opens new
opportunities for adaptive, just-in-time interfaces that cab be centered on human needs while automating tasks.
My research explores this space by developing proactive, multimodal XR agents that interpret user
input, generate context-aware ephemeral interfaces, and personalize interactions through structured
long-term memory. This approach aims to anticipate user intent, adapt to context, and reduce cognitive
load for more natural interactions.</p>
      <p>This paper discusses the motivations and the main steps of this research, and is organized as follows:
Section 2 discusses the motivations and the objectives of my research. Section 3 illustrates the main
related work, and Section 4 proposes research directions to address current lacks. Section 5 finally
draws the conclusions.</p>
      <p>CEUR
Workshop</p>
      <p>ISSN1613-0073</p>
    </sec>
    <sec id="sec-3">
      <title>2. Motivations and Objectives</title>
      <p>
        This research positions XR as a hub for multimodal interaction, integrating gestures, voice, and spatial
context to enable personalized, adaptive interfaces beyond purely verbal interaction, as discussed below.
Limitations of Verbal-Only Interaction. Current interactions with LLMs predominantly rely on
text-based conversational interfaces. Although intuitive for basic information retrieval tasks, relying
solely on textual or verbal communication introduces significant cognitive load by forcing users
to serialize complex, multimodal intentions into linear text [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. This inadequacy is emphasized in
XR scenarios, where traditional input modalities (e.g., text or speech) often fail to support efective
interaction due to spatial, cognitive, and contextual challenges [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        Decades of research in HCI have consistently demonstrated that multimodal interfaces, combining
speech, gesture, spatial visualization, and tactile feedback, substantially reduce cognitive load, lower
error rates, and accelerate task completion compared to unimodal interactions [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <sec id="sec-3-1">
        <title>Shallow Personalization and Lack of Adaptivity. Existing LLM deployments exhibit limited</title>
        <p>
          personalization, typically treating interactions as isolated episodes without meaningful longitudinal
adaptation. Current models lack stable and structured long-term memory, hindering persistent user
profiling and contextual awareness [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
        <p>
          In XR, deep adaptivity is essential. Multimodal inputs enable rich user-agent interaction, yet current
eforts—like cognitive load–aware interfaces [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] and adaptive training systems [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]—lack long-term
preference learning. Embedding structured, user-controlled memory remains a key research challenge.
Restricted Interaction Flexibility in XR Environments. Current XR systems struggle with
integrating and semantically enriching multimodal inputs. While modalities like gesture, speech, and gaze
are common, they remain limited to rigid vocabularies and shallow interpretations. Surveys highlight
that collaboration in AR/VR/MR still relies on inflexible paradigms [ 11]. Prototypes such as LLMR show
promise by using language models to generate XR scenes, but remain weak in real-world use due to
unpredictability, limited precision, and performance constraints [12]. Unlocking XR’s full potential
requires multimodal frameworks that fuse inputs in real-time and adapt to user context and intent.
        </p>
        <p>These gaps call for rethinking LLM interactions: adaptive multimodal interfaces, dynamically shaped
by agentic LLMs and grounded in persistent, user-controlled memory, are key to building truly
humancentered AI in XR.</p>
      </sec>
      <sec id="sec-3-2">
        <title>2.1. Research Objectives</title>
        <p>My project pursues three tightly coupled goals. First, I will formalize ephemeral interface synthesis as
a constrained generation problem in which an AI agent assembles just-in-time multimodal widgets,
building on early evidence that language models can already reason about user intent to shape
taskspecific UIs [ 13]. Second, I will endow the agent with a privacy-aware, persistent memory substrate that
combines vector retrieval with symbolic policies, extending recent long-term-memory architectures
such as MemoryBank and LongMem that demonstrate continual recall and update across sessions
[14, 15]. Third, I will integrate these capabilities into an embodied, proactive assistant that leverages
spatial, visual, and verbal cues native to XR to anticipate user needs and autonomously execute actions,
drawing on design principles from proactive context-aware chatbots [16].</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Related Work</title>
      <p>LLM-Generated Interfaces. Recent research has explored leveraging LLMs to dynamically generate
user interfaces tailored to user intentions and context. The Bespoke system utilizes LLM agents
to produce just-in-time interfaces by directly reasoning about user intent, enhancing task-specific
interactions [13]. Similarly, the Large Language User Interface (LAUI) paradigm leverages LLMs to
proactively suggest adaptive interface elements, anticipating user needs and dynamically adjusting
interactions [17]. OpenAI’s Operator illustrates practical applications, where an LLM autonomously
navigates existing web interfaces to accomplish user-defined tasks, demonstrating both feasibility and
robustness in real-world scenarios. ReactGenie complements these approaches by leveraging LLMs to
interpret multimodal (voice + touch) user commands and map them to developer-defined UI components
and state abstractions, enabling rich, context-aware interactions with minimal development efort [ 18].
Most recently, Cao et al. introduced a model-driven pipeline for generative and malleable user interfaces,
where task-specific data models generated via LLMs serve as the foundation for dynamically constructed
and user-customizable UIs [19].</p>
      <p>These systems provide important contributions to prove the feasibility of generating intent-based
interfaces felxibly. However, they remain limited to desktop and web-based interactions, lack
continuous real-time adaptations, and do not address immersive spatial afordances or long-term user
personalization, all of which my research aims to fill.</p>
      <p>Memory Systems for LLMs. Persistent memory systems are vital for long-term personalization and
adaptivity in LLM interactions. MemoryBank implements structured long-term memory, allowing LLMs
to recall and utilize relevant user-specific and contextual information [ 14]. Similarly, Jones et al. [16]
demonstrate the importance of contextual and persistent user information in a proactive, goal-oriented
chatbot. Their system continuously integrates environmental and personal context (e.g., location, time,
weather, user goals) to proactively recommend personalized actions, highlighting how persistent and
adaptive memory can improve user engagement and efectiveness in long-term goal pursuit.</p>
      <p>Despite their advances, these memory systems primarily handle textual data, lack multimodal
integration, and ofer limited user control over memory retention policies. My research extends these
approaches to include structured multimodal memory with explicit user control in XR environments.</p>
      <sec id="sec-4-1">
        <title>Interaction in XR Environments and Multimodal Agents. Integrating LLM-based agents into XR</title>
        <p>environments significantly enhances multimodal interaction capabilities. Systems like LLMR illustrate
how LLMs can interpret natural language commands to dynamically generate and manipulate complex
virtual scenes, improving intuitiveness and accessibility in 3D content creation [12]. XaiR employs
multimodal capabilities, combining vision and language processing to provide real-time contextual
guidance in augmented reality (AR) tasks, markedly improving task performance and eficiency [ 20].</p>
        <p>However, most systems rely on predefined tasks, lack persistent multimodal memory, and seldom
include proactive behaviors. My approach, instead, aims to combine ephemeral interfaces, structured
memory, and anticipation in immersive XR.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Towards Just-in-Time Multimodal Interfaces</title>
      <p>Our proposed approach aims to overcome the limitations of existing text-based and basic personalized
interactions by developing adaptive agentic systems that dynamically generate just-in-time multimodal
interfaces. The core concept integrates adaptive interfaces, personalized contextual memory, and
proactive anticipation to deliver a deeply personalized and context-aware user experience, particularly
tailored for XR environments. The envisioned AI agent will use multimodal inputs to synthesize
context-aware interfaces in real time, adapting to tasks and environments. Interaction devices like
smart glasses ofer an ideal platform for continuous sensing and interaction.</p>
      <p>Grounded in HCI, the research will leverage user-centered design methods, using iterative prototyping
with XR technologies. The focus will be on developing and validating new multimodal interaction
mechanisms, including real-time semantic fusion of voice, gesture, and visual cues. To this end, the
research will adopt a research-through-design approach [21], with extensive user research to identify
challenges and needs in multimodal interaction, followed by empirical studies to assess usability,
interaction efectiveness, and cognitive load of the devised solutions.</p>
      <p>The research will adopt in particular simulated environments to test fine-grained mechanisms in
anticipatory agent behaviors in controlled, context-rich scenarios, enabling systematic evaluation of
relevant interaction patterns. This methodology will support the development of interaction paradigms
that are responsive but still unobtrusive. It will also help identify meaningful scenarios in which the
new technology can help address relevant user needs.</p>
      <p>Currently, we are conducting a first user study to validate initial design choices developed in a
prototype that integrates multimodal LLM interactions within immersive XR environments [22]. The
prototype allows interactions via gestures, voice, and text, providing users with immediate, contextual
responses within industrial training and maintenance scenarios. The user study, in particular, aims to
evaluate the efectiveness of diferent XR interactions, as illustrated in Figure 1:.</p>
      <p>1. Menu-based interaction: users manually navigate a menu to search for 3D models and
documentation.
2. Infographic-based interaction: relevant 3D models are pre-loaded in the XR space, and users
can access detailed infographics displaying characteristics by interacting directly with the models.
3. AI-driven interaction: an intelligent agent retrieves and contextualizes information from
documentation, highlights relevant 3D models within the environment, and answers
contextspecific questions related to these models.</p>
      <p>Early results show that the AI assistant improves task eficiency and reduces cognitive load by
operating directly within the 3D space and ofering contextual support. Even if preliminary, these outcomes
suggest that AI integration into the XR environment — beyond traditional chatbot-style interaction —
ofers opportunities to enhance user experience in complex tasks, paving the way for more immersive
and adaptive interaction paradigms. These results also highlight interesting research directions for
advancing the definition of multimodal interaction paradigms along three main technological directions:
• Multimodal interfaces generated by LLMs. Using AI to dynamically synthesize multimodal
interfaces, based on both explicit user requests and inferred needs, can help generate ephemeral
graphical overlays. If integrated into the interactive environment via XR devices, like smart
glasses, this technique can prioritize visual interaction over text to reduce cognitive load.
• Memory and contextual customization. This direction explores persistent, structured memory
for deep personalization. By combining graph-based retrieval with symbolic reasoning, agents
can track and recall contextual data—visual inputs, conversations, behaviors—enabling tailored,
evolving interactions with increasing precision over time.
• Proactivity and anticipation in XR environments. This third direction focuses on proactive
and anticipatory behaviors in XR. By analyzing multimodal inputs, visual cues, gestures, spatial
data, and conversations, the agent can suggest interfaces or actions in real time. The design of
proactive, anticipatory agents ca be grounded in cognitive science, which identifies expectation,
goal modeling, and anticipatory representations as key drivers of adaptive and emotionally
resonant interactions [23].</p>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusion</title>
      <p>The research illustrated in this paper aims to introduce an innovative paradigm for adaptive multimodal
interaction by integrating intelligent agents into XR environments. The project is currently in its
early stages, adopting a user-centered approach to systematically identify user needs and understand
interaction challenges in complex multimodal XR contexts.</p>
      <p>Given the limited body of research on LLM-based multimodal interaction in XR, the initial focus
is on characterizing the properties of the emerging interaction style described. Achieving this goal
requires extensive user studies and the formulation of new conceptual models and design principles.</p>
      <p>Future work will involve iterative empirical investigations to collect data and insights that will
drive the refinement of the design framework. The research aims to develop generalized models and
actionable design principles that can guide the creation of intuitive, adaptive user experiences across a
wide range of contexts and user needs.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The author used tools such as OpenAI’s GPT and Grammarly to assist in refining the text’s grammar
and clarity. After using these tools, the author reviewed and edited the content as needed and takes full
responsibility for the publication’s content.
[11] A. Schäfer, G. Reis, D. Stricker, A Survey on Synchronous Augmented, Virtual, andMixed Reality
Remote Collaboration Systems, ACM Computing Surveys 55 (2023) 1–27. URL: https://dl.acm.org/
doi/10.1145/3533376. doi:10.1145/3533376.
[12] F. De La Torre, C. M. Fang, H. Huang, A. Banburski-Fahey, J. Amores Fernandez, J. Lanier, LLMR:
Real-time Prompting of Interactive Worlds using Large Language Models, in: Proceedings of the
CHI Conference on Human Factors in Computing Systems, ACM, Honolulu HI USA, 2024, pp.
1–22. URL: https://dl.acm.org/doi/10.1145/3613904.3642579. doi:10.1145/3613904.3642579.
[13] P. Nandy, S. O. Adalgeirsson, A. K. Sinha, T. Kraljic, M. Cleron, L. Shi, A. Singh, A. Chaudhary,
A. Ganti, C. A. Melancon, S. Zhang, D. Robishaw, H. Ciurdar, J. Secor, K. A. Robertsen, K. Climer,
M. Le, M. Venkatesan, P. Chi, P. Li, P. F. McDermott, R. Shim, S. Onsan, S. Vaishnav, S. Guamán,
Bespoke: Using LLM agents to generate just-in-time interfaces by reasoning about user intent,
in: Companion Proceedings of the 26th International Conference on Multimodal Interaction,
ACM, San Jose Costa Rica, 2024, pp. 78–81. URL: https://dl.acm.org/doi/10.1145/3686215.3688372.
doi:10.1145/3686215.3688372.
[14] W. Zhong, L. Guo, Q. Gao, H. Ye, Y. Wang, MemoryBank: Enhancing Large Language Models
with Long-Term Memory, 2023. URL: http://arxiv.org/abs/2305.10250. doi:10.48550/arXiv.2305.
10250, arXiv:2305.10250 [cs].
[15] W. Wang, L. Dong, H. Cheng, X. Liu, X. Yan, J. Gao, F. Wei, Augmenting Language Models with
LongTerm Memory, 2023. URL: http://arxiv.org/abs/2306.07174. doi:10.48550/arXiv.2306.07174,
arXiv:2306.07174 [cs].
[16] B. Jones, Y. Xu, Q. Li, S. Scherer, Designing a Proactive Context-Aware AI Chatbot for People’s
Long-Term Goals, in: Extended Abstracts of the CHI Conference on Human Factors in Computing
Systems, ACM, Honolulu HI USA, 2024, pp. 1–7. URL: https://dl.acm.org/doi/10.1145/3613905.
3650912. doi:10.1145/3613905.3650912.
[17] S. M. Wasti, K. Q. Pu, A. Neshati, Large Language User Interfaces: Voice Interactive User Interfaces
powered by LLMs, 2024. URL: http://arxiv.org/abs/2402.07938. doi:10.48550/arXiv.2402.07938,
arXiv:2402.07938 [cs].
[18] J. J. Yang, Y. Shi, Y. Zhang, K. Li, D. W. Rosli, A. Jain, S. Zhang, T. Li, J. A. Landay, M. S. Lam,
ReactGenie: A Development Framework for Complex Multimodal Interactions Using Large Language
Models, in: Proceedings of the CHI Conference on Human Factors in Computing Systems, 2024,
pp. 1–23. URL: http://arxiv.org/abs/2306.09649. doi:10.1145/3613904.3642517, arXiv:2306.09649
[cs].
[19] Y. Cao, P. Jiang, H. Xia, Generative and Malleable User Interfaces with Generative and Evolving
Task-Driven Data Model, in: Proceedings of the 2025 CHI Conference on Human Factors in
Computing Systems, ACM, Yokohama Japan, 2025, pp. 1–20. URL: https://dl.acm.org/doi/10.1145/
3706598.3713285. doi:10.1145/3706598.3713285.
[20] S. Srinidhi, E. Lu, A. Rowe, XaiR: An XR Platform that Integrates Large Language Models with the
Physical World, in: 2024 IEEE International Symposium on Mixed and Augmented Reality (ISMAR),
IEEE, Bellevue, WA, USA, 2024, pp. 759–767. URL: https://ieeexplore.ieee.org/document/10765376/.
doi:10.1109/ISMAR62088.2024.00091.
[21] J. Zimmerman, J. Forlizzi, S. Evenson, Research through design as a method for interaction design
research in hci, in: Proceedings of the SIGCHI conference on Human factors in computing systems,
2007, pp. 493–502.
[22] L. Cordioli, M. Valoriani, M. Matera, Integrating large language models into extended reality
environments for enhanced user experiences, Submitted for publication (2025). Manuscript
submitted for publication.
[23] M. Miceli, C. Castelfranchi, A. Ortony, Expectancy and emotion, Series in afective science, Oxford
University Press, New York, NY, 2015.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>OpenAI</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Achiam</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Adler</surname>
          </string-name>
          , et al.,
          <source>GPT-4 Technical Report</source>
          ,
          <year>2024</year>
          . URL: http://arxiv.org/abs/2303. 08774. doi:
          <volume>10</volume>
          .48550/arXiv.2303.08774, arXiv:
          <fpage>2303</fpage>
          .08774 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Döring</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sylvester</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <article-title>Ephemeral user interfaces: valuing the aesthetics of interface components that do not last</article-title>
          ,
          <source>Interactions</source>
          <volume>20</volume>
          (
          <year>2013</year>
          )
          <fpage>32</fpage>
          -
          <lpage>37</lpage>
          . URL: https://dl.acm.org/doi/10.1145/ 2486227.2486235. doi:
          <volume>10</volume>
          .1145/2486227.2486235.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Inc</surname>
          </string-name>
          .,
          <article-title>A peek into rabbit's progress with lam playground</article-title>
          ,
          <year>2024</year>
          . URL: https://www.rabbit.tech/ research/a
          <article-title>-peek-into-rabbit-s-progress-with-LAM-playground</article-title>
          , accessed:
          <fpage>2025</fpage>
          -05-29.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>X.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>Model Context Protocol (MCP): Landscape</surname>
            ,
            <given-names>Security</given-names>
          </string-name>
          <string-name>
            <surname>Threats</surname>
          </string-name>
          , and Future Research Directions,
          <year>2025</year>
          . URL: http://arxiv.org/abs/2503.23278. doi:
          <volume>10</volume>
          .48550/arXiv. 2503.23278, arXiv:
          <fpage>2503</fpage>
          .23278 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Oviatt</surname>
          </string-name>
          ,
          <article-title>Ten myths of multimodal interaction</article-title>
          ,
          <source>Commun. ACM</source>
          <volume>42</volume>
          (
          <year>1999</year>
          )
          <fpage>74</fpage>
          -
          <lpage>81</lpage>
          . URL: https: //doi.org/10.1145/319382.319398. doi:
          <volume>10</volume>
          .1145/319382.319398.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Billinghurst</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Clark</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>A survey of augmented reality, Foundations and Trends® in Human-Computer Interaction 8 (</article-title>
          <year>2015</year>
          )
          <fpage>73</fpage>
          -
          <lpage>272</lpage>
          . URL: http://dx.doi.org/10.1561/1100000049. doi:
          <volume>10</volume>
          .1561/1100000049.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Oviatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. R.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <article-title>Aims and Advantages of Multimodal Interfaces, in: The Paradigm Shift to Multimodality in Contemporary Computer Interfaces</article-title>
          , Springer International Publishing, Cham,
          <year>2015</year>
          , pp.
          <fpage>17</fpage>
          -
          <lpage>25</lpage>
          . URL: https://link.springer.com/10.1007/978-3-
          <fpage>031</fpage>
          -02213-
          <issue>5</issue>
          _3. doi:
          <volume>10</volume>
          .1007/ 978- 3-
          <fpage>031</fpage>
          - 02213-
          <issue>5</issue>
          _3,
          <string-name>
            <surname>series</surname>
            <given-names>Title</given-names>
          </string-name>
          :
          <article-title>Synthesis Lectures on Human-Centered Informatics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Shan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <source>Cognitive Memory in Large Language Models</source>
          ,
          <year>2025</year>
          . URL: http://arxiv.org/abs/2504.02441. doi:
          <volume>10</volume>
          .48550/arXiv.2504.02441, arXiv:
          <fpage>2504</fpage>
          .02441 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Lindlbauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Feit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Hilliges</surname>
          </string-name>
          ,
          <article-title>Context-Aware Online Adaptation of Mixed Reality Interfaces</article-title>
          ,
          <source>in: Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology, ACM, New Orleans LA USA</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>147</fpage>
          -
          <lpage>160</lpage>
          . URL: https://dl.acm.org/doi/10.1145/3332165. 3347945. doi:
          <volume>10</volume>
          .1145/3332165.3347945.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Bödding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Schriek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. W.</given-names>
            <surname>Maier</surname>
          </string-name>
          ,
          <article-title>A systematic review and meta-analysis of mixed reality in vocational education and training: examining behavioral, cognitive, and afective training outcomes and possible moderators</article-title>
          ,
          <source>Virtual Reality</source>
          <volume>29</volume>
          (
          <year>2025</year>
          )
          <article-title>44</article-title>
          . URL: https://link.springer.com/10. 1007/s10055-025-01118-z. doi:
          <volume>10</volume>
          .1007/s10055- 025- 01118- z.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>