<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Digital Twin Construction for Real-World Metaverse: A Case Study of a Collaborative Escape Game</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Renta Inoue</string-name>
          <email>inoue@rm2c.ise.ritsumei.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Keigo Hattori</string-name>
          <email>hattori@rm2c.ise.ritsumei.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hayato Iwasaki</string-name>
          <email>iwasaki@rm2c.ise.ritsumei.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fumihiko Nakamura</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Asako Kimura</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fumihisa Shibata</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>APMAR 24: The 16th Asia-Pacific Workshop on Mixed and Augmented Reality</institution>
          ,
          <addr-line>Nov. 29-30, 2024, Kyoto</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>College of Information Science and Engineering, Ritsumeikan University</institution>
          ,
          <addr-line>Osaka</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We are planning to construct a Mixed Reality (MR) campus that realizes a real-world metaverse by merging physical and digital spaces, where the MR campus is built as a Digital Twin (DT) that mirrors the real-space campus in Virtual Reality (VR) space, aiming to enable collaborative work between users existing in both spaces. In this study, we conducted a basic investigation for the construction of the DT through the production of a collaborative escape room game using an asymmetrical MR environment. In the game, rooms in real space are reproduced as a DT in VR space. We confirmed that the MR user in the real space and the VR user immersed in the VR space with differently sized virtual bodies could see each other and work together interactively to achieve their goals.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Multi-user Collaboration</kwd>
        <kwd>Asymmetric Mixed Reality</kwd>
        <kwd>Metaverse</kwd>
        <kwd>Digital Twin1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Various methods of utilizing Digital Twin (DT) in a
Mixed Reality (MR) environment have been proposed.
Among them, a real-world metaverse proposed by
Niantic is a new type of fusion of metaverse and MR [1].
It aims to duplicate the real world into a virtual
environment, which can be accessed not only from the
real world but also from remote locations.</p>
      <p>By combining information in the real space with
digital information in the MR space, it is possible to
create a sense of presence as if the remote user actually
exists in the real space. We expect to see applications in
a wide range of fields, including business as well as
entertainment.</p>
      <p>For example, Zaman et al. developed a system that
enables remote users to interact with local users by
accessing a collaboration space in the real world and
demonstrated an improvement in the sense of presence
and the ability to perform tasks [2].</p>
      <p>Lee et al. have shown that sharing a host space with
others using 360-degree video and allowing independent
viewing ability for each user can lead to improved user
presence in collaborative tasks [3].</p>
      <p>In addition, a system utilizing asymmetric VR
environments has been proposed by Ibayashi et al. The
proposed system provides both an internal view using
VR and a top-down view through a table-type device for
architectural design. This system enables
communication through ceiling transparency and user
gestures [4].</p>
      <p>Cho et al. proposed an asymmetric VR environment
where participants can join from various platforms,
including PCs, mobile devices, AR, and VR. It was found
that mobile devices and AR lack immersive experience,
highlighting the importance of interface design tailored
to each platform [5].</p>
      <p>We envision an MR campus that reproduces the
realworld campus of the university as a DT in a VR space.
Users in remote locations can interact with avatars in
the VR space. Users in remote locations can also move
around as avatars in the VR space constructed as a DT,
and at the same time, they can interact with local users
by mapping their real-space positions and postures to
the VR space. Therefore, in our MR Campus, users in
local and remote locations can collaborate as if they
were in the same space through the interaction of both
spaces by the DT.</p>
      <p>In order to realize the concept of the MR Campus,
this study created and exhibited cross reality (XR)
content using a real stage set (Figure 1 (a)) and its 3D
model (Figure 1 (b) and Figure 1 (c)) as a basic study. By
constructing a DT of the real space and implementing
object and user interactions between the two spaces, we
examined the challenges toward realizing the concept.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Challenges to realization</title>
      <p>One of the challenges in realizing the MR campus is
sharing the user s position and posture. Specifically,
how to present the position and posture of a user in real
space to a user in VR space, and how to present a user in
VR space to a user in real space. In the former case, in
particular, there are many issues to be considered, such
as how to track and represent not only the position and
posture of the head but also the entire body. Other major
issues include aligning the two spaces and
synchronizing the position and posture of objects.</p>
      <p>In this research, we constructed an asymmetrical MR
space in which the user in the real world is represented
by a life-sized avatar, while the user in the VR space is
represented by a scaled-down avatar. We created a
collaborative escape room game using this asymmetrical
MR space and considered how to share position and
posture between the two spaces, how to facilitate
interaction between users, and how to align the two
spaces. By framing the work as a game, we invited
members of the public who are unfamiliar with VR and
MR technologies to experience it and discussed the
challenges involved.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Overview of the game</title>
      <p>There are two players in the collaborative escape room
game using an asymmetrical MR environment. One
player, the human player, takes the role of a phantom
thief in the MR space, while the other player, the mouse
player, takes the role of a mouse in the VR space. They
interact with each other to complete tasks and aim to
escape from the room. The human player and the mouse
player can see each other, and their movements are
reflected in the other s space. The human player can
move by walking around in the real space, while the
mouse player can move by alternately shaking the
controllers.</p>
      <p>Additionally, the human player can grab and move
the mouse or items using hand-gesture operations
(Figure 2). Due to the difference in body size between the
human and mouse, tasks in narrow spaces are handled
by the mouse player, while tasks such as assisting the
mouse s movements or operating levers and other
mechanisms are handled by the human player, allowing
them to collaborate effectively.</p>
      <sec id="sec-3-1">
        <title>3.1. Flow of the experience</title>
        <p>First, both players start in a dark room. The human
player can turn on the ceiling light by touching a virtual
switch, allowing them to begin performing various tasks.
Next, both players must search for three items required
to escape from the room. There are three major
challenges, and by clearing each one, they can obtain the
item needed for escape. Three items must be acquired
within the time limit of 5 minutes.</p>
        <p>The first challenge is a maze (located inside the
transparent case at the center bottom in Figure 1 (c)).
After the human player finds the mouse, they pick it up
and carry it to the entrance of the maze. Since the maze
is too small for the human player to enter, the mouse
player navigates the maze under the guidance of the
human player. By pushing out the item inside the maze,
the mouse player makes it possible for the human player
to grab the item and obtain it.</p>
        <p>The second challenge involves pipes and handles
(Figure 3). Similar to the maze, only the mouse player
can enter the pipes. The two differently colored handles
exist as physical objects and some colored pipes are
linked to each handle of the same color. By manipulating
these handles, the human player rotates the pipes to
create a path to the item. The mouse player moves
forward inside the pipes, collaborating with the human
player who manipulates the rotation of the pipes.</p>
        <p>The third challenge involves a lift and a lever (Figure
4). The item is placed near the ceiling, out of reach of the
human player. By moving the lever up and down, the lift
moves correspondingly. The mouse player gets on the
lift and the human player manipulates the lever to raise
the lift. If the lift moves near the ceiling, they can obtain
the item. The books placed on the shelves act as
obstacles to the lift s movement. Therefore, players need
to remove these books while raising the lift.</p>
        <p>After finding the three items, the human player
places them inside a virtual attaché case and carries it by
grabbing its handle. The game is cleared when both
players move with the case to the door. However, if the
players exceed the 5-minute time limit, a locking
mechanism is triggered on the door, resulting in a failed
escape.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. System configuration</title>
      <p>The system configuration is shown in Figure 5. In this
setup, both the human player and the mouse player wear
the Meta Quest 3 [6] as their head-mounted display
(HMD). The Meta Quest 3 was selected due to its support
for pass-through, hand tracking, and spatial anchors,
and it offers extensive functionality through the Meta
XR SDK. The human player utilizes the pass-through
feature to overlay virtual objects onto the real-world
setup, while the mouse player only sees the VR space.
The Unity game engine was used for game development
to leverage the Meta XR SDK. The Meta Quest 3 headsets
are connected to desktop PCs via Quest Link to run the
game on the PCs.</p>
      <sec id="sec-4-1">
        <title>4.1. Synchronization between two players</title>
        <p>Photon Unity Networking 2 (PUN2) [7] is used to
synchronize actions, position data, and other
information between the two players. PUN2 was
selected for its easy integration with Unity projects and
the extensive API. Data such as coordinates, and rotation
information are exchanged via UDP communication
through Photon s cloud server. The identification and
connection of synchronized information are managed
using name servers, regions, and room IDs.</p>
        <p>To reduce the bandwidth load, neither the
playerspecific in-game camera nor the coordinate information
from the HMD is synchronized directly. Instead, only
the players avatar information is synchronized using
PUN2. The avatar s coordinates are obtained from the
head position tracked by the HMD. The movement of
the avatar is achieved through inverse kinematics, using
four points: the coordinates of the left and right hands,
the head, and the floor. For the hand coordinates, the
human player uses the hand positions recognized via the
hand tracking feature, while the mouse player s hand
positions are obtained from the controllers held in both
hands.</p>
        <p>Additionally, the status of item acquisition and
switch operations is also synchronized using PUN2. The
system is designed so that variables on the human player
are synchronized and can be used as shared variables.</p>
        <p>Furthermore, any virtual object in the MR space can
be moved by either player. In PUN2, each object must
have an owner, and any movement of an object by
anyone other than the owner is not reflected on the
other side. Therefore, ownership of the object is
primarily held by the human player, and ownership is
transferred only when the mouse player touches the
object to enable smooth interaction with the object.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Position alignments</title>
        <p>In this game, the human player sees virtual objects like
a maze and pipes displayed in the physical room. These
objects are aligned using the Meta Quest 3 s Spatial
Anchor feature, and all virtual objects are positioned
relative to a single Spatial Anchor.</p>
        <p>The entire room in both spaces is synchronized
using the previously mentioned PUN2 in terms of
position and orientation. At the start of the game, the
entire room in the VR space is aligned with the physical
room in the MR space.</p>
        <p>Additionally, since the mouse player s starting point
is set at a specific position within the room object, the
mouse player s position is aligned accordingly. This
process ensures that position alignment is maintained
between the physical and virtual spaces, correctly
reflecting not only the position of virtual objects but also
the direction and posture of both players.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Operation using physical handles</title>
        <p>This section provides a detailed explanation of the
challenge involving pipes and handles described in
Section 3.1. The virtual pipes that the mouse player
passes through have red and blue sections, which can be
rotated by turning the corresponding physical handles
of the same color.</p>
        <p>Initially, the pipes are oriented in different directions
and do not connect to each other. The human player can
use the physical handles to rotate and connect the pipes,
allowing the mouse player to pass through. The handles
use a steering controller attachment with a Raspberry Pi
Zero featuring a compact size and Wi-Fi capability. An
MPU-6050 gyroscope is attached to the axis of each
handle, and it rotates together with the handle. The data
obtained by the gyroscope is sent to the Raspberry Pi
Zero for angle calculation.</p>
        <p>The calculated angle data is then transmitted via
TCP to the PC for the human player running Unity,
where it updates the angles of the virtual pipes in real
time.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Interaction using hand tracking</title>
        <p>The human player can interact with virtual objects using
hand tracking. In this game, the player performs two
types of action.</p>
        <p>The first action is a poking gesture. This is
implemented using the Poke Interaction feature from
the Meta XR Interaction SDK (Interaction SDK). This
action is applied to the switch-type object within the
space, allowing the player to toggle the switch on and
off by pressing it with their fingertip.</p>
        <p>The second action is a grabbing gesture. This is
implemented using the Grab Interaction feature from
the Interaction SDK. This action is used when the human
player picks up and moves the mouse and other virtual
objects within the space. By making a fist as if they are
actually grabbing something, the virtual object follows
the movement of the player's hand.</p>
        <p>With the grabbing gesture, the human player can
move virtual objects to any location within reach, which
may occasionally bypass collision detection among
objects. There was a concern that if the mouse player
were moved into virtual objects by the human player,
the maze or pipes could be unintentionally cleared. To
prevent this, the system is designed to forcibly release
the grab state if the mouse object collides with specific
objects, thereby preventing virtual objects from being
passed through.
The mouse player is positioned next to the physical set
used by the human player. The player moves forward by
alternately shaking the controllers. The speed of forward
movement is determined by the absolute speed of the
controller's movement; the faster the controller is moved,
the greater the forward movement. To prevent
unintended forward movement due to slight motions of
the controller, the player moves forward only when the
controller s speed exceeds a certain threshold.</p>
        <p>In the development of VR and MR content, for
tracking the movement of a controller, it is common
practice to detect changes in the controller s position
and rotation within Unity rather than directly
referencing raw sensor values from the controller.
Therefore, in this game, the speed is calculated based on
the controller s position relative to the player s head.</p>
        <p>The player s direction of movement is determined
by the orientation of their head, allowing them to move
forward in the direction they are looking. The mouse
player encounters a section where they must climb
through a vertical pipe. While in this section, forward
movement is disabled, and the controls automatically
switch to an upward movement, allowing the mouse to
climb the pipe. Additionally, when the mouse player is
being held by the human player, the mouse player s
movement is disabled.</p>
        <p>Although the mouse player plays next to the set in
this game, the system is designed to allow remote
operation from a distant location.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Other considerations</title>
      <sec id="sec-5-1">
        <title>5.1. Lighting expression</title>
        <p>In this game, there are multiple lighting patterns, such
as the player turning on the lights at the start or the
lights turning red when the time limit is approaching.
These are expressed differently in the VR space and the
MR space.</p>
        <p>The VR space applies simple lighting expressions
with Unity s Directional Light, such as disabling lights
and changing colors.</p>
        <p>In the MR space, since the real lighting could not be
manipulated due to the exhibition, virtual objects are
represented in the same way as in the VR space using
Directional Light, while physical objects are represented
by applying black or red filters to the pass-through video
to represent changes in lighting (Figure 6).</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Sound expression</title>
        <p>In this game, voice communication between players is
possible. This allows collaboration and communication
even when the mouse player is in a remote location.
Additionally, some objects emit sound effects. These
sound effects can be heard in the appropriate direction
and volume in the space not only in VR but also in MR.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Perspective expression by occlusion</title>
        <p>In the MR view of this game, an occlusion feature is
implemented using the Meta Quest 3 s Depth API. All
virtual objects displayed in the MR view, including the
mouse, are occluded by the player s hand and other
physical objects. In Figure 7, the arm and handles of the
physical objects occlude virtual objects such as a wall or
pipes.</p>
        <p>However, only the mouse is outlined so that the
player can locate the mouse when it is occluded. This
allows both players to interact smoothly. In Figure 8, the
mouse is under the table of the physical object, and the
mouse object itself is occluded, but its outline is still
visible.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Exhibition and findings</title>
      <sec id="sec-6-1">
        <title>6.1. Exhibition results and identified issues</title>
        <p>This game was exhibited at the Ibaraki ✕ Ritsumeikan
DAY 2024 event (Figure 9) on May 19, 2024. A total of 87
groups, comprising 174 participants aged 12 and older,
experienced the game (some participants may have
participated multiple times).</p>
        <p>Most participants were able to clear the game within
the 5-minute time limit and successfully finished the
three challenges. However, several issues related to
convenience and operability were identified during the
exhibition. Two key issues are highlighted below.</p>
        <p>Connection Instability: There were various issues
with the connection between the Meta Quest 3 and the
PC. Problems included the pass-through function failing
to activate at the start of the game and position
misalignment. These issues are believed to stem from
the fact the pass-through feature and the use of spatial
data in Quest Link being in the preview stage. It is
expected future updates to the Meta Quest 3 will resolve
these problems.</p>
        <p>Hand Tracking Limitations: Hand tracking only
functions within the HMD s field of view. When players
faced different directions while operating levers or
grabbing the mouse, their hands were not recognized,
making it difficult to perform the intended actions
comfortably. A fundamental solution cannot be achieved
with the Meta Quest 3 alone. To address this issue,
additional devices like 360-degree cameras could be used
to provide a more comprehensive tracking system.
These cameras would allow for hand tracking from all
angles, ensuring that the player s hands are always
detected, regardless of their position relative to the
headset.</p>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. Key findings and future challenges</title>
        <p>There are four main findings from this and its exhibition
that will help realize the MR campus.</p>
        <p>Accurate Positioning and Avatar
Synchronization: Accurate positioning ensured that
the positional relationship between players was as
perceived by the players. This allowed players to
complete challenges smoothly by indicating directions
and item locations to each other. This is effective for
collaborative tasks and important for user interaction.</p>
        <p>Avatar Visibility and Voice Communication:
Although participants were playing in different spaces,
some felt as if they were in the same space. This was
likely because they could see each other s avatars and
communicate through voice. This enhanced the sense of
presence and interaction.</p>
        <p>Aligning the two Spaces: In this exhibition, one
person played on the MR side and one on the VR side.
From a technical standpoint, the VR side supports
multiple participants because its positioning is based on
the physical space. However, the MR side does not
support multiple participants, as each instance has its
own coordinate system, and integration between these
spaces was not implemented. This requires further study.</p>
        <p>Synchronizing Objects in Both Spaces: In this
game, synchronization of real-world to virtual objects
was achieved by sensing the angle of the handle in the
physical space and reflecting it in the VR space. This
allows virtual objects to be operated even when the
hands are outside the HMD s field of view, as mentioned
earlier. Compared to virtual objects like the lever and the
lift, this method offers better operability. Since the MR
campus envisions interaction in both spaces,
synchronization of virtual to real objects should also be
achieved. This would eliminate gaps in object alignment
between the two spaces, improving communication
based on positional relationships. Furthermore, enabling
remote users to manipulate physical objects would allow
them to take a more active role in collaborative tasks.
This could be accomplished by utilizing and developing
the system proposed in [8].</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>We developed and showcased content where two
players, existing in MR and VR spaces, collaborate to
achieve an escape goal. Each space was constructed
using a real-world set and a 3D model that replicates it
as a digital twin (DT), designed to enable mutual
interaction. Several issues were identified during the
exhibition, and addressing these challenges is essential
to advance the realization of the MR campus concept.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgements</title>
      <p>This work was partially supported by JSPS KAKENHI
Grant Number JP23K21690.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Yuji</given-names>
            <surname>Higaki</surname>
          </string-name>
          <article-title>: Building the real-world metaverse</article-title>
          ,
          <year>2022</year>
          . URL: https://nianticlabs.com/news/buildingthe-real
          <article-title>-world-metaverse Faisal Zaman, Craig Anslow, Andrew Chalmers, Taehyun Rhee: MRMAC: Mixed reality multiuser asymmetric collaboration</article-title>
          ,
          <source>Proc.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>International Symposium on Mixed and Augmented Reality (ISMAR</source>
          <year>2023</year>
          ):
          <fpage>591</fpage>
          -
          <lpage>600</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Gun A.</given-names>
            <surname>Lee</surname>
          </string-name>
          , Theophilus Teo, Seungwon Kim, Mark Billinghurst:
          <article-title>A user study on MR remote collaboration using live 360 video</article-title>
          , Proc.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <source>International Symposium on Mixed and Augmented Reality (ISMAR</source>
          <year>2018</year>
          ):
          <fpage>153</fpage>
          -
          <lpage>164</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>SIGGRAPH Asia</surname>
          </string-name>
          (
          <year>2015</year>
          )
          <article-title>Yunsik Cho</article-title>
          , Myeongseok Park, Jinmo Kim: XAVE:
          <article-title>Cross-platform based asymmetric virtual environment for immersive content</article-title>
          ,
          <source>IEEE Access</source>
          , Vol.
          <volume>11</volume>
          , pp.
          <fpage>71890</fpage>
          -
          <lpage>71904</lpage>
          ,
          <year>2023</year>
          <article-title>Meta Quest 3</article-title>
          . URL: https://www.meta.com/jp/quest/quest-3/ Photon Unity Networking 2. URL: https://docapi.photonengine.com/en/pun/current/index.html Yumi Fukuda, Ayumu Shikishima, Asako Kimura, Hideyuki Tamura, Fumihisa Shibata:
          <article-title>RVXoverKit: Mixed reality content creation toolkit to connect real and virtual spaces</article-title>
          ,
          <source>Proc. AsiaPacific Workshop on Mixed and Augmented Reality</source>
          (
          <year>2022</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>