<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Unreal Engine-based Data Augmentation to Improve Real-world Human Activity Recognition with Wearable Devices</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xingyu Zhou</string-name>
          <email>zhouxingyu4590@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Keisuke Mizutani</string-name>
          <email>mizutani.keisuke41@chugai-pharm.co.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kento Tokuyama</string-name>
          <email>tokuyama.kento26@chugai-pharm.co.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Digital Transformation Unit, Chugai Pharmaceutical Co., Ltd.</institution>
          ,
          <addr-line>Tokyo</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Nagoya University</institution>
          ,
          <addr-line>Nagoya</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>Human activity recognition (HAR) has garnered significant attention owing to its potential applications in ifelds such as healthcare, automated driving, and personalized user interfaces. The use of wearable devices for collecting real-time motion data has spurred ongoing research in activity detection and prediction. However, the development of robust HAR models faces a critical challenge: the dificulty in acquiring diverse and comprehensive datasets of human activities. Ethical considerations and practical limitations often impede the collection of realworld motion data, particularly for sensitive or uncommon activities. To address this challenge, we introduce a novel system-the Unreal Data Generator-that leverages the capabilities of Unreal Engine 5. This tool facilitates the synthesis of time series data for a wide range of activities and scenarios by simulating data collection from multiple wearable device locations using virtual human models. Experiments demonstrated that augmenting real-world datasets with synthetic data generated by the Unreal Data Generator significantly improves the performance of deep learning-based HAR models. Specifically, by training our deep convolutional neural network model with both real-world and synthetic data, the accuracy in the WISDM benchmark test improved from 0.784 to 0.823. This approach ofers a promising solution for improving the robustness and generalizability of HAR models by providing access to a rich and diverse range of training data.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;convolutional neural network</kwd>
        <kwd>human activity recognition</kwd>
        <kwd>synthetic data</kwd>
        <kwd>Unreal Engine</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Human activity recognition (HAR) is a rapidly evolving field that enables machines to automatically
recognize and interpret human actions and movements. HAR systems have numerous applications in
healthcare [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ], manufacturing [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], autonomous vehicles [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], and sports analysis [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. These systems
utilize various sensor technologies, including wearable devices, smartphones, and radar sensors, to
collect continuous signal data related to human activity. Sophisticated algorithms, primarily machine
learning models such as deep learning approaches, are employed to classify this data into meaningful
action categories.
      </p>
      <p>
        Recent advancements in wearable device technology have significantly accelerated progress in HAR
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. These sensor devices can be easily worn, enabling convenient and eficient data acquisition. To
capture motion and orientation in three-dimensional space, these devices are typically equipped with
3-axis accelerometers and gyroscopes. In addition to accelerometers and gyroscopes, wearable devices
can measure physiological signals such as heart rate and skin temperature. Their small size and long
battery life enable the collection of large datasets over extended monitoring periods. The availability of
readily accessible and diverse data has substantially enhanced the performance of deep learning models
in recognizing human activities [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>Despite advancements fueled by readily available data, significant challenges persist in developing
robust HAR systems. Acquiring suficient data for accurate predictions, particularly for infrequent yet
critical activities such as falls and syncope, remains a major obstacle. Conducting large-scale studies
with diverse participants further complicates data collection. Furthermore, ensuring data quality and
mitigating sensor noise are crucial considerations. Future research must focus on addressing these
limitations while improving the generalizability of HAR systems to unseen scenarios and diverse
populations.</p>
      <p>To address these data acquisition challenges, we developed Unreal Data Generator, the architecture
of which is illustrated in Figure 1. Our methodology comprises two main stages. First, as detailed in
Figure 1a, we leverage Unreal Engine, a real-time 3D creation tool, to simulate human movements. The
world coordinates are recorded and then converted into raw sensor signals, which are subsequently
refined through a dedicated processing pipeline. Second, as depicted in Figure 1b, this processed
synthetic data is used to augment real-world datasets. By simulating a wide range of activities and
scenarios under controlled parameters, this approach efectively overcomes the limitations and ethical
concerns of collecting large-scale real-world data, especially for rare events. The primary objective is
to leverage synthetic data generated within Unreal Engine to enhance the accuracy and robustness of
deep learning-based HAR models by augmenting real-world datasets and improving performance on
human activity recognition tasks.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Numerous studies have been conducted on HAR using multi-channel time series data from sensors
such as accelerometers and gyroscopes, obtained from wearable devices and smartphones [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In recent
years, there has been extensive research on the application of machine learning techniques for activity
classification models in HAR. For model improvements, approaches can be classified into two types:
model-centric improvement and data-centric improvement.
      </p>
      <p>
        In model-centric improvement, HAR using neural networks has been actively studied. For example,
convolutional neural networks (CNNs) have been employed to efectively capture local patterns in
sensor data [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], while long short-term memory (LSTM) networks have demonstrated their ability to
model the temporal dynamics of human activities [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Furthermore, U-Net type models, originally
developed for image segmentation, have also been explored for their potential in HAR tasks involving
sequential data [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        While these deep learning models have shown promising results and continue to be actively developed,
they often require large amounts of training data to achieve optimal performance. This requirement
for extensive datasets poses a significant challenge for the practical implementation of HAR systems.
Therefore, a data-centric approach, focusing on improving data quality and exploring efective data
augmentation techniques, becomes crucial for building robust and generalizable HAR models with
limited resources. Kim et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] demonstrated that data augmentation with oversampling solved the
data imbalance problem for medical applications. Some research has applied generative adversarial
networks (GANs) to generate synthetic data for HAR tasks [
        <xref ref-type="bibr" rid="ref13 ref14 ref15">13, 14, 15</xref>
        ]. For example, Lupión et al.[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]
were the first to propose a conditional Wasserstein GAN (cWGAN) for generating accelerometer signals,
demonstrating its superiority over standard conditional GANs for data augmentation in HAR. While
some recent work has focused on improving GAN architectures directly, Zhang et al. (2025) took a
diferent approach by proposing a diferentiable framework that automatically learns to select and
combine traditional, prior-driven handcrafted operations with a generative model, aiming to leverage
the strengths of both methodologies [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>
        The application of GANs has shown promise in generating synthetic data for HAR tasks. However, a
potential limitation of relying solely on GANs lies in ensuring the diversity and comprehensiveness
of the generated data. Simulation-based approaches ofer a complementary strategy by allowing for
the creation of varied datasets through the manipulation of virtual environments and parameters. For
instance, Waqar et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] demonstrated that synthetic 3D animation data can efectively enhance
radar-based HAR predictions. Cauli and Recupero [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] demonstrated that synthetic data augmentation
with Unity, which is a popular game engine, improved video action recognition. Their study showed
that 3D animation data can be leveraged to construct diverse motion datasets, providing a promising
strategy to address the data scarcity challenges associated with human subject experiments. Various
studies have explored the use of game engines for generating synthetic data in HAR, including image
[
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], video data [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], and skeletal poses [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. However, to the best of our knowledge, no studies
have explored the application of game engines to generate synthetic data from accelerometers and
gyroscopes for improving HAR systems with wearable devices. Here, we first demonstrate the novel
application of a game engine, specifically using Unreal Engine 5 (UE5), to generate synthetic time series
data from accelerometers and gyroscopes for HAR systems using wearable devices. Furthermore, our
study explores a methodology for integrating this game engine-generated data into real-world datasets,
addressing the data scarcity challenges associated with human subject experiments. This integration
method contributes to an improved approach for generating comprehensive and diverse data.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <sec id="sec-3-1">
        <title>3.1. Development of Unreal Data Generator</title>
        <p>
          UE5.4 was selected as the standard engine for this study. The motion-matching template was downloaded,
and all necessary plugins and configurations were set up following the oficial documentation [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ].
To enhance character movement, the advanced locomotion system (ALS) was integrated into the
motion-matching template. We utilized the standard default and female mannequins provided by Unreal
Engine 5. Furthermore, we created three additional mannequin variations by applying animation layers,
resulting in a total of five switchable mannequins. Walking animations from Mixamo [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] were modified
by adjusting the leg weights (&lt;0.2) within each layer to prevent animation errors. Additionally, Foot IK
was implemented to improve realism by dynamically adjusting foot-ground interactions. Movement
speed control was incorporated, allowing users to adjust walking and running speeds using the numeric
keys 1–4.
        </p>
        <p>Because UE5’s default mannequin ofers only basic actions—running, walking, standing, and
jumping—custom animations need to be added for specific actions, such as climbing stairs and sitting. To
implement the stair animations, a stair detection capsule and a Boolean variable were first added to
indicate when a character was "on stairs." Then, stairs were configured as a separate collision layer
that overlapped only with the character’s stair capsule. Logic was implemented so that the Boolean
variable switched to “true” when the stair capsule overlapped with the stair collision. Stair-specific
animations (e.g., "running upstairs" and "ascending stairs") from Mixamo were integrated into the
motion-matching system. A condition in the animation player triggered the appropriate animations
when the Boolean variable was true. The sitting action was implemented using a montage system.
Fifteen types of sitting animations were downloaded from Mixamo and used to create diferent montage
ifles, with data captured and updated iteratively.</p>
        <p>The data collection system was developed by placing sockets on designated bone locations and using
the Break Vector and Break Rotator math functions to obtain world coordinates and rotation angles,
respectively. In the Unreal Data Generator, the sampling rate, measurement time, and measurement
points can be freely configured. For this study, time series data were collected at a sampling rate of 50
Hz for approximately 300 s for each activity across diferent animations (default mannequins, preset
female mannequins, and three custom animation layers). Details of the dataset are presented in Table 1.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Data Processing</title>
        <p>The Unreal data generated by UE5.4 contain time series data of unreal world coordinates (, , ) and
rotation information as roll, pitch, and yaw ( ,  ,  ). The relationships between the coordinates (),
rotation angles ( ), velocity (), acceleration (), and angular velocity () are defined by the following
equations (1–4):
 = (, , ),   = ( ,  ,  )
 =  ≈ + Δ∆ −  
 =  ≈ + Δ∆ −  
 =  ≈  + Δ∆ −  
Since the acceleration data measured by wearable devices in the real world are afected by gravity and
changes in device posture, we analyzed the efects of gravity on real-world data and applied gravity
correction to the Unreal data as follows:
(1)
(2)
(3)
(4)
• Calculation of average acceleration per axis for each motion.
• Determination of gravity compensation values: For each axis, the average acceleration was
compared to the gravitational acceleration (9.81 m/s²).
• If the absolute diference between the average acceleration and 9.81 (or -9.81 for downward
acceleration) was within a threshold of 2.0 m/s², then 9.81 or -9.81 was used as the gravity
compensation value for that axis. It was assumed that the sensor was primarily aligned with
gravity.
• Handle outliers: If the average acceleration on any axis fell outside the threshold of ± 2.0 m/s²
from 9.81 or -9.81, the median acceleration for that axis was used as the gravity compensation
value instead. This approach is more robust for handling outliers and noisy data.
• Calculation of the rotation matrices: Euler-ZYX rotation matrices were computed from the
rotation angle data at each time step of the Unreal data.
• Apply rotation and gravity compensation: The gravity compensation vector (obtained in steps 2
and 3) was multiplied by the rotation matrix (calculated in step 4) for each time step. This process
transformed the gravity compensation vector from the sensor coordinate system into the world
coordinate system, accounting for the sensor’s orientation.
• The resulting compensated gravity vector was added to the raw acceleration readings on each
axis to account for the real-world efects of gravity.</p>
        <p>
          Finally, a Butterworth low-pass filter [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ] was applied to the processed data to remove noise.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Benchmark Real Datasets</title>
        <p>
          As real benchmark datasets, we selected two types: WISDM [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ] and DSADS [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ] (Table 1). The
WISDM dataset included six types of motions—walking, jogging, upstairs, downstairs, sitting, and
standing—recorded using a cell phone placed in the subject’s front pant leg pocket. The DSADS dataset
included 19 activities, in addition to those in the WISDM dataset, recorded using MTx 3-DOF orientation
trackers positioned on five body locations: the torso, right arm, left arm, right leg, and left leg.
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Activity Recognition Model Development</title>
        <p>
          The learning task was defined as a six-activity classification problem using 10 s windowed data sampled
at 20 Hz (i.e., a window size of 200). For HAR model development, we implemented a one-dimensional
convolutional neural network (1D-CNN) based on a previous study [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. We selected this architecture
because our primary goal is to validate our novel synthetic data augmentation method using a
foundational baseline model. The 1D-CNN is a standard backbone for HAR using multi-channel time-series
data [
          <xref ref-type="bibr" rid="ref27 ref28">27, 28</xref>
          ], making it a suitable choice for this purpose. The model comprises two convolutional
blocks and one adaptive average pooling layer. Each convolutional block includes a one-dimensional
convolutional layer and a normalization layer. A single fully connected layer is used as the classifier. To
evaluate the prediction performance of the developed models, we divided the real data by subject and
conducted k-fold cross-person validation, with &lt;span class="math-inline"&gt;K&lt;/span&gt; set to 5 for the
WISDM dataset and 4 for the DSADS dataset. We investigated the benefits of incorporating synthetic
data into the training process under two scenarios: one using only real-world data and the other using
a combined dataset of real-world data and Unreal Engine-generated synthetic data.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <sec id="sec-4-1">
        <title>4.1. Development of Unreal Data Generator</title>
        <p>We developed Unreal Data Generator for synthetic data generation to improve the HAR system with
accelerometers and gyroscopes. By utilizing UE5 as the standard engine, diverse motion data can be
generated across various scenarios through intuitive key operations, much like controlling a game
character. For instance, the data collected while walking on a paved road difers from that collected
while walking on a bumpy road. Movement speed is also adjustable, allowing it to run straight and fast
or to keep circling around the area. The Motion Matching in UE5 enables smooth animation transitions,
thus ofering natural and realistic motion responses to complex user operation inputs. Additionally,
the sensor position for data collection and the sampling rate can be specified arbitrarily. This level of
customizability is particularly useful for optimizing sensor positions on wearable devices and for other
related applications.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Unreal Data Generation for HAR</title>
        <p>We developed an unreal data generation system using the real-time 3D creation tool UE5. To extract
unreal data, we utilized the real-world coordinate system in UE5 and recorded six activities—walking,
jogging, upstairs, downstairs, sitting, and standing—using five types of mannequins sampled at 50
Hz. Activity animations were obtained from the Fab website (Unreal Engine marketplace). The values
for accelerometers and gyroscopes were calculated based on the time diferences in the coordinate
dynamics for each record. We recorded each motion for 5 minutes, resulting in a total of 150 minutes of
motion capture data (5 mannequin types × 6 motion types).</p>
        <p>Figure 2 compares the real (top subplot) and unreal (middle and bottom subplots) datasets. As
shown in Figure 2, the raw synthetic data generated by Unreal Engine exhibited significant noise and
deviated considerably from the real-world data. This discrepancy arises primarily because the synthetic
acceleration is derived by numerically diferentiating position data twice, a method highly susceptible
to amplifying noise. In contrast, real-world accelerometers measure acceleration directly as a physical
force and often include hardware-level filters, resulting in inherently smoother signals.</p>
        <p>To address this, we applied a two-step processing method. First, a Butterworth low-pass filter was
used to remove the high-frequency noise inherent in the diferentiation process. Second, to account for
the influence of gravity present in real-world measurements, we implemented the gravity correction
detailed in Section 3.2. The processed unreal data were significantly closer to the measured real-world
values (see bottom subplots in Figure 2).</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Improved Cross-person Human Activity Recognition with Unreal Data</title>
        <p>To validate the improvement in HAR accuracy for diferent datasets, we trained separate 1D-CNN based
activity recognition models for WISDM and DSADS datasets, using Unreal Engine-generated synthetic
data to augment each. Model performance was evaluated using k-fold cross-person validation in terms
of accuracy, precision, recall, F1-score, and AUC. We compared the performance of models trained
solely on real-world data with those trained on augmented datasets incorporating synthetic data. The
results are summarized in Table 2.</p>
        <p>On the WISDM dataset, the baseline model trained solely on real-world data achieved an accuracy
of 78.41%. Augmenting the dataset with Unreal data generated by UE5 improved HAR performance.
Specifically, applying a low-pass filter and gravity correction to the Unreal data further enhanced
performance, resulting in an accuracy of 82.55%. Similarly, on the DSADS dataset, the addition of
processed unreal data also improved HAR performance, whereas raw Unreal data negatively impacted
model accuracy. These findings highlight the importance of incorporating low-pass filtering and gravity
correction during data preprocessing when developing HAR models using Unreal Engine-generated
data.</p>
        <p>The confusion matrix for HAR classification in the WISDM benchmark is shown in Figure 3. The
accuracy of the HAR model improved significantly for motions involving movement when Unreal
data were included. Although classification accuracy for diferentiating between the static postures of
sitting and standing did not show improvement, the accuracy in dynamic/static motion classification
demonstrated noticeable gains. Furthermore, the confusion matrix revealed improved classification
accuracy for similar motions, such as walking/jogging and climbing upstairs/downstairs.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>Unreal Data Generator facilitates the acquisition of acceleration and gyroscopic sensor data across a
wide range of motions and scenarios. Typically, 3D animations in game engines operate by repeatedly
playing back pre-recorded animation data, which results in identical sensor data being generated from
the same animation. However, the ALS in UE5 employs a dynamic procedural animation system that
adapts to the environment and player inputs in real time. This approach blends animations and adjusts
them based on factors such as terrain, speed, and direction, creating more fluid and realistic character
movements. Compared with traditional animation techniques in game engines, these advanced methods
provide a more immersive and responsive character movement system. In this study, we developed a
more flexible movement animation by integrating ALS, motion matching, and Foot IK techniques.</p>
      <p>As mentioned previously, generating natural and diverse movements using default mannequins can
be easily achieved. However, generating data for specific custom actions requires the addition of new
animations. For instance, in this study, a sitting animation montage was downloaded from FAB to
capture sitting motion data. The FAB database currently ofers a wide variety of motions, providing a
diverse range of animations that can be utilized. When specific motion data are not available in the
FAB database, new animations must be created. This process can be streamlined by leveraging software
that generates motions from videos (DeepMotion [29]) or using generative AI tools specialized in video
generation (Runway Gen-3 Alpha [30]).</p>
      <p>The primary diferences between Unreal Engine-generated data and real-world sensor data lie in the
nature of the measurements. The real data generator provides world-space coordinates, dynamics, and
angular rotation, whereas real-world wearable devices measure acceleration, angular velocity, and other
digital biomarkers, such as heart rate, using their own algorithms. Notably, while real-world sensors
measure acceleration as the force applied to the device—afected by gravity—acceleration derived from
world coordinate dynamics in Unreal Engine does not account for gravity. This distinction is crucial
when comparing or integrating data from these two sources.</p>
      <p>Our results demonstrate the profound impact of correcting for this gravitational efect. In the real
world, the gravitational force registered by a 3-axis accelerometer changes based on the sensor’s
orientation. Our method of applying a gravity vector transformed by the sensor’s rotation matrix
successfully embeds this orientation-dependent component into the synthetic data. The efectiveness
of this is most apparent in static activities like ’Standing’ and ’Sitting’ (Figure 2e, 2f). For these poses,
the raw synthetic acceleration is nearly zero, as position coordinates are constant. In contrast, a real
sensor registers non-zero values corresponding to the constant pull of gravity. Our gravity correction
replicates this real-world phenomenon, transforming the unrealistic near-zero signals into realistic
static ofsets, thus creating much more faithful training data.</p>
      <p>
        These specific corrections for gravity and signal noise are part of a broader challenge in
simulationbased research known as the "Sim-to-Real Gap"—the discrepancy between simulated and real-world
data [31]. Our study demonstrates that it is essential to mitigate this gap through a dedicated data
processing pipeline to improve the data’s fidelity. We addressed this by implementing a low-pass filter
for noise and using Euler rotation matrices for gravity correction. While this direct signal processing
approach proved efective, future research could explore alternative methods to bridge this gap. For
instance, incorporating domain adaptation techniques or leveraging Generative Adversarial Networks
(GANs) could ofer ways to learn the transformation from simulated to realistic data automatically
[
        <xref ref-type="bibr" rid="ref17">32, 17</xref>
        ].
      </p>
    </sec>
    <sec id="sec-6">
      <title>6. Limitations</title>
      <p>While this study demonstrates the potential of using Unreal Engine for data augmentation, it has
several limitations. First, the scope of our evaluation was limited. The experiments were conducted
on two public datasets, covering only six basic daily activities such as walking, running, and sitting,
and used a single backbone architecture (1D-CNN). Furthermore, our study focused on validating a
simulation-based approach and did not include a direct comparative analysis against other categories of
data augmentation, such as model-based techniques like GANs. Second, our methodology for bridging
the Sim-to-Real Gap was confined to a direct signal processing pipeline (i.e., low-pass filtering and
gravity correction). The exploration of other advanced methods, such as automated domain adaptation,
was not covered in this work.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>We thank Epic Games Japan for the complimentary use of Unreal Engine for this research.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used Gemini in order to: draft content, Grammar and
spelling check, Paraphrase and reword, Text translation, Improve writing style, and Literature review
generation. After using this tool, the authors reviewed and edited the content as needed and took full
responsibility for the content of the publication.
[29] Deep Motion Inc., Deep motion, 2025. URL: https://www.deepmotion.com/.
[30] Runway AI Inc., Introducing gen-3 alpha: A new frontier for video generation, 2025. URL: https:
//runwayml.com/research/introducing-gen-3-alpha.
[31] E. Salvato, G. Fenu, E. Medvet, F. A. Pellegrino, Crossing the reality gap: A survey on sim-to-real
transferability of robot controllers in reinforcement learning, IEEE Access 9 (2021) 153171–153187.
doi:10.1109/ACCESS.2021.3126658.
[32] A. Akbari, R. Jafari, Transferring activity recognition models for new wearable sensors with
deep generative domain adaptation, in: Proceedings of the 18th International Conference on
Information Processing in Sensor Networks, ACM, New York, USA, 2019, pp. 85–96. doi:10.1145/
3302506.3310391.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Umbricht</surname>
          </string-name>
          , W. Cheng,
          <string-name>
            <given-names>F.</given-names>
            <surname>Lipsmeier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bamdadian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lindemann</surname>
          </string-name>
          ,
          <article-title>Deep learning-based human activity recognition for continuous activity and gesture monitoring for schizophrenia patients with negative symptoms</article-title>
          ,
          <source>Frontiers in Psychiatry</source>
          <volume>11</volume>
          (
          <year>2020</year>
          )
          <article-title>574375</article-title>
          . doi:
          <volume>10</volume>
          .3389/fpsyt.
          <year>2020</year>
          .
          <volume>574375</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Momo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kosuke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Takaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kijun</surname>
          </string-name>
          , T. Kento,
          <article-title>Improved generalized performance of hemodynamics scenarios prediction with digital biomarkers by Conv1D approach</article-title>
          , in: IEEE SMC,
          <year>Proceedings</year>
          .
          <year>2023</year>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.</given-names>
            <surname>Yoshimura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Maekawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Namioka</surname>
          </string-name>
          ,
          <article-title>Acceleration-based activity recognition of repetitive works with lightweight ordered-work segmentation network</article-title>
          ,
          <source>Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies</source>
          <volume>6</volume>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>39</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Tammvee</surname>
          </string-name>
          , G. Anbarjafari,
          <article-title>Human activity recognition-based path planning for autonomous vehicles</article-title>
          ,
          <source>Signal Image Video Process</source>
          .
          <volume>15</volume>
          (
          <year>2021</year>
          )
          <fpage>809</fpage>
          -
          <lpage>816</lpage>
          . doi:
          <volume>10</volume>
          .1007/s11760-020-01800-6.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Tuncer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ertam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dogan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Subasi</surname>
          </string-name>
          ,
          <article-title>An automated daily sports activities and gender recognition method based on novel multikernel local diamond pattern using sensor signals</article-title>
          ,
          <source>IEEE Trans. Instrum. Meas</source>
          .
          <volume>69</volume>
          (
          <year>2020</year>
          )
          <fpage>9441</fpage>
          -
          <lpage>9448</lpage>
          . doi:
          <volume>10</volume>
          .1109/TIM.
          <year>2020</year>
          .
          <volume>3003395</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Shahabi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Alshurafa</surname>
          </string-name>
          ,
          <article-title>Deep learning in human activity recognition with wearable sensors: a review on advances</article-title>
          ,
          <source>Sensors (Basel) 22</source>
          (
          <year>2022</year>
          )
          <article-title>1476</article-title>
          . doi:
          <volume>10</volume>
          .3390/s22041476.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Yi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hwang</surname>
          </string-name>
          ,
          <article-title>A human activity recognition method based on lightweight feature extraction combined with pruned and quantized cnn for wearable device</article-title>
          ,
          <source>IEEE Trans. Con. Electron</source>
          .
          <volume>69</volume>
          (
          <year>2023</year>
          )
          <fpage>657</fpage>
          -
          <lpage>670</lpage>
          . doi:
          <volume>10</volume>
          .1109/TCE.
          <year>2023</year>
          .
          <volume>3266506</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E.</given-names>
            <surname>Ramanujam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Perumal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Padmavathi</surname>
          </string-name>
          ,
          <article-title>Human activity recognition with smartphone and wearable sensors using deep learning techniques: a review</article-title>
          ,
          <source>IEEE Sens. J</source>
          .
          <volume>21</volume>
          (
          <year>2021</year>
          )
          <fpage>13029</fpage>
          -
          <lpage>13040</lpage>
          . doi:
          <volume>10</volume>
          .1109/JSEN.
          <year>2021</year>
          .
          <volume>3069927</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          , P. San,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Krishnaswamy</surname>
          </string-name>
          ,
          <article-title>Deep convolutional neural networks on multichannel time series for human activity recognition</article-title>
          ,
          <source>in: Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI' 15)</source>
          , AAAI Press,
          <year>2015</year>
          , pp.
          <fpage>3995</fpage>
          -
          <lpage>4001</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>K.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Lstm-cnn architecture for human activity recognition</article-title>
          ,
          <source>IEEE Access 8</source>
          (
          <year>2020</year>
          )
          <fpage>56855</fpage>
          -
          <lpage>56866</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2020</year>
          .
          <volume>2982225</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , J. Bao,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , H. Deng,
          <article-title>Human activity recognition based on motion sensor using U-Net, IEEE Access 7 (</article-title>
          <year>2019</year>
          )
          <fpage>75213</fpage>
          -
          <lpage>75226</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2019</year>
          .
          <volume>2920969</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Y.-W.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Joa</surname>
          </string-name>
          , H.-Y. Jeong,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Wearable imu-based human activity recognition algorithm for clinical balance assessment using 1d-cnn and gru ensemble model</article-title>
          ,
          <source>Sensors (Basel) 21</source>
          (
          <year>2021</year>
          )
          <article-title>7628</article-title>
          . doi:
          <volume>10</volume>
          .3390/s21227628.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Chan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Noor</surname>
          </string-name>
          ,
          <article-title>A unified generative model using generative adversarial network for activity recognition</article-title>
          ,
          <source>J. Ambient Intell. Hum. Comput</source>
          .
          <volume>12</volume>
          (
          <year>2021</year>
          )
          <fpage>8119</fpage>
          -
          <lpage>8128</lpage>
          . doi:
          <volume>10</volume>
          .1007/ s12652-020-02548-0.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Younes</surname>
          </string-name>
          ,
          <article-title>Activitygan: generative adversarial networks for data augmentation in sensor-based human activity recognition</article-title>
          ,
          <source>in: Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>249</fpage>
          -
          <lpage>254</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lupión</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Cruciani</surname>
          </string-name>
          , I. Cleland,
          <string-name>
            <given-names>C.</given-names>
            <surname>Nugent</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Ortigosa</surname>
          </string-name>
          ,
          <article-title>Data augmentation for human activity recognition with generative adversarial networks</article-title>
          ,
          <source>IEEE Journal of Biomedical and Health Informatics</source>
          <volume>28</volume>
          (
          <year>2024</year>
          )
          <fpage>2350</fpage>
          -
          <lpage>2361</lpage>
          . doi:
          <volume>10</volume>
          .1109/JBHI.
          <year>2024</year>
          .
          <volume>3364910</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <article-title>Diferentiable prior-driven data augmentation for sensor-based human activity recognition</article-title>
          ,
          <source>IEEE Transactions on Computational Social Systems</source>
          (
          <year>2025</year>
          )
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          . doi:
          <volume>10</volume>
          .1109/TCSS.
          <year>2025</year>
          .
          <volume>3565414</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Waqar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pätzold</surname>
          </string-name>
          ,
          <article-title>A simulation-based framework for the design of human activity recognition systems using radar sensors</article-title>
          ,
          <source>IEEE Internet Things J</source>
          .
          <volume>11</volume>
          (
          <year>2024</year>
          )
          <fpage>14494</fpage>
          -
          <lpage>14507</lpage>
          . doi:
          <volume>10</volume>
          .1109/JIOT.
          <year>2023</year>
          .
          <volume>3344179</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>N.</given-names>
            <surname>Cauli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. Reforgiato</given-names>
            <surname>Recupero</surname>
          </string-name>
          ,
          <article-title>Synthetic data augmentation for video action classification using Unity, IEEE Access 12 (</article-title>
          <year>2024</year>
          )
          <fpage>156172</fpage>
          -
          <lpage>156183</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2024</year>
          .
          <volume>3485199</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>N.</given-names>
            <surname>Pai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.-Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Home fitness and rehabilitation support system implemented by combining deep images and machine learning using unity game engine</article-title>
          ,
          <source>Sens. Mater</source>
          .
          <volume>34</volume>
          (
          <year>2022</year>
          )
          <fpage>1971</fpage>
          -
          <lpage>1990</lpage>
          . doi:
          <volume>10</volume>
          .18494/SAM3734.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Peven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Qiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Yuille</surname>
          </string-name>
          , G. Hager,
          <article-title>Synthesizing attributes with unreal engine for ifne-grained activity analysis</article-title>
          ,
          <source>in: 2019 IEEE Winter Applications of Computer Vision</source>
          Workshops (WACVW), IEEE, New York,
          <year>2019</year>
          , pp.
          <fpage>35</fpage>
          -
          <lpage>37</lpage>
          . doi:
          <volume>10</volume>
          .1109/WACVW.
          <year>2019</year>
          .
          <volume>00013</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>D.</given-names>
            <surname>Ludl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gulde</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Curio</surname>
          </string-name>
          ,
          <article-title>Enhancing data-driven algorithms for human pose estimation and action recognition through simulation</article-title>
          ,
          <source>IEEE Trans. Intell. Transp. Syst</source>
          .
          <volume>21</volume>
          (
          <year>2020</year>
          )
          <fpage>3990</fpage>
          -
          <lpage>3999</lpage>
          . doi:
          <volume>10</volume>
          .1109/TITS.
          <year>2020</year>
          .
          <volume>2988504</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Epic</surname>
            <given-names>Games</given-names>
          </string-name>
          , Inc.,
          <source>Motion matching in unreal engine</source>
          ,
          <year>2025</year>
          . URL: https://dev.epicgames.com/ documentation/en-us/
          <article-title>unreal-engine/motion-matching-in-unreal-engine.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Mixamo</given-names>
            <surname>Inc</surname>
          </string-name>
          .,
          <source>Mixamo animation library website</source>
          ,
          <year>2025</year>
          . URL: https://www.mixamo.com.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>S.</given-names>
            <surname>Butterworth</surname>
          </string-name>
          ,
          <article-title>On the theory of filter amplifiers</article-title>
          ,
          <source>Wirel. Eng</source>
          .
          <volume>7</volume>
          (
          <year>1930</year>
          )
          <fpage>536</fpage>
          -
          <lpage>641</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kwapisz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Weiss</surname>
          </string-name>
          , S. Moore,
          <article-title>Activity recognition using cell phone accelerometers</article-title>
          ,
          <source>ACM SIGKDD Explor. Newsl</source>
          .
          <volume>12</volume>
          (
          <year>2011</year>
          )
          <fpage>74</fpage>
          -
          <lpage>82</lpage>
          . doi:
          <volume>10</volume>
          .1145/1964897.1964918.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>B.</given-names>
            <surname>Barshan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yüksek</surname>
          </string-name>
          ,
          <article-title>Recognizing daily and sports activities in two open source machine learning environments using body-worn sensor units</article-title>
          ,
          <source>Comput. J</source>
          .
          <volume>57</volume>
          (
          <year>2014</year>
          )
          <fpage>1649</fpage>
          -
          <lpage>1667</lpage>
          . doi:
          <volume>10</volume>
          . 1093/comjnl/bxt075.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>F. J.</given-names>
            <surname>Ordóñez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Roggen</surname>
          </string-name>
          ,
          <article-title>Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition</article-title>
          ,
          <source>Sensors</source>
          <volume>16</volume>
          (
          <year>2016</year>
          ). doi:
          <volume>10</volume>
          .3390/s16010115.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <article-title>Deep learning based human activity recognition (har) using wearable sensor data</article-title>
          ,
          <source>International Journal of Information Management Data Insights</source>
          <volume>1</volume>
          (
          <year>2021</year>
          )
          <article-title>100046</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>