=Paper= {{Paper |id=Vol-3776/paper12 |storemode=property |title=Kuura: Leveraging eclipse kuksa in vehicular data collection and digital twin creation environment |pdfUrl=https://ceur-ws.org/Vol-3776/paper12.pdf |volume=Vol-3776 |authors=Olli Timonen,Toni Bornström,Nicklas Stafford,Samuli Määttä,Alireza Bakhshi Zadi Mahmoodi,Tero Päivärinta,Ella Peltonen |dblpUrl=https://dblp.org/rec/conf/tktp/TimonenBSMMPP24 }} ==Kuura: Leveraging eclipse kuksa in vehicular data collection and digital twin creation environment== https://ceur-ws.org/Vol-3776/paper12.pdf
                                Kuura: Leveraging Eclipse Kuksa in Vehicular Data
                                Collection and Digital Twin Creation Environment
                                Olli Timonen, Toni Bomström, Nicklas Stafford, Samuli Määttä,
                                Alireza Bakhshi Zadi Mahmoodi, Tero Päivärinta and Ella Peltonen
                                Empirical Software Engineering in Software, Systems, and Services, University of Oulu, Finland


                                                  Abstract
                                                  Increased sensing and computing capabilities in cars are crucial for advanced traffic and driving automation. However, novel
                                                  data delivery, testing, and machine learning pipelines are still needed to harness the full capabilities of automotive sensing
                                                  solutions. At the same time, vehicular digital twins are needed to enable versatile testing and simulation capabilities. This
                                                  paper depicts the Vehicle-In-The-Loop (VIL) cloud interface and verifies data consistency regardless of the source. The
                                                  study aims to determine how data collected from simulation corresponds to real test drive data. The data is collected from
                                                  both simulation and actual test drives. Utilising the MQTT protocol, data is stored on a cloud server and further fed into
                                                  Unreal Engine 5, where the test drive is replayed, and its correspondence to the real drive is ensured. This work offers a new
                                                  perspective on verifying data consistency between simulated and real test drives and complements the vehicle abstraction
                                                  opportunities provided by Eclipse KUKSA. Our results highlight digital twin creation as a part of automotive software
                                                  development and set premises for testing and validating complex use cases, such as traffic accidents and extreme weather,
                                                  that can rarely or only with severe expenses be tested in real-life situations.

                                                  Keywords
                                                  Vehicular Computing, Data Transfer, Digital Twins,



                                1. Introduction                                                                                                    is broad, and definitions may vary. Still, the main idea is
                                                                                                                                                   to model physical systems with digital means and update
                                Today’s cars hold considerable computational and sens- these digital models dynamically based on measurement
                                ing capabilities that are crucial for advanced traffic and data. In essence, digital twin methods provide digital
                                driving automation, applications spanning from safety spaces where reality can be modelled virtually [7] as it
                                features to fully automated vehicles. However, data de- is or would be in unseen but possible situations. Indeed,
                                livery and management protocols and interfaces that are the creation of the digital twin as part of automotive
                                required for machine learning pipelines are still primar- software development sets premises for testing and vali-
                                ily closed in company-specific silos [1]. The first efforts dating complex use cases, such as traffic accidents and
                                for creating open-source data transfer protocols and in- extreme weather, that can rarely or only be tested in real-
                                terfaces from the car to the cloud environment include life situations with severe expenses. The utilisation of
                                Eclipse Kuksa [2], of which this work also bases, but con- digital twins for automotive software development opens
                                siderable work is still required for data validation and avenues for testing different sensors and components in
                                benchmark efficiency of the proposed frameworks. Data actual use cases, such as studying the longevity of such
                                sharing through open interfaces can boost innovation by components and proposing novel learning strategies that
                                more efficient and accurate machine learning models that combine multiple data sources.
                                cover more expansive geographic areas and use cases [3].                                                              This paper describes the Vehicle-In-The-Loop (VIL)
                                              At the same time, digital twins can enable versatile cloud interface. It verifies data consistency regardless of
                                testing and simulation capabilities as seen with applica- the source: a real car on the road or a virtual object in
                                tions in industry, energy, and transportation verticals [4]. the digital twin environment. The overview of the Kuura
                                Indeed, digital twins have attracted much research inter- platform is provided in Figure 1. We use KUKSA.val
                                est in recent years [5, 6]. The concept of the digital twin [8] that provides a vehicle abstraction layer to enable
                                                                                                                                                   the management and use of vehicle signals. As a dig-
                                TKTP 2024: Annual Doctoral Symposium of Computer Science, 10.- ital twin modelling framework, we use Unreal Engine
                                11.6.2024 Vaasa, Finland                                                                                           5. Capabilities of utilising such a game engine in digital
                                Envelope-Open olli.timonen@oulu.fi (O. Timonen); toni.bomstrom@proton.me
                                (T. Bomström); Nicklas.Stafford@oulu.fi (N. Stafford);
                                                                                                                                                   twin creation have been successfully demonstrated in
                                Alireza.BakhshiZadiMahmoodi@oulu.fi (A. B. Z. Mahmoodi);                                                           wind  power plants [9] and cultural tourism [10]. Using
                                tero.paivarinta@oulu.fi (T. Päivärinta); ella.peltonen@oulu.fi                                                     a game engine and a VIL cloud interface enables visual
                                (E. Peltonen)                                                                                                      simulation that complements the capabilities provided
                                Orcid 0009-0006-4132-9496 (O. Timonen); 0009-0004-3192-7711                                                        by KUKSA.val; KUKSA.val is used to collect data from a
                                (A. B. Z. Mahmoodi); 0000-0002-3374-671X (E. Peltonen)
                                                      © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License test drive in a real car. For validation, we determine how
                                            Attribution 4.0 International (CC BY 4.0).




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
Figure 1: Overview of Kuuras overall architecture.



data collected from the simulation corresponds to real         edge and cloud back-ends can perform challenging infer-
test drive data in a real-life driving scenario. Utilising     ence and learning tasks to support drivers’ cognition and
the MQTT protocol, data is stored on a cloud server and        automate the driving scenario [11, 12]. Such intelligent
further fed into Unreal Engine 5, where the test drive         systems demand training data, which in-vehicle sensors
is replayed, and its correspondence to the real drive is       and external databases can provide. How to make the
ensured. This work offers a new perspective on verifying       data available, processed, and utilised in a challenging
data consistency between simulated and real test drives        real-time and mobile environment is a timely research
and complements the vehicle abstraction opportunities          question.
provided by Eclipse KUKSA.                                        Vehicular safety systems (and any other relevant ap-
   The main contributions of this work are the following:      plications) of any level of driving autonomy require data
1) We provide the Kuura platform for similarly collecting      from in-vehicle sensors [13], such as cameras, LiDARs,
vehicular sensor data to a real car and similar test runs      radars, and speed meters [14, 15]. This information can
in a digital twin environment. With this, we extend the        be used to, for example, improve lane [16] and road pot-
KUKSA.val environment to better fit digital testing and        hole [17] recognition. Solutions for detecting drivers’
validation tasks. 2) We explore Unreal Engine 5 as a ve-       behaviour while using smartphones during driving [18]
hicular digital twin environment and provide a pipeline        and drunk driving [19] have been explored. However,
to deploy such digital twins with simulated and real test      the results underline that human drivers’ perception and
drives. 3) We experiment with the data consistency be-         reasoning still maintain an advantage compared to fully
tween simulated and real test drives and further demon-        automatic vehicles [20].
strate the power of game engine-based digital twins in            However, most of the in-vehicular and driver’s per-
vehicular computing and sensing scenarios.                     sonal sensors and interfaces are brand-specific or closed,
                                                               limiting access to the data, computing, and networking
                                                               capabilities and thus hindering vehicular application de-
2. Background                                                  velopment. To enable connected vehicles to utilise all
                                                               the available data sources, AI/ML computing resources,
2.1. Vehicle as a Sensing Device                               and networking capabilities, open-sourced general in-
Modern cars implement technologies for automatic brak-         terfaces and software platforms need to be defined [1].
ing, Cooperative Adaptive Cruise Control (CACC), pre-          On-board diagnostics (OBD) protocol refers to a vehi-
vention of unwanted lane crossing, distance keeping, and       cle’s self-diagnostic and reporting capability. The more
so on, to supply drivers’ own cognition and prevent acci-      advanced OBD-II is a protocol homogenised into the vehi-
dents. For further technological advancement, vehicles         cle itself, allowing software-defined onboard operations
will require artificial intelligence (AI) and machine learn-   and, most importantly, collecting a wide range of vehic-
ing (ML) capabilities depending on effective data transfer     ular data to the software-defined vehicle’s case. This
and management systems. With increased networking              includes but is not limited to engine load, coolant tem-
and computing capabilities, vehicles and their supporting      perature, fuel pressure, engine revolutions per minute
                                                               (RPM), vehicle speed, intake air temperature, airflow rate,
throttle position and many types of sensor data like oxy-  precisely under source made available license, making
gen sensors and fuel system status. OBD-II has relatively  it suitable for various applications in autonomous and
easy access to the mentioned sensors, which is enough to   driving-support test cases [22]. The key differences be-
prove the concept. In the future, research conducted with  tween Unity and Unreal Engine are summarised in Table
vehicular sensors can utilise direct access to the vehicle’s
                                                           1. Unreal Engine has typically been considered a better
controller area network (i.e. CAN bus) and standardised    choice for 3D games, while Unity has been considered a
architectures such as AUTOSAR for wider data access.       strong choice for 2D games.
                                                              The previous literature emphasises the importance
2.2. Automotive Simulations                                of  determinism in simulation environments to ensure
                                                           repeatability, allowing for trustworthy and easily debug-
Driving and traffic simulators are used in the automo- gable results. Game engines still may come with chal-
tive industry as an alternative to costly and potentially lenges of non-deterministic behaviours. For example, the
dangerous real-life testing [21]. The advantages of such investigation by Chance et al. [24] reveals significant
practices highlight effectiveness in analysing human driv- simulation variance in CARLA, particularly due to ac-
ing behaviour and essential traffic situations often too tor collisions and system-level resource utilisation. As
hazardous to test in real-life scenarios, such as extreme such, accuracy investigation is one of the key goals in
weather, congestion, and accidents [22]. This can be our preliminary work presented in this paper.
especially emphasised in the increasing reliance on sim-
ulation technologies for assessing human driving factors.
                                                           2.3. Automotive Digital Twins
The more real-life, photo-realistic simulations enable si-
multaneous testing of vehicle dynamics and stochastic Digital twins (DT) in the context of vehicles is an emerg-
pedestrian, driver, and vehicle interactions in various ing field that has attracted significant attention in both
scenarios [23].                                            industry and academia [27]. Digital twins are virtual
   However, traditional simulators often have limitations representations of physical entities, such as vehicles, that
in emulating real-life behaviour and perception. This has aim to mirror the lives and behaviours of their real-world
led to a growing interest in game engines as simulation counterparts [28]. These digital replicas use the best
platforms for developing and testing autonomous vehi- available physical models, sensor updates, and other data
cle control systems [24]. Several vehicular simulators, sources to simulate and predict the behaviour of the cor-
such as CARLA, AirSim and CarSIM, provide simula- responding physical twin [29, 30]. One area where digital
tion capabilities and environments to support vehicle twins have shown great potential is in the automotive
research and development. These platforms have been industry, particularly for electric vehicles [31]. Digital
used to study vehicle autonomy, safety and performance. twins can greatly benefit electric vehicles, which have
CARLA is a free open-source simulator to support au- gained greater market share in recent years. By creating
tonomous vehicle systems’ development, training, and a digital twin of an electric vehicle, manufacturers and
validation. AirSim is a simulator for drones and cars researchers can simulate and optimise its performance,
developed by Microsoft. It can also provide the possibil- energy consumption and other key parameters. Unlike
ity to experiment with deep learning, computer vision traditional simulators, digital twins provide beyond capa-
and reinforcement learning algorithms in autonomous bilities for human-machine interaction and performing
vehicles and the creation of complex and changeable envi- data-driven actions in real-world scenarios.
ronments and additional sensor modalities [25]. CarSim        Digital twins also play a key role in the design and
is a vehicle dynamics simulation platform that allows the development of autonomous vehicles [32]. The concept
simulation of vehicle behaviour in different conditions of digital twins is closely related to the transition to data-
and environments, including motor dynamics, through driven vehicles, as it enables the analysis and validation
Simulink models. It can be used to create accurate models of autonomous vehicle designs [33]. By exploiting digital
of vehicles and simulate their behaviour under different twin technologies, researchers can assess the safety and
road surfaces, weather conditions, and traffic situations security of autonomous vehicles and identify potential
[26], but is not open-sourced.                             risks and vulnerabilities. Furthermore, combining digi-
   In this study, we use Unreal Engine, renowned for its tal twins with combined vehicle technology and cloud
versatility, high-quality graphics and realistic physics computing has led to the development of the Mobility
simulation, which is useful for simulating vehicles [21]. Digital Twin (MDT) framework [34]. These frameworks
Competing game engines include Unity and CryEngine, consist of digital representations of people, vehicles, and
of which CryEngine is the smaller project. The main transport, which enable the analysis and optimisation
arguments that favour Unreal Engine are it is free of of mobility and large-scale traffic systems. By exploit-
cost for research and commercial projects until making ing real-time data and simulations, MDT frameworks
one million revenue, has open source code even it is can support decision-making processes and improve the
   Feature                      Unreal Engine                                 Unity
   Developer                    Epic Games                                    Unity Technologies
   Programming Languages        C++, Blueprint                                C#
   Source Code                  Open source                                   Not open source
   Pricing                      Free for research and for commercial use      Free version available
                                up to 1 million revenue, 5% comission after
                                that
   Learning Curve               Steep                                         Easy to learn with intuitive user interface
   Graphics                     Photorealistic graphics, used in AAA games    High-quality graphics, but not as refined
                                                                              as Unreal
   Physics and Simulation       Ragdoll physics, physics-based destruction,   Easily integrated and well-rounded with
                                fluid simulation                              other engine features
   2D vs. 3D                    Excellent 3D-development, especially for      Strong 2D development capability, excel-
                                creating photorealistic environments and      lent choice for 2D game projects
                                visual effects
Table 1
Comparison of Unreal Engine and Unity



efficiency and safety of transportation systems.                abilities. The maintenance and support of open-source
   Digital twins enable the simulation, optimisation, and       software can be uncertain if their developers and com-
analysis of vehicle performance, energy consumption,            munity are not active or committed. While open-source
and safety and security. Combining digital twins with           software is dynamic and constantly changing, vehicles
connected vehicle technology and cloud computing will           purchased today will remain in traffic for decades. In
extend their capabilities to optimise mobility systems.         addition, there is a need for precise quality control and
As technology advances, digital twins can be expected           software certification in the automotive industry, which
to play a key role in shaping the future of vehicles and        can be challenging to implement in an open-source en-
transport systems. As such, current technologies aim to         vironment because access to representative designs and
create models for distributed multi-agent cyber-physical        industry-standard methodologies is limited. This limi-
systems using co-simulation [35]. Such large-scale digi-        tation challenges researchers as automotive companies
tal twins should be able to make predictions about the          do not openly share their development life-cycles and
future condition and behaviour of the vehicle [36]. How-        verification methods, each maintaining proprietary tech-
ever, AI-based digital twin capabilities require data co-       niques. Given this scenario, there is a growing demand
operation and load-balancing, scheduling, and network           for open-source solutions to support the development
security schemes over vehicle-to-cloud computing con-           and research of automotive applications, emphasizing
tinuum [37].                                                    the need for open-source benchmarks to facilitate re-
                                                                search across various aspects of automotive application
2.4. Open-sourced Automotive Software                           development. [41].

Open source software refers to software that has a pub-
licly available and editable source code. This allows col-      3. Kuura Implementation
laborative development and innovation. One of the most
remarkable benefits of open-source software is its flexi-       3.1. Design Principles
bility and customizability, as user communities can adapt       The cornerstone of our framework is grounded in the
the software to their specific needs. Open source is also       principle of open-source development, ensuring trans-
cost-effective as it is free and reduces dependencies on        parency and collaborative potential. Simplicity is at the
specific software providers. The use of open source also        core, paving the way for effortless future evolution. Our
offers opportunities for innovation in automotive soft-         design philosophy revolves around creating a system that
ware development and promotes the use of new tech-              is not just functional today but remains adaptable and
nologies and solutions [38, 39].                                maintainable for tomorrow’s innovations. The essence
   For the automotive industry, open-source software            of this framework is to avoid complexity instead of em-
presents some unique challenges as vehicular software,          bracing a minimalist approach that prioritises ease of
by default, has life-critical safety and reliability require-   understanding and operation. One must consider the
ments [40]. Technically, anyone can modify the source           life cycle of software components, as updates and de-
code, which may create unwanted surprises and vulner-           pendencies are inevitable. The framework architecture
Figure 2: Deployment diagram of the Kuura vehicular data collection system.



is designed to handle these, avoiding obsolescence and     limited devices or low-bandwidth networks.
incompatibility.                                              The Kuura framework presents a cohesive suite of
                                                           components, each selected for robustness and simplicity.
3.2. Kuura Architecture Design                             At its foundation lies the integration of Kuksa, SMAD,
                                                           and Kuura, delineating a timeline of iterative progress
The general architecture of Kuura is shown in Figure 2. as detailed in Table 2. Each iteration is a response to the
We chose the Unreal Engine 5 game engine because of evolving needs and challenges encountered. Kuksa, ini-
its versatility in creating realistic simulations. This is tially misaligned with its focus on automotive app stores
an essential part of the research objective to verify the and firmware updates, has since been archived [8, 2].
consistency of the data between the simulation and the SMAD was unsustainable due to its complexity and poor
real test runs. The MQTT protocol was chosen to collect documentation [42]. Our simplified stack emerges as
and transfer the data to the cloud server and the game a response, stripping away the superfluous to focus on
engine, as it is reliable and efficient for real-time data functionality. It leverages OpenShift (run on CSC Rahti
transfer, which might be the next step in the research container cloud 1 ) for its cost-efficiency compared to Mi-
and, thus, critical requirements in our study. The MQTT crosoft Azure. This pragmatic approach is engineered
protocol operates asynchronously and is considered an to reduce complexity, cost, and maintenance overhead,
ideal choice for IoT applications that often operate on
                                                            1
                                                                https://rahti.csc.fi/
   Purpose                         Eclipse Kuksa Cloud         SMAD stack                   Kuura (this paper)
   Cloud Service Provider          Microsoft Azure             Microsoft Azure              OpenShift
   Deployment Platform             Kubernetes                  Kubernetes                   OKD
   Client-Server Messaging In-     Eclipse Hono                Eclipse Hono                 Eclipse Mosquitto
   frastructure Broker
   Serverside Messaging Infras-    -                           Ambassador and Kafka         Python script
   tructure                                                    with Zookeeper
   Client Message Persistence      InfluxDB                    MongoDB                      InfluxDB
   Client Message Data Mod-        Kuksa.VAL                   Kuksa.VAL                    Client implementation
   elling
   Client Firmware Updates         Eclipse hawkBit             -                            -
   Client Appstore                 Kuksa Appstore              -                            -
   Messaging Telemetry Storage     -                           MongoDB                      -
   Data Visualization              -                           Node-RED                     Grafana
   Deployment Monitoring           -                           Prometheus Monitoring,       -
                                                               InfluxDB, and Grafana
   Message Tracing                 -                           Jaeger Trace                 -
Table 2
Eclipse Kuksa Cloud, SMAD stack, and Kuura software components.



streamlining operations without compromising capabil-        car. A laptop computer running Linux was connected to
ity.                                                         the adapter, and a script was run to record data from the
   Each framework iteration — Kuksa, SMAD, and Kuura         vehicle in a log file. The successful log file collection was
— brings new insights. Kuksa’s archival signals a            further important in developing the auto-client script
pivot away from its original automotive-centric fo-          for future larger tests and ensuring the whole system’s
cus. SMAD’s downfall was its complexity and reliance         functionality. Practical testing in the first phase was
on now-inaccessible Kubernetes Helm charts. Kuura            carried out by driving the car and ensuring the data was
emerges as the distilled essence of its predecessors, em-    stored correctly and its format was manageable.
bodying simplicity and sustainability. By eliminating
non-essential components, Kuura adapts existing func-        3.4. Data Transmission
tionalities with more straightforward tools, significantly
reducing cost and complexity and enabling an environ-      MQTT makes it trivial to multi-cast the collected data if
ment conducive to continuous development and opera-        we want to enable multiple clients to listen to the gen-
tion.                                                      erated data simultaneously. One example of such a sce-
                                                           nario is live visualisation of the data while saving it to a
                                                           database without additional latency. While we could also
3.3. Vehicle Data Reader
                                                           save the data and then fetch it from the database, this
The OBD-II is a port designed for diagnostic purposes. would add latency to the visualisation. MQTT also has
It has multiple buses available. These buses include the built-in, easy-to-configure security mechanisms. Setting
CAN bus, SAE-1850 and ISO-9141-2. The automotive up MQTT with SSL is very easy, and configuring the
manufacturers can also provide other networks at their MQTT broker to require client certificates for communi-
discretion [43]. The bus we are most interested in is cation is also very easy. The connection can also be set
the CAN bus. On some vehicles, the CAN bus available up to require a username and password.
at the OBD connector can be protected by a gateway            We could also use HTTP or raw TCP/UDP sockets as
device restricting access to some data from the OBD port. an alternative for MQTT. While HTTP offers security
Unlike the CAN bus inside the car, you must poll the measures similar to MQTT, it does not have multi-cast
OBD port to receive any data. While we could get most by default. While it is not hard to implement, MQTT has
of the data we wanted from the OBD port, some data, it built in and is most likely already done correctly. One
like the steering wheel position, was unavailable. This advantage HTTP has over MQTT is the ability to com-
makes the OBD port unsuitable as a data source for our municate directly between two applications, eliminating
purposes, as it would make it quite difficult to drive the the need for a broker in cases where there is only one
virtual car in Unreal accurately.                          client.
   In the evaluation phase, we collected data from an         Raw sockets are the most basic option, and they don’t
OBD-II Bluetooth adapter connected to a Toyota RAV4 come with any of the advanced features included in
Figure 3: Sequence diagram of the vehicular data transferred to Unreal Engine 5.



MQTT out of the box. However, they are very versa-            3.5. Cloud Environment
tile and can be used for various purposes. One advan-
                                                              The cloud environment receives data from the MQTT
tage the sockets would offer is the ability to write raw
                                                              broker. The environment also has a Python script that
can data as is to the socket. This would enable saving
                                                              connects to the broker to receive the data from the vehicle.
raw can dumps in a database with minimal overhead
                                                              The script stores all of the messages received by InfluxDB.
if we ever needed/wanted to support it. One problem
                                                              The point name and field are derived from the MQTT
with multi-cast solutions is that the provider has no idea
                                                              topic. The timestamp is also gotten from the MQTT
if any clients are listening for the sent data unless the
                                                              message payload. Since the timestamp is included in the
clients have been programmed to provide feedback when
                                                              message, we could use any database solution to store the
they are listening. This makes it harder to implement the
                                                              data. If the timestamp were missing, however, then a
provider in a way that it holds the messages in memory
                                                              time series database would be our only option since the
or saves them locally in case the data is sent to nowhere.
                                                              message times are crucial for playback at a later time.
   In the experiments, a laptop was used as the in-vehicle
                                                              By getting the message time from the provider, we can
client running Ubuntu 22.04 LTS, and the script collect-
                                                              ensure that network conditions do not affect the accuracy
ing the data was written in Python using an OBD library
                                                              of the recorded timestamps.
[44]. The script writes read values into a CSV file locally
                                                                 InfluxDB, a time series database used to store large
and publishes them using the MQTT protocol. The back-
                                                              amounts of time-stamped data due to its high perfor-
end was deployed on CSC’s Rahti as RedHat OpenShift
                                                              mance and scalability, was stored at the onset of the pro-
deployments. On the server side, Mosquitto MQTT bro-
                                                              cess. Storage is essential in handling large amounts of
ker forwards the published messages to subscribers. The
                                                              data that emanate from driving vehicles. A Python script
most important subscriber is a Python service that stores
                                                              was then used in the next stage of the data-processing
received messages in an InfluxDB instance. As an addi-
                                                              workflow. This script had two main functionalities: First,
tional demonstration, Grafana was deployed to provide
                                                              it reads GPS point data pre-recorded into a JSON file,
a real-time dashboard for the published and stored data.
                                                              which is vital in mapping out routes of vehicles. Sec-
The sequence diagram is provided in Figure 3.
                                                              ondly, this script establishes a connection with InfluxDB
                                                              to retrieve useful information within a particular range.
                                                              This recovery is critical for evaluating the vehicle’s per-
                                                              formance and environmental conditions during various
experiment stages.
   At this point, the processed data goes through an
MQTT broker using a Python script. Once more, this pro-
tocol provides lightweight messaging, providing fast and
reliable real-time information transmission that would
be needed for the simulation environment.

3.6. Simulation Environment
Multiple reasons contributed to the choice of Unreal En-
gine 5 game engine, including the capacity to create real-
istic simulations of real car driving and the possibility of
driving a car in a simulation, thereby generating corre-       Figure 4: Simulated route in the virtual environment, based
sponding data. The research aimed to ensure uniformity         on the real data points collected during the experiments.
between the simulation and actual driving, thus requiring
realistic simulations. Unreal Engine 5 is also open-source,
which meets one of the implementation principles of the        and efficient. This technique also makes Unreal Engine
study, making future development as easy as possible.          simulation more elaborate. It allows different scenarios
   The research utilised the MQTT protocol, one of the         to be run on a platform without sticking to a single static
key IoT connections and data collection components. Un-        map, giving the evaluation process more flexibility.
real Engine does not have native MQTT support. For this
reason, we used the NinevaStudios MQTT-utilities exten-
sion with some modifications. This extension allowed           4. Experimentation
MQTT data communication, which is essential for col-
lecting data from the simulation, with minor adjustments       4.1. Real-life Experiment
made to transfer it to cloud storage securely. Through         The real-world tests were conducted in the OuluZone
this connection, it was possible to develop a dynamic and      vehicle testing area using a Toyota RAV4 Hybrid 2019
interactive simulation environment.                            vehicle. A closed area, such as OuluZone, was chosen
   Lastly, we simulate a car running along received GPS        because it allows for assessing the drives and their safety.
points as shown in Figure 4. In the simulation, the vehi-      The significance of this place is that it helped gather and
cle’s movement was driven by speed data acquired from          analyse information in real-life scenarios, thus allowing
the MQTT broker. As a result, real-time synchronisa-           comparison and verification with data collected from
tion between the GPS points and speed data gave an             virtual and actual driving instances. Besides being a
actual representation of the journey made by the vehi-         recreational driving and sports centre, OuluZone is also
cle, hence allowing for the immersion of details about         a notable site for research and learning, especially on
its performance in different circumstances. Such a holis-      autonomous cars and related technologies.
tic approach to data storage, processing, transmission,           Several laps were driven during the tests, some with
and visualisations shows how diverse technologies can          the cruise control set at different speeds (30km/h, 40km/h,
be integrated into high-level vehicular data analysis and      and 50km/h) to facilitate the validation of results in the
simulation.                                                    simulation with data collected at a constant speed. Laps
   The initial version of the Kuura presented in this paper    were also driven without cruise control at varying speeds.
has a dynamic road generated as the car moves around,          Driving data was collected during the test via an OBD-II
thus simplifying testing by making it independent of           Bluetooth adapter connected to a laptop running Linux.
environmental conditions. This method enables better           This allowed for the vehicle data to be logged and its
flexibility in the testing process because it does not re-     format managed. Towards the end of the tests, a USB
quire a predefined route or special environmental cir-         adapter enhanced data collection.
cumstances. Generating dynamic roads is essential to
ensure the reliability of the data collection system. This
phase, built on the multiple approaches used in the study,     4.2. Virtual Experiment
emphasises adaptability and precision. By generating           Our virtual experiment utilised Unreal Engine 5.3.2 to
the road during runs accuracy of collected data could be       drive test drive scenarios comparable to our real-world
instantly evaluated. It is particularly advantageous to        data collection efforts. In this experiment, we used the
work within this dynamic environment for the purposes          same logger used during actual test drives with a real car,
of identifying and solving prospective issues within a         ensuring a uniform approach to data acquisition and prov-
workflow for data processing that would make it strong         ing that the logger could be used without changes in both
Figure 5: Screenshot of influxDB which contains both real-
world data (smad/toyota) and virtually collected data (unre-
al/toyota).
                                                               Figure 6: A picture of the car driving in the virtual OuluZone
                                                               3D environment using the data collected in the real OuluZone.

environments. We gathered data on speed and time from
the virtual test drive, which can be cross-verified with the
real car’s outputs. The current limitation of real-world       5. Discussion and Conclusions
data collection stems from the OBD-II interface’s inabil-
ity to provide comprehensive vehicle diagnostics. In the       In this study, we have aimed to bring new insights into
virtual setting, we collected additional data such as gear,    vehicular data collection and the creation of digital twins
throttle, brake application, and steering angle. These         by using the Eclipse Kuksa platform and Unreal Engine
were predominantly included for illustrative purposes,         5 to simulate driving scenarios. Our main focus was
aiming to demonstrate the extensive data collection pos-       providing an overview of the simplified vehicular data
sibilities within a simulated environment. It is important     collection architecture that can be easily developed for
to note that verifying these additional parameters will        further projects and verifying the consistency between
become feasible with future access to the CAN bus, allow-      real and simulated vehicular data through practical real-
ing for a more detailed and accurate comparison between        world experimentation.
virtual and real vehicle data.                                    Using the MQTT protocol for sending data and Unreal
                                                               Engine 5 for simulation has allowed us to compare real
                                                               driving data with simulated ones. This method makes
4.3. Experimentation Results                                   digital twins more reliable and allows later use for testing
In our validation process, we specifically focused on com-     in many conditions that are hard or expensive to create
paring the collected GPS data and speed data between the       in real life, like very bad weather or different kinds of
actual and virtual driving tests conducted in Unreal En-       traffic situations.
gine 5. As shown in Figure 5, the same InfluxDB database          We encountered challenges in data collection via the
successfully contains both real-world data (smad/toyota)       OBD-II protocol because it is filtered and does not allow
and virtually collected data (unreal/toyota). This design      the collection of all possible data. This limitation high-
will further allow simultaneous analysis of both virtual       lighted the need for more comprehensive data acquisition
and real-world data sets, allowing us to expand the digital    methods like the CAN bus. The data collection limita-
twin creation capabilities with virtual realities and actual   tions prompted us to consider future enhancements in
real-life test runs, independently of the data source.         our methodology to achieve a more accurate and encom-
   As illustrated in Figure 6, we successfully mapped the      passing digital representation of the vehicle.
collected GPS data onto the 3D model of the racetrack in          Our findings open up possibilities for future research
runtime from cloud and verified its accuracy. This demon-      directions, including optimising data transmission meth-
strates that our virtual environment can accurately repli-     ods for improved efficiency and exploring bi-directional
cate real driving conditions. The speed data collected in      data flow between the digital twin and the vehicle. Such
the database corresponded with the data obtained in the        advancements could potentially enable real-time vehicle
Unreal Engine 5 simulation, confirming the consistency         control based on digital twin data.
of data in both real and virtual driving scenarios. While         By integrating additional simulation models and con-
the data transmitted from the game engine to the server        sidering more sophisticated data collection interfaces, we
was also accurate, at this stage, our primary focus was        anticipate that future iterations of this work will address
on verifying the accuracy of speed and time information.       the current limitations and unlock new capabilities for
Expanding this experimentation to cover a wider range          digital twins in automotive research and development.
of variables is possible in future research.                   The potential for these technologies to improve vehicle
                                                               safety, efficiency, and innovation is immense, paving the
way for a more interconnected and intelligent transporta-          tions, and design implications, IEEE Access 7 (2019)
tion ecosystem.                                                    167653–167671.
   Future efforts should be made using the CAN bus in- [8] A. Banijamali, P. Jamshidi, P. Kuvaja, M. Oivo,
stead of the OBD-II to improve accuracy completeness               Kuksa: A cloud-native architecture for enabling
and to have access to all possible data the vehicle pro-           continuous delivery in the automotive domain,
vides. Reconsidering data transmission methods, like               in: International Conference on Product-Focused
MQTT, for more efficient data multicasting is also a pos-          Software Process Improvement, Springer, 2019, pp.
sible future direction. In the future, we are also looking         455–472.
into sending data from the game engine to the car instead      [9] J. V. Sørensen, Z. Ma, B. N. Jørgensen, Potentials
of just storing it in the cloud, having the car drive in real      of game engines for wind power digital twin de-
life and the game engine simultaneously with as little             velopment: an investigation of the unreal engine,
latency as possible and importing Eclipse Arrowhead to             Energy Informatics 5 (2022) 1–30.
extend possibilities with simulation models, such as using [10] F. Sang, H. Wu, Z. Liu, S. Fang, Digital twin platform
the architecture with Matlab Simulink or corresponding             design for zhejiang rural cultural tourism based on
open-sourced physics modelling software.                           unreal engine, in: 2022 International Conference on
                                                                   Culture-Oriented Science and Technology (CoST),
                                                                   IEEE, 2022, pp. 274–278.
Acknowledgments                                               [11] A. Alhilal, T. Braud, P. Hui, Distributed ve-
                                                                   hicular computing at the dawn of 5g: a survey,
The work has been supported by the EU HORI-
                                                                   arXiv:2001.07077 (2020).
ZON project CHIPS-JU CIA FEDERATE (grant number
                                                              [12] Y. Khaled, M. Tsukada, J. Santa, T. Ernst, On the
101139749), Business Finland project 6G Visible (grant
                                                                   design of efficient vehicular applications, in: VTC
number 10743/31/2022), and the Finnish Research Coun-
                                                                   Spring 2009-IEEE 69th Vehicular Technology Con-
cil project Northern Utility Vehicle Laboratory Consor-
                                                                   ference, IEEE, 2009, pp. 1–5.
tium GO!-RI (grant number 352726).
                                                              [13] S. Baidya, Y. Ku, H. Zhao, J. Zhao, S. Dey, Vehicular
                                                                   and edge computing for emerging connected and
References                                                         autonomous vehicle applications, in: Proc. of the
                                                                   57th Design Automation Conference (DAC), 2020.
  [1] E. Peltonen, A. Sojan, T. Päivärinta, Towards real- [14] M. Munz, M. Mahlisch, K. Dietmayer, Generic cen-
      time learning for edge-cloud continuum with ve-              tralized multi sensor data fusion based on proba-
      hicular computing, in: 2021 IEEE 7th World Fo-               bilistic sensor and environment models for driver
      rum on Internet of Things (WF-IoT), IEEE, 2021, pp.          assistance systems, IEEE Intelligent Transportation
      921–926.                                                     Systems 2 (2010).
  [2] A. Banijamali, P. Kuvaja, M. Oivo, P. Jamshidi, [15] F. Garcia, D. Martin, A. De La Escalera, J. M. Armin-
      Kuksa∗: Self-adaptive microservices in automotive            gol, Sensor fusion methodology for vehicle detec-
      systems, in: International Conference on Product-            tion, IEEE Int Transportation Systems 9 (2017).
      Focused Software Process Improvement, Springer, [16] Q. Li, L. Chen, M. Li, S. L. Shaw, A. Nüchter, A
      2020, pp. 367–384.                                           sensor-fusion drivable-region and lane-detection
  [3] J. Nickerson, K. Lyttinen, J. L. King, Automated Ve-         system for autonomous vehicle navigation in chal-
      hicles: A Human/Machine Co-learning Perspective,             lenging road scenarios, IEEE Transactions on Ve-
      Technical Report, SAE Technical Paper, 2022.                 hicular Technology 63 (2014) 540–555.
  [4] F. Tao, B. Xiao, Q. Qi, J. Cheng, P. Ji, Digital twin [17] A. Ghose, P. Biswas, C. Bhaumik, M. Sharma, A. Pal,
      modeling, Journal of Manufacturing Systems 64                A. Jha, Road condition monitoring and alert appli-
      (2022) 372–389.                                              cation, in: IEEE International Conference on Perva-
  [5] M. Liu, S. Fang, H. Dong, C. Xu, Review of digital           sive Computing and Communications Workshops,
      twin about concepts, technologies, and industrial            IEEE, Lugano, Switzerland, 2012, pp. 489–491.
      applications, Journal of Manufacturing Systems 58 [18] Y. Wang, J. Yang, H. Liu, Y. Chen, M. Gruteser, R. P.
      (2021) 346–361. Digital Twin towards Smart Manu-             Martin, Sensing vehicle dynamics for determining
      facturing and Industry 4.0.                                  driver phone use, in: Int. conf. on mobile systems,
  [6] F. Tao, H. Zhang, A. Liu, A. Y. Nee, Digital twin            applications, and services, 2013, pp. 41–54.
      in industry: State-of-the-art, IEEE Transactions on [19] J. Ljungblad, B. Hök, A. Allalou, H. Pettersson, Pas-
      Industrial Informatics 15 (2018) 2405–2415.                  sive in-vehicle driver breath alcohol detection using
  [7] B. R. Barricelli, E. Casiraghi, D. Fogli, A survey on        advanced sensor signal acquisition and fusion, Traf-
      digital twin: Definitions, characteristics, applica-         fic injury prevention 18 (2017).
[20] B. Schoettle, Sensor fusion: A comparison of sens-             security in autonomous vehicles, IEEE Communi-
     ing capabilities of human drivers and highly auto-             cations Standards Magazine 5 (2021) 40–46.
     mated vehicles, University of Michigan (2017).            [33] T. Fuchs, M. Zinser, K. Renatus, B. Bäker, Data
[21] D. Michalík, M. Jirgl, J. Arm, P. Fiedler, Developing          model of automotive digital twins, ATZelectronics
     an unreal engine 4-based vehicle driving simulator             worldwide 16 (2021) 52–57.
     applicable in driver behavior analysis—a technical        [34] Z. Wang, R. Gupta, K. Han, H. Wang, A. Gan-
     perspective, Safety 7 (2021) 25.                               lath, N. Ammar, P. Tiwari, Mobility digital twin:
[22] S. Malik, M. A. Khan, H. El-Sayed, Carla: Car learn-           Concept, architecture, case study, and future chal-
     ing to act — an inside out, Procedia Computer                  lenges, IEEE Internet of Things Journal 9 (2022)
     Science 198 (2022) 742–749. 12th International Con-            17452–17467.
     ference on Emerging Ubiquitous Systems and Per-           [35] M. Palmieri, C. Quadri, A. Fagiolini, C. Bernarde-
     vasive Networks / 11th International Conference                schi, Co-simulated digital twin on the network edge:
     on Current and Future Trends of Information and                A vehicle platoon, Computer Communications 212
     Communication Technologies in Healthcare.                      (2023) 35–47.
[23] A. Dubs, V. C. Andrade, M. Ellis, S. Ganley, B. Kara-     [36] G. Bhatti, H. Mohan, R. R. Singh, Towards the future
     man, O. Toker, A photo-realistic simulation and test           of smart electric vehicles: Digital twin technology,
     platform for autonomous vehicles research (????).              Renewable and Sustainable Energy Reviews 141
[24] G. Chance, A. Ghobrial, K. McAreavey, S. Lemaig-               (2021) 110801.
     nan, T. Pipe, K. Eder, On determinism of game             [37] D. Chen, Z. Lv, Artificial intelligence enabled digital
     engines used for simulation-based autonomous ve-               twins for training autonomous cars, Internet of
     hicle verification, IEEE Transactions on Intelligent           Things and Cyber-Physical Systems 2 (2022) 31–41.
     Transportation Systems (2022).                            [38] S. Kochanthara, Y. Dajsuren, L. Cleophas,
[25] W. Jansen, E. Verreycken, A. Schenck, J.-E. Blan-              M. van den Brand, Painting the landscape of
     quart, C. Verhulst, N. Huebel, J. Steckel, Cosys-              automotive software in github, in: Proceedings
     airsim: A real-time simulation framework ex-                   of the 19th International Conference on Mining
     panded for complex industrial applications, in:                Software Repositories, 2022, pp. 215–226.
     2023 Annual Modeling and Simulation Conference            [39] S. Niæetin, R. Šandor, G. Stupar, N. Tesliæ, Maximiz-
     (ANNSIM), IEEE, 2023, pp. 37–48.                               ing the efficiency of automotive software develop-
[26] Q. Liu, D. Xie, S. Hu, J. Wu, Research on dynamic              ment environment using open source technologies,
     performance simulation of in-wheel motor electric              in: 2018 IEEE 8th International Conference on Con-
     vehicle based on carsim-simulink, in: Journal of               sumer Electronics-Berlin (ICCE-Berlin), IEEE, 2018,
     Physics: Conference Series, volume 1820, IOP Pub-              pp. 1–3.
     lishing, 2021, p. 012109.                                 [40] Y. Zhang, Y. Ning, C. Ma, L. Yu, Z. Guo, Empiri-
[27] A. Fuller, Z. Fan, C. Day, C. Barlow, Digital twin: En-        cal study for open source libraries in automotive
     abling technologies, challenges and open research,             software systems, IEEE Access (2023).
     IEEE access 8 (2020) 108952–108971.                       [41] F. A. da Silva, A. C. Bagbaba, A. Ruospo, R. Mariani,
[28] J. A. Ross, K. Tam, D. J. Walker, K. D. Jones, To-             G. Kanawati, E. Sanchez, M. S. Reorda, M. Jenihhin,
     wards a digital twin of a complex maritime site for            S. Hamdioui, C. Sauer, Special session: Autosoc-a
     multi-objective optimization, in: 2022 14th Interna-           suite of open-source automotive soc benchmarks,
     tional Conference on Cyber Conflict: Keep Moving!              in: 2020 IEEE 38th VLSI Test Symposium (VTS),
     (CyCon), volume 700, IEEE, 2022, pp. 331–345.                  IEEE, 2020, pp. 1–9.
[29] S. Maulik, D. Riordan, J. Walsh, Dynamic reduction-       [42] H. Hirvonsalo, P. Seppänen, On deployment of
     based virtual models for digital twins—a compara-              eclipse kuksa as a framework for an intelligent
     tive study, Applied Sciences 12 (2022) 7154.                   moving test platform for research of autonomous
[30] A. M. Madni, C. C. Madni, S. D. Lucero, Leveraging             vehicles, in: Proceedings of the 2nd Eclipse Re-
     digital twin technology in model-based systems                 search International Conference on Security, Arti-
     engineering, Systems 7 (2019) 7.                               ficial Intelligence, Architecture and Modelling for
[31] D. Piromalis, A. Kantaros, Digital twins in the auto-          Next Generation Mobility, RWTH Aachen Univer-
     motive industry: The road toward physical-digital              sity, 2021.
     convergence, Applied System Innovation 5 (2022)           [43] K. McCord, Automotive Diagnostic Systems: Un-
     65.                                                            derstanding OBD I and OBD II, CarTech Inc, 2011.
[32] S. Almeaibed, S. Al-Rubaye, A. Tsourdos, N. P. Avde-      [44] Obd library for python 3, https://github.com/
     lidis, Digital twin analysis to promote safety and             brendan-w/python-OBD, 2023. Accessed: 2023-11-
                                                                    12.