=Paper=
{{Paper
|id=Vol-3776/paper12
|storemode=property
|title=Kuura: Leveraging eclipse kuksa in vehicular data collection and digital twin creation environment
|pdfUrl=https://ceur-ws.org/Vol-3776/paper12.pdf
|volume=Vol-3776
|authors=Olli Timonen,Toni Bornström,Nicklas Stafford,Samuli Määttä,Alireza Bakhshi Zadi Mahmoodi,Tero Päivärinta,Ella Peltonen
|dblpUrl=https://dblp.org/rec/conf/tktp/TimonenBSMMPP24
}}
==Kuura: Leveraging eclipse kuksa in vehicular data collection and digital twin creation environment==
Kuura: Leveraging Eclipse Kuksa in Vehicular Data
Collection and Digital Twin Creation Environment
Olli Timonen, Toni Bomström, Nicklas Stafford, Samuli Määttä,
Alireza Bakhshi Zadi Mahmoodi, Tero Päivärinta and Ella Peltonen
Empirical Software Engineering in Software, Systems, and Services, University of Oulu, Finland
Abstract
Increased sensing and computing capabilities in cars are crucial for advanced traffic and driving automation. However, novel
data delivery, testing, and machine learning pipelines are still needed to harness the full capabilities of automotive sensing
solutions. At the same time, vehicular digital twins are needed to enable versatile testing and simulation capabilities. This
paper depicts the Vehicle-In-The-Loop (VIL) cloud interface and verifies data consistency regardless of the source. The
study aims to determine how data collected from simulation corresponds to real test drive data. The data is collected from
both simulation and actual test drives. Utilising the MQTT protocol, data is stored on a cloud server and further fed into
Unreal Engine 5, where the test drive is replayed, and its correspondence to the real drive is ensured. This work offers a new
perspective on verifying data consistency between simulated and real test drives and complements the vehicle abstraction
opportunities provided by Eclipse KUKSA. Our results highlight digital twin creation as a part of automotive software
development and set premises for testing and validating complex use cases, such as traffic accidents and extreme weather,
that can rarely or only with severe expenses be tested in real-life situations.
Keywords
Vehicular Computing, Data Transfer, Digital Twins,
1. Introduction is broad, and definitions may vary. Still, the main idea is
to model physical systems with digital means and update
Today’s cars hold considerable computational and sens- these digital models dynamically based on measurement
ing capabilities that are crucial for advanced traffic and data. In essence, digital twin methods provide digital
driving automation, applications spanning from safety spaces where reality can be modelled virtually [7] as it
features to fully automated vehicles. However, data de- is or would be in unseen but possible situations. Indeed,
livery and management protocols and interfaces that are the creation of the digital twin as part of automotive
required for machine learning pipelines are still primar- software development sets premises for testing and vali-
ily closed in company-specific silos [1]. The first efforts dating complex use cases, such as traffic accidents and
for creating open-source data transfer protocols and in- extreme weather, that can rarely or only be tested in real-
terfaces from the car to the cloud environment include life situations with severe expenses. The utilisation of
Eclipse Kuksa [2], of which this work also bases, but con- digital twins for automotive software development opens
siderable work is still required for data validation and avenues for testing different sensors and components in
benchmark efficiency of the proposed frameworks. Data actual use cases, such as studying the longevity of such
sharing through open interfaces can boost innovation by components and proposing novel learning strategies that
more efficient and accurate machine learning models that combine multiple data sources.
cover more expansive geographic areas and use cases [3]. This paper describes the Vehicle-In-The-Loop (VIL)
At the same time, digital twins can enable versatile cloud interface. It verifies data consistency regardless of
testing and simulation capabilities as seen with applica- the source: a real car on the road or a virtual object in
tions in industry, energy, and transportation verticals [4]. the digital twin environment. The overview of the Kuura
Indeed, digital twins have attracted much research inter- platform is provided in Figure 1. We use KUKSA.val
est in recent years [5, 6]. The concept of the digital twin [8] that provides a vehicle abstraction layer to enable
the management and use of vehicle signals. As a dig-
TKTP 2024: Annual Doctoral Symposium of Computer Science, 10.- ital twin modelling framework, we use Unreal Engine
11.6.2024 Vaasa, Finland 5. Capabilities of utilising such a game engine in digital
Envelope-Open olli.timonen@oulu.fi (O. Timonen); toni.bomstrom@proton.me
(T. Bomström); Nicklas.Stafford@oulu.fi (N. Stafford);
twin creation have been successfully demonstrated in
Alireza.BakhshiZadiMahmoodi@oulu.fi (A. B. Z. Mahmoodi); wind power plants [9] and cultural tourism [10]. Using
tero.paivarinta@oulu.fi (T. Päivärinta); ella.peltonen@oulu.fi a game engine and a VIL cloud interface enables visual
(E. Peltonen) simulation that complements the capabilities provided
Orcid 0009-0006-4132-9496 (O. Timonen); 0009-0004-3192-7711 by KUKSA.val; KUKSA.val is used to collect data from a
(A. B. Z. Mahmoodi); 0000-0002-3374-671X (E. Peltonen)
© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License test drive in a real car. For validation, we determine how
Attribution 4.0 International (CC BY 4.0).
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
Figure 1: Overview of Kuuras overall architecture.
data collected from the simulation corresponds to real edge and cloud back-ends can perform challenging infer-
test drive data in a real-life driving scenario. Utilising ence and learning tasks to support drivers’ cognition and
the MQTT protocol, data is stored on a cloud server and automate the driving scenario [11, 12]. Such intelligent
further fed into Unreal Engine 5, where the test drive systems demand training data, which in-vehicle sensors
is replayed, and its correspondence to the real drive is and external databases can provide. How to make the
ensured. This work offers a new perspective on verifying data available, processed, and utilised in a challenging
data consistency between simulated and real test drives real-time and mobile environment is a timely research
and complements the vehicle abstraction opportunities question.
provided by Eclipse KUKSA. Vehicular safety systems (and any other relevant ap-
The main contributions of this work are the following: plications) of any level of driving autonomy require data
1) We provide the Kuura platform for similarly collecting from in-vehicle sensors [13], such as cameras, LiDARs,
vehicular sensor data to a real car and similar test runs radars, and speed meters [14, 15]. This information can
in a digital twin environment. With this, we extend the be used to, for example, improve lane [16] and road pot-
KUKSA.val environment to better fit digital testing and hole [17] recognition. Solutions for detecting drivers’
validation tasks. 2) We explore Unreal Engine 5 as a ve- behaviour while using smartphones during driving [18]
hicular digital twin environment and provide a pipeline and drunk driving [19] have been explored. However,
to deploy such digital twins with simulated and real test the results underline that human drivers’ perception and
drives. 3) We experiment with the data consistency be- reasoning still maintain an advantage compared to fully
tween simulated and real test drives and further demon- automatic vehicles [20].
strate the power of game engine-based digital twins in However, most of the in-vehicular and driver’s per-
vehicular computing and sensing scenarios. sonal sensors and interfaces are brand-specific or closed,
limiting access to the data, computing, and networking
capabilities and thus hindering vehicular application de-
2. Background velopment. To enable connected vehicles to utilise all
the available data sources, AI/ML computing resources,
2.1. Vehicle as a Sensing Device and networking capabilities, open-sourced general in-
Modern cars implement technologies for automatic brak- terfaces and software platforms need to be defined [1].
ing, Cooperative Adaptive Cruise Control (CACC), pre- On-board diagnostics (OBD) protocol refers to a vehi-
vention of unwanted lane crossing, distance keeping, and cle’s self-diagnostic and reporting capability. The more
so on, to supply drivers’ own cognition and prevent acci- advanced OBD-II is a protocol homogenised into the vehi-
dents. For further technological advancement, vehicles cle itself, allowing software-defined onboard operations
will require artificial intelligence (AI) and machine learn- and, most importantly, collecting a wide range of vehic-
ing (ML) capabilities depending on effective data transfer ular data to the software-defined vehicle’s case. This
and management systems. With increased networking includes but is not limited to engine load, coolant tem-
and computing capabilities, vehicles and their supporting perature, fuel pressure, engine revolutions per minute
(RPM), vehicle speed, intake air temperature, airflow rate,
throttle position and many types of sensor data like oxy- precisely under source made available license, making
gen sensors and fuel system status. OBD-II has relatively it suitable for various applications in autonomous and
easy access to the mentioned sensors, which is enough to driving-support test cases [22]. The key differences be-
prove the concept. In the future, research conducted with tween Unity and Unreal Engine are summarised in Table
vehicular sensors can utilise direct access to the vehicle’s
1. Unreal Engine has typically been considered a better
controller area network (i.e. CAN bus) and standardised choice for 3D games, while Unity has been considered a
architectures such as AUTOSAR for wider data access. strong choice for 2D games.
The previous literature emphasises the importance
2.2. Automotive Simulations of determinism in simulation environments to ensure
repeatability, allowing for trustworthy and easily debug-
Driving and traffic simulators are used in the automo- gable results. Game engines still may come with chal-
tive industry as an alternative to costly and potentially lenges of non-deterministic behaviours. For example, the
dangerous real-life testing [21]. The advantages of such investigation by Chance et al. [24] reveals significant
practices highlight effectiveness in analysing human driv- simulation variance in CARLA, particularly due to ac-
ing behaviour and essential traffic situations often too tor collisions and system-level resource utilisation. As
hazardous to test in real-life scenarios, such as extreme such, accuracy investigation is one of the key goals in
weather, congestion, and accidents [22]. This can be our preliminary work presented in this paper.
especially emphasised in the increasing reliance on sim-
ulation technologies for assessing human driving factors.
2.3. Automotive Digital Twins
The more real-life, photo-realistic simulations enable si-
multaneous testing of vehicle dynamics and stochastic Digital twins (DT) in the context of vehicles is an emerg-
pedestrian, driver, and vehicle interactions in various ing field that has attracted significant attention in both
scenarios [23]. industry and academia [27]. Digital twins are virtual
However, traditional simulators often have limitations representations of physical entities, such as vehicles, that
in emulating real-life behaviour and perception. This has aim to mirror the lives and behaviours of their real-world
led to a growing interest in game engines as simulation counterparts [28]. These digital replicas use the best
platforms for developing and testing autonomous vehi- available physical models, sensor updates, and other data
cle control systems [24]. Several vehicular simulators, sources to simulate and predict the behaviour of the cor-
such as CARLA, AirSim and CarSIM, provide simula- responding physical twin [29, 30]. One area where digital
tion capabilities and environments to support vehicle twins have shown great potential is in the automotive
research and development. These platforms have been industry, particularly for electric vehicles [31]. Digital
used to study vehicle autonomy, safety and performance. twins can greatly benefit electric vehicles, which have
CARLA is a free open-source simulator to support au- gained greater market share in recent years. By creating
tonomous vehicle systems’ development, training, and a digital twin of an electric vehicle, manufacturers and
validation. AirSim is a simulator for drones and cars researchers can simulate and optimise its performance,
developed by Microsoft. It can also provide the possibil- energy consumption and other key parameters. Unlike
ity to experiment with deep learning, computer vision traditional simulators, digital twins provide beyond capa-
and reinforcement learning algorithms in autonomous bilities for human-machine interaction and performing
vehicles and the creation of complex and changeable envi- data-driven actions in real-world scenarios.
ronments and additional sensor modalities [25]. CarSim Digital twins also play a key role in the design and
is a vehicle dynamics simulation platform that allows the development of autonomous vehicles [32]. The concept
simulation of vehicle behaviour in different conditions of digital twins is closely related to the transition to data-
and environments, including motor dynamics, through driven vehicles, as it enables the analysis and validation
Simulink models. It can be used to create accurate models of autonomous vehicle designs [33]. By exploiting digital
of vehicles and simulate their behaviour under different twin technologies, researchers can assess the safety and
road surfaces, weather conditions, and traffic situations security of autonomous vehicles and identify potential
[26], but is not open-sourced. risks and vulnerabilities. Furthermore, combining digi-
In this study, we use Unreal Engine, renowned for its tal twins with combined vehicle technology and cloud
versatility, high-quality graphics and realistic physics computing has led to the development of the Mobility
simulation, which is useful for simulating vehicles [21]. Digital Twin (MDT) framework [34]. These frameworks
Competing game engines include Unity and CryEngine, consist of digital representations of people, vehicles, and
of which CryEngine is the smaller project. The main transport, which enable the analysis and optimisation
arguments that favour Unreal Engine are it is free of of mobility and large-scale traffic systems. By exploit-
cost for research and commercial projects until making ing real-time data and simulations, MDT frameworks
one million revenue, has open source code even it is can support decision-making processes and improve the
Feature Unreal Engine Unity
Developer Epic Games Unity Technologies
Programming Languages C++, Blueprint C#
Source Code Open source Not open source
Pricing Free for research and for commercial use Free version available
up to 1 million revenue, 5% comission after
that
Learning Curve Steep Easy to learn with intuitive user interface
Graphics Photorealistic graphics, used in AAA games High-quality graphics, but not as refined
as Unreal
Physics and Simulation Ragdoll physics, physics-based destruction, Easily integrated and well-rounded with
fluid simulation other engine features
2D vs. 3D Excellent 3D-development, especially for Strong 2D development capability, excel-
creating photorealistic environments and lent choice for 2D game projects
visual effects
Table 1
Comparison of Unreal Engine and Unity
efficiency and safety of transportation systems. abilities. The maintenance and support of open-source
Digital twins enable the simulation, optimisation, and software can be uncertain if their developers and com-
analysis of vehicle performance, energy consumption, munity are not active or committed. While open-source
and safety and security. Combining digital twins with software is dynamic and constantly changing, vehicles
connected vehicle technology and cloud computing will purchased today will remain in traffic for decades. In
extend their capabilities to optimise mobility systems. addition, there is a need for precise quality control and
As technology advances, digital twins can be expected software certification in the automotive industry, which
to play a key role in shaping the future of vehicles and can be challenging to implement in an open-source en-
transport systems. As such, current technologies aim to vironment because access to representative designs and
create models for distributed multi-agent cyber-physical industry-standard methodologies is limited. This limi-
systems using co-simulation [35]. Such large-scale digi- tation challenges researchers as automotive companies
tal twins should be able to make predictions about the do not openly share their development life-cycles and
future condition and behaviour of the vehicle [36]. How- verification methods, each maintaining proprietary tech-
ever, AI-based digital twin capabilities require data co- niques. Given this scenario, there is a growing demand
operation and load-balancing, scheduling, and network for open-source solutions to support the development
security schemes over vehicle-to-cloud computing con- and research of automotive applications, emphasizing
tinuum [37]. the need for open-source benchmarks to facilitate re-
search across various aspects of automotive application
2.4. Open-sourced Automotive Software development. [41].
Open source software refers to software that has a pub-
licly available and editable source code. This allows col- 3. Kuura Implementation
laborative development and innovation. One of the most
remarkable benefits of open-source software is its flexi- 3.1. Design Principles
bility and customizability, as user communities can adapt The cornerstone of our framework is grounded in the
the software to their specific needs. Open source is also principle of open-source development, ensuring trans-
cost-effective as it is free and reduces dependencies on parency and collaborative potential. Simplicity is at the
specific software providers. The use of open source also core, paving the way for effortless future evolution. Our
offers opportunities for innovation in automotive soft- design philosophy revolves around creating a system that
ware development and promotes the use of new tech- is not just functional today but remains adaptable and
nologies and solutions [38, 39]. maintainable for tomorrow’s innovations. The essence
For the automotive industry, open-source software of this framework is to avoid complexity instead of em-
presents some unique challenges as vehicular software, bracing a minimalist approach that prioritises ease of
by default, has life-critical safety and reliability require- understanding and operation. One must consider the
ments [40]. Technically, anyone can modify the source life cycle of software components, as updates and de-
code, which may create unwanted surprises and vulner- pendencies are inevitable. The framework architecture
Figure 2: Deployment diagram of the Kuura vehicular data collection system.
is designed to handle these, avoiding obsolescence and limited devices or low-bandwidth networks.
incompatibility. The Kuura framework presents a cohesive suite of
components, each selected for robustness and simplicity.
3.2. Kuura Architecture Design At its foundation lies the integration of Kuksa, SMAD,
and Kuura, delineating a timeline of iterative progress
The general architecture of Kuura is shown in Figure 2. as detailed in Table 2. Each iteration is a response to the
We chose the Unreal Engine 5 game engine because of evolving needs and challenges encountered. Kuksa, ini-
its versatility in creating realistic simulations. This is tially misaligned with its focus on automotive app stores
an essential part of the research objective to verify the and firmware updates, has since been archived [8, 2].
consistency of the data between the simulation and the SMAD was unsustainable due to its complexity and poor
real test runs. The MQTT protocol was chosen to collect documentation [42]. Our simplified stack emerges as
and transfer the data to the cloud server and the game a response, stripping away the superfluous to focus on
engine, as it is reliable and efficient for real-time data functionality. It leverages OpenShift (run on CSC Rahti
transfer, which might be the next step in the research container cloud 1 ) for its cost-efficiency compared to Mi-
and, thus, critical requirements in our study. The MQTT crosoft Azure. This pragmatic approach is engineered
protocol operates asynchronously and is considered an to reduce complexity, cost, and maintenance overhead,
ideal choice for IoT applications that often operate on
1
https://rahti.csc.fi/
Purpose Eclipse Kuksa Cloud SMAD stack Kuura (this paper)
Cloud Service Provider Microsoft Azure Microsoft Azure OpenShift
Deployment Platform Kubernetes Kubernetes OKD
Client-Server Messaging In- Eclipse Hono Eclipse Hono Eclipse Mosquitto
frastructure Broker
Serverside Messaging Infras- - Ambassador and Kafka Python script
tructure with Zookeeper
Client Message Persistence InfluxDB MongoDB InfluxDB
Client Message Data Mod- Kuksa.VAL Kuksa.VAL Client implementation
elling
Client Firmware Updates Eclipse hawkBit - -
Client Appstore Kuksa Appstore - -
Messaging Telemetry Storage - MongoDB -
Data Visualization - Node-RED Grafana
Deployment Monitoring - Prometheus Monitoring, -
InfluxDB, and Grafana
Message Tracing - Jaeger Trace -
Table 2
Eclipse Kuksa Cloud, SMAD stack, and Kuura software components.
streamlining operations without compromising capabil- car. A laptop computer running Linux was connected to
ity. the adapter, and a script was run to record data from the
Each framework iteration — Kuksa, SMAD, and Kuura vehicle in a log file. The successful log file collection was
— brings new insights. Kuksa’s archival signals a further important in developing the auto-client script
pivot away from its original automotive-centric fo- for future larger tests and ensuring the whole system’s
cus. SMAD’s downfall was its complexity and reliance functionality. Practical testing in the first phase was
on now-inaccessible Kubernetes Helm charts. Kuura carried out by driving the car and ensuring the data was
emerges as the distilled essence of its predecessors, em- stored correctly and its format was manageable.
bodying simplicity and sustainability. By eliminating
non-essential components, Kuura adapts existing func- 3.4. Data Transmission
tionalities with more straightforward tools, significantly
reducing cost and complexity and enabling an environ- MQTT makes it trivial to multi-cast the collected data if
ment conducive to continuous development and opera- we want to enable multiple clients to listen to the gen-
tion. erated data simultaneously. One example of such a sce-
nario is live visualisation of the data while saving it to a
database without additional latency. While we could also
3.3. Vehicle Data Reader
save the data and then fetch it from the database, this
The OBD-II is a port designed for diagnostic purposes. would add latency to the visualisation. MQTT also has
It has multiple buses available. These buses include the built-in, easy-to-configure security mechanisms. Setting
CAN bus, SAE-1850 and ISO-9141-2. The automotive up MQTT with SSL is very easy, and configuring the
manufacturers can also provide other networks at their MQTT broker to require client certificates for communi-
discretion [43]. The bus we are most interested in is cation is also very easy. The connection can also be set
the CAN bus. On some vehicles, the CAN bus available up to require a username and password.
at the OBD connector can be protected by a gateway We could also use HTTP or raw TCP/UDP sockets as
device restricting access to some data from the OBD port. an alternative for MQTT. While HTTP offers security
Unlike the CAN bus inside the car, you must poll the measures similar to MQTT, it does not have multi-cast
OBD port to receive any data. While we could get most by default. While it is not hard to implement, MQTT has
of the data we wanted from the OBD port, some data, it built in and is most likely already done correctly. One
like the steering wheel position, was unavailable. This advantage HTTP has over MQTT is the ability to com-
makes the OBD port unsuitable as a data source for our municate directly between two applications, eliminating
purposes, as it would make it quite difficult to drive the the need for a broker in cases where there is only one
virtual car in Unreal accurately. client.
In the evaluation phase, we collected data from an Raw sockets are the most basic option, and they don’t
OBD-II Bluetooth adapter connected to a Toyota RAV4 come with any of the advanced features included in
Figure 3: Sequence diagram of the vehicular data transferred to Unreal Engine 5.
MQTT out of the box. However, they are very versa- 3.5. Cloud Environment
tile and can be used for various purposes. One advan-
The cloud environment receives data from the MQTT
tage the sockets would offer is the ability to write raw
broker. The environment also has a Python script that
can data as is to the socket. This would enable saving
connects to the broker to receive the data from the vehicle.
raw can dumps in a database with minimal overhead
The script stores all of the messages received by InfluxDB.
if we ever needed/wanted to support it. One problem
The point name and field are derived from the MQTT
with multi-cast solutions is that the provider has no idea
topic. The timestamp is also gotten from the MQTT
if any clients are listening for the sent data unless the
message payload. Since the timestamp is included in the
clients have been programmed to provide feedback when
message, we could use any database solution to store the
they are listening. This makes it harder to implement the
data. If the timestamp were missing, however, then a
provider in a way that it holds the messages in memory
time series database would be our only option since the
or saves them locally in case the data is sent to nowhere.
message times are crucial for playback at a later time.
In the experiments, a laptop was used as the in-vehicle
By getting the message time from the provider, we can
client running Ubuntu 22.04 LTS, and the script collect-
ensure that network conditions do not affect the accuracy
ing the data was written in Python using an OBD library
of the recorded timestamps.
[44]. The script writes read values into a CSV file locally
InfluxDB, a time series database used to store large
and publishes them using the MQTT protocol. The back-
amounts of time-stamped data due to its high perfor-
end was deployed on CSC’s Rahti as RedHat OpenShift
mance and scalability, was stored at the onset of the pro-
deployments. On the server side, Mosquitto MQTT bro-
cess. Storage is essential in handling large amounts of
ker forwards the published messages to subscribers. The
data that emanate from driving vehicles. A Python script
most important subscriber is a Python service that stores
was then used in the next stage of the data-processing
received messages in an InfluxDB instance. As an addi-
workflow. This script had two main functionalities: First,
tional demonstration, Grafana was deployed to provide
it reads GPS point data pre-recorded into a JSON file,
a real-time dashboard for the published and stored data.
which is vital in mapping out routes of vehicles. Sec-
The sequence diagram is provided in Figure 3.
ondly, this script establishes a connection with InfluxDB
to retrieve useful information within a particular range.
This recovery is critical for evaluating the vehicle’s per-
formance and environmental conditions during various
experiment stages.
At this point, the processed data goes through an
MQTT broker using a Python script. Once more, this pro-
tocol provides lightweight messaging, providing fast and
reliable real-time information transmission that would
be needed for the simulation environment.
3.6. Simulation Environment
Multiple reasons contributed to the choice of Unreal En-
gine 5 game engine, including the capacity to create real-
istic simulations of real car driving and the possibility of
driving a car in a simulation, thereby generating corre- Figure 4: Simulated route in the virtual environment, based
sponding data. The research aimed to ensure uniformity on the real data points collected during the experiments.
between the simulation and actual driving, thus requiring
realistic simulations. Unreal Engine 5 is also open-source,
which meets one of the implementation principles of the and efficient. This technique also makes Unreal Engine
study, making future development as easy as possible. simulation more elaborate. It allows different scenarios
The research utilised the MQTT protocol, one of the to be run on a platform without sticking to a single static
key IoT connections and data collection components. Un- map, giving the evaluation process more flexibility.
real Engine does not have native MQTT support. For this
reason, we used the NinevaStudios MQTT-utilities exten-
sion with some modifications. This extension allowed 4. Experimentation
MQTT data communication, which is essential for col-
lecting data from the simulation, with minor adjustments 4.1. Real-life Experiment
made to transfer it to cloud storage securely. Through The real-world tests were conducted in the OuluZone
this connection, it was possible to develop a dynamic and vehicle testing area using a Toyota RAV4 Hybrid 2019
interactive simulation environment. vehicle. A closed area, such as OuluZone, was chosen
Lastly, we simulate a car running along received GPS because it allows for assessing the drives and their safety.
points as shown in Figure 4. In the simulation, the vehi- The significance of this place is that it helped gather and
cle’s movement was driven by speed data acquired from analyse information in real-life scenarios, thus allowing
the MQTT broker. As a result, real-time synchronisa- comparison and verification with data collected from
tion between the GPS points and speed data gave an virtual and actual driving instances. Besides being a
actual representation of the journey made by the vehi- recreational driving and sports centre, OuluZone is also
cle, hence allowing for the immersion of details about a notable site for research and learning, especially on
its performance in different circumstances. Such a holis- autonomous cars and related technologies.
tic approach to data storage, processing, transmission, Several laps were driven during the tests, some with
and visualisations shows how diverse technologies can the cruise control set at different speeds (30km/h, 40km/h,
be integrated into high-level vehicular data analysis and and 50km/h) to facilitate the validation of results in the
simulation. simulation with data collected at a constant speed. Laps
The initial version of the Kuura presented in this paper were also driven without cruise control at varying speeds.
has a dynamic road generated as the car moves around, Driving data was collected during the test via an OBD-II
thus simplifying testing by making it independent of Bluetooth adapter connected to a laptop running Linux.
environmental conditions. This method enables better This allowed for the vehicle data to be logged and its
flexibility in the testing process because it does not re- format managed. Towards the end of the tests, a USB
quire a predefined route or special environmental cir- adapter enhanced data collection.
cumstances. Generating dynamic roads is essential to
ensure the reliability of the data collection system. This
phase, built on the multiple approaches used in the study, 4.2. Virtual Experiment
emphasises adaptability and precision. By generating Our virtual experiment utilised Unreal Engine 5.3.2 to
the road during runs accuracy of collected data could be drive test drive scenarios comparable to our real-world
instantly evaluated. It is particularly advantageous to data collection efforts. In this experiment, we used the
work within this dynamic environment for the purposes same logger used during actual test drives with a real car,
of identifying and solving prospective issues within a ensuring a uniform approach to data acquisition and prov-
workflow for data processing that would make it strong ing that the logger could be used without changes in both
Figure 5: Screenshot of influxDB which contains both real-
world data (smad/toyota) and virtually collected data (unre-
al/toyota).
Figure 6: A picture of the car driving in the virtual OuluZone
3D environment using the data collected in the real OuluZone.
environments. We gathered data on speed and time from
the virtual test drive, which can be cross-verified with the
real car’s outputs. The current limitation of real-world 5. Discussion and Conclusions
data collection stems from the OBD-II interface’s inabil-
ity to provide comprehensive vehicle diagnostics. In the In this study, we have aimed to bring new insights into
virtual setting, we collected additional data such as gear, vehicular data collection and the creation of digital twins
throttle, brake application, and steering angle. These by using the Eclipse Kuksa platform and Unreal Engine
were predominantly included for illustrative purposes, 5 to simulate driving scenarios. Our main focus was
aiming to demonstrate the extensive data collection pos- providing an overview of the simplified vehicular data
sibilities within a simulated environment. It is important collection architecture that can be easily developed for
to note that verifying these additional parameters will further projects and verifying the consistency between
become feasible with future access to the CAN bus, allow- real and simulated vehicular data through practical real-
ing for a more detailed and accurate comparison between world experimentation.
virtual and real vehicle data. Using the MQTT protocol for sending data and Unreal
Engine 5 for simulation has allowed us to compare real
driving data with simulated ones. This method makes
4.3. Experimentation Results digital twins more reliable and allows later use for testing
In our validation process, we specifically focused on com- in many conditions that are hard or expensive to create
paring the collected GPS data and speed data between the in real life, like very bad weather or different kinds of
actual and virtual driving tests conducted in Unreal En- traffic situations.
gine 5. As shown in Figure 5, the same InfluxDB database We encountered challenges in data collection via the
successfully contains both real-world data (smad/toyota) OBD-II protocol because it is filtered and does not allow
and virtually collected data (unreal/toyota). This design the collection of all possible data. This limitation high-
will further allow simultaneous analysis of both virtual lighted the need for more comprehensive data acquisition
and real-world data sets, allowing us to expand the digital methods like the CAN bus. The data collection limita-
twin creation capabilities with virtual realities and actual tions prompted us to consider future enhancements in
real-life test runs, independently of the data source. our methodology to achieve a more accurate and encom-
As illustrated in Figure 6, we successfully mapped the passing digital representation of the vehicle.
collected GPS data onto the 3D model of the racetrack in Our findings open up possibilities for future research
runtime from cloud and verified its accuracy. This demon- directions, including optimising data transmission meth-
strates that our virtual environment can accurately repli- ods for improved efficiency and exploring bi-directional
cate real driving conditions. The speed data collected in data flow between the digital twin and the vehicle. Such
the database corresponded with the data obtained in the advancements could potentially enable real-time vehicle
Unreal Engine 5 simulation, confirming the consistency control based on digital twin data.
of data in both real and virtual driving scenarios. While By integrating additional simulation models and con-
the data transmitted from the game engine to the server sidering more sophisticated data collection interfaces, we
was also accurate, at this stage, our primary focus was anticipate that future iterations of this work will address
on verifying the accuracy of speed and time information. the current limitations and unlock new capabilities for
Expanding this experimentation to cover a wider range digital twins in automotive research and development.
of variables is possible in future research. The potential for these technologies to improve vehicle
safety, efficiency, and innovation is immense, paving the
way for a more interconnected and intelligent transporta- tions, and design implications, IEEE Access 7 (2019)
tion ecosystem. 167653–167671.
Future efforts should be made using the CAN bus in- [8] A. Banijamali, P. Jamshidi, P. Kuvaja, M. Oivo,
stead of the OBD-II to improve accuracy completeness Kuksa: A cloud-native architecture for enabling
and to have access to all possible data the vehicle pro- continuous delivery in the automotive domain,
vides. Reconsidering data transmission methods, like in: International Conference on Product-Focused
MQTT, for more efficient data multicasting is also a pos- Software Process Improvement, Springer, 2019, pp.
sible future direction. In the future, we are also looking 455–472.
into sending data from the game engine to the car instead [9] J. V. Sørensen, Z. Ma, B. N. Jørgensen, Potentials
of just storing it in the cloud, having the car drive in real of game engines for wind power digital twin de-
life and the game engine simultaneously with as little velopment: an investigation of the unreal engine,
latency as possible and importing Eclipse Arrowhead to Energy Informatics 5 (2022) 1–30.
extend possibilities with simulation models, such as using [10] F. Sang, H. Wu, Z. Liu, S. Fang, Digital twin platform
the architecture with Matlab Simulink or corresponding design for zhejiang rural cultural tourism based on
open-sourced physics modelling software. unreal engine, in: 2022 International Conference on
Culture-Oriented Science and Technology (CoST),
IEEE, 2022, pp. 274–278.
Acknowledgments [11] A. Alhilal, T. Braud, P. Hui, Distributed ve-
hicular computing at the dawn of 5g: a survey,
The work has been supported by the EU HORI-
arXiv:2001.07077 (2020).
ZON project CHIPS-JU CIA FEDERATE (grant number
[12] Y. Khaled, M. Tsukada, J. Santa, T. Ernst, On the
101139749), Business Finland project 6G Visible (grant
design of efficient vehicular applications, in: VTC
number 10743/31/2022), and the Finnish Research Coun-
Spring 2009-IEEE 69th Vehicular Technology Con-
cil project Northern Utility Vehicle Laboratory Consor-
ference, IEEE, 2009, pp. 1–5.
tium GO!-RI (grant number 352726).
[13] S. Baidya, Y. Ku, H. Zhao, J. Zhao, S. Dey, Vehicular
and edge computing for emerging connected and
References autonomous vehicle applications, in: Proc. of the
57th Design Automation Conference (DAC), 2020.
[1] E. Peltonen, A. Sojan, T. Päivärinta, Towards real- [14] M. Munz, M. Mahlisch, K. Dietmayer, Generic cen-
time learning for edge-cloud continuum with ve- tralized multi sensor data fusion based on proba-
hicular computing, in: 2021 IEEE 7th World Fo- bilistic sensor and environment models for driver
rum on Internet of Things (WF-IoT), IEEE, 2021, pp. assistance systems, IEEE Intelligent Transportation
921–926. Systems 2 (2010).
[2] A. Banijamali, P. Kuvaja, M. Oivo, P. Jamshidi, [15] F. Garcia, D. Martin, A. De La Escalera, J. M. Armin-
Kuksa∗: Self-adaptive microservices in automotive gol, Sensor fusion methodology for vehicle detec-
systems, in: International Conference on Product- tion, IEEE Int Transportation Systems 9 (2017).
Focused Software Process Improvement, Springer, [16] Q. Li, L. Chen, M. Li, S. L. Shaw, A. Nüchter, A
2020, pp. 367–384. sensor-fusion drivable-region and lane-detection
[3] J. Nickerson, K. Lyttinen, J. L. King, Automated Ve- system for autonomous vehicle navigation in chal-
hicles: A Human/Machine Co-learning Perspective, lenging road scenarios, IEEE Transactions on Ve-
Technical Report, SAE Technical Paper, 2022. hicular Technology 63 (2014) 540–555.
[4] F. Tao, B. Xiao, Q. Qi, J. Cheng, P. Ji, Digital twin [17] A. Ghose, P. Biswas, C. Bhaumik, M. Sharma, A. Pal,
modeling, Journal of Manufacturing Systems 64 A. Jha, Road condition monitoring and alert appli-
(2022) 372–389. cation, in: IEEE International Conference on Perva-
[5] M. Liu, S. Fang, H. Dong, C. Xu, Review of digital sive Computing and Communications Workshops,
twin about concepts, technologies, and industrial IEEE, Lugano, Switzerland, 2012, pp. 489–491.
applications, Journal of Manufacturing Systems 58 [18] Y. Wang, J. Yang, H. Liu, Y. Chen, M. Gruteser, R. P.
(2021) 346–361. Digital Twin towards Smart Manu- Martin, Sensing vehicle dynamics for determining
facturing and Industry 4.0. driver phone use, in: Int. conf. on mobile systems,
[6] F. Tao, H. Zhang, A. Liu, A. Y. Nee, Digital twin applications, and services, 2013, pp. 41–54.
in industry: State-of-the-art, IEEE Transactions on [19] J. Ljungblad, B. Hök, A. Allalou, H. Pettersson, Pas-
Industrial Informatics 15 (2018) 2405–2415. sive in-vehicle driver breath alcohol detection using
[7] B. R. Barricelli, E. Casiraghi, D. Fogli, A survey on advanced sensor signal acquisition and fusion, Traf-
digital twin: Definitions, characteristics, applica- fic injury prevention 18 (2017).
[20] B. Schoettle, Sensor fusion: A comparison of sens- security in autonomous vehicles, IEEE Communi-
ing capabilities of human drivers and highly auto- cations Standards Magazine 5 (2021) 40–46.
mated vehicles, University of Michigan (2017). [33] T. Fuchs, M. Zinser, K. Renatus, B. Bäker, Data
[21] D. Michalík, M. Jirgl, J. Arm, P. Fiedler, Developing model of automotive digital twins, ATZelectronics
an unreal engine 4-based vehicle driving simulator worldwide 16 (2021) 52–57.
applicable in driver behavior analysis—a technical [34] Z. Wang, R. Gupta, K. Han, H. Wang, A. Gan-
perspective, Safety 7 (2021) 25. lath, N. Ammar, P. Tiwari, Mobility digital twin:
[22] S. Malik, M. A. Khan, H. El-Sayed, Carla: Car learn- Concept, architecture, case study, and future chal-
ing to act — an inside out, Procedia Computer lenges, IEEE Internet of Things Journal 9 (2022)
Science 198 (2022) 742–749. 12th International Con- 17452–17467.
ference on Emerging Ubiquitous Systems and Per- [35] M. Palmieri, C. Quadri, A. Fagiolini, C. Bernarde-
vasive Networks / 11th International Conference schi, Co-simulated digital twin on the network edge:
on Current and Future Trends of Information and A vehicle platoon, Computer Communications 212
Communication Technologies in Healthcare. (2023) 35–47.
[23] A. Dubs, V. C. Andrade, M. Ellis, S. Ganley, B. Kara- [36] G. Bhatti, H. Mohan, R. R. Singh, Towards the future
man, O. Toker, A photo-realistic simulation and test of smart electric vehicles: Digital twin technology,
platform for autonomous vehicles research (????). Renewable and Sustainable Energy Reviews 141
[24] G. Chance, A. Ghobrial, K. McAreavey, S. Lemaig- (2021) 110801.
nan, T. Pipe, K. Eder, On determinism of game [37] D. Chen, Z. Lv, Artificial intelligence enabled digital
engines used for simulation-based autonomous ve- twins for training autonomous cars, Internet of
hicle verification, IEEE Transactions on Intelligent Things and Cyber-Physical Systems 2 (2022) 31–41.
Transportation Systems (2022). [38] S. Kochanthara, Y. Dajsuren, L. Cleophas,
[25] W. Jansen, E. Verreycken, A. Schenck, J.-E. Blan- M. van den Brand, Painting the landscape of
quart, C. Verhulst, N. Huebel, J. Steckel, Cosys- automotive software in github, in: Proceedings
airsim: A real-time simulation framework ex- of the 19th International Conference on Mining
panded for complex industrial applications, in: Software Repositories, 2022, pp. 215–226.
2023 Annual Modeling and Simulation Conference [39] S. Niæetin, R. Šandor, G. Stupar, N. Tesliæ, Maximiz-
(ANNSIM), IEEE, 2023, pp. 37–48. ing the efficiency of automotive software develop-
[26] Q. Liu, D. Xie, S. Hu, J. Wu, Research on dynamic ment environment using open source technologies,
performance simulation of in-wheel motor electric in: 2018 IEEE 8th International Conference on Con-
vehicle based on carsim-simulink, in: Journal of sumer Electronics-Berlin (ICCE-Berlin), IEEE, 2018,
Physics: Conference Series, volume 1820, IOP Pub- pp. 1–3.
lishing, 2021, p. 012109. [40] Y. Zhang, Y. Ning, C. Ma, L. Yu, Z. Guo, Empiri-
[27] A. Fuller, Z. Fan, C. Day, C. Barlow, Digital twin: En- cal study for open source libraries in automotive
abling technologies, challenges and open research, software systems, IEEE Access (2023).
IEEE access 8 (2020) 108952–108971. [41] F. A. da Silva, A. C. Bagbaba, A. Ruospo, R. Mariani,
[28] J. A. Ross, K. Tam, D. J. Walker, K. D. Jones, To- G. Kanawati, E. Sanchez, M. S. Reorda, M. Jenihhin,
wards a digital twin of a complex maritime site for S. Hamdioui, C. Sauer, Special session: Autosoc-a
multi-objective optimization, in: 2022 14th Interna- suite of open-source automotive soc benchmarks,
tional Conference on Cyber Conflict: Keep Moving! in: 2020 IEEE 38th VLSI Test Symposium (VTS),
(CyCon), volume 700, IEEE, 2022, pp. 331–345. IEEE, 2020, pp. 1–9.
[29] S. Maulik, D. Riordan, J. Walsh, Dynamic reduction- [42] H. Hirvonsalo, P. Seppänen, On deployment of
based virtual models for digital twins—a compara- eclipse kuksa as a framework for an intelligent
tive study, Applied Sciences 12 (2022) 7154. moving test platform for research of autonomous
[30] A. M. Madni, C. C. Madni, S. D. Lucero, Leveraging vehicles, in: Proceedings of the 2nd Eclipse Re-
digital twin technology in model-based systems search International Conference on Security, Arti-
engineering, Systems 7 (2019) 7. ficial Intelligence, Architecture and Modelling for
[31] D. Piromalis, A. Kantaros, Digital twins in the auto- Next Generation Mobility, RWTH Aachen Univer-
motive industry: The road toward physical-digital sity, 2021.
convergence, Applied System Innovation 5 (2022) [43] K. McCord, Automotive Diagnostic Systems: Un-
65. derstanding OBD I and OBD II, CarTech Inc, 2011.
[32] S. Almeaibed, S. Al-Rubaye, A. Tsourdos, N. P. Avde- [44] Obd library for python 3, https://github.com/
lidis, Digital twin analysis to promote safety and brendan-w/python-OBD, 2023. Accessed: 2023-11-
12.