Kuura: Leveraging Eclipse Kuksa in Vehicular Data Collection and Digital Twin Creation Environment Olli Timonen, Toni Bomström, Nicklas Stafford, Samuli Määttä, Alireza Bakhshi Zadi Mahmoodi, Tero Päivärinta and Ella Peltonen Empirical Software Engineering in Software, Systems, and Services, University of Oulu, Finland Abstract Increased sensing and computing capabilities in cars are crucial for advanced traffic and driving automation. However, novel data delivery, testing, and machine learning pipelines are still needed to harness the full capabilities of automotive sensing solutions. At the same time, vehicular digital twins are needed to enable versatile testing and simulation capabilities. This paper depicts the Vehicle-In-The-Loop (VIL) cloud interface and verifies data consistency regardless of the source. The study aims to determine how data collected from simulation corresponds to real test drive data. The data is collected from both simulation and actual test drives. Utilising the MQTT protocol, data is stored on a cloud server and further fed into Unreal Engine 5, where the test drive is replayed, and its correspondence to the real drive is ensured. This work offers a new perspective on verifying data consistency between simulated and real test drives and complements the vehicle abstraction opportunities provided by Eclipse KUKSA. Our results highlight digital twin creation as a part of automotive software development and set premises for testing and validating complex use cases, such as traffic accidents and extreme weather, that can rarely or only with severe expenses be tested in real-life situations. Keywords Vehicular Computing, Data Transfer, Digital Twins, 1. Introduction is broad, and definitions may vary. Still, the main idea is to model physical systems with digital means and update Today’s cars hold considerable computational and sens- these digital models dynamically based on measurement ing capabilities that are crucial for advanced traffic and data. In essence, digital twin methods provide digital driving automation, applications spanning from safety spaces where reality can be modelled virtually [7] as it features to fully automated vehicles. However, data de- is or would be in unseen but possible situations. Indeed, livery and management protocols and interfaces that are the creation of the digital twin as part of automotive required for machine learning pipelines are still primar- software development sets premises for testing and vali- ily closed in company-specific silos [1]. The first efforts dating complex use cases, such as traffic accidents and for creating open-source data transfer protocols and in- extreme weather, that can rarely or only be tested in real- terfaces from the car to the cloud environment include life situations with severe expenses. The utilisation of Eclipse Kuksa [2], of which this work also bases, but con- digital twins for automotive software development opens siderable work is still required for data validation and avenues for testing different sensors and components in benchmark efficiency of the proposed frameworks. Data actual use cases, such as studying the longevity of such sharing through open interfaces can boost innovation by components and proposing novel learning strategies that more efficient and accurate machine learning models that combine multiple data sources. cover more expansive geographic areas and use cases [3]. This paper describes the Vehicle-In-The-Loop (VIL) At the same time, digital twins can enable versatile cloud interface. It verifies data consistency regardless of testing and simulation capabilities as seen with applica- the source: a real car on the road or a virtual object in tions in industry, energy, and transportation verticals [4]. the digital twin environment. The overview of the Kuura Indeed, digital twins have attracted much research inter- platform is provided in Figure 1. We use KUKSA.val est in recent years [5, 6]. The concept of the digital twin [8] that provides a vehicle abstraction layer to enable the management and use of vehicle signals. As a dig- TKTP 2024: Annual Doctoral Symposium of Computer Science, 10.- ital twin modelling framework, we use Unreal Engine 11.6.2024 Vaasa, Finland 5. Capabilities of utilising such a game engine in digital Envelope-Open olli.timonen@oulu.fi (O. Timonen); toni.bomstrom@proton.me (T. Bomström); Nicklas.Stafford@oulu.fi (N. Stafford); twin creation have been successfully demonstrated in Alireza.BakhshiZadiMahmoodi@oulu.fi (A. B. Z. Mahmoodi); wind power plants [9] and cultural tourism [10]. Using tero.paivarinta@oulu.fi (T. Päivärinta); ella.peltonen@oulu.fi a game engine and a VIL cloud interface enables visual (E. Peltonen) simulation that complements the capabilities provided Orcid 0009-0006-4132-9496 (O. Timonen); 0009-0004-3192-7711 by KUKSA.val; KUKSA.val is used to collect data from a (A. B. Z. Mahmoodi); 0000-0002-3374-671X (E. Peltonen) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License test drive in a real car. For validation, we determine how Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings Figure 1: Overview of Kuuras overall architecture. data collected from the simulation corresponds to real edge and cloud back-ends can perform challenging infer- test drive data in a real-life driving scenario. Utilising ence and learning tasks to support drivers’ cognition and the MQTT protocol, data is stored on a cloud server and automate the driving scenario [11, 12]. Such intelligent further fed into Unreal Engine 5, where the test drive systems demand training data, which in-vehicle sensors is replayed, and its correspondence to the real drive is and external databases can provide. How to make the ensured. This work offers a new perspective on verifying data available, processed, and utilised in a challenging data consistency between simulated and real test drives real-time and mobile environment is a timely research and complements the vehicle abstraction opportunities question. provided by Eclipse KUKSA. Vehicular safety systems (and any other relevant ap- The main contributions of this work are the following: plications) of any level of driving autonomy require data 1) We provide the Kuura platform for similarly collecting from in-vehicle sensors [13], such as cameras, LiDARs, vehicular sensor data to a real car and similar test runs radars, and speed meters [14, 15]. This information can in a digital twin environment. With this, we extend the be used to, for example, improve lane [16] and road pot- KUKSA.val environment to better fit digital testing and hole [17] recognition. Solutions for detecting drivers’ validation tasks. 2) We explore Unreal Engine 5 as a ve- behaviour while using smartphones during driving [18] hicular digital twin environment and provide a pipeline and drunk driving [19] have been explored. However, to deploy such digital twins with simulated and real test the results underline that human drivers’ perception and drives. 3) We experiment with the data consistency be- reasoning still maintain an advantage compared to fully tween simulated and real test drives and further demon- automatic vehicles [20]. strate the power of game engine-based digital twins in However, most of the in-vehicular and driver’s per- vehicular computing and sensing scenarios. sonal sensors and interfaces are brand-specific or closed, limiting access to the data, computing, and networking capabilities and thus hindering vehicular application de- 2. Background velopment. To enable connected vehicles to utilise all the available data sources, AI/ML computing resources, 2.1. Vehicle as a Sensing Device and networking capabilities, open-sourced general in- Modern cars implement technologies for automatic brak- terfaces and software platforms need to be defined [1]. ing, Cooperative Adaptive Cruise Control (CACC), pre- On-board diagnostics (OBD) protocol refers to a vehi- vention of unwanted lane crossing, distance keeping, and cle’s self-diagnostic and reporting capability. The more so on, to supply drivers’ own cognition and prevent acci- advanced OBD-II is a protocol homogenised into the vehi- dents. For further technological advancement, vehicles cle itself, allowing software-defined onboard operations will require artificial intelligence (AI) and machine learn- and, most importantly, collecting a wide range of vehic- ing (ML) capabilities depending on effective data transfer ular data to the software-defined vehicle’s case. This and management systems. With increased networking includes but is not limited to engine load, coolant tem- and computing capabilities, vehicles and their supporting perature, fuel pressure, engine revolutions per minute (RPM), vehicle speed, intake air temperature, airflow rate, throttle position and many types of sensor data like oxy- precisely under source made available license, making gen sensors and fuel system status. OBD-II has relatively it suitable for various applications in autonomous and easy access to the mentioned sensors, which is enough to driving-support test cases [22]. The key differences be- prove the concept. In the future, research conducted with tween Unity and Unreal Engine are summarised in Table vehicular sensors can utilise direct access to the vehicle’s 1. Unreal Engine has typically been considered a better controller area network (i.e. CAN bus) and standardised choice for 3D games, while Unity has been considered a architectures such as AUTOSAR for wider data access. strong choice for 2D games. The previous literature emphasises the importance 2.2. Automotive Simulations of determinism in simulation environments to ensure repeatability, allowing for trustworthy and easily debug- Driving and traffic simulators are used in the automo- gable results. Game engines still may come with chal- tive industry as an alternative to costly and potentially lenges of non-deterministic behaviours. For example, the dangerous real-life testing [21]. The advantages of such investigation by Chance et al. [24] reveals significant practices highlight effectiveness in analysing human driv- simulation variance in CARLA, particularly due to ac- ing behaviour and essential traffic situations often too tor collisions and system-level resource utilisation. As hazardous to test in real-life scenarios, such as extreme such, accuracy investigation is one of the key goals in weather, congestion, and accidents [22]. This can be our preliminary work presented in this paper. especially emphasised in the increasing reliance on sim- ulation technologies for assessing human driving factors. 2.3. Automotive Digital Twins The more real-life, photo-realistic simulations enable si- multaneous testing of vehicle dynamics and stochastic Digital twins (DT) in the context of vehicles is an emerg- pedestrian, driver, and vehicle interactions in various ing field that has attracted significant attention in both scenarios [23]. industry and academia [27]. Digital twins are virtual However, traditional simulators often have limitations representations of physical entities, such as vehicles, that in emulating real-life behaviour and perception. This has aim to mirror the lives and behaviours of their real-world led to a growing interest in game engines as simulation counterparts [28]. These digital replicas use the best platforms for developing and testing autonomous vehi- available physical models, sensor updates, and other data cle control systems [24]. Several vehicular simulators, sources to simulate and predict the behaviour of the cor- such as CARLA, AirSim and CarSIM, provide simula- responding physical twin [29, 30]. One area where digital tion capabilities and environments to support vehicle twins have shown great potential is in the automotive research and development. These platforms have been industry, particularly for electric vehicles [31]. Digital used to study vehicle autonomy, safety and performance. twins can greatly benefit electric vehicles, which have CARLA is a free open-source simulator to support au- gained greater market share in recent years. By creating tonomous vehicle systems’ development, training, and a digital twin of an electric vehicle, manufacturers and validation. AirSim is a simulator for drones and cars researchers can simulate and optimise its performance, developed by Microsoft. It can also provide the possibil- energy consumption and other key parameters. Unlike ity to experiment with deep learning, computer vision traditional simulators, digital twins provide beyond capa- and reinforcement learning algorithms in autonomous bilities for human-machine interaction and performing vehicles and the creation of complex and changeable envi- data-driven actions in real-world scenarios. ronments and additional sensor modalities [25]. CarSim Digital twins also play a key role in the design and is a vehicle dynamics simulation platform that allows the development of autonomous vehicles [32]. The concept simulation of vehicle behaviour in different conditions of digital twins is closely related to the transition to data- and environments, including motor dynamics, through driven vehicles, as it enables the analysis and validation Simulink models. It can be used to create accurate models of autonomous vehicle designs [33]. By exploiting digital of vehicles and simulate their behaviour under different twin technologies, researchers can assess the safety and road surfaces, weather conditions, and traffic situations security of autonomous vehicles and identify potential [26], but is not open-sourced. risks and vulnerabilities. Furthermore, combining digi- In this study, we use Unreal Engine, renowned for its tal twins with combined vehicle technology and cloud versatility, high-quality graphics and realistic physics computing has led to the development of the Mobility simulation, which is useful for simulating vehicles [21]. Digital Twin (MDT) framework [34]. These frameworks Competing game engines include Unity and CryEngine, consist of digital representations of people, vehicles, and of which CryEngine is the smaller project. The main transport, which enable the analysis and optimisation arguments that favour Unreal Engine are it is free of of mobility and large-scale traffic systems. By exploit- cost for research and commercial projects until making ing real-time data and simulations, MDT frameworks one million revenue, has open source code even it is can support decision-making processes and improve the Feature Unreal Engine Unity Developer Epic Games Unity Technologies Programming Languages C++, Blueprint C# Source Code Open source Not open source Pricing Free for research and for commercial use Free version available up to 1 million revenue, 5% comission after that Learning Curve Steep Easy to learn with intuitive user interface Graphics Photorealistic graphics, used in AAA games High-quality graphics, but not as refined as Unreal Physics and Simulation Ragdoll physics, physics-based destruction, Easily integrated and well-rounded with fluid simulation other engine features 2D vs. 3D Excellent 3D-development, especially for Strong 2D development capability, excel- creating photorealistic environments and lent choice for 2D game projects visual effects Table 1 Comparison of Unreal Engine and Unity efficiency and safety of transportation systems. abilities. The maintenance and support of open-source Digital twins enable the simulation, optimisation, and software can be uncertain if their developers and com- analysis of vehicle performance, energy consumption, munity are not active or committed. While open-source and safety and security. Combining digital twins with software is dynamic and constantly changing, vehicles connected vehicle technology and cloud computing will purchased today will remain in traffic for decades. In extend their capabilities to optimise mobility systems. addition, there is a need for precise quality control and As technology advances, digital twins can be expected software certification in the automotive industry, which to play a key role in shaping the future of vehicles and can be challenging to implement in an open-source en- transport systems. As such, current technologies aim to vironment because access to representative designs and create models for distributed multi-agent cyber-physical industry-standard methodologies is limited. This limi- systems using co-simulation [35]. Such large-scale digi- tation challenges researchers as automotive companies tal twins should be able to make predictions about the do not openly share their development life-cycles and future condition and behaviour of the vehicle [36]. How- verification methods, each maintaining proprietary tech- ever, AI-based digital twin capabilities require data co- niques. Given this scenario, there is a growing demand operation and load-balancing, scheduling, and network for open-source solutions to support the development security schemes over vehicle-to-cloud computing con- and research of automotive applications, emphasizing tinuum [37]. the need for open-source benchmarks to facilitate re- search across various aspects of automotive application 2.4. Open-sourced Automotive Software development. [41]. Open source software refers to software that has a pub- licly available and editable source code. This allows col- 3. Kuura Implementation laborative development and innovation. One of the most remarkable benefits of open-source software is its flexi- 3.1. Design Principles bility and customizability, as user communities can adapt The cornerstone of our framework is grounded in the the software to their specific needs. Open source is also principle of open-source development, ensuring trans- cost-effective as it is free and reduces dependencies on parency and collaborative potential. Simplicity is at the specific software providers. The use of open source also core, paving the way for effortless future evolution. Our offers opportunities for innovation in automotive soft- design philosophy revolves around creating a system that ware development and promotes the use of new tech- is not just functional today but remains adaptable and nologies and solutions [38, 39]. maintainable for tomorrow’s innovations. The essence For the automotive industry, open-source software of this framework is to avoid complexity instead of em- presents some unique challenges as vehicular software, bracing a minimalist approach that prioritises ease of by default, has life-critical safety and reliability require- understanding and operation. One must consider the ments [40]. Technically, anyone can modify the source life cycle of software components, as updates and de- code, which may create unwanted surprises and vulner- pendencies are inevitable. The framework architecture Figure 2: Deployment diagram of the Kuura vehicular data collection system. is designed to handle these, avoiding obsolescence and limited devices or low-bandwidth networks. incompatibility. The Kuura framework presents a cohesive suite of components, each selected for robustness and simplicity. 3.2. Kuura Architecture Design At its foundation lies the integration of Kuksa, SMAD, and Kuura, delineating a timeline of iterative progress The general architecture of Kuura is shown in Figure 2. as detailed in Table 2. Each iteration is a response to the We chose the Unreal Engine 5 game engine because of evolving needs and challenges encountered. Kuksa, ini- its versatility in creating realistic simulations. This is tially misaligned with its focus on automotive app stores an essential part of the research objective to verify the and firmware updates, has since been archived [8, 2]. consistency of the data between the simulation and the SMAD was unsustainable due to its complexity and poor real test runs. The MQTT protocol was chosen to collect documentation [42]. Our simplified stack emerges as and transfer the data to the cloud server and the game a response, stripping away the superfluous to focus on engine, as it is reliable and efficient for real-time data functionality. It leverages OpenShift (run on CSC Rahti transfer, which might be the next step in the research container cloud 1 ) for its cost-efficiency compared to Mi- and, thus, critical requirements in our study. The MQTT crosoft Azure. This pragmatic approach is engineered protocol operates asynchronously and is considered an to reduce complexity, cost, and maintenance overhead, ideal choice for IoT applications that often operate on 1 https://rahti.csc.fi/ Purpose Eclipse Kuksa Cloud SMAD stack Kuura (this paper) Cloud Service Provider Microsoft Azure Microsoft Azure OpenShift Deployment Platform Kubernetes Kubernetes OKD Client-Server Messaging In- Eclipse Hono Eclipse Hono Eclipse Mosquitto frastructure Broker Serverside Messaging Infras- - Ambassador and Kafka Python script tructure with Zookeeper Client Message Persistence InfluxDB MongoDB InfluxDB Client Message Data Mod- Kuksa.VAL Kuksa.VAL Client implementation elling Client Firmware Updates Eclipse hawkBit - - Client Appstore Kuksa Appstore - - Messaging Telemetry Storage - MongoDB - Data Visualization - Node-RED Grafana Deployment Monitoring - Prometheus Monitoring, - InfluxDB, and Grafana Message Tracing - Jaeger Trace - Table 2 Eclipse Kuksa Cloud, SMAD stack, and Kuura software components. streamlining operations without compromising capabil- car. A laptop computer running Linux was connected to ity. the adapter, and a script was run to record data from the Each framework iteration — Kuksa, SMAD, and Kuura vehicle in a log file. The successful log file collection was — brings new insights. Kuksa’s archival signals a further important in developing the auto-client script pivot away from its original automotive-centric fo- for future larger tests and ensuring the whole system’s cus. SMAD’s downfall was its complexity and reliance functionality. Practical testing in the first phase was on now-inaccessible Kubernetes Helm charts. Kuura carried out by driving the car and ensuring the data was emerges as the distilled essence of its predecessors, em- stored correctly and its format was manageable. bodying simplicity and sustainability. By eliminating non-essential components, Kuura adapts existing func- 3.4. Data Transmission tionalities with more straightforward tools, significantly reducing cost and complexity and enabling an environ- MQTT makes it trivial to multi-cast the collected data if ment conducive to continuous development and opera- we want to enable multiple clients to listen to the gen- tion. erated data simultaneously. One example of such a sce- nario is live visualisation of the data while saving it to a database without additional latency. While we could also 3.3. Vehicle Data Reader save the data and then fetch it from the database, this The OBD-II is a port designed for diagnostic purposes. would add latency to the visualisation. MQTT also has It has multiple buses available. These buses include the built-in, easy-to-configure security mechanisms. Setting CAN bus, SAE-1850 and ISO-9141-2. The automotive up MQTT with SSL is very easy, and configuring the manufacturers can also provide other networks at their MQTT broker to require client certificates for communi- discretion [43]. The bus we are most interested in is cation is also very easy. The connection can also be set the CAN bus. On some vehicles, the CAN bus available up to require a username and password. at the OBD connector can be protected by a gateway We could also use HTTP or raw TCP/UDP sockets as device restricting access to some data from the OBD port. an alternative for MQTT. While HTTP offers security Unlike the CAN bus inside the car, you must poll the measures similar to MQTT, it does not have multi-cast OBD port to receive any data. While we could get most by default. While it is not hard to implement, MQTT has of the data we wanted from the OBD port, some data, it built in and is most likely already done correctly. One like the steering wheel position, was unavailable. This advantage HTTP has over MQTT is the ability to com- makes the OBD port unsuitable as a data source for our municate directly between two applications, eliminating purposes, as it would make it quite difficult to drive the the need for a broker in cases where there is only one virtual car in Unreal accurately. client. In the evaluation phase, we collected data from an Raw sockets are the most basic option, and they don’t OBD-II Bluetooth adapter connected to a Toyota RAV4 come with any of the advanced features included in Figure 3: Sequence diagram of the vehicular data transferred to Unreal Engine 5. MQTT out of the box. However, they are very versa- 3.5. Cloud Environment tile and can be used for various purposes. One advan- The cloud environment receives data from the MQTT tage the sockets would offer is the ability to write raw broker. The environment also has a Python script that can data as is to the socket. This would enable saving connects to the broker to receive the data from the vehicle. raw can dumps in a database with minimal overhead The script stores all of the messages received by InfluxDB. if we ever needed/wanted to support it. One problem The point name and field are derived from the MQTT with multi-cast solutions is that the provider has no idea topic. The timestamp is also gotten from the MQTT if any clients are listening for the sent data unless the message payload. Since the timestamp is included in the clients have been programmed to provide feedback when message, we could use any database solution to store the they are listening. This makes it harder to implement the data. If the timestamp were missing, however, then a provider in a way that it holds the messages in memory time series database would be our only option since the or saves them locally in case the data is sent to nowhere. message times are crucial for playback at a later time. In the experiments, a laptop was used as the in-vehicle By getting the message time from the provider, we can client running Ubuntu 22.04 LTS, and the script collect- ensure that network conditions do not affect the accuracy ing the data was written in Python using an OBD library of the recorded timestamps. [44]. The script writes read values into a CSV file locally InfluxDB, a time series database used to store large and publishes them using the MQTT protocol. The back- amounts of time-stamped data due to its high perfor- end was deployed on CSC’s Rahti as RedHat OpenShift mance and scalability, was stored at the onset of the pro- deployments. On the server side, Mosquitto MQTT bro- cess. Storage is essential in handling large amounts of ker forwards the published messages to subscribers. The data that emanate from driving vehicles. A Python script most important subscriber is a Python service that stores was then used in the next stage of the data-processing received messages in an InfluxDB instance. As an addi- workflow. This script had two main functionalities: First, tional demonstration, Grafana was deployed to provide it reads GPS point data pre-recorded into a JSON file, a real-time dashboard for the published and stored data. which is vital in mapping out routes of vehicles. Sec- The sequence diagram is provided in Figure 3. ondly, this script establishes a connection with InfluxDB to retrieve useful information within a particular range. This recovery is critical for evaluating the vehicle’s per- formance and environmental conditions during various experiment stages. At this point, the processed data goes through an MQTT broker using a Python script. Once more, this pro- tocol provides lightweight messaging, providing fast and reliable real-time information transmission that would be needed for the simulation environment. 3.6. Simulation Environment Multiple reasons contributed to the choice of Unreal En- gine 5 game engine, including the capacity to create real- istic simulations of real car driving and the possibility of driving a car in a simulation, thereby generating corre- Figure 4: Simulated route in the virtual environment, based sponding data. The research aimed to ensure uniformity on the real data points collected during the experiments. between the simulation and actual driving, thus requiring realistic simulations. Unreal Engine 5 is also open-source, which meets one of the implementation principles of the and efficient. This technique also makes Unreal Engine study, making future development as easy as possible. simulation more elaborate. It allows different scenarios The research utilised the MQTT protocol, one of the to be run on a platform without sticking to a single static key IoT connections and data collection components. Un- map, giving the evaluation process more flexibility. real Engine does not have native MQTT support. For this reason, we used the NinevaStudios MQTT-utilities exten- sion with some modifications. This extension allowed 4. Experimentation MQTT data communication, which is essential for col- lecting data from the simulation, with minor adjustments 4.1. Real-life Experiment made to transfer it to cloud storage securely. Through The real-world tests were conducted in the OuluZone this connection, it was possible to develop a dynamic and vehicle testing area using a Toyota RAV4 Hybrid 2019 interactive simulation environment. vehicle. A closed area, such as OuluZone, was chosen Lastly, we simulate a car running along received GPS because it allows for assessing the drives and their safety. points as shown in Figure 4. In the simulation, the vehi- The significance of this place is that it helped gather and cle’s movement was driven by speed data acquired from analyse information in real-life scenarios, thus allowing the MQTT broker. As a result, real-time synchronisa- comparison and verification with data collected from tion between the GPS points and speed data gave an virtual and actual driving instances. Besides being a actual representation of the journey made by the vehi- recreational driving and sports centre, OuluZone is also cle, hence allowing for the immersion of details about a notable site for research and learning, especially on its performance in different circumstances. Such a holis- autonomous cars and related technologies. tic approach to data storage, processing, transmission, Several laps were driven during the tests, some with and visualisations shows how diverse technologies can the cruise control set at different speeds (30km/h, 40km/h, be integrated into high-level vehicular data analysis and and 50km/h) to facilitate the validation of results in the simulation. simulation with data collected at a constant speed. Laps The initial version of the Kuura presented in this paper were also driven without cruise control at varying speeds. has a dynamic road generated as the car moves around, Driving data was collected during the test via an OBD-II thus simplifying testing by making it independent of Bluetooth adapter connected to a laptop running Linux. environmental conditions. This method enables better This allowed for the vehicle data to be logged and its flexibility in the testing process because it does not re- format managed. Towards the end of the tests, a USB quire a predefined route or special environmental cir- adapter enhanced data collection. cumstances. Generating dynamic roads is essential to ensure the reliability of the data collection system. This phase, built on the multiple approaches used in the study, 4.2. Virtual Experiment emphasises adaptability and precision. By generating Our virtual experiment utilised Unreal Engine 5.3.2 to the road during runs accuracy of collected data could be drive test drive scenarios comparable to our real-world instantly evaluated. It is particularly advantageous to data collection efforts. In this experiment, we used the work within this dynamic environment for the purposes same logger used during actual test drives with a real car, of identifying and solving prospective issues within a ensuring a uniform approach to data acquisition and prov- workflow for data processing that would make it strong ing that the logger could be used without changes in both Figure 5: Screenshot of influxDB which contains both real- world data (smad/toyota) and virtually collected data (unre- al/toyota). Figure 6: A picture of the car driving in the virtual OuluZone 3D environment using the data collected in the real OuluZone. environments. We gathered data on speed and time from the virtual test drive, which can be cross-verified with the real car’s outputs. The current limitation of real-world 5. Discussion and Conclusions data collection stems from the OBD-II interface’s inabil- ity to provide comprehensive vehicle diagnostics. In the In this study, we have aimed to bring new insights into virtual setting, we collected additional data such as gear, vehicular data collection and the creation of digital twins throttle, brake application, and steering angle. These by using the Eclipse Kuksa platform and Unreal Engine were predominantly included for illustrative purposes, 5 to simulate driving scenarios. Our main focus was aiming to demonstrate the extensive data collection pos- providing an overview of the simplified vehicular data sibilities within a simulated environment. It is important collection architecture that can be easily developed for to note that verifying these additional parameters will further projects and verifying the consistency between become feasible with future access to the CAN bus, allow- real and simulated vehicular data through practical real- ing for a more detailed and accurate comparison between world experimentation. virtual and real vehicle data. Using the MQTT protocol for sending data and Unreal Engine 5 for simulation has allowed us to compare real driving data with simulated ones. This method makes 4.3. Experimentation Results digital twins more reliable and allows later use for testing In our validation process, we specifically focused on com- in many conditions that are hard or expensive to create paring the collected GPS data and speed data between the in real life, like very bad weather or different kinds of actual and virtual driving tests conducted in Unreal En- traffic situations. gine 5. As shown in Figure 5, the same InfluxDB database We encountered challenges in data collection via the successfully contains both real-world data (smad/toyota) OBD-II protocol because it is filtered and does not allow and virtually collected data (unreal/toyota). This design the collection of all possible data. This limitation high- will further allow simultaneous analysis of both virtual lighted the need for more comprehensive data acquisition and real-world data sets, allowing us to expand the digital methods like the CAN bus. The data collection limita- twin creation capabilities with virtual realities and actual tions prompted us to consider future enhancements in real-life test runs, independently of the data source. our methodology to achieve a more accurate and encom- As illustrated in Figure 6, we successfully mapped the passing digital representation of the vehicle. collected GPS data onto the 3D model of the racetrack in Our findings open up possibilities for future research runtime from cloud and verified its accuracy. This demon- directions, including optimising data transmission meth- strates that our virtual environment can accurately repli- ods for improved efficiency and exploring bi-directional cate real driving conditions. The speed data collected in data flow between the digital twin and the vehicle. Such the database corresponded with the data obtained in the advancements could potentially enable real-time vehicle Unreal Engine 5 simulation, confirming the consistency control based on digital twin data. of data in both real and virtual driving scenarios. While By integrating additional simulation models and con- the data transmitted from the game engine to the server sidering more sophisticated data collection interfaces, we was also accurate, at this stage, our primary focus was anticipate that future iterations of this work will address on verifying the accuracy of speed and time information. the current limitations and unlock new capabilities for Expanding this experimentation to cover a wider range digital twins in automotive research and development. of variables is possible in future research. The potential for these technologies to improve vehicle safety, efficiency, and innovation is immense, paving the way for a more interconnected and intelligent transporta- tions, and design implications, IEEE Access 7 (2019) tion ecosystem. 167653–167671. Future efforts should be made using the CAN bus in- [8] A. Banijamali, P. Jamshidi, P. Kuvaja, M. Oivo, stead of the OBD-II to improve accuracy completeness Kuksa: A cloud-native architecture for enabling and to have access to all possible data the vehicle pro- continuous delivery in the automotive domain, vides. Reconsidering data transmission methods, like in: International Conference on Product-Focused MQTT, for more efficient data multicasting is also a pos- Software Process Improvement, Springer, 2019, pp. sible future direction. In the future, we are also looking 455–472. into sending data from the game engine to the car instead [9] J. V. Sørensen, Z. Ma, B. N. Jørgensen, Potentials of just storing it in the cloud, having the car drive in real of game engines for wind power digital twin de- life and the game engine simultaneously with as little velopment: an investigation of the unreal engine, latency as possible and importing Eclipse Arrowhead to Energy Informatics 5 (2022) 1–30. extend possibilities with simulation models, such as using [10] F. Sang, H. Wu, Z. Liu, S. Fang, Digital twin platform the architecture with Matlab Simulink or corresponding design for zhejiang rural cultural tourism based on open-sourced physics modelling software. unreal engine, in: 2022 International Conference on Culture-Oriented Science and Technology (CoST), IEEE, 2022, pp. 274–278. Acknowledgments [11] A. Alhilal, T. Braud, P. Hui, Distributed ve- hicular computing at the dawn of 5g: a survey, The work has been supported by the EU HORI- arXiv:2001.07077 (2020). ZON project CHIPS-JU CIA FEDERATE (grant number [12] Y. Khaled, M. Tsukada, J. Santa, T. Ernst, On the 101139749), Business Finland project 6G Visible (grant design of efficient vehicular applications, in: VTC number 10743/31/2022), and the Finnish Research Coun- Spring 2009-IEEE 69th Vehicular Technology Con- cil project Northern Utility Vehicle Laboratory Consor- ference, IEEE, 2009, pp. 1–5. tium GO!-RI (grant number 352726). [13] S. Baidya, Y. Ku, H. Zhao, J. Zhao, S. Dey, Vehicular and edge computing for emerging connected and References autonomous vehicle applications, in: Proc. of the 57th Design Automation Conference (DAC), 2020. [1] E. Peltonen, A. Sojan, T. Päivärinta, Towards real- [14] M. Munz, M. Mahlisch, K. Dietmayer, Generic cen- time learning for edge-cloud continuum with ve- tralized multi sensor data fusion based on proba- hicular computing, in: 2021 IEEE 7th World Fo- bilistic sensor and environment models for driver rum on Internet of Things (WF-IoT), IEEE, 2021, pp. assistance systems, IEEE Intelligent Transportation 921–926. Systems 2 (2010). [2] A. Banijamali, P. Kuvaja, M. Oivo, P. Jamshidi, [15] F. Garcia, D. Martin, A. De La Escalera, J. M. Armin- Kuksa∗: Self-adaptive microservices in automotive gol, Sensor fusion methodology for vehicle detec- systems, in: International Conference on Product- tion, IEEE Int Transportation Systems 9 (2017). Focused Software Process Improvement, Springer, [16] Q. Li, L. Chen, M. Li, S. L. Shaw, A. Nüchter, A 2020, pp. 367–384. sensor-fusion drivable-region and lane-detection [3] J. Nickerson, K. Lyttinen, J. L. King, Automated Ve- system for autonomous vehicle navigation in chal- hicles: A Human/Machine Co-learning Perspective, lenging road scenarios, IEEE Transactions on Ve- Technical Report, SAE Technical Paper, 2022. hicular Technology 63 (2014) 540–555. [4] F. Tao, B. Xiao, Q. Qi, J. Cheng, P. Ji, Digital twin [17] A. Ghose, P. Biswas, C. Bhaumik, M. Sharma, A. Pal, modeling, Journal of Manufacturing Systems 64 A. Jha, Road condition monitoring and alert appli- (2022) 372–389. cation, in: IEEE International Conference on Perva- [5] M. Liu, S. Fang, H. Dong, C. Xu, Review of digital sive Computing and Communications Workshops, twin about concepts, technologies, and industrial IEEE, Lugano, Switzerland, 2012, pp. 489–491. applications, Journal of Manufacturing Systems 58 [18] Y. Wang, J. Yang, H. Liu, Y. Chen, M. Gruteser, R. P. (2021) 346–361. Digital Twin towards Smart Manu- Martin, Sensing vehicle dynamics for determining facturing and Industry 4.0. driver phone use, in: Int. conf. on mobile systems, [6] F. Tao, H. Zhang, A. Liu, A. Y. Nee, Digital twin applications, and services, 2013, pp. 41–54. in industry: State-of-the-art, IEEE Transactions on [19] J. Ljungblad, B. Hök, A. Allalou, H. Pettersson, Pas- Industrial Informatics 15 (2018) 2405–2415. sive in-vehicle driver breath alcohol detection using [7] B. R. Barricelli, E. Casiraghi, D. Fogli, A survey on advanced sensor signal acquisition and fusion, Traf- digital twin: Definitions, characteristics, applica- fic injury prevention 18 (2017). [20] B. Schoettle, Sensor fusion: A comparison of sens- security in autonomous vehicles, IEEE Communi- ing capabilities of human drivers and highly auto- cations Standards Magazine 5 (2021) 40–46. mated vehicles, University of Michigan (2017). [33] T. Fuchs, M. Zinser, K. Renatus, B. Bäker, Data [21] D. Michalík, M. Jirgl, J. Arm, P. Fiedler, Developing model of automotive digital twins, ATZelectronics an unreal engine 4-based vehicle driving simulator worldwide 16 (2021) 52–57. applicable in driver behavior analysis—a technical [34] Z. Wang, R. Gupta, K. Han, H. Wang, A. Gan- perspective, Safety 7 (2021) 25. lath, N. Ammar, P. Tiwari, Mobility digital twin: [22] S. Malik, M. A. Khan, H. El-Sayed, Carla: Car learn- Concept, architecture, case study, and future chal- ing to act — an inside out, Procedia Computer lenges, IEEE Internet of Things Journal 9 (2022) Science 198 (2022) 742–749. 12th International Con- 17452–17467. ference on Emerging Ubiquitous Systems and Per- [35] M. Palmieri, C. Quadri, A. Fagiolini, C. Bernarde- vasive Networks / 11th International Conference schi, Co-simulated digital twin on the network edge: on Current and Future Trends of Information and A vehicle platoon, Computer Communications 212 Communication Technologies in Healthcare. (2023) 35–47. [23] A. Dubs, V. C. Andrade, M. Ellis, S. Ganley, B. Kara- [36] G. Bhatti, H. Mohan, R. R. Singh, Towards the future man, O. Toker, A photo-realistic simulation and test of smart electric vehicles: Digital twin technology, platform for autonomous vehicles research (????). Renewable and Sustainable Energy Reviews 141 [24] G. Chance, A. Ghobrial, K. McAreavey, S. Lemaig- (2021) 110801. nan, T. Pipe, K. Eder, On determinism of game [37] D. Chen, Z. Lv, Artificial intelligence enabled digital engines used for simulation-based autonomous ve- twins for training autonomous cars, Internet of hicle verification, IEEE Transactions on Intelligent Things and Cyber-Physical Systems 2 (2022) 31–41. Transportation Systems (2022). [38] S. Kochanthara, Y. Dajsuren, L. Cleophas, [25] W. Jansen, E. Verreycken, A. Schenck, J.-E. Blan- M. van den Brand, Painting the landscape of quart, C. Verhulst, N. Huebel, J. Steckel, Cosys- automotive software in github, in: Proceedings airsim: A real-time simulation framework ex- of the 19th International Conference on Mining panded for complex industrial applications, in: Software Repositories, 2022, pp. 215–226. 2023 Annual Modeling and Simulation Conference [39] S. Niæetin, R. Šandor, G. Stupar, N. Tesliæ, Maximiz- (ANNSIM), IEEE, 2023, pp. 37–48. ing the efficiency of automotive software develop- [26] Q. Liu, D. Xie, S. Hu, J. Wu, Research on dynamic ment environment using open source technologies, performance simulation of in-wheel motor electric in: 2018 IEEE 8th International Conference on Con- vehicle based on carsim-simulink, in: Journal of sumer Electronics-Berlin (ICCE-Berlin), IEEE, 2018, Physics: Conference Series, volume 1820, IOP Pub- pp. 1–3. lishing, 2021, p. 012109. [40] Y. Zhang, Y. Ning, C. Ma, L. Yu, Z. Guo, Empiri- [27] A. Fuller, Z. Fan, C. Day, C. Barlow, Digital twin: En- cal study for open source libraries in automotive abling technologies, challenges and open research, software systems, IEEE Access (2023). IEEE access 8 (2020) 108952–108971. [41] F. A. da Silva, A. C. Bagbaba, A. Ruospo, R. Mariani, [28] J. A. Ross, K. Tam, D. J. Walker, K. D. Jones, To- G. Kanawati, E. Sanchez, M. S. Reorda, M. Jenihhin, wards a digital twin of a complex maritime site for S. Hamdioui, C. Sauer, Special session: Autosoc-a multi-objective optimization, in: 2022 14th Interna- suite of open-source automotive soc benchmarks, tional Conference on Cyber Conflict: Keep Moving! in: 2020 IEEE 38th VLSI Test Symposium (VTS), (CyCon), volume 700, IEEE, 2022, pp. 331–345. IEEE, 2020, pp. 1–9. [29] S. Maulik, D. Riordan, J. Walsh, Dynamic reduction- [42] H. Hirvonsalo, P. Seppänen, On deployment of based virtual models for digital twins—a compara- eclipse kuksa as a framework for an intelligent tive study, Applied Sciences 12 (2022) 7154. moving test platform for research of autonomous [30] A. M. Madni, C. C. Madni, S. D. Lucero, Leveraging vehicles, in: Proceedings of the 2nd Eclipse Re- digital twin technology in model-based systems search International Conference on Security, Arti- engineering, Systems 7 (2019) 7. ficial Intelligence, Architecture and Modelling for [31] D. Piromalis, A. Kantaros, Digital twins in the auto- Next Generation Mobility, RWTH Aachen Univer- motive industry: The road toward physical-digital sity, 2021. convergence, Applied System Innovation 5 (2022) [43] K. McCord, Automotive Diagnostic Systems: Un- 65. derstanding OBD I and OBD II, CarTech Inc, 2011. [32] S. Almeaibed, S. Al-Rubaye, A. Tsourdos, N. P. Avde- [44] Obd library for python 3, https://github.com/ lidis, Digital twin analysis to promote safety and brendan-w/python-OBD, 2023. Accessed: 2023-11- 12.