<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>ACM KDD Conference, August</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Integration Use Case</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Amirhossein Ghafari</string-name>
          <email>amirhossein.ghaffari@oulu.fi</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Huong Nguyen</string-name>
          <email>huong.nguyen@oulu.fi</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alaa Saleh</string-name>
          <email>alaa.saleh@oulu.fi</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lauri Lovén</string-name>
          <email>lauri.loven@oulu.fi</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ekaterina Gilman</string-name>
          <email>ekaterina.gilman@oulu.fi</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Smart City, Transportation, Federated Learning, Edge Computing, Generative AI, RAG</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Center for Ubiquitous Computing, University of Oulu</institution>
          ,
          <addr-line>Oulu</addr-line>
          ,
          <country country="FI">Finland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Infotech Oulu, University of Oulu</institution>
          ,
          <addr-line>Oulu</addr-line>
          ,
          <country country="FI">Finland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>26</volume>
      <issue>2024</issue>
      <fpage>249</fpage>
      <lpage>253</lpage>
      <abstract>
        <p>This paper presents a system for predicting and warning about trafic accidents in smart cities, aimed at enhancing urban safety through advanced data analysis and explained warning and reporting. Our system emphasizes computational eficiency and data privacy, predicting trafic accident severity with good accuracy. By integrating real data with external knowledge sources, the system produces detailed, contextually relevant reports and warnings. Implemented with efective task orchestration, our system ensures seamless integration and resource management. Evaluation results demonstrate high accuracy and scalability, highlighting its potential for practical application in smart city environments. Future work will focus on further enhancing model eficiency, exploring transfer learning for broader applicability, and conducting real-world deployments to validate system performance.</p>
      </abstract>
      <kwd-group>
        <kwd>When edge computing is integrated with AI</kwd>
        <kwd>known as</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>jected to live in urban areas [1]. Urbanization, driven by
population growth and migration towards cities, presents
both opportunities and challenges such as
overpopulation and trafic congestion [ 2]. Developing smart cities
is a strategic approach to mitigate these challenges.</p>
      <sec id="sec-2-1">
        <title>A ”smart city” integrates information and communica</title>
        <p>
          tion technology to enhance urban living [
          <xref ref-type="bibr" rid="ref1">3</xref>
          ]. This concept
emphasizes the interconnection of community, people,
and technology, aiming to prioritize human needs [4].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Urban mobility and transportation are significant chal</title>
        <p>lenges, with trafic congestion and accidents being major
concerns. Annually, trafic accidents result in 1.35
million deaths globally, underscoring the critical need for
efective accident prevention measures [ 5].</p>
      </sec>
      <sec id="sec-2-3">
        <title>In large-scale Internet of Things (IoT) ecosystems, efi</title>
        <p>cient data processing is crucial. Centralized cloud servers
face latency and security challenges for many
application domains, making real-time processing dificult [</p>
      </sec>
      <sec id="sec-2-4">
        <title>Edge computing aims to address these limitations by bringing computational resources closer to data sources, enabling timely processing and reducing latency [7, 8].</title>
        <p>nEvelop-O
(E. Gilman)
tem, containing two AI modules: first, Federated
Learning (FL) [18] model to predict trafic accident occurrences
and estimate severity and second, Generative Artificial
intelligence (GenAI) to generate reports and warnings.</p>
        <p>Attribution 4.0 International (CC BY 4.0).</p>
        <p>© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Moreover, we utilized k0s, a lightweight Kubernetes
distribution, for eficient task orchestration [ 19]. The task to generate datasets, which are challenging to replicate
orchestration capabilities of k0s are crucial for seamlessly in real-life scenarios. Yu et al. [21], with the same aim,
integrating the FL models and Retrieval-Augmented Gen- proposed a Deep Spatio-Temporal Graph Convolutional
eration (RAG) processes across multiple edge nodes. This Network for trafic accident prediction for Beijing
trafenables automated deployment, scaling, and manage- fic data, which was collected hourly over three months
ment of tasks, ensuring high availability, fault tolerance, and includes accident records (time and location),
veand robust performance monitoring for our accident pre- hicle speeds, meteorological conditions and points of
vention warning system. interest. Recent research has considered informing other</p>
        <p>The contributions of this work can be summarized as vehicles after detecting trafic accidents using IoT, IoV,
follows: and related technologies. Zhou et al. [22] proposed an
1. We integrate two diferent kinds of AI modules accident detection algorithm based on spatio-temporal
into a coherent distributed system supporting ac- feature encoding with a multilayer neural network. This
cident prevention. We comprehensively evaluate method first detects border frames as potential accident
this system and analyze the related challenges frames, then encodes the spatial relationships of detected
and opportunities. objects to confirm an accident. The process involves
using Histogram of Oriented Gradients and ordinal features
2. We orchestrate tasks and monitor our system, initially, followed by CNN feature encoding and object
examining its feasibility for real-world smart city relationship detection with a multilayer neural network.
environments. A trained Support Vector Machine then confirms the
The remainder of this article is organized as follows. presence of an accident.
sec:relatedwork discusses related work, while sec:design Another approach involves eforts to reduce accidents
describes the system design, and sec:implementation de- before they occur is the work of Uma and Eswari [16],
tails the implementation. sec:eval then provides a de- which developed a prototype using a Raspberry Pi and
tailed system evaluation and metrics, sec:discussfuture Pi Camera, along with sensors to monitor driver’s eye
discusses our findings, implications, and future research movements, detect yawning, and identify toxic gases and
directions, and sec:conclusion concludes the work. alcohol consumption. This system, employing the Haar
Cascade algorithm for face detection and calculation of
Eye Aspect Ratio and Mouth Aspect Ratio, estimates risk
2. Related Work through these feature analysis. Besides, to identify
accident hot spots, Le et al. [23] used Road Trafic Accident
2.1. Intelligent Transportation System in data over three years in Hanoi, Vietnam, to develop a
Smart City GIS-based statistical analysis technique. This method
assesses the influence of accident severity on
temporalIntelligent Transportation Systems (ITS) are essential spatial patterns, identifying accident hotspots in relation
for the advancement of smart cities, with many recent to specific times of day and seasons.
studies dedicated to improving urban trafic management Beyond the mention in [24] of the potential service
and safety. Here, we discuss several key works that have supports of cloud to autonomous vehicles applications,
made significant contributions to this field. As an exam- edge computing is playing a pivotal role in reshaping
ple, Hasan et al. [20] used the Google Distance Matrix trafic management in smart cities. Within this domain,
and Directions APIs to provide advanced trafic jam alerts. Mohamed’s [25] and Zhou’s research groups [26]
demonTheir Internet of Vehicles (IoV) module detects accidents strated substantial improvements in trafic management
and, with the assistance of the National Data Warehouse and reduced congestion durations through an edge-based
and a GPS module, notifies the nearest clinic. They devel- model for real-time trafic data analysis. Besides, to
oped an Android application for routing suggestions and achieve low latency and high prediction accuracy on
employed an Arduino with a Sonar sensor, temperature vehicle identification at the edge, Wan et.al [ 27] have
sensor, gyroscope, piezo sensor, and GSM module as the eliminated redundant frames from collected videos and
core processing unit. presented an approach for real-time video processing.</p>
        <p>Working on one of the most trendy applications, Bort- In a similar manner, Ke et al. [28] developed a
multinikov et al. [15] developed a 3D Convolutional Neu- thread system for real-time detection of near-crash events
ral Network (CNN) to recognize accidents automatically. in trafic, using video analytics on dashcams.
LeveragThey trained the CNN using a custom video game to ing edge power, their system eficiently performs object
create accident scenes with various weather and lighting detection and tracking directly from the video feeds on
conditions, adding noise to diversify the data. The model board. This approach involves removing irrelevant video
was then tested on real trafic videos from YouTube. The to conserve bandwidth and storage while collecting
dinovelty of this research lies in the use of video games verse and valuable data for trafic safety such as road user
type, vehicle trajectory, vehicle speed, brake switch, and systems begin by detecting vehicles and subsequently
throttle. The approach from Ke et al. demonstrates con- estimating trafic flow density.
siderable promise for widespread application due to its In their research, Xu et al. [32] employed remote
low cost, real-time processing, high accuracy, and broad sensing images for this purpose, while Chougule et al.
compatibility with various vehicles and camera types. [33] continuously used the estimated trafic density from</p>
        <p>Additionally, a recent work by Nguyen et al. [29] uti- intersection-captured images to dynamically adjust the
lized Blockchain technology alongside edge computing to duration of green light and schedule the timing of signals
develop a reliable and transparent situational awareness across all lanes.
system for autonomous vehicles. Their system broadcasts As one of the highlights in the narrow field of applying
notifications and alternative route suggestions from the FL on ITS: risk detection, Yuan et al. [34] introduced
nearest edge station when congestion or accidents are FedRD, a framework combining edge-cloud computing,
detected by other vehicles, using various sensing data FL, and diferential privacy techniques for intelligent road
sources, including dashcam images and environmental damage detection and warning. The framework not only
factors like weather, temperature, and humidity. The use improves detection performance and coverage area but
of Blockchain in their study ensures the data validity and also addresses privacy concerns through Individualized
integrity, as well as facilitates collaboration among difer- Diferential Privacy with pixelization technique.
ent service providers. However, despite the recognized Comprehensive evaluations demonstrate FedRD’s
cavision and applications, Zhou et al. [30] emphasized that pability to deliver high detection accuracy and wider
covemploying edge computing in ITS always comes with erage while preserving user privacy, even in scenarios
inherent challenges related to sensor failure, and privacy where edge devices have limited data. This
groundbreakprotection concerns, which must be addressed for efec- ing efectiveness sets a new benchmark in the field.
tive implementation.</p>
        <sec id="sec-2-4-1">
          <title>2.3. GenAI in ITS 2.2. FL in ITS</title>
        </sec>
      </sec>
      <sec id="sec-2-5">
        <title>Recently, GenAI has garnered significant attention in</title>
        <p>Building on the challenges identified by Zhou et al. [ 30] several applications, including ITS, due to its
advanparticularly concerning privacy protection, FL recently tages and flexibility. By analyzing data from various
has been used more in smart cities. Amongst many ap- sources, such as roadside sensors, vehicles, and trafic
plied domains within urban environments, the extension signals, GenAI enhances urban operations by detecting
of FL applications in trafic systems is mostly leveraged patterns, identifying trends, and providing accurate
prefor trafic monitoring and accident predictions. dictions and advice. With the leverage of natural
lan</p>
        <p>FedGRU - FL-based Gated Recurrent Unit (GRU) neural guage processing, GenAI can present these predictions in
network [17] is one of the pioneering works for trafic human-understandable language, making these
technololfow prediction (TFP) with federated deep learning that gies more accessible and practical for smart services [35].
comparably performs to other advanced competing meth- See prior works [36, 37] for examples of how GenAI
inods without compromising the privacy and security of tegrated into many services within cities. As another
data. Additionally, as proved by experiments, the joint example in ITS, Impedovo et al. [38] propose a deep
genannouncement protocol proposed in this paper helps in erative model to predict weekday vehicular trafic flow
reducing communication overhead by 64.10% compared to prevent accidents in the most critical areas and
imwith centralized models, implicating the scalability of prove continuity by reducing trafic. More notably, RAG,
FedGRU for bigger networks. ifrst introduced by Lewis et al. in 2020 [ 39]l, stood out</p>
        <p>With the same motivation to address the privacy expo- as a part of this GenAI world, representing a distinct
sure risk of centralized machine learning, Qi et al. [31] approach to generating text, informed reasoning, and
presented a fully decentralized FL network, utilizing a supporting decision-making.</p>
        <p>Blockchain-based FL architecture as opposed to the con- Its application in ITS is not really popular, however,
ventional vanilla framework. The authors employed the there are some notable works. For instance, Dai et al. [40]
local diferential privacy technique to protect vehicle lo- integrated RAG into autonomous driving systems to
encation and utilized GRU to achieve accurate TFP. Perfor- hance decision-making processes. According to the
aumance and security comparisons were also made among thors, the use of RAG in their work addresses the problem
diferent machine learning models and with/without the of impractical generated content from the mainstream
use of blockchain. Qi et al. also conducted comparative foundation models nowadays, such as GPT4 or LLaMa. It
analyses in terms of both performance and security, exam- helps these models enhance the reliability of their outputs
ining various machine learning models and contrasting during the generation phase by dynamically retrieving
acscenarios with and without blockchain implementation. curate contextual information from outer databases (e.g.
Concerning the monitoring of trafic congestion, typical updated trafic rules, driving experiences, or human
prefSeverity Estimation by FL</p>
        <p>Prepocessing
Accident Report Generation by RAG</p>
        <p>Query
Semantic
Meaning
Model
s
g
n
i
d
d
e
b
m
E</p>
        <p>Similarity
Search
Library
s
k
n
u
h
C
t
n
a
v
e
l
e
R</p>
        <p>Accident Severity</p>
        <p>Prediction Model</p>
        <p>Estimated Accident Severity
Sensors</p>
        <p>Data
Comprehensive
analysis
of US
accident data</p>
        <p>Warning
Generation Model</p>
        <p>Traffic
Accident
Report
erence). Similarly, Ding et al. [41] utilized RAG for more estimation. Figure 1 illustrates the overall system flow,
controlled generation of trafic scenarios. Specifically, highlighting the interplay between the key components:
RealGen [41] synthesizes new scenarios by combining Federated Learning (FL) and Retrieval- Augmented
Genbehaviors from multiple retrieved examples in a gradient- eration (RAG).
free manner, using templates or tagged scenarios. This This integrated system combines the strengths of RAG
in-context learning framework provides versatile gener- and FL to ensure high-quality outputs while maintaining
ative capabilities, including scenario editing, behavior data privacy and relevance. FL enhances the accident
composition, and the creation of critical scenarios, thus severity prediction model while maintaining data privacy.
enhancing the adaptability and precision of synthetic The RAG system uses integration between the warning
data generation for various applications. Most recently, generation model and the knowledge retrieval model to
in his Master’s thesis, Mohanan [42] evaluated eight em- enhance the generation process with relevant external
bedding RAG models for a chatbot tailored to Indian data, improving context and accuracy.
Motor Vehicle Law. Our training approach starts from data preprocessing.</p>
        <p>As can be seen, prior research typically focuses on a The preprocessed dataset is then used to train the FL
single module, such as risk estimation or warning gen- model for trafic accident risk estimation. The
prediceration, limiting possible support for ITS. This raises an tions, along with the sensors’ real-time data, are utilized
open question: ”Is it possible to integrate all diverse compo- as input for the RAG model. The RAG model integrates
nents into a cohesive and comprehensive ITS framework?” advanced retrieval mechanisms with state-of-the-art
lanThis is where our work positions. guage generation capabilities to produce detailed
warnings and reports for trafic accidents.</p>
        <p>To eficiently manage and deploy these components,
3. System Design we use a task orchestration tool. This tool ensures
seamless integration and coordination among the various
models, automates deployment, and scales the system as
needed. Additionally, it facilitates robust performance
monitoring, ensuring high availability and fault tolerance
across the system.</p>
        <p>This article presents a system for predicting and
preventing trafic accidents. It is capable of predicting the
possible accidents based on the trafic conditions and
other available data, and provides detailed textual
comments to the user explaining the grounds leading to such</p>
        <sec id="sec-2-5-1">
          <title>3.1. Dataset 3.3. Retrieval-Augmented Generation</title>
          <p>This study uses US Accidents (2016-2023) dataset 1[43] RAG combines an information retrieval component with
from Kaggle, distributed under CC BY-NC-SA 4.0 license. a text generator model to provide situational information
This dataset comprises a vast collection of over 7.7 mil- and guidance [44]. In the ITS context, RAG can integrate
lion (7,728,394) trafic accident records, covering 49 states various external data sources to analyze and report trafic
of the USA from February 2016 to March 2023. The ac- accidents, identifying risk factors and details [45]. This
cident data were collected using multiple APIs that pro- makes the system more dynamic and adaptable to new
vide streaming trafic incident data captured by various information. In our system, see Figure 1, RAG provides
entities, including the US and state departments of trans- textual accident warnings to the end user, along with
portation, law enforcement agencies, trafic cameras, and explanations of how the estimates were derived.
trafic sensors within the road networks. The data in- Knowledge retrieval model It is designed to find the
cludes detailed information on accident severity, location, most relevant information from an external knowledge
time, and weather conditions. This dataset was utilized base in response to the query. This enhances FL model
to train the FL models for trafic accident prediction. output and sensor data with relevant information. We
use SentenceTransformers2 as a retrieval model based on</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3.2. Federated Learning similarity search.</title>
      <p>Warning generation model: It is designed to generate
Our application relies on FL model for accident risk esti- new content using language models. It uses the retrieved
mation. FL was selected based on two primary consider- information by the retrieval model and FL-output details
ations: data privacy and collaborative enhancement. to generate a response. For our system, we use
gpt-3.5turbo-06133 to create contextually relevant warnings
and detailed reports. The accident report includes the
severity of the accident, the location and trafic control
procedures, and guidance and actions.
1. Privacy: Addressing privacy concerns, vehicles
in a real scenario do not transmit raw data, which
could potentially reveal sensitive information.
Instead, only model parameters will be sent,
ensuring that individual data remains secure and
private. This cannot be done with traditional
centralized learning when all data need to be sent to
a central server for training.
2. Collaboration: When a vehicle updates and
shares its model parameters, it contributes to the
overall learning process. This collective efort
leads to an improvement in the overall model’s
performance, as it can learn from a wide range of
diverse and localized inputs. The shared
knowledge enables more accurate and robust risk
estimation.</p>
      <p>The training data features provide a detailed view of
accident records, including the specifics of the accidents,
the geographic locations, the prevailing weather
conditions at the time of the accidents, and various
environmental and contextual factors that may be relevant to
analyzing the accidents. In a real scenario, the vehicle’s
onboard computing system uses these inputs to
continuously update its local model, learning from real data.</p>
      <p>Once the training is done, the model parameters will be
sent to the nearby edge server. The server, after receiving
a suficient amount of models will start doing the
aggregation to get the global model, which is then sent back
to the participating vehicles. When this whole process
is complete, we finish one communication round and
continue to the next round.</p>
      <sec id="sec-3-1">
        <title>1https://www.kaggle.com/datasets/sobhanmoosavi/us-accidents</title>
        <sec id="sec-3-1-1">
          <title>3.4. Task Orchestration and Monitoring</title>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Efective resource management and device health moni</title>
        <p>toring are essential for enhancing the responsiveness of
smart city services. This requires comprehensive system
monitoring that spans from edge devices to the cloud.
The deployment of applications on edge devices
necessitates advanced task orchestration platforms, which
must be carefully selected based on specific requirements.
Given that edge devices typically have limited resources,
the chosen tool must operate smoothly under such
constraints. For the proposed system, k0s4 has been selected.
We selected k0s because of its minimal resource
consumption on edge devices and its straightforward and rapid
implementation process, supported by comprehensive
documentation and active developer forums. It typically
operates with as little as 1 CPU and 512 MB of RAM on
each controller node and 1 GB of RAM on each worker
node, which aligns well with the capabilities of edge
devices. However, the minimum requirements increase
when the number of worker nodes is increased.
Additionally, numerous monitoring options compatible with k0s
are available. k0s is packaged as a single, self-extracting
binary which embeds Kubernetes binaries. It has many
benefits, such as it has no OS level dependencies and
everything can be, and is, statically compiled.</p>
      </sec>
      <sec id="sec-3-3">
        <title>2https://sbert.net/ 3https://platform.openai.com/docs/models/gpt-3-5-turbo 4https://docs.k0sproject.io/stable/</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. System Implementation</title>
      <p>back to the participants for training in the next round.</p>
      <p>The FL training process concludes after ten
communica4.1. Risk Estimation with FL tion rounds. At this stage, various model architectures,
encompassing difering layer counts and
hyperparame4.1.1. Preprocessing ters, were evaluated over 50 communication rounds to
The preprocessing phase for our system includes a series observe the trend and convergence in via its performance.
of essential data preparation steps to ensure the quality The selected model outperformed alternatives; models
of the dataset for further analysis: with reduced layers demonstrated inferior outcomes
(31. Data Cleaning: Duplicated and missing values were 4%), while configurations with additional layers, despite
removed. a 3% accuracy improvement, incurred prolonged
train2. Feature Engineering: To enhance the informative- ing duration and converged to local, rather than global,
ness of the dataset, a new feature, called “Comfort_Index” optima. See Table 1 for details.
following Equation 1 is created.
  
_  = (    − 32) ∗ (  /100)</p>
      <p>(1)
3. Data Resampling: To address the imbalance issue,
both random oversampling and undersampling of the
data was done to ensure that each label had an equal
distribution.
4. Data Transformation: Done according to feature
type:
• Categorical Data: One-hot encoding was
applied to categorical columns, except for “Street,”
“State,” and the target label “Severity”.
• Boolean Data: Columns with two distinct values</p>
      <p>were binarized, converting them to 0 and 1.
• Numeric Data: Columns containing numeric
data were left unchanged, preserving their
original values.</p>
      <p>Using the RAG model, we retrieve text passages using
an input sequence. During the generation of the target
sequence, we include these passages as additional
context. Our model leverages two components, which are
implemented in LangChain5. A retriever that retrieves
5. Standardization: The dataset was then subjected relevant text snippets in response to a user’s query or
to StandardScaler standardization. This process ensured prompt based on knowledge source which is uploaded
that all features had consistent scales and values within using built-in document loader from LangChain.
a particular range. In our system, we rely on the US trafic accident
database as an external knowledge source, containing a
comprehensive analysis of US trafic accident data [ 46].
4.1.2. FL Training and Prediction This report provides insight into preventive measures
To simulate a real-world scenario using our chosen and policy recommendations for decreasing trafic
accidataset, we distributed the data across several nodes and dents in the US based on detailed analyses by state, time,
established certain assumptions. This section will elabo- and contributing factors such as weather. The retrieval
rate on those details. process begins with loading documents using a tool in
Distribution: The data is divided into five equal parts, LangChain. This process is enhanced by a splitter tool,
corresponding to five nodes in the system. We also make also integrated into LangChain, designed to segment
exsure the number of samples of each label is distributed tensive texts into smaller chunks based on a specified
equally among clients. chunk size by examining characters recursively which is
Model Training: Each client trains its local model, con- crucial for the eficient handling of large textual data.
sisting of three fully connected layers. Training specifi- For the creation of text embeddings, we employ
Hugcations include the use of the cross-entropy loss function, gingFaceEmbeddings, a specialized embedding model
Adam optimizer with a learning rate of 1e-3, and a batch from the Hugging Face library6 within LangChain. This
size of 32. After ten training epochs, the locally trained model transforms the segmented text chunks into
numermodels are aggregated by the server into a global model, ical vectors, facilitating their computational handling.
and the global parameters are saved at each checkpoint, 5https://www.langchain.com/
here at each communication round, before being sent 6https://huggingface.co/
To store these embedding vectors in a vector store, we</p>
      <p>7
utilize the FAISS library , a robust vector database. It
enables efective similarity search by identifying text chunk
vectors most similar to the question vector. This process
is vital to determine which portions of the knowledge
source are most pertinent to the input query. This is for
later retrieval at query time based on the k argument
which finds the top k most relevant text chunk vectors
for each query. Table 2 summarizes the RAG parameters
used.</p>
      <p>The generator creates a more detailed, factual, and
relevant response based on the original input and retrieved
documents. The original input represents the severity
of an accident, derived from the FL output and
complemented by sensor real-time data. For the generation
of coherent and contextually relevant text, the original
input and the retrieved documents are fed into
gpt-3.5turbo-0613, a sophisticated pre-trained language model.</p>
      <sec id="sec-4-1">
        <title>Based on the content of these documents, the model generates coherent and contextually relevant text grounded in real-world information. Figure 2 illustrates an example of a trafic accident report generated by RAG.</title>
        <sec id="sec-4-1-1">
          <title>4.3. Task Orchestration and Monitoring</title>
          <p>9
using Docker and deployed our application using Lens
ity and relevance of warnings and reports generated by
As discussed in Sub-section 3.4 we opted for k0S, which
the RAG model were assessed. The system’s prompt
reis ideal for our needs and simple in implementation. We</p>
          <p>8 sponsiveness was also tested, particularly how quickly
used Lens IDE which is a Kubernetes IDE to manage
it can generate alerts and warnings based on incoming
the cluster and monitoring of the whole system. It
aldata. Furthermore, the resource management aspect was
lows for comprehensive oversight of nodes, pods, and
evaluated to ensure that the system’s resource usage is
resource monitoring. Monitoring involves tracking the
optimized and well-maintained. The developed system
usage of CPU, memory, storage, and network bandwidth,
was deployed and tested on a real cluster of three nodes
and monitoring device safety and functionality to detect
with k0s equipped with the monitoring application.
any potential problem. We containerized our application
IDE and k0s task orchestration tool. We used Cluster met- 5.1. Risk Estimation Evaluation
rics in the Lens IDE to monitor the resources eficiently.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. System Evaluation</title>
      <p>To assess the system’s performance, several key metrics
were employed. We want to ensure that all the
components work perfectly both independently and in the
integrated system. First, we monitored the accuracy of
the FL model for risk estimation, assessing its ability to
predict trafic accident severity. This evaluation utilized
the dataset for training the model. Additionally, the
qual5.1.1. Accuracy</p>
      <sec id="sec-5-1">
        <title>We monitor the training process of the FL model in terms of accuracy, loss, and convergence. The training for 50 communication rounds with 5 training clients takes up to 4.042 hours.</title>
        <p>and the training loss in the lower graph. The model However, as input sizes increase to 1,000 and 10,000,
demonstrates convergence approximately by round 30 the total latency grows more substantially, hitting 0.4487
at 71.15%, as depicted in the upper plot. Initially, model seconds for 10,000 inputs. This increment continues,
accuracy exhibits an upward trend from round 0 to 30, even more sharply, with the model taking 0.9463 seconds
albeit with fluctuations observed around rounds 15-17 to predict outcomes for 100,000 inputs concurrently.
and 21. Subsequently, after round 30, the risk estimation Overall, this evaluation outcome underscores the FL
model appears to have reached a plateau in accuracy, model’s scalability with a total latency, not only for small
becoming converged. This is also reflected in the lower input batches but also optimized for larger ones.
Nevergraph of training loss. theless, it should be noted that the measured time can be
diferent among diferent working devices.</p>
        <p>Training accuracy</p>
      </sec>
      <sec id="sec-5-2">
        <title>It is, however, possible for low power-resource devices to terminate the training process at an earlier stage, such as after round 10 or 20, with negligible tradeofs in accuracy.</title>
        <p>5.1.2. Total latency trends</p>
      </sec>
      <sec id="sec-5-3">
        <title>The bar graph (referred to Fig. 4) depicting the total latency for predictions reveals a clear trend: as the number of inputs processed simultaneously increases, so does the time required for prediction.</title>
        <p>Time response
0.9463
0.8
0.6
cseondS
0.4
0.2
0.0
0.3931</p>
      </sec>
      <sec id="sec-5-4">
        <title>Starting from a swift 0.3931 seconds for a single input,</title>
        <p>the latency moderately rises for batches of 10 and 100
inputs, reaching 0.4062 seconds, suggesting the model
handles small to moderate increases in input size
eficiently.</p>
        <sec id="sec-5-4-1">
          <title>5.2. Accident Warning Report Evaluation</title>
        </sec>
      </sec>
      <sec id="sec-5-5">
        <title>To evaluate the quality of accident warning report gener</title>
        <p>ated by RAG, we have used correctness, relevance, and
faithfulness as criteria to assess LLM outputs10. We used
gpt-3.5-turbo-0613 for the evaluation task to contextually
analyze and interpret generated reports according to the
criteria.</p>
        <p>Correctness is based on the LLM’s internal knowledge.
However, given the potential unreliability of the LLM’s
knowledge base, we enhanced the evaluation method by
incorporating reference labels. This provides an
external benchmark for correctness. The evaluation process
produces a dictionary containing key metrics: “score”, a
binary integer from 0 to 1 indicating compliance with
the criteria, “value”, which is either ”Y” (Yes) or ”N” (No)
based on the score, and “reasoning”, which outlines the
LLM’s chain of thought. Relevance evaluates the
relevance and focus of the generated answer in relation to
the provided prompt. Faithfulness assesses the factual
consistency of the generated answer against the given
context and reference documents. Using this approach,
we ensure not only that the generated content meets the
prompt’s specific requirements. It also remains true to
the factual information provided in the reference
material. Figure 5 illustrates an example of RAG output
evaluation.</p>
        <p>Based on correctness, relevance, and faithfulness
criteria, the evaluation shows that the output accurately
represents an actual quote. Throughout the evaluation
output, all necessary elements are addressed in a
comprehensive, well-structured, and well-written manner.
Based on the evaluation output, the response
summarizes accident data and provides a comprehensive
analysis of weather conditions at the time of the accident,
including visibility and severity. Additionally, it provides
recommendations for preventing accidents in the future
relevant to the reference.
10https://python.langchain.com/docs/guides/evaluation/string/
criteria_eval_chain</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Discussion and Future Work</title>
      <p>Correctness_criteria:
{'reasoning': To determine if the submission meets the criteria, we need to evaluate the
correctness, accuracy, and factual nature of the submission.
1. Check if the submission correctly presents the accident data, including the street,
state, latitude, longitude, and various factors related to the accident.
2. Verify if the submission accurately describes the weather conditions at the time of the
daicrceicdteinotn,, ainndclpurdeicnigpittaetmipoenr.ature, wind chill, humidity, pressure, visibility, wind
3. Assess whether the submission accurately provides information about the severity of the
accident, distance, sunrise/sunset, and comfort index.
4. Evaluate if the recommendations for future prevention are reasonable and relevant to
the accident scenario.</p>
      <p>Based on the above reasoning, the submission meets the criteria if all the above conditions
are satisfied. 'score': 1, 'value': 'Y'}
Relevance_criteria:
{'reasoning': To determine if the submission meets the criteria of relevance, we need to
compare the content of the submission with the provided data.</p>
      <p>We will check if the submission accurately refers to a real quote from the text.
- The submission provides a detailed analysis of the accident data, including the street,
state, and various accident factors. It also mentions the weather conditions, severity,
and recommendations for future prevention based on the given data.
- The submission accurately reflects the information provided in the data.
- Therefore, the submission meets the criteria of relevance.</p>
      <p>Based on the above reasoning, the conclusion is that the submission meets all the
criteria.'score': 1, 'value': 'Y'}
Faithfulness_criteria:
The assistant's response is faithful to the reference context. It accurately summarizes
the accident data provided in the user question and provides a detailed analysis of the
accocmipdreenhte.nsivIet anadlscooveorfsfearlsl trheecormemleenvdaanttioansspecftosr offutthuereaccpirdeevnetntdiaotna.. The response is</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>3.3. Retrieval-Augmented Generation</mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>