<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <article-id pub-id-type="doi">10.1109/ICNNB.2005.1614650</article-id>
      <title-group>
        <article-title>An Evaluation of “Crash Prediction Networks” (CPN) for Autonomous Driving Scenarios in CARLA Simulator</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Saasha Nair</string-name>
          <email>saasha.nair@santannapisa.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sina Shafaei</string-name>
          <email>sina.shafaei@tum.de</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniel Auge</string-name>
          <email>daniel.auge@tum.de</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alois Knoll</string-name>
          <email>knoll@in.tum.de</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Excellence in Robotics &amp; AI, Scuola Superiore Sant'Anna</institution>
          ,
          <addr-line>Pisa</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>TeCIP Institute, Scuola Superiore Sant'Anna</institution>
          ,
          <addr-line>Pisa</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Technical University of Munich</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2005</year>
      </pub-date>
      <volume>1</volume>
      <fpage>438</fpage>
      <lpage>442</lpage>
      <abstract>
        <p>Safeguarding the trajectory planning algorithms of autonomous vehicles is crucial for safe operation in mixed traffic scenarios. This paper proposes the use of an ensemble of neural networks to work together under the moniker of “Crash Prediction Networks”. The system comprises multiple independent networks, each focusing on a different subset of sensory inputs. The aim is for the networks to work together in unison to reach a consensus of whether a vehicle might enter a catastrophic state to trigger an appropriate intervention. The proposed approach would act as an additional layer of safety by supervising the decision making module of an autonomous vehicle. Though the proposed approach encompasses all the sensors and allied paraphernalia, the scope of this paper is exclusively limited to exploring safety monitors for visual sensors. The approach can, however, be extrapolated to other sensors. The evaluation was conducted using the CARLA simulator for simple driving scenarios studying the benefits of modeling temporal features to capture the motion in the environment. Additionally, the paper studies the importance of 'accounting for uncertainty' in models dealing with vehicle safety.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Safety, as defined by Avizenis (Avizienis et al. 2004), is the
“absence of catastrophic consequences on the user(s) and the
environment”. Therefore, safety engineering (
        <xref ref-type="bibr" rid="ref22">Törngren et al.
2018</xref>
        ) relates to “methods used to assess, eliminate and/or
reduce risk to acceptable level”. Safety in vehicles requires
implementing vehicle-level behaviours
        <xref ref-type="bibr" rid="ref10">(Koopman and
Wagner 2017)</xref>
        that ensures safe and reliable performance across
different contexts
        <xref ref-type="bibr" rid="ref14">(McAllister et al. 2017)</xref>
        . This could entail
safety by construction, analysis, and verification/validation.
Currently, however, testing is the main method used for
ensuring safety in automated vehicles, and includes the
following types
        <xref ref-type="bibr" rid="ref7">(Huang et al. 2016)</xref>
        :
1. Software testing: helps to ensure the correctness of the
program at the source code level via unit tests and testing
tools.
2. X-in-the-Loop (XiL) testing: as introduced in
        <xref ref-type="bibr" rid="ref17 ref21">(Stellet
et al. 2015; Riedmaier et al. 2018)</xref>
        , combines real-world
and simulated components for testing the functionality of
parts of the system. The software, hardware and the
vehicle itself can be tested while simulating the residual parts.
3. Testing in real traffic: allows to study how the
autonomous vehicle reacts to real-world driving scenarios
on public roads with other participating actors.
      </p>
      <p>
        Safety concerns are still seen to be a major impediment in
the wide-scale adoption of autonomous vehicles
        <xref ref-type="bibr" rid="ref14 ref18 ref5 ref5 ref8 ref8 ref9 ref9">(Rudolph,
Voget, and Mottok 2018; Kalra and Paddock 2016;
McAllister et al. 2017; Koopman and Wagner 2016)</xref>
        . Safety
monitors can potentially help to alleviate the safety concerns
surrounding autonomous vehicles, by observing the inputs to
and outputs from the various modules contained in the
driving pipeline. These monitors usually differentiate between
three main states that the system of interest, in this case,
the car, may end up in
        <xref ref-type="bibr" rid="ref13">(Machin et al. 2016)</xref>
        . The first
being the catastrophic state, that is one where the damage has
already been done and from which the system cannot be
recovered. The remaining two states, namely the safe state and
the warning state, are non-catastrophic. In the safe state the
system behaves as expected, without any constraints. The
warning state is where the system is close to being involved
in a catastrophe but can still be salvaged. For a system to
transition from a safe to a catastrophic state, it must always
pass through the warning state, thus the warning state can
be thought of as the margin, where applying the correct
intervention brings the system back to the safe state. Thus,
it requires defining a safety strategy that identifies potential
warning states and specifies corresponding safety rules to
guide the response of the autonomous system.
      </p>
      <p>
        The problem with existing approaches to safety
monitoring, such as “Safety Monitoring Framework
(SMOF)”
        <xref ref-type="bibr" rid="ref13">(Machin et al. 2016)</xref>
        , is that they require
handpicked and hard-coded limiters defined at the design stage
describing what counts as safe or unsafe intervals of input
and output. However, in neural networks, rather than the
behaviour being hard-coded into the system, the models
learn to detect patterns from the data they were trained on.
This makes it difficult to combine neural networks with
the existing safety monitoring techniques, and might in
fact limit the expressive power and freedom of neural
networks
        <xref ref-type="bibr" rid="ref1">(Ashmore, Calinescu, and Paterson 2019)</xref>
        . Thus this
paper introduces “Crash Prediction Networks” (CPN) (Nair
Environment
state, reward
      </p>
      <p>action
state, action</p>
      <p>RL-agent</p>
      <p>Crash
Prediction
Network
isCrashed
prediction</p>
      <p>Compare
update
et al. 2019) as a ‘safety by construction’ solution for
monitoring modules composed of neural networks. The
aim of this paper is to explore the effectiveness of CPN
in monitoring various aspects of vehicular safety. CPN for
this purpose has been evaluated in a simulated environment
and in the presence of dynamic obstacles. Additionally, the
potential enhancement of prediction accuracy has also been
investigated using temporal features in the architecture of
the networks.</p>
      <sec id="sec-1-1">
        <title>Overall Architecture</title>
        <p>Consider an end-to-end setup for the driving system with a
range of sensors that can be trained in simulation, the
driving module uses information about the state of the
environment to decide whether to continue straight, turn or apply
brakes. The Crash Prediction Network can be thought of as
an envelope around this driving module. The aim of CPN
is to study the action decision obtained as output from the
driving module, to determine whether it is likely to lead to
a crash given the sensory information about the state. Safety
monitors need to be extremely robust and reliable, thus to
this effect CPN is suggested to be implemented as an
ensemble (Ying Zhao, Jun Gao, and Xuezhi Yang 2005) of neural
networks. Each network differs either in architecture or the
subset of sensory data it consumes as input. The networks
then work in unison to reach a consensus on whether the
vehicle should continue with the current action or trigger an
intervention.</p>
        <p>During the training phase for CPN ( Figure 1), a dataset is
created by allowing a Reinforcement Learning driving agent
to interact with the environment, and storing the information
of the states encountered and the action taken along with the
outcome. This allows the training of the CPN to be posed as
a classification problem with two classes ‘safe’ or ‘unsafe’
indicating whether the action decision with the given state
information led to a collision or not. During the operational
phase (Figure 2), the ensemble of networks that compose
CPN observe the input from the various sensors and the
decision proposed by the driving module to predict the
“safeness” of the outcome. If the proposed action is likely to lead
to a catastrophic state, a predefined intervention is triggered
thereby filtering out potentially dangerous action decisions,
else, the proposed action is executed. The predefined
interInput</p>
        <p>NN-based
component</p>
        <p>Action</p>
        <p>Crash
Predication
Network
Fail-safe
mode</p>
        <p>Invalid</p>
        <p>Valid
vention, also referred to as the “Fail Safe Mode” (as seen in
Figure 2), can be of the form of transferring control to the
human driver, or pulling over to the shoulder of the road and
so on. This paper does not delve into the details of the
intervention, but rather focuses on developing the technique to
identify when the intervention should be triggered.</p>
        <p>A potential problem that might be encountered is that
CPN might lose its relevance in the real-world over time due
to the dynamic and constantly evolving state of the
operational environment. The setup of CPN makes it possible to
train the network in an iterative manner, which can be used
to combat this problem. Implementing iterative training with
CPN would require the continuous collection of live
driving data during the operational phase and training CPN on
the newly collected data at regular intervals. Additionally, an
advantage with this technique is that it is not tightly coupled
with the nature of the driving agent, though the experiments
use an RL-based driving agent, one could easily swap it for
any other driving agent.</p>
        <p>
          Though the proposed architecture and techniques suggest
the use of an ensemble of networks each focusing on a
subgroup of different sensors, the scope of this evaluation is
limited only to visual data collected using RGB cameras
mounted on the hood of the car in simulation. The paper
studies the importance of accounting for temporal features
in the data on the prediction accuracy of CPN by modeling
two different network architectures, namely “Simple CPN”
and “Spatio-Temporal (ST)-CPN” as depicted in Figure 3.
Simple CPN uses a single frame, i.e., an image of the
current state to make a prediction about the level of safety of the
next state based on the proposed action of the driving
module. This is achieved by using a VGG network
          <xref ref-type="bibr" rid="ref20">(Simonyan
and Zisserman 2014)</xref>
          (trained from scratch with the dataset
collected in CARLA as described in the following section)
for feature extraction, which is then concatenated with the
action decision to perform the classification. The ST-CPN
architecture, on the other hand, takes as input an N-frame
long history, i.e., the last N-frames encountered before the
current state, along with the proposed driving decision.
        </p>
        <p>
          Additionally, the importance of accounting for
uncertainty is also studied in this evaluation. Standard deep
learning uses point estimates for predictions
          <xref ref-type="bibr" rid="ref5 ref6 ref8 ref9">(Goodfellow,
Bengio, and Courville 2016)</xref>
          . Thus, even when the model
encounters inputs that are dissimilar to the ones that it was
trained on, it might counterintuitively generate a high
probability score, thereby making probability scores an unreliable
image (84x84x3)
action (1x5)
estimate of the model’s confidence. This can be combated
with the use of Bayesian deep learning which allows for
a probabilistic approach to predictions by inferring
distributions over the model parameters
          <xref ref-type="bibr" rid="ref11 ref4 ref5">(Gal 2016; Kwon et al.
2020)</xref>
          . Beside generating uncertainty estimates, Bayesian
deep learning also helps reduce over-fitting. However, such
models are difficult to train, and usually have intractable
objective functions. Thus, in this work, we explore the need
for accounting for uncertainty in safety monitors, like CPN,
with the use of MC-Dropout
          <xref ref-type="bibr" rid="ref4 ref5 ref8 ref9">(Gal and Ghahramani 2016)</xref>
          to
approximate the Bayesian function.
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Experiments and Evaluations</title>
      <p>
        As stated in the previous section, the scope of the
experiments was limited to inputs from RGB cameras placed on
the hood of the car. Additionally, the focus of the networks
was on preventing “locally avoidable ca
        <xref ref-type="bibr" rid="ref22">tastrophes”
(Saunders et al. 2018</xref>
        ), i.e., ones that can be avoided by adjusting
the course of action when danger is imminent. This
simplifying assumption eliminates the need for long-term strategic
planning and focuses only at the point of failure. The
experiments were conducted using Carla 0.9.6, “an open-source
simulator for urban driving”
        <xref ref-type="bibr" rid="ref3">(Dosovitskiy et al. 2017)</xref>
        . The
CARLA simulator provides a “Scenario Runner”, which
acts as an additional layer over the simulator, to support the
testing of driving scenarios laid out by NHTSA as a list of
pre-crash typologies
        <xref ref-type="bibr" rid="ref16">(Najm et al. 2007)</xref>
        . To study the
importance of temporal features in safety prediction, ST-CPN
is compared against the single frame input of Simple CPN.
The evaluation in this paper, uses a history length of 10,
meaning the last 10 image frames encountered by the ego
vehicle are fed as input to the ST-CPN model. Both the
models are extended for further experiments with uncertainty by
applying MC-Dropout
        <xref ref-type="bibr" rid="ref4 ref5 ref8 ref9">(Gal and Ghahramani 2016)</xref>
        . To this
effect, a Dropout layer with a probability of 0.4 is applied
during training and inference after each of the trainable
layers (namely, conv, convlstm and dense) in the models.
      </p>
      <sec id="sec-2-1">
        <title>Dataset</title>
        <p>Creating a representative dataset is a vital part of the deep
learning pipeline. Data for the experiments in this paper was
collected by allowing the ego vehicle backed by an
RLagent to drive in and interact with the simulated environment
in CARLA. The simulator provides pre-built environments,
called “Towns”. Towns 01, 03 and 04 were used for training
and validation, while towns 02 and 05 were used for testing.
For the initial tests of the proposed approach presented here,
the scale of the experiment was fairly limited with 18000
images (12000 safe and 6000 unsafe) used for training and
9000 (6000 safe and 3000 unsafe) used for testing.</p>
        <p>For the experiments discussed here, the ego vehicle used
in the simulation was equipped with three RGB cameras,
placed on the left, right and center of the far-front of the
hood of the car. The cameras enabled the ego vehicle to
better perceive its surroundings, allowing for a wider field of
view. As mentioned before, the two network architectures
required two different formats of data. Thus, for the Simple
CPN model single frames of images were stored. The RGB
images from the three cameras were first converted to
grayscale to reduce the effect of color on the decision making of
the neural network since the network only needs to detect
the presence of an obstacle, and not the type of the obstacle.
The single-channel gray-scale images from the three
cameras were then combined depth-wise to create a single
threechannel image of dimension 84x84x3 (as depicted in
Figure 4), such that each channel represented one of the
grayscale images. To extend the data to the ST-CPN model, a
similar procedure was followed to store a concatenation of
the last 10 image frames per step, such that 10 image frame
were stacked vertically to generate an (84x10)x84x3 image.
Before being fed to the model as input, the single long
image was processed into a series of 10 images of dimensions
84x84x3 akin to a short video. The two networks perform
binary classification, such that the final dense layer contains
a single neuron. Thus, a decision threshold value of 0.6 was
used, such that if the output layer neuron produces a value
greater than 0.6 then the state-action pair is classified as
unsafe.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Evaluation Metrics</title>
        <p>In real world scenarios, unsafe crash states are
comparatively rare, which often leads to imbalanced datasets. This
information is captured in the collected dataset by
enforcing a mild imbalance, as can be noticed in the description of
the dataset in the previous section. Thus, accuracy, the most
commonly used metric in deep learning, does not suffice as
it could lead to a false sense of success. Since falsely
classifying unsafe states as safe is much worse than vice versa, the
main focus of the models should be on reducing false
negatives. Therefore, recall is an important metric for the CPN
models, which has been captured in this paper via
precisionrecall curves.</p>
      </sec>
      <sec id="sec-2-3">
        <title>Simple CPN with Static Obstacles Only</title>
        <p>Before moving on to complex scenarios, it was necessary to
test if a deep-learning based model could help predict the
possibility of a crash based on state and action information.
As a sanity check, in this first experiment the Simple CPN
model was tested with only static obstacles, which meant
crashes into walls, fences, rocks, crates on the road and other
static objects in an urban scenario. With a test accuracy of
0.7907, the model was able to predict crash situations.
However, the accuracy was inadequate to be practically usable.</p>
      </sec>
      <sec id="sec-2-4">
        <title>Simple CPN with Dynamic Obstacles</title>
        <p>Seeing the results of the previous experiment, the simulation
environment for the following experiments was extended to
include dynamic obstacles in the form of 2- and 4- wheeler
vehicles. The Simple CPN model was tasked with taking as
input an image of the current state along with the proposed
action decision to predict if the next state would be ‘unsafe’.
The model was able to replicate the success of the previous
experiment, achieving a test accuracy of 0.8018 and
AUCPRC score of 0.7624.</p>
        <p>As per the confusion matrix depicted in Figure 5, the
number of false negatives were quite high. However, for
safetycritical applications such as autonomous vehicles, it is
important to reduce false negatives to as low as possible. In
order to improve the results of the classification, class weights
were introduced in the Simple CPN architecture. The class
weights were used during training in the ratio of 1:2, such
that the penalty of wrongly classifying a crash case applied
to the loss was double that of the penalty of wrongly
classifying a non-crash case. The results summarized in table 1
show a slight improvement of the overall accuracy and an
increased recall score.</p>
      </sec>
      <sec id="sec-2-5">
        <title>ST-CPN with Dynamic Obstacles</title>
        <p>Despite the presence of dynamic objects in the environment,
the Simple CPN made its decision based on only a single
frame of information. This does not allow the network to
model the motion of the ego vehicle as well as other
obstacles in the environment. To better deal with moving objects,
the ST-CPN model was introduced, which uses ConvLSTM
to process a contiguous series of 10 frames of images.</p>
        <p>Looking at Figure 7 and Figure 5, it is evident that the
ST-CPN model was able to much better identify “unsafe”
situations, thereby reducing the number of false negatives.
This was further visible on comparing the results of the
STCPN model against the Simple CPN model in table 1 on
the test set. The disadvantage however was that, due to the
higher complexity of the model, the inference time of
STCPN increases by a factor of 10 when compared to that of
Simple CPN.</p>
        <p>Following the performance of the ST-CPN model, a study
on the reaction of the model to out-of-distribution data, i.e.,
data that is slightly different from the conditions that the
model was trained on, was performed. For this 3000 images
with clear weather, similar to training conditions, and 3000
images with rainy weather were collected. These were
referred to as “Test small” and “Test rainy”, respectively. As
can be seen in table 2 and Figure 8, the rainy condition
causes the model to misclassify comparatively more
“unsafe” scenarios, thereby dropping the performance of the
model.</p>
      </sec>
      <sec id="sec-2-6">
        <title>Simple CPN with Uncertainty in dynamic environment</title>
        <p>
          Though accounting for temporal features in the prediction
model helped improve the performance, it suffered from
drop in performance when encountering out-of-distribution
data. Since the real-world is constantly evolving and
cannot be modeled completely in the training data, it is
necessary to have in place techniques for the models to deal with
such data. As pointed out in an earlier section, standard deep
learning gives no information about the confidence of the
model in its prediction. Thus, both the models from the
previous experiments were extended with “MC-Dropout”
          <xref ref-type="bibr" rid="ref4 ref5 ref8 ref9">(Gal
and Ghahramani 2016)</xref>
          by placing a dropout layer after
every convolutional and dense layer, except the output layer.
The dropout was then applied not just during training but
also during testing, wherein each input was used to generate
T predictions to calculate the mean and variance. The
variance on each observation/data point is an indication of how
certain the model is in its prediction, which in turn shows
how similar the test data is to the training data. Since
performance improvements of using class weights were negligible,
the Bayesian version of the Simple CPN model was trained
without class weights.
        </p>
        <p>The advantages of using uncertainty becomes clear when
using a dataset that differs considerably from the training
data. Thus, the performance of Bayesian Simple CPN is
evaluated on “Test small” and “Test rainy” from the the
previous experiment. As can be seen from table 2, the test set
with clear weather has similar uncertainty estimates as the
training set, however, while using the test set with the rainy
weather, the uncertainty estimate increases. Thus, even with
a change as small as variation in weather can increase
uncertainty. The uncertainty estimates can therefore be useful
in building trust in the prediction of the CPN models.</p>
        <p>The benefit of estimating the confidence of the model in
its decision however comes at the cost of a significantly
longer inference time. During evaluations, the model took
10 times longer to compute the class labels and their
corresponding confidence values.</p>
      </sec>
      <sec id="sec-2-7">
        <title>ST-CPN with Uncertainty in Dynamic</title>
      </sec>
      <sec id="sec-2-8">
        <title>Environment</title>
        <p>To study the effect of uncertainty estimates, combined with
the benefits of modeling temporal features, the ST-CPN
network, similar to the previous experiment with Simple
CPN, was extended using MC-Dropout. To ensure that the
model was comparable to the Bayesian version of the
Simple CPN model of the previous experiments, both of the
models were trained to have a similar validation accuracy
of about 0.80. Additionally, to capture the essence of the
proposed CPN model with multiple independent neural
networks, the outputs of the “Bayesian Simple CPN” model
and the “Bayesian ST-CPN” model were combined as a
weighted average with higher weight being assigned to the
latter. This model is referred to as the “Combined” model.</p>
        <p>Evaluating the performance over a range of
probability thresholds, the combined model performs slightly better
than both individual models, as seen in Figure 9, thereby
showing the benefit of modeling CPN as an ensemble of
diverse networks working in unison to make a prediction about
the future state.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Conclusion</title>
      <p>The paper evaluated the proposed system of “Crash
Prediction Network” and demonstrated its advantage in exploiting
the expressiveness of neural networks, while also
maintaining robustness due to the ensemble-like setup. CPN provides
redundancy to the setup by adding a layer of “safety
prediction”. CPN has the power to re-check the decisions of
the driving module, thereby, sharing the load of safe
decision making with the driving module. More importantly, it
should be noted that CPN is not just a solution for
futuristic autonomous vehicles, but can in fact be integrated with
current levels of automation to monitor driving decisions of
human drivers to ensure safer and more conservative
driving, thereby reducing human error on roads.</p>
      <p>Based on the evaluations discussed in the paper, the
importance of accounting for temporal features to model
motion in the environment is evident. Thus, future work
involves extending the CPN model to support other sensory
data as input, along with examining longer history lengths.
Furthermore, one could increase the level of granularity of
the prediction label beyond just ‘safe’ and ‘unsafe’. This
would allow for a more sensitive setup with the ability
to trigger situation-specific interventions, which is not
currently possible. Additionally, the networks that form the
ensemble could be extended beyond merely sensory inputs to
make it geographical region and/or rule-specific with
considerably little effort.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Ashmore</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ; Calinescu, R.; and
          <string-name>
            <surname>Paterson</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>2019</year>
          .
          <article-title>Assuring the machine learning lifecycle: Desiderata, methods, and challenges</article-title>
          . arXiv preprint arXiv:
          <year>1905</year>
          .04223 .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2004.
          <article-title>Basic concepts and taxonomy of dependable and secure computing</article-title>
          .
          <source>IEEE transactions on dependable and secure computing 1</source>
          (
          <issue>1</issue>
          ):
          <fpage>11</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Dosovitskiy</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Ros</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Codevilla</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Lopez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Koltun</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>CARLA: An open urban driving simulator</article-title>
          .
          <source>arXiv preprint arXiv:1711</source>
          .
          <fpage>03938</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Gal</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>Uncertainty in deep learning</article-title>
          . University of Cambridge 1.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Gal</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Ghahramani</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>Dropout as a bayesian approximation: Representing model uncertainty in deep learning</article-title>
          .
          <source>In international conference on machine learning</source>
          ,
          <fpage>1050</fpage>
          -
          <lpage>1059</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Goodfellow</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ; Bengio,
          <string-name>
            <given-names>Y.</given-names>
            ; and
            <surname>Courville</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <year>2016</year>
          .
          <article-title>Deep learning</article-title>
          . MIT press.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Lv</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>Autonomous vehicles testing methods review</article-title>
          .
          <source>In 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC)</source>
          ,
          <fpage>163</fpage>
          -
          <lpage>168</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Kalra</surname>
          </string-name>
          , N.; and
          <string-name>
            <surname>Paddock</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability</article-title>
          ?
          <source>Transportation Research Part A: Policy and Practice</source>
          <volume>94</volume>
          :
          <fpage>182</fpage>
          -
          <lpage>193</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Koopman</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ; and Wagner,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <year>2016</year>
          .
          <article-title>Challenges in autonomous vehicle testing and validation</article-title>
          .
          <source>SAE International Journal of Transportation Safety</source>
          <volume>4</volume>
          (
          <issue>1</issue>
          ):
          <fpage>15</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Koopman</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ; and Wagner,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <year>2017</year>
          .
          <article-title>Autonomous vehicle safety: An interdisciplinary challenge</article-title>
          .
          <source>IEEE Intelligent Transportation Systems Magazine</source>
          <volume>9</volume>
          (
          <issue>1</issue>
          ):
          <fpage>90</fpage>
          -
          <lpage>96</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Kwon</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Won</surname>
          </string-name>
          , J.-H.;
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>B. J.;</given-names>
          </string-name>
          and Paik,
          <string-name>
            <surname>M. C.</surname>
          </string-name>
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <article-title>Uncertainty quantification using Bayesian neural networks in classification: Application to biomedical image segmentation</article-title>
          .
          <source>Computational Statistics &amp; Data Analysis</source>
          <volume>142</volume>
          :
          <fpage>106816</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Machin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Guiochet</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ; Waeselynck,
          <string-name>
            <given-names>H.</given-names>
            ;
            <surname>Blanquart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-P.</given-names>
            ;
            <surname>Roy</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          ; and Masson,
          <string-name>
            <surname>L.</surname>
          </string-name>
          <year>2016</year>
          .
          <article-title>Smof: A safety monitoring framework for autonomous systems</article-title>
          .
          <source>IEEE Transactions on Systems, Man, and Cybernetics: Systems</source>
          <volume>48</volume>
          (
          <issue>5</issue>
          ):
          <fpage>702</fpage>
          -
          <lpage>715</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>McAllister</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ; Gal,
          <string-name>
            <given-names>Y.</given-names>
            ;
            <surname>Kendall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.; Van Der</given-names>
            <surname>Wilk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ;
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ;
            <surname>Cipolla</surname>
          </string-name>
          , R.; and
          <string-name>
            <surname>Weller</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Concrete problems for autonomous vehicle safety: Advantages of bayesian deep learning</article-title>
          .
          <source>International Joint Conferences on Artificial Intelligence</source>
          , Inc.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Nair</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Shafaei</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Kugele</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ; Osman,
          <string-name>
            <given-names>M. H.</given-names>
            ; and
            <surname>Knoll</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <year>2019</year>
          .
          <article-title>Monitoring safety of autonomous vehicles with crash prediction networks</article-title>
          .
          <source>In SafeAI@ AAAI.</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Najm</surname>
            ,
            <given-names>W. G.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>J. D.</given-names>
          </string-name>
          ; Yanagisawa,
          <string-name>
            <surname>M.</surname>
          </string-name>
          ; et al.
          <year>2007</year>
          .
          <article-title>Precrash scenario typology for crash avoidance research</article-title>
          .
          <source>Technical report, United States. National Highway Traffic Safety Administration.</source>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Riedmaier</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Nesensohn</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ; Gutenkunst,
          <string-name>
            <given-names>C.</given-names>
            ;
            <surname>Düser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ;
            <surname>Schick</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          ; and Abdellatif,
          <string-name>
            <surname>H.</surname>
          </string-name>
          <year>2018</year>
          .
          <article-title>Validation of X-in-theLoop Approaches for Virtual Homologation of Automated Driving Functions</article-title>
          .
          <source>In GSVF-Symposium. Graz.</source>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Rudolph</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Voget</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Mottok</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>A consistent safety case argumentation for artificial intelligence in safety related automotive systems</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          2018.
          <article-title>Trial without error: Towards safe reinforcement learning via human intervention</article-title>
          .
          <source>In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems</source>
          , 2067-
          <fpage>2069</fpage>
          . International Foundation for Autonomous Agents and
          <string-name>
            <given-names>Multiagent</given-names>
            <surname>Systems</surname>
          </string-name>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Simonyan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Zisserman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>2014</year>
          .
          <article-title>Very deep convolutional networks for large-scale image recognition</article-title>
          .
          <source>arXiv preprint arXiv:1409</source>
          .
          <fpage>1556</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Stellet</surname>
            ,
            <given-names>J. E.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Zofka</surname>
            ,
            <given-names>M. R.</given-names>
          </string-name>
          ; Schumacher,
          <string-name>
            <surname>J.</surname>
          </string-name>
          ; Schamm,
          <string-name>
            <given-names>T.</given-names>
            ;
            <surname>Niewels</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ; and
            <surname>Zöllner</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. M.</surname>
          </string-name>
          <year>2015</year>
          .
          <article-title>Testing of advanced driver assistance towards automated driving: A survey and taxonomy on existing approaches and open questions</article-title>
          .
          <source>In 2015 IEEE 18th International Conference on Intelligent Transportation Systems</source>
          ,
          <volume>1455</volume>
          -
          <fpage>1462</fpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Törngren</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Mohan</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Becker</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Svensson</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Tao</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Chen</surname>
          </string-name>
          , D.-J.; and
          <string-name>
            <surname>Westman</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>Architecting Safety Supervisors for High Levels of Automated Driving</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <source>In 2018 21st International Conference on Intelligent Transportation Systems (ITSC)</source>
          ,
          <fpage>1721</fpage>
          -
          <lpage>1728</lpage>
          . IEEE.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>