Monitoring Safety of Autonomous Vehicles with Crash Prediction Network

             Saasha Nair, Sina Shafaei, Stefan Kugele, Mohd Hafeez Osman and Alois Knoll
                                             Technical University Munich
                                                    Munich, Germany
         saasha.nair@tum.de, sina.shafaei@tum.de, stefan.kugele@tum.de, hafeez.osman@tum.de, knoll@in.tum.de


                            Abstract                                  analysis and risk assessment (HARA). Within the scope of
                                                                      highly automated driving (i. e., level 4), Burton et al. (Bur-
  Automation needs safety inbuilt in the system such that it be-      ton, Gauerhof, and Heinzemann 2017) explored the assur-
  haves at least as well as a diligent human in unforeseen cir-
  cumstances, if not better. It is therefore necessary that the ma-
                                                                      ance case approaches that can be applied to the problem of
  chine learns to behave intuitively by predicting future occur-      arguing the safety of machine learning. From the ISO 26262
  rences and take actions accordingly. Machine learning tech-         V-model perspective, Koopman and Wagner (Koopman and
  niques, therefore, have to focus on safety issues. Human de-        Wagner 2016) identified several testing challenges for au-
  velopment and the consequential environmental changes will          tonomous vehicles. Monkhouse et al. (Monkhouse et al.
  only push safety requirements higher demanding artificial in-       2017) reported several safety concerns to ensure the safety
  telligence to fill in the voids so generated. The purpose of        of highly automated driving from the functional safety engi-
  this paper is to study the artificial intelligence perspective on   neers’ perspective. This paper explores the challenges in de-
  safety challenges and concerns of such systems through an           veloping and monitoring AI-based component for an end-to-
  extensive literature review and propose a futuristic and easily     end deep learning AV. The presented approach can minimize
  adaptable system using deep learning technique. The paper
  would focus primarily on safety aspects of autonomous vehi-
                                                                      the apparent risk when dealing with machine learning based
  cles using Bayesian Deep learning method.                           components of Autonomous Driving. However, a more fine-
                                                                      grain safety assessment such as safety requirements and risk
                                                                      assessment remain for future work. Basically, this research
                        Introduction                                  endeavors to answer the following research questions (RQ):
Current trends in the automotive industry are introducing             RQ1 What are the challenges involved in ensuring safety of
new, increasingly complex software functions into vehicles                   highly critical systems when augmented with machine
(Broy 2006). The ever-growing availability of computing re-                  learning based components?
sources, memory, and newest technologies allows for new               RQ2 What are the existing approaches used to ensure safety
levels of automated and intelligent systems. Driving at a                    of learning systems?
high level of driving automation (i. e., level 3 to 5) accord-
                                                                      RQ3 What are the shortcomings of the existing approaches
ing to SAE J3016 (Committee and others 2014) is just one
                                                                             and how can they be overcome?
example that has been discussed recently and is no longer
just a future vision. Vehicles driving at levels 3 to 5 will be,
hereafter, referred to as Autonomous Vehicles (AV) and the               Traditional Safety Techniques and Neural
corresponding task as Autonomous Driving (AD).                                           Networks
   Success stories in deep learning have made AVs more                The main challenges (RQ1) associated with applying tradi-
or less a reality, however, commercializing such vehicles             tional safety assurance methodologies to NNs as it was ex-
have not yet fructified. Recent accidents, especially those           plained in (Cheng et al. 2018) are as follows:
involving cars driving at as low as SAE Level 2 show,                    (i) Implicit Specification – Traditional Verification and
that there are challenges engineers are still faced with,             Validation (V&V) methods (as suggested in ISO 26262
and the major impediment that stands in the way of large-             V model) lay great importance on ensuring that the func-
scale adoption of AVs, is its associated safety concerns              tional requirements specified at the design-time of the sys-
(Kalra and Paddock 2016; Fagnant and Kockelman 2015;                  tem are met. However, NN-based systems depend solely on
McAllister et al. 2017).                                              the training data for inferring the specifications of the model
   Although there is no concrete solution in addressing               and do not depend on any explicit list of requirements, which
the safety concern, several researchers have outlined the             can be problematic while applying traditional V&V meth-
safety challenges and proposed recommendations to con-                ods. (ii) Black-Box Structure – While writing the code for
sider. Salay et al. (Salay, Queiroz, and Czarnecki 2017) an-          a NN, one specifies the details about the layers and the acti-
alyzed the impact that the use of ML-based software has on            vation functions, but, unlike traditional software, the control
various parts of ISO 26262 especially with respect to hazard          flow is not explicitly coded, leading to NNs being referred to
as black-box structures. Traditional white-box testing tech-      (Fu 1994), Validity Interval Analysis (VIA) (Thrun 1995),
niques such as code coverage and decision coverage cannot         DeepRed (Zilke, Mencía, and Janssen 2016), can be used to
be directly applied to NNs, thus, there is a need to construct    model the knowledge that a neural network has acquired dur-
paradigms for adaptive software systems.                          ing the training phase. These rules can be expressed as easy
                      Related Work                                to understand ‘if-then’ statements, that can either be manu-
                                                                  ally verified owing to the human-readable format or can be
We distinguish between the existing approaches by cat-            automated with a model checker. This method can be helpful
egorizing them into two groups: (i) ‘Training phase’,             to establish trust in the system, as it augments the explain-
i. e.approaches that are solely used during the development       ability of the system (Gasser and Almeida 2017). It also aids
and training phase of the neural network, and (ii) ‘Opera-        requirements traceability, as one can verify if the rules de-
tional phase’, i. e.those that are used in the run-time envi-     pict functional requirements specified for the system. They
ronment of the neural network to ensure proper functioning        can also help to examine the various functional modes of
(RQ2).                                                            the system and ensure that a safe operation mode is induced
Training Phase                                                    by certain inputs, while respecting the expected safety lim-
The existing approaches that fall under this category are:        its. Though this method brings in enormous advantages, it
(i) Train/Validation/Test split – This method is used to en-      is more applicable for offline learning systems, wherein the
sure that the developed adaptive system works satisfactorily      V&V practitioner can extract rules from the network after
for a given set of inputs. The method involves splitting the      training is complete.
available data, to obtain three sets, such that the largest of    Operational Phase
the sets is used solely for training, and of the remaining two    The solutions that fall under this category can be more ac-
sets, one is used for fine tuning the hyperparameters of the      curately referred to as ‘Online monitoring techniques’, that
NN, and the second one is used to test the working neu-           involve the use of one or more monitors working as an oracle
ral network to study how well it reacts to previously unseen      to ensure continued proper functioning of the neural network
data points. Though this method helps verify the working of       over time (Cukic et al. 2006). The goal here is to ensure that
the NN, it is not extensive enough to be considered a guar-       the adaptation dynamics does not cause the network to di-
antee for safety (Taylor, Darrah, and Moats 2003) in high-        verge, thereby triggering unpredictable behavior.
criticality systems.
                                                                     Data Sniffing (Liu, Menzies, and Cukic 2002) is an ex-
(ii) Automated test data generation – Lack of trust in the
                                                                  ample based on the foregoing technique, which studies the
train-validation-test split method roots from the fact that one
                                                                  data entering and exiting a neural network. If a certain in-
is left with very few data samples to test against, wherein,
                                                                  put could pose negative results, then the monitors generate
the chances are that cases of high interest might even get
                                                                  an alert and could even possibly flag down the data, thereby
missed in the testing phase. A way to overcome this problem
                                                                  not allowing it to enter the system. This method is extremely
is to use test data generation tools to generate synthetic data
                                                                  useful in cases where outliers could degrade the functioning
points which, can be used for testing the trained neural net-
                                                                  of the system.
works. Tools such as Automated Test Trajectory Generation
(ATTG) (Taylor 2006) and the more recent approach of gen-
erating scenes that an AV might encounter using ontologies                         Proposed Approach
(Bagschik, Menzel, and Maurer 2017) fall under this cate-         Majority of the contemporary approaches, as evident from
gory. This approach can help the V&V procedure for NNs            “Related Work” section, relate to testing a developed model
by unveiling missing knowledge in fixed NNs and increasing        before it is deployed in the operational environment. ML-
confidence in the working of adaptive NNs (Taylor, Darrah,        based components, however, suffer from problems like; op-
and Moats 2003).                                                  erational data/platform being different from what the model
(iii) Formal Methods – Formal verification (Ray 2010)             was trained on, uncertainty about the new inferences gained
refers to the use of mathematical specifications to model         from operational data, and even wear-and-tear of hard-
and analyse a system. Though these methods work well              ware/software. This leaves ML-based components vulnera-
with traditional software, they have not shown much suc-          ble to errors. Thus, it is necessary to focus on monitoring-
cess in the area of adaptive software systems. This is due        based approaches, which are starting to gain recent inter-
to challenges (Seshia, Sadigh, and Sastry 2016) in modeling       est (Fridman, Jenik, and Reimer 2017), to help alleviate the
the non-deterministic nature of the environment, difficulty       safety concerns associated with such systems.
in establishing a formal specification to encode the desired         To elaborate on specifics of the proposed solution, an
and undesired behavior of the system, and the need to ac-         end-to-end deep learning model for lane change maneuvers
count for adaptive behavior of the system. Formal verifica-       has been chosen. Such a model uses a deep neural net that
tion techniques for NNs deal instead with proving conver-         takes input data from sensors that represent the environment
gence and stability (Fuller, Yerramalla, and Cukic 2006) of       around the ego vehicle, and generate one of three actions
the system, using methods such as Lyapunov analysis (Yer-         allowing the ego vehicle to continue driving in the current
ramalla et al. 2003).                                             lane or to switch to the left or right lane depending on the
(iv) Rule extraction – Rules (Darrah and Taylor 2006) are         presence of obstacles.
viewed as a descriptive representation of the inner workings         This proposed solution, referred to as ‘Crash Prediction
of a neural network. Rule extraction algorithms, such as KT       Network’, involves a neural network model, tasked with de-
                state, reward
                                                                                  action command. The action command along with the en-
                                RL-agent                                          vironmental inputs in the form of sensor data are directed
                                                     isCrashed                    to the Crash Prediction Network, which performs its task of
Environment
                      action                                            Compare
                                                                                  predicting the likelihood of a crash. Only if the likelihood
                                                                                  is low, is the vehicle allowed to perform the desired actions,
                                  Crash               prediction                  else the vehicle is pushed into Fail-safe mode which varies
                   state        Prediction                                        depending on the predicted severity of the crash. It is im-
                                 Network
                                                           update                 portant to note that for the model to stay relevant to the en-
                                                                                  vironment, it needs to learn and improve even in the opera-
         Figure 1: Training of Crash Prediction Network                           tional stage. Thus, similar to the training stage the difference
                                                                                  between the actual output and predicted output are used to
                                                                                  update the model.
                                                                                     Crash Prediction Network is based on Bayesian Deep
 Input
                   NN-based
                                                Crash
                                                                                  Learning (BDL). The reason being that other Deep Learn-
                   component    Action                        Valid
                   (Maneuver                 Predication                          ing methods in use currently are known to make hard clas-
                    Planning)                 Network                             sifications based on what they see and what they perceive.
                                                                                  The disadvantage with this method becomes apparent in a
                                                   Invalid          Fail-safe     system such as an AV where multiple components come to-
                                                                     mode         gether to form a complex whole, an error in one compo-
                                                                                  nent could have a snowball effect up the pipeline, leading to
         Figure 2: Operation of Crash Prediction Network                          catastrophic outputs in the later components. A way to get
                                                                                  over this problem is to use BDL (McAllister et al. 2017).
                                                                                  Bayesian models would provide better results (Kendall and
termining the likelihood and severity of a crash at any given                     Gal 2017), owing to the fact that such models generate as
time step (RQ3). The model takes into consideration multi-                        output a probability distribution with a consideration for un-
ple features such as output of the perception module of the                       certainty, which can be exploited for the output regarding the
vehicle, planned trajectory/action of the ego vehicle, pre-                       likelihood of a crash that the model is expected to generate.
dicted (or intended, if available via V2V communication)                          Additionally, it would mean that the model would propagate
trajectory of the obstacles, and possibly also information                        not only the classification output but also the uncertainty of
such as number and severity of previous crashes that the ego                      the model associated with the output, such that the higher-
vehicle and obstacles were involved in. Specifics of the sys-                     level components can be developed to react in a way that the
tem can be understood by distinguishing between the train-                        system behaves conservatively when the uncertainty of the
ing and operational (after deployment) phases of the model.                       previous components in the pipeline is high.
   The training phase (as shown in Fig. 1) relies on the model                       The proposed system has definite advantages. Most im-
receiving the required input values for the previously de-                        portantly, such a system does not just focus on futuris-
scribed feature set, and also knowing whether a crash oc-                         tic autonomous vehicles, but, can even be used in current
curred or not. Thus, the model requires an architecture that                      day Advanced Driving Assistance Systems (ADAS) as well,
involves a Reinforcement Learning environment, that would                         thereby allowing a smoother transition to Autonomous Ve-
allow the model to know the outcome at every time step for                        hicles in future. Secondly, the model can be seen as making
a given set of feature values. This would also allow the ve-                      an intuitively ‘informed decision’, by taking into considera-
hicle to crash often, as is characteristic of RL-agents, espe-                    tion data from multiple sources. Additionally, such a system
cially at the start of training. We, therefore, propose to train                  would also generalize and scale well to different scenarios
the model by allowing it to spar with an RL-agent such that                       that the vehicle might encounter. One of the major problems
the ego vehicle closely imitates a real-world vehicle that can                    that would be encountered during the development of the
perform tasks similar to the lane change maneuver use-case                        model, however, is the consideration of handling input data
described above. At each time step, the RL-agent and the                          received from different sources in varied formats. Next, re-
Crash Prediction Network will have access to information                          dundancy needs to be inbuilt to compensate for sensor fail-
about the environment of the vehicle, the Crash Prediction                        ures/malfunctions in such a way that failure of a sensor does
Network will predict whether a crash occurs or not, while                         not affect the accuracy of the system. Another major aspect,
simultaneously, the RL agent would interact with the envi-                        apropos this methodology that needs experimentation and
ronment to determine whether a crash really occurred or not.                      validation is that of having one ML based component super-
Based on the differences in the output of the two networks,                       vising another.
the Crash Prediction Network would be updated to eventu-
ally be able to predict crashes with a high level of accuracy.                                            Conclusion
   The operational stage (as show in Fig. 2) of this model                        This work covered the different aspects of safety for in-
is designed such that the inputs as usual are fed to the ML-                      telligent components which employ machine learning tech-
based component responsible to determine the lane change                          niques in order to enable the integration of artificial intelli-
maneuver to be carried out by the ego vehicle. The vehicle,                       gence for autonomous driving. The focus was on the main
however, does not act directly on the generated lane change                       concerns and challenges to ensure safety in highly critical
applications which are based on machine learning methods,         Kendall, A., and Gal, Y. 2017. What uncertainties do we
with special emphasis on neural networks. Traditional safety      need in bayesian deep learning for computer vision? In
approaches are not sufficiently poised for such systems and       Advances in neural information processing systems, 5574–
therefore, there is a need for more concrete methods like         5584.
monitoring techniques, such as the one proposed Crash Pre-        Koopman, P., and Wagner, M. 2016. Challenges in au-
diction Network, which guarantees an acceptable level of          tonomous vehicle testing and validation. SAE International
safety for the system functions. The team is in the process       Journal of Transportation Safety 4(1):15–24.
of implementing and evaluating the proposed approach.
                                                                  Liu, Y.; Menzies, T.; and Cukic, B. 2002. Data sniffing-
                                                                  monitoring of machine learning for online adaptive systems.
                        References                                In Tools with Artificial Intelligence, 2002.(ICTAI 2002). Pro-
Bagschik, G.; Menzel, T.; and Maurer, M. 2017. Ontol-             ceedings. 14th IEEE International Conference on, 16–21.
ogy based scene creation for the development of automated         IEEE.
vehicles. arXiv preprint arXiv:1704.01006.                        McAllister, R.; Gal, Y.; Kendall, A.; Van Der Wilk, M.;
Broy, M. 2006. Challenges in automotive software engineer-        Shah, A.; Cipolla, R.; and Weller, A. V. 2017. Con-
ing. In Proceedings of the 28th international conference on       crete problems for autonomous vehicle safety: Advantages
Software engineering, 33–42. ACM.                                 of bayesian deep learning. International Joint Conferences
Burton, S.; Gauerhof, L.; and Heinzemann, C. 2017. Making         on Artificial Intelligence, Inc.
the case for safety of machine learning in highly automated       Monkhouse, H.; Habli, I.; McDermid, J.; Khastgir, S.; and
driving. In International Conference on Computer Safety,          Dhadyalla, G. 2017. Why functional safety experts worry
Reliability, and Security, 5–16. Springer.                        about automotive systems having increasing autonomy. In
                                                                  International Workshop on Driver and Driverless Cars:
Cheng, C.-H.; Diehl, F.; Hinz, G.; Hamza, Y.; Nührenberg,
                                                                  Competition or Coexistence.
G.; Rickert, M.; Ruess, H.; and Truong-Le, M. 2018. Neural
networks for safety-critical applicationsâĂŤchallenges, ex-     Ray, S. 2010. Scalable techniques for formal verification.
periments and perspectives. In Design, Automation & Test          Springer Science & Business Media.
in Europe Conference & Exhibition (DATE), 2018, 1005–             Salay, R.; Queiroz, R.; and Czarnecki, K. 2017. An analysis
1006. IEEE.                                                       of ISO 26262: Using machine learning safely in automotive
Committee, S. O.-R. A. V. S., et al. 2014. Taxonomy and           software. CoRR abs/1709.02435.
definitions for terms related to on-road motor vehicle auto-      Seshia, S. A.; Sadigh, D.; and Sastry, S. S.            2016.
mated driving systems. SAE Standard J 3016:1–16.                  Towards verified artificial intelligence. arXiv preprint
Cukic, B.; Fuller, E.; Mladenovski, M.; and Yerramalla, S.        arXiv:1606.08514.
2006. Run-Time Assessment of Neural Network Control Sys-          Taylor, B. J.; Darrah, M. A.; and Moats, C. D. 2003. Ver-
tems. Boston, MA: Springer US. 257–269.                           ification and validation of neural networks: a sampling of
Darrah, M., and Taylor, B. J. 2006. Rule Extraction as a          research in progress. In Intelligent Computing: Theory and
Formal Method. Boston, MA: Springer US. 199–227.                  Applications, volume 5103, 8–17. International Society for
                                                                  Optics and Photonics.
Fagnant, D. J., and Kockelman, K. 2015. Preparing a nation
                                                                  Taylor, B. J. 2006. Automated Test Generation for Testing
for autonomous vehicles: opportunities, barriers and policy
                                                                  Neural Network Systems. Boston, MA: Springer US. 229–
recommendations. Transportation Research Part A: Policy
                                                                  256.
and Practice 77:167–181.
                                                                  Thrun, S. 1995. Extracting rules from artificial neural
Fridman, L.; Jenik, B.; and Reimer, B. 2017. Arguing ma-
                                                                  networks with distributed representations. In Tesauro, G.;
chines: Perceptioncontrol system redundancy and edge case
                                                                  Touretzky, D.; and Leen, T., eds., Advances in Neural Infor-
discovery in real-world autonomous driving. arXiv preprint
                                                                  mation Processing Systems (NIPS) 7. Cambridge, MA: MIT
arXiv:1710.04459.
                                                                  Press.
Fu, L. 1994. Rule generation from neural networks.                Yerramalla, S.; Fuller, E.; Mladenovski, M.; and Cukic, B.
IEEE Transactions on Systems, Man, and Cybernetics                2003. Lyapunov analysis of neural network stability in
24(8):1114–1124.                                                  an adaptive flight control system. In Symposium on Self-
Fuller, E. J.; Yerramalla, S. K.; and Cukic, B. 2006. Stability   Stabilizing Systems, 77–92. Springer.
properties of neural networks. In Methods and Procedures          Zilke, J. R.; Mencía, E. L.; and Janssen, F. 2016. Deepred–
for the Verification and Validation of Artificial Neural Net-     rule extraction from deep neural networks. In International
works. Springer. 97–108.                                          Conference on Discovery Science, 457–473. Springer.
Gasser, U., and Almeida, V. A. 2017. A layered model for
ai governance. IEEE Internet Computing 21(6):58–62.
Kalra, N., and Paddock, S. M. 2016. Driving to safety:
How many miles of driving would it take to demonstrate au-
tonomous vehicle reliability? Transportation Research Part
A: Policy and Practice 94:182–193.