Monitoring Safety of Autonomous Vehicles with Crash Prediction Network Saasha Nair, Sina Shafaei, Stefan Kugele, Mohd Hafeez Osman and Alois Knoll Technical University Munich Munich, Germany saasha.nair@tum.de, sina.shafaei@tum.de, stefan.kugele@tum.de, hafeez.osman@tum.de, knoll@in.tum.de Abstract analysis and risk assessment (HARA). Within the scope of highly automated driving (i. e., level 4), Burton et al. (Bur- Automation needs safety inbuilt in the system such that it be- ton, Gauerhof, and Heinzemann 2017) explored the assur- haves at least as well as a diligent human in unforeseen cir- cumstances, if not better. It is therefore necessary that the ma- ance case approaches that can be applied to the problem of chine learns to behave intuitively by predicting future occur- arguing the safety of machine learning. From the ISO 26262 rences and take actions accordingly. Machine learning tech- V-model perspective, Koopman and Wagner (Koopman and niques, therefore, have to focus on safety issues. Human de- Wagner 2016) identified several testing challenges for au- velopment and the consequential environmental changes will tonomous vehicles. Monkhouse et al. (Monkhouse et al. only push safety requirements higher demanding artificial in- 2017) reported several safety concerns to ensure the safety telligence to fill in the voids so generated. The purpose of of highly automated driving from the functional safety engi- this paper is to study the artificial intelligence perspective on neers’ perspective. This paper explores the challenges in de- safety challenges and concerns of such systems through an veloping and monitoring AI-based component for an end-to- extensive literature review and propose a futuristic and easily end deep learning AV. The presented approach can minimize adaptable system using deep learning technique. The paper would focus primarily on safety aspects of autonomous vehi- the apparent risk when dealing with machine learning based cles using Bayesian Deep learning method. components of Autonomous Driving. However, a more fine- grain safety assessment such as safety requirements and risk assessment remain for future work. Basically, this research Introduction endeavors to answer the following research questions (RQ): Current trends in the automotive industry are introducing RQ1 What are the challenges involved in ensuring safety of new, increasingly complex software functions into vehicles highly critical systems when augmented with machine (Broy 2006). The ever-growing availability of computing re- learning based components? sources, memory, and newest technologies allows for new RQ2 What are the existing approaches used to ensure safety levels of automated and intelligent systems. Driving at a of learning systems? high level of driving automation (i. e., level 3 to 5) accord- RQ3 What are the shortcomings of the existing approaches ing to SAE J3016 (Committee and others 2014) is just one and how can they be overcome? example that has been discussed recently and is no longer just a future vision. Vehicles driving at levels 3 to 5 will be, hereafter, referred to as Autonomous Vehicles (AV) and the Traditional Safety Techniques and Neural corresponding task as Autonomous Driving (AD). Networks Success stories in deep learning have made AVs more The main challenges (RQ1) associated with applying tradi- or less a reality, however, commercializing such vehicles tional safety assurance methodologies to NNs as it was ex- have not yet fructified. Recent accidents, especially those plained in (Cheng et al. 2018) are as follows: involving cars driving at as low as SAE Level 2 show, (i) Implicit Specification – Traditional Verification and that there are challenges engineers are still faced with, Validation (V&V) methods (as suggested in ISO 26262 and the major impediment that stands in the way of large- V model) lay great importance on ensuring that the func- scale adoption of AVs, is its associated safety concerns tional requirements specified at the design-time of the sys- (Kalra and Paddock 2016; Fagnant and Kockelman 2015; tem are met. However, NN-based systems depend solely on McAllister et al. 2017). the training data for inferring the specifications of the model Although there is no concrete solution in addressing and do not depend on any explicit list of requirements, which the safety concern, several researchers have outlined the can be problematic while applying traditional V&V meth- safety challenges and proposed recommendations to con- ods. (ii) Black-Box Structure – While writing the code for sider. Salay et al. (Salay, Queiroz, and Czarnecki 2017) an- a NN, one specifies the details about the layers and the acti- alyzed the impact that the use of ML-based software has on vation functions, but, unlike traditional software, the control various parts of ISO 26262 especially with respect to hazard flow is not explicitly coded, leading to NNs being referred to as black-box structures. Traditional white-box testing tech- (Fu 1994), Validity Interval Analysis (VIA) (Thrun 1995), niques such as code coverage and decision coverage cannot DeepRed (Zilke, Mencía, and Janssen 2016), can be used to be directly applied to NNs, thus, there is a need to construct model the knowledge that a neural network has acquired dur- paradigms for adaptive software systems. ing the training phase. These rules can be expressed as easy Related Work to understand ‘if-then’ statements, that can either be manu- ally verified owing to the human-readable format or can be We distinguish between the existing approaches by cat- automated with a model checker. This method can be helpful egorizing them into two groups: (i) ‘Training phase’, to establish trust in the system, as it augments the explain- i. e.approaches that are solely used during the development ability of the system (Gasser and Almeida 2017). It also aids and training phase of the neural network, and (ii) ‘Opera- requirements traceability, as one can verify if the rules de- tional phase’, i. e.those that are used in the run-time envi- pict functional requirements specified for the system. They ronment of the neural network to ensure proper functioning can also help to examine the various functional modes of (RQ2). the system and ensure that a safe operation mode is induced Training Phase by certain inputs, while respecting the expected safety lim- The existing approaches that fall under this category are: its. Though this method brings in enormous advantages, it (i) Train/Validation/Test split – This method is used to en- is more applicable for offline learning systems, wherein the sure that the developed adaptive system works satisfactorily V&V practitioner can extract rules from the network after for a given set of inputs. The method involves splitting the training is complete. available data, to obtain three sets, such that the largest of Operational Phase the sets is used solely for training, and of the remaining two The solutions that fall under this category can be more ac- sets, one is used for fine tuning the hyperparameters of the curately referred to as ‘Online monitoring techniques’, that NN, and the second one is used to test the working neu- involve the use of one or more monitors working as an oracle ral network to study how well it reacts to previously unseen to ensure continued proper functioning of the neural network data points. Though this method helps verify the working of over time (Cukic et al. 2006). The goal here is to ensure that the NN, it is not extensive enough to be considered a guar- the adaptation dynamics does not cause the network to di- antee for safety (Taylor, Darrah, and Moats 2003) in high- verge, thereby triggering unpredictable behavior. criticality systems. Data Sniffing (Liu, Menzies, and Cukic 2002) is an ex- (ii) Automated test data generation – Lack of trust in the ample based on the foregoing technique, which studies the train-validation-test split method roots from the fact that one data entering and exiting a neural network. If a certain in- is left with very few data samples to test against, wherein, put could pose negative results, then the monitors generate the chances are that cases of high interest might even get an alert and could even possibly flag down the data, thereby missed in the testing phase. A way to overcome this problem not allowing it to enter the system. This method is extremely is to use test data generation tools to generate synthetic data useful in cases where outliers could degrade the functioning points which, can be used for testing the trained neural net- of the system. works. Tools such as Automated Test Trajectory Generation (ATTG) (Taylor 2006) and the more recent approach of gen- erating scenes that an AV might encounter using ontologies Proposed Approach (Bagschik, Menzel, and Maurer 2017) fall under this cate- Majority of the contemporary approaches, as evident from gory. This approach can help the V&V procedure for NNs “Related Work” section, relate to testing a developed model by unveiling missing knowledge in fixed NNs and increasing before it is deployed in the operational environment. ML- confidence in the working of adaptive NNs (Taylor, Darrah, based components, however, suffer from problems like; op- and Moats 2003). erational data/platform being different from what the model (iii) Formal Methods – Formal verification (Ray 2010) was trained on, uncertainty about the new inferences gained refers to the use of mathematical specifications to model from operational data, and even wear-and-tear of hard- and analyse a system. Though these methods work well ware/software. This leaves ML-based components vulnera- with traditional software, they have not shown much suc- ble to errors. Thus, it is necessary to focus on monitoring- cess in the area of adaptive software systems. This is due based approaches, which are starting to gain recent inter- to challenges (Seshia, Sadigh, and Sastry 2016) in modeling est (Fridman, Jenik, and Reimer 2017), to help alleviate the the non-deterministic nature of the environment, difficulty safety concerns associated with such systems. in establishing a formal specification to encode the desired To elaborate on specifics of the proposed solution, an and undesired behavior of the system, and the need to ac- end-to-end deep learning model for lane change maneuvers count for adaptive behavior of the system. Formal verifica- has been chosen. Such a model uses a deep neural net that tion techniques for NNs deal instead with proving conver- takes input data from sensors that represent the environment gence and stability (Fuller, Yerramalla, and Cukic 2006) of around the ego vehicle, and generate one of three actions the system, using methods such as Lyapunov analysis (Yer- allowing the ego vehicle to continue driving in the current ramalla et al. 2003). lane or to switch to the left or right lane depending on the (iv) Rule extraction – Rules (Darrah and Taylor 2006) are presence of obstacles. viewed as a descriptive representation of the inner workings This proposed solution, referred to as ‘Crash Prediction of a neural network. Rule extraction algorithms, such as KT Network’, involves a neural network model, tasked with de- state, reward action command. The action command along with the en- RL-agent vironmental inputs in the form of sensor data are directed isCrashed to the Crash Prediction Network, which performs its task of Environment action Compare predicting the likelihood of a crash. Only if the likelihood is low, is the vehicle allowed to perform the desired actions, Crash prediction else the vehicle is pushed into Fail-safe mode which varies state Prediction depending on the predicted severity of the crash. It is im- Network update portant to note that for the model to stay relevant to the en- vironment, it needs to learn and improve even in the opera- Figure 1: Training of Crash Prediction Network tional stage. Thus, similar to the training stage the difference between the actual output and predicted output are used to update the model. Crash Prediction Network is based on Bayesian Deep Input NN-based Crash Learning (BDL). The reason being that other Deep Learn- component Action Valid (Maneuver Predication ing methods in use currently are known to make hard clas- Planning) Network sifications based on what they see and what they perceive. The disadvantage with this method becomes apparent in a Invalid Fail-safe system such as an AV where multiple components come to- mode gether to form a complex whole, an error in one compo- nent could have a snowball effect up the pipeline, leading to Figure 2: Operation of Crash Prediction Network catastrophic outputs in the later components. A way to get over this problem is to use BDL (McAllister et al. 2017). Bayesian models would provide better results (Kendall and termining the likelihood and severity of a crash at any given Gal 2017), owing to the fact that such models generate as time step (RQ3). The model takes into consideration multi- output a probability distribution with a consideration for un- ple features such as output of the perception module of the certainty, which can be exploited for the output regarding the vehicle, planned trajectory/action of the ego vehicle, pre- likelihood of a crash that the model is expected to generate. dicted (or intended, if available via V2V communication) Additionally, it would mean that the model would propagate trajectory of the obstacles, and possibly also information not only the classification output but also the uncertainty of such as number and severity of previous crashes that the ego the model associated with the output, such that the higher- vehicle and obstacles were involved in. Specifics of the sys- level components can be developed to react in a way that the tem can be understood by distinguishing between the train- system behaves conservatively when the uncertainty of the ing and operational (after deployment) phases of the model. previous components in the pipeline is high. The training phase (as shown in Fig. 1) relies on the model The proposed system has definite advantages. Most im- receiving the required input values for the previously de- portantly, such a system does not just focus on futuris- scribed feature set, and also knowing whether a crash oc- tic autonomous vehicles, but, can even be used in current curred or not. Thus, the model requires an architecture that day Advanced Driving Assistance Systems (ADAS) as well, involves a Reinforcement Learning environment, that would thereby allowing a smoother transition to Autonomous Ve- allow the model to know the outcome at every time step for hicles in future. Secondly, the model can be seen as making a given set of feature values. This would also allow the ve- an intuitively ‘informed decision’, by taking into considera- hicle to crash often, as is characteristic of RL-agents, espe- tion data from multiple sources. Additionally, such a system cially at the start of training. We, therefore, propose to train would also generalize and scale well to different scenarios the model by allowing it to spar with an RL-agent such that that the vehicle might encounter. One of the major problems the ego vehicle closely imitates a real-world vehicle that can that would be encountered during the development of the perform tasks similar to the lane change maneuver use-case model, however, is the consideration of handling input data described above. At each time step, the RL-agent and the received from different sources in varied formats. Next, re- Crash Prediction Network will have access to information dundancy needs to be inbuilt to compensate for sensor fail- about the environment of the vehicle, the Crash Prediction ures/malfunctions in such a way that failure of a sensor does Network will predict whether a crash occurs or not, while not affect the accuracy of the system. Another major aspect, simultaneously, the RL agent would interact with the envi- apropos this methodology that needs experimentation and ronment to determine whether a crash really occurred or not. validation is that of having one ML based component super- Based on the differences in the output of the two networks, vising another. the Crash Prediction Network would be updated to eventu- ally be able to predict crashes with a high level of accuracy. Conclusion The operational stage (as show in Fig. 2) of this model This work covered the different aspects of safety for in- is designed such that the inputs as usual are fed to the ML- telligent components which employ machine learning tech- based component responsible to determine the lane change niques in order to enable the integration of artificial intelli- maneuver to be carried out by the ego vehicle. The vehicle, gence for autonomous driving. The focus was on the main however, does not act directly on the generated lane change concerns and challenges to ensure safety in highly critical applications which are based on machine learning methods, Kendall, A., and Gal, Y. 2017. What uncertainties do we with special emphasis on neural networks. Traditional safety need in bayesian deep learning for computer vision? In approaches are not sufficiently poised for such systems and Advances in neural information processing systems, 5574– therefore, there is a need for more concrete methods like 5584. monitoring techniques, such as the one proposed Crash Pre- Koopman, P., and Wagner, M. 2016. Challenges in au- diction Network, which guarantees an acceptable level of tonomous vehicle testing and validation. SAE International safety for the system functions. The team is in the process Journal of Transportation Safety 4(1):15–24. of implementing and evaluating the proposed approach. Liu, Y.; Menzies, T.; and Cukic, B. 2002. Data sniffing- monitoring of machine learning for online adaptive systems. References In Tools with Artificial Intelligence, 2002.(ICTAI 2002). Pro- Bagschik, G.; Menzel, T.; and Maurer, M. 2017. Ontol- ceedings. 14th IEEE International Conference on, 16–21. ogy based scene creation for the development of automated IEEE. vehicles. arXiv preprint arXiv:1704.01006. McAllister, R.; Gal, Y.; Kendall, A.; Van Der Wilk, M.; Broy, M. 2006. Challenges in automotive software engineer- Shah, A.; Cipolla, R.; and Weller, A. V. 2017. Con- ing. In Proceedings of the 28th international conference on crete problems for autonomous vehicle safety: Advantages Software engineering, 33–42. ACM. of bayesian deep learning. International Joint Conferences Burton, S.; Gauerhof, L.; and Heinzemann, C. 2017. Making on Artificial Intelligence, Inc. the case for safety of machine learning in highly automated Monkhouse, H.; Habli, I.; McDermid, J.; Khastgir, S.; and driving. In International Conference on Computer Safety, Dhadyalla, G. 2017. Why functional safety experts worry Reliability, and Security, 5–16. Springer. about automotive systems having increasing autonomy. In International Workshop on Driver and Driverless Cars: Cheng, C.-H.; Diehl, F.; Hinz, G.; Hamza, Y.; Nührenberg, Competition or Coexistence. G.; Rickert, M.; Ruess, H.; and Truong-Le, M. 2018. Neural networks for safety-critical applicationsâĂŤchallenges, ex- Ray, S. 2010. Scalable techniques for formal verification. periments and perspectives. In Design, Automation & Test Springer Science & Business Media. in Europe Conference & Exhibition (DATE), 2018, 1005– Salay, R.; Queiroz, R.; and Czarnecki, K. 2017. An analysis 1006. IEEE. of ISO 26262: Using machine learning safely in automotive Committee, S. O.-R. A. V. S., et al. 2014. Taxonomy and software. CoRR abs/1709.02435. definitions for terms related to on-road motor vehicle auto- Seshia, S. A.; Sadigh, D.; and Sastry, S. S. 2016. mated driving systems. SAE Standard J 3016:1–16. Towards verified artificial intelligence. arXiv preprint Cukic, B.; Fuller, E.; Mladenovski, M.; and Yerramalla, S. arXiv:1606.08514. 2006. Run-Time Assessment of Neural Network Control Sys- Taylor, B. J.; Darrah, M. A.; and Moats, C. D. 2003. Ver- tems. Boston, MA: Springer US. 257–269. ification and validation of neural networks: a sampling of Darrah, M., and Taylor, B. J. 2006. Rule Extraction as a research in progress. In Intelligent Computing: Theory and Formal Method. Boston, MA: Springer US. 199–227. Applications, volume 5103, 8–17. International Society for Optics and Photonics. Fagnant, D. J., and Kockelman, K. 2015. Preparing a nation Taylor, B. J. 2006. Automated Test Generation for Testing for autonomous vehicles: opportunities, barriers and policy Neural Network Systems. Boston, MA: Springer US. 229– recommendations. Transportation Research Part A: Policy 256. and Practice 77:167–181. Thrun, S. 1995. Extracting rules from artificial neural Fridman, L.; Jenik, B.; and Reimer, B. 2017. Arguing ma- networks with distributed representations. In Tesauro, G.; chines: Perceptioncontrol system redundancy and edge case Touretzky, D.; and Leen, T., eds., Advances in Neural Infor- discovery in real-world autonomous driving. arXiv preprint mation Processing Systems (NIPS) 7. Cambridge, MA: MIT arXiv:1710.04459. Press. Fu, L. 1994. Rule generation from neural networks. Yerramalla, S.; Fuller, E.; Mladenovski, M.; and Cukic, B. IEEE Transactions on Systems, Man, and Cybernetics 2003. Lyapunov analysis of neural network stability in 24(8):1114–1124. an adaptive flight control system. In Symposium on Self- Fuller, E. J.; Yerramalla, S. K.; and Cukic, B. 2006. Stability Stabilizing Systems, 77–92. Springer. properties of neural networks. In Methods and Procedures Zilke, J. R.; Mencía, E. L.; and Janssen, F. 2016. Deepred– for the Verification and Validation of Artificial Neural Net- rule extraction from deep neural networks. In International works. Springer. 97–108. Conference on Discovery Science, 457–473. Springer. Gasser, U., and Almeida, V. A. 2017. A layered model for ai governance. IEEE Internet Computing 21(6):58–62. Kalra, N., and Paddock, S. M. 2016. Driving to safety: How many miles of driving would it take to demonstrate au- tonomous vehicle reliability? Transportation Research Part A: Policy and Practice 94:182–193.