=Paper=
{{Paper
|id=Vol-3087/paper_49
|storemode=property
|title=The wall of safety for AI: approaches in the Confiance.ai program
|pdfUrl=https://ceur-ws.org/Vol-3087/paper_49.pdf
|volume=Vol-3087
|authors=Bertrand Braunschweig,Rodolphe Gelin,François Terrier
|dblpUrl=https://dblp.org/rec/conf/aaai/BraunschweigGT22
}}
==The wall of safety for AI: approaches in the Confiance.ai program==
The wall of safety for AI: approaches in the confiance.ai program Bertrand Braunschweig1, Rodolphe Gelin2, François Terrier3 1 Institut de Recherche Technologique SystemX 2, boulevard Thomas Gobert – Bâtiment 863 F-91120, Palaiseau, France 2 Renault Group TCR, 1 avenue du Golf, 78084 Guyancourt, France, 3 Université Paris-Saclay, CEA, List, F-91120, Palaiseau, France 1 bertrand.braunschweig@ext.irt-systemx.fr, 2rodolphe.gelin@renault.com, 3francois.terrier@cea.fr Abstract AI faces some « walls » towards which it is advancing at high There are different opinions on this matter. The paper (Ben- pace. Apart from social and ethics consideration, there are gio et al. 2021) by Yoshua Bengio, Yann LeCun and Geof- walls on several subjects very dependent but gathering each frey Hinton, written after their collective Turing Award, some considerations from AI community, both for use, de- provides insights into the future of AI through deep learning sign and research: trust, safety, security, energy, human-ma- chine cooperation, and « inhumanity ». Safety questions are and neural networks without addressing the same topics; the an particularly important subjects for all of them. The Confi- 2021 progress report of Stanford's 100-year longitudinal ance.ai industrial program aims at solving some of these is- study (Littman et al. 2021) examines AI advances to date sues by developing seven interrelated projects that address and presents challenges for the future, very complementary these aspects from different viewpoints and integrate them in to those we discuss here; the recent book by César Hidalgo an engineering environment for AI-based systems. We will present the concrete approach taken by confiance.ai and the (2021) looks at how humans perceive AI (and machines); validation strategy based on real-world industrial use cases the book "Human Compatible" by Stuart Russell (2019), is provided by the members. interested in the compatibility between machines and hu- mans, a subject we treat differently when we talk about the interaction wall. The walls of AI and their relation with safety Artificial intelligence is advancing at a very fast pace, both Trust and safety in terms of research and applications, and is raising societal If people do not trust the AI systems they interact with, they questions that are far from being answered. But as it moves will reject them. Several organizations are trying to provide forward rapidly, it runs into what we call the five walls of definitions of what is trust in artificial intelligence systems, AI, walls that it is likely to crash into if we don't take pre- it has been the main subject of the group of experts mobi- cautions. Any one of these five walls is capable of halting lized by the European Commission (whose work is all done its progress, which is why it is essential to know what they in the "trustworthy AI" perspective) (EC 2019). The inter- are and to seek answers in order to avoid the so-called third national standardization organization, ISO (2020a, 2020b), winter of AI, a winter that would follow the first two in the considers about eleven different objectives, with ramifica- years 197x and 199x, during which AI research and devel- tions related to Trustworthy AI: fairness, security, safety, opment came to a virtual standstill for lack of budget and privacy, reliability, transparency/explainability, accounta- community interest. The five walls are those of trust, energy, bility, availability, maintainability, integrity, duty of care, safety, human interaction and inhumanity. They each con- social responsibility, environmental impact, availability and tain a number of ramifications, and obviously interact. quality of training data, AI expertise. This is probably not a definitive list of the dimensions of the “Trust” and all these terms would require a precise definition and the develop- ment of a dedicated ontology to identify the meaning and Copyright © 2022 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). relations among them, in particular for its relations with Even if first step of Confiance.ai if focused on safety, the is- “safety”. This point motivate some activities in Confiance.ai sues of security are considered and will be subject of dedi- to build a taxonomy gathering inputs from all identified cated works in next phases. sources. However, as this is still not stabilized we use them here with their inherent ambiguities. Focuss on Interaction aspects Trust, especially in digital artifacts of which AI is a part, is Interaction with humans can take various forms: speech, a combination of technological and sociological factors. text, graphics, signs, etc. In any case it is not necessarily in Technological, such as the ability to verify the correctness the form of sentences. Interaction problems (in both direc- of a conclusion, the robustness to perturbations, the handling tions) between AI systems and human operators and users of uncertainty etc. All these technological factors are re- can obviously cause safety issues if there is misunderstand- lated to safety. They constitute kernel of Confiance.ai pro- ing of critical situations. For example, if request made by gram and gather the main part of the activities. Sociologi- users are ambiguous for the machine due to interaction prob- cal factors, such as validation by peers, reputation in social lems, wrong interpretation of instructions can lead to unde- networks, the attribution of a label by a trusted third party, sired behavior (e.g. target a friendly vehicle, supply an inap- etc. will complete the assessment of AI based system safety propriate medication). The requests for proper interaction to improve their adoption. mechanisms in the case of autonomous vehicles have been well described in (Daimler et al., 2020), section 2.2.2.14 Focuss on Security aspects (quoting the introduction of this section): “Human-machine Security is here considered from the point of view of cyber- interaction (HMI) is considered a crucial element for the security. It is a key dimension of trust can be included in safe operation of SAE L3, L4 or L5 vehicles ... HMI should safety consideration as attacks can trigger critical safety is- be carefully designed to consider the psychological and cog- sues, but it is also often considered separately, for example nitive traits and states of human beings with the goal of op- for privacy aspects that not always triggers safety questions. timizing the human’s understanding of the task and situation Concerning attacks, if AI systems are, like all digital sys- and of reducing accidental misuse or incorrect operations”. tems, susceptible to being attacked, hacked, compromised by "usual" methods (intrusion, decryption, virus, saturation, The need for explanations of artificial intelligence systems etc.), they have particular characteristics that make them is one of the measures of the regulation proposed by the Eu- particularly fragile to other types of more specific attacks. ropean Commission (EC 2021), or of a draft standard con- Adversarial attacks consist in injecting minor variations of cerning the certification of the development process of these the input data, during the inference phase, in order to signif- systems (LNE 2021). As they are key issues of safety and icantly modify the system output. Since the famous example they will be considered by Confiance.ai in the next phases. of the STOP sign not being recognized when tagged with labels, and the example of the panda being mistaken for a Energy gibbon when a noise component is added, it is known that it The energy wall is well identified by some deep learning re- is possible to compose an attack in such a way as to strongly searchers. The seminal paper by Emma Strubell et al. (2019) modify the interpretation of the data made by a neural net- found that training a large transforming natural language work. And this does not only concern images: one can con- processing neural network, with optimization of the network ceive adversarial attacks on text, or on temporal signals (au- architecture, consumed as much energy as five passenger dio in particular, but also on any physical measures), etc. cars over their lifetime (opposite). The paper by Thompson The consequences of such an attack can be dramatic, a bad et al. (2020) went further, concluding that "the computa- interpretation of the input data can lead to a decision in the tional limits of deep learning will soon be constraining for a wrong direction (for example, accelerate instead of stop- range of applications, making the achievement of important ping, for a car). The report by NIST (NISTIR 2019) estab- benchmark milestones impossible if current trajectories lishes an interesting taxonomy of attacks and corresponding hold." This is a key issue in particular if we consider this defenses. In particular, it shows that attacks during the in- subject more largely in terms of required or wished frugality ference phase are not the only ones of concern. For instance, of AI both in data, algorithms and computation resources. it is possible to pollute the learning bases with antagonistic As embedded systems are natural targets of Confiance.ai examples, which naturally compromises the systems trained this subject will be considered through the angle of the im- from these bases. If we add to this the "usual" security is- pact of resources (energy, memory, data) optimization on sues, as well as the multiple problems caused by deepfakes, the AI based system safety. it is clear that the AI security wall is now solid enough and close enough that it is essential to protect ourselves from it. (non-)Humanity Finally, one have to mention a fifth wall is the one of the humanity of machines, or rather the one of their “inhuman- ity” (in the sense of “not being human”). It gather several hot subjects as: acquisition of common sense; causal reason- ing; transition to System 2 thinking in the sense of Kahne- man (2013). All components that we, humans, naturally pos- sess and that artificial intelligence systems do not have. Even if it is a crucial set of issues that could change com- pletely the relation and safety of AI, it seems still to require long-term researches, and thus is not addressed by the pro- gram Confiance.ai. Figure 1: Confiance.ai program architecture Overview of Confiance.ai approach Each subject triggers several focused actions evaluated on The program Confiance.ai is the technological pillar of the use cases to help identifying and assessing the capacities of Grand Challenge “Securing, certifying and enhancing the technologies to provide valuable arguments for safety as- reliability of systems based on artificial intelli- sessment. The program adopts a strategy of progressive ad- gence” launched by the Innovation Council. The two other vancement: during the first year of the program, data-based pillars focus on standardization (norms, standards and regu- AI solutions, mainly using neural networks, are the focus of lation toward certification) and application evaluation. research with application on image processing, time series Confiance.ai is the largest technological research program and structured data. Then, in the following years, more com- in the #AIforHumanity (2018) plan. It tackles the challenge plex problems and relevant industrial use cases will be of AI industrialization, as the very large-scale deployment looked at. Use cases using video, audio and text data will be of industrial systems integrating AI is a crucial stake for in- added, as well as the introduction of other AI formalisms dustrial and economic competitiveness. It has a strong am- including knowledge-based and hybrid approaches. At the bition: breaking down the barriers associated with the indus- end of the program, the program will cover the whole spec- trialization of AI and equipping industrial players with trum of critical systems. methods adapted to their engineering. One originality of the program lies in its integrative strategy: it addresses the sci- Technological and scientific challenges entific challenges related to trustworthy AI and provides tangible solutions that can be applied in the real world and More precisely, we identified more than 40 technological are ready for deployment in operations. and scientific detailed challenges for the program. The list As defined by the European commission (EC 2020), (EC of challenges is subject to changes as we progress, it has al- 2021) trust is the key objective for a deployment in respect ready evolved since the launch of the program one year ago. to the European values. It can be defined through various The program adopted the term of “trust” to remain open to points of views, details and encompass both engineering and all possible factors ensuring an AI deployment that will be usage aspects. Even if Confiance.ai has to consider all as- beneficial for humans. In practice, at least for the first phases pects, a particular effort is made to propose concrete and of Confiance.ai, “Trustworthy” could be understood as pragmatic answers for system and software engineering “Safe” as the focus is this of ensuring, evaluating, certifying methods able to allow certification of AI based systems ac- the AI based system safety. As of now, the challenges be- cording to their criticality levels. long to three main categories and eight subcategories: Confiance.ai has organized the program upon the four main 1. Trustworthy system engineering with AI components stages of ML component development also identify by R. - Qualify AI-based components and systems Ashmore, R. Calinescy and C. Paterson (2019): data man- - Building AI components with controlled trust agement, model learning (or “design”), model verification - Embeddability of trustworthy AI and model deployment. The structure is completed by a transversal objective to define the methodologies for engi- 2. Trust and learning data neering and certification of AI based systems (Figure 1). - Qualify data/knowledge for learning - Building data/knowledge to increase confidence in learning A validation strategy based on industrial use 3. Trust and human interaction cases - Trust-generating interaction between users and AI- based system Confiance.ai is an industry-oriented project. Its outputs are - Trust-generating interaction between designer/certifiers expected to be usable by industrial partners within their soft- and AI-based systems ware engineering process. A way to achieve this objective is to validate the produced methods and tools on industrial use The first category gathers all aspects of designing and eval- cases. uating AI components for trust (safety). Issues such as per- Use cases are formally defined by formance, robustness, verification, proof, monitoring and • A feature implemented with AI-based technologies. supervision, as well as hybrid systems mixing data-based • An acceptability issue raised by any kind of authorities. and knowledge-based solutions, belong to this category. • Access to the data or the knowledge base used by the fea- Since the major application area of the program is critical ture systems, we also put an emphasis on embedded AI, aiming • Involvement of the feature product owner himself for the at maintaining the desired properties in environments where evaluation of the proposed methods and tools memory, computation capacity, energy usage and real time To reach this goal, the project must perfectly understand the behavior are constrained. arguments that will convince the validation authorities. That is the reason why the involvement of the product owner is The second category deals with data and knowledge. Here crucial. Each tool provided by the project should be a step we consider subjects such as data preparation, data augmen- towards the demonstration of the AI-based system safety. tation (when the available data are not sufficient), heteroge- Furthermore, because this demonstration will rely on the neity of data, domain adaptation, mixing data-based and way the function has been developed and validated, the use knowledge-based models. Another key consideration if that case carrier must be transparent about the way he generated of the ODD (operational design domain) in which an auto- the function: development process, source code, training mated function or system is designed to properly operate. and validation data base, validation process… The third category puts the emphasis on proper interaction Providing a use case to Confiance.ai is thus not that simple. between humans and AI-based systems, focusing on three It is sometimes difficult to share data or knowledge without types of interaction: (i) during the design phase; (ii) for cer- sharing intellectual property or confidential information. A tification by authorities; (iii) when in the hands of final users part of competitive advantage could be in selected network with major issues being transparency and explainability. architecture. These aspects can be circumvented by provid- ing representative publicly available data or well-known To make things more concrete, let us take two examples of networks instead of the real artefact used by the industrial detailed challenges: (i) in the first category, we aim to de- partner. But in this case, these public use cases come rarely velop components integrating self-monitoring of staying with all the information on the development context, in par- within the ODD boundaries. For this purpose, we need a ticular regarding feature related to the quality process, and clear definition of the ODD, as formal as possible; alert consequently can be used only partially to assess the tool mechanisms when the system approaches the boundaries; results on the AI function, and less for evaluating the sound- and stopping mechanisms when the system has exited the ness of methodological proposals at system level. ODD. (ii) In the third category, we look at methods of ex- As being developed in a research project, the AI-based fea- planation corresponding to the needs of users, designers and tures are often under development or at POC status, their certifiers. There is a variety of explanation methods for dif- integration in critical industrial systems is not expected at ferent kinds of problems and data (e.g. saliency maps for short term when plenty of other critical issues are to be man- images, logic-based explanations for numerical data, text- aged today. The connection between safety system require- based explanations for knowledge-based approaches etc.); ments and the software technical proposed solution at the we analyzed and tested a dozen available explanation meth- component level will be the major challenge of the project. ods and tools, but at this time none of them brings a full so- lution to the question, more research is needed. Nevertheless, a first set of use cases have been proposed by Confiance.ai partners. They are all supported by data-based AI (implemented through artificial neural networks technol- ogies) dealing with vision, time series and surrogate models. 2D vision Visual Inspection Surrogate Road scene Classifica- Welding Indication Look-up under- tion in quality detection table standing Aerial pic- (ACAS XU) tures Valeo Airbus Renault Safran Airbus Other use cases will be integrated for the second year to complete the panel of the AI challenges, for example con- cerning: Natural Language Processing and Audio pro- cessing. Figure 3: Example of tool output evaluating accuracy variation depending on different brightness variations. To illustrate the context and process of work around the use case, we shortly describe here the “Welding“ use case, by Renault. The implemented feature is a vision- based detector However, explainability is an important aspect to reach the of the quality of a welding. This feature is expected to assist acceptability of AI. We have evaluated several existing the human operator in tracking the possible default on weld- methods: Rise (Petsiuk & al 2018), Lime (Ribeiro & al ing point. This feature has been implemented with neural 2016), Occlusion (Zeiler and Fergus 2014), KernelSHAP networks techniques because they allow a simple learning (Lundberg and Lee 2017)… The methods proposed within phase doable by non-software experts. These welding the Xplique and GemsAI libraries (developed by ANITI and points, on the chassis, are involved in the safety of the vehi- the DEEL project) have been used to highlight the parts of cle, their control is critical. Despite the very good perfor- the picture that have been used to take the decision about the mance of the classification, the quality management of the quality of the welding. Figure 4 demonstrates that the AI factory does not trust the efficiency of the AI based system. system pays particular attention at a certain part of the weld- This is a hard issue of acceptability. The objective in Confi- ing to classify it. ance.ai project is to build justification arguments in order make this feature accepted by the quality management. First results on a representative use case After less than 1 year (project starts in 2021), some tools have been evaluated on the selected industrial use cases. For instance, we have developed several ways to evaluate the robustness of a classifier. One of it, illustrated on Figure 3, is to add noise to the input pictures (lightning conditions, gaussian blur, motion blur, dead columns, dead pixels…) and check the evolution of the classification accuracy. Figure 4: Explainability of classification decision The output of explainability can be used by the software de- veloper, to validate the good behavior of the developed Figure 2: Image perturbation examples for robustness evaluation model. But it can also be used by the person in charge of quality check, on the manufacturing line. At last, it can be Based on an original welding picture (middle), sensor trou- used to convince the quality manager that the AI is trustwor- bles have been simulated (dead pixels on the left, loss of fo- thy because it takes its decision based on the right observa- cus on the right). The graphics below (represent the evolu- tion. tion of the error according to the amplitude of the noise for Many other tools have been developed to characterize and several pictures. This very simple example illustrates the monitor the behavior of AI based components. We also pro- necessary connection with the use case owner: what kind of pose methods to improve the robustness of neural networks: noise is relevant? Which noise amplitude is realistic? How 1 Lipschitz network (Tzuzuku & al 2018), randomized to fit such a robustness evaluation with the quality require- smoothing (Cohen & al 2019), adversarial training (Bal- ment? unovic and Vechev 2019)… but also, for instance, to vali- Conclusion date the quality and the completeness of the data used for training : Pixano developed by CEA (Dupont 2020), Debiai Confiance.ai is the largest French project on AI focusing on by IRT SystemX. trust, with particular concerns on safety critical applications at different levels of criticality. It targets setting up a com- The black box cases plete tool chain for the development of trustworthy AI based systems. For that Confiance.ai encompasses the whole cycle Even if AI components are developed internally by the in- with the focus of ensuring trust at each stage, from data man- dustrial partners, some others will be bought off-the- agement, AI design and AI validation to deployment. This shelves. For instance, the automotive industry uses smart includes the system qualification by defining the element re- cameras, developed by other companies. These cameras in- quired for qualification accord to the requirements of re- tegrate AI-based features for which source-code, training spective applications domains (aeronautics, automotive, de- data base, development methods are not accessible. Such fense, energy…). use cases are to be considered as well by the project that will Working process is iterative and incremental and strongly develop tools and methods to evaluate, validate and monitor attached to real operational industrial use cases on which all such features without the requiring the 4 criteria exposed be- the different tools and methods (either for existing ones and fore. for those developed in Confiance.ai) are evaluated. Focus has been made for the first year on neural network -based AI In this case, what is required is of course the access to the for applications requiring real qualification but with low device but, mainly, the clear statement of the product criticality (for example with human remaining in the loop). owner’s expectations. What should be demonstrated? First results shows that mathematical approaches for robust- Which kind of validation could be decisive for the owner? ness or explainability could provide interesting elements to In this case, the output of the project will be more the good ease the qualification. Next steps will be completing the questions to ask to the supplier than technical tools to an- chain, for example by addressing the question of ODD def- swer these questions. inition and management and with integrating applications using hybrid AI with the objective to obtain within the 4 years of the project both methodological guidelines and tool The confidential use cases chains adapted to each of the partners engineering contexts. For some reasons, mentioned above, partners will not be able to share their use cases. Anyway, they want to validate the methods and tools proposed by the project. References As a matter of fact, it represents another way for the project AI4Humanity (2018) https://www.aiforhumanity.fr/ to validate it outputs. Instead of providing a “certified” use case, as it does for selected use cases, the project will pro- Ashmore, R., Calinescu, R., and Paterson, C. (2019). Assuring the vide methods and tools to be used to “qualify/certify” a use machine learning lifecycle: Desiderata, methods, and challenges. case. Each partner can use these methods and tools either by using the whole environment provided by the project or by Bengio et al (2021) Yoshua Bengio, Yann Lecun, Geoffrey Hinton. integrating them in its own development cycle and will val- Deep Learning for AI. Communications of the ACM, July 2021, idate, internally, the efficiency of the provided outputs. If Vol. 64 No. 7, Pages 58-65 the “qualification/certification” is doable internally, the pro- ject is successful: the aim of the project is not to provide Balunovic, M., & Vechev, M. (2019, September). Adversarial “certified” use cases but methods and tools to “certify”. If training and provable defenses: Bridging the gap. In International the provided methods and tools are not good enough, it will Conference on Learning Representations. be a very rich feedback to improve them within the project. Cohen, J., Rosenfeld, E., & Kolter, Z. (2019, May). Certified ad- In a way, non-sharable use cases will almost be more useful versarial robustness via randomized smoothing. In International than sharable ones because they will demonstrate the rele- Conference on Machine Learning (pp. 1310-1320). PMLR. vance of the Confiance.ai project, able to deal with use cases it was not specifically designed for. Darpa (2017) https://www.darpa.mil/program/explainable-artifi- cial-intelligence Daimler et al. (2020), Safety first for autonomous driving, Petsiuk, V., Das, A., & Saenko, K. (2018). Rise: Randomized input https://www.daimler.com/innovation/case/autonomous/safety- sampling for explanation of black-box models. arXiv preprint first-for-automated-driving-2.html arXiv:1806.07421. Dupont, C., Ouakrim, Y., Pham, Q. C. (2020) UCP-Net: Unstruc- Ribeiro, M. T., Singh, S., & Guestrin, C. (2016, August). " Why tured Contour Points for Instance Segmentation. IEEE Interna- should i trust you?" Explaining the predictions of any classifier. In tional Conference on Systems, Man, and Cybernetics (SMC), 2020 Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135-1144). EC (2019) https://ec.europa.eu/futurium/en/sys- tem/files/ged/ai_hleg_policy_and_investment_recommenda- Russell S. (2020), Human compatible: Artificial intelligence and tions.pdf the problem of control Stuart Russell, Penguin Books, 2020 EC (2020) https://ec.europa.eu/.../commission-white-paper-artifi- cial-intelligence-feb2020_en.pdf Strubell et al (2019) E. Strubell, A. Ganesh, A. McCallum; Energy and Policy Considerations for Deep Learning in NL , EC (2021) https://ec.europa.eu/france/news/20210421 nou- https://arxiv.org/abs/1906.02243v1 velles_regles_europeennes_intelligence_artificielle_fr Thompson N. et al. (2020) The Computational Limits of Deep GemsAI : https://github.com/XAI-ANITI/ethik Learning, arXiv:2007.05558v1 Hidalgo C. (2021) https://www.judgingmachines.com/ Tsuzuku, Y., Sato, I., & Sugiyama, M. (2018). Lipschitz-margin training: Scalable certification of perturbation invariance for deep neural networks. arXiv preprint arXiv:1802.04034. ISO (2020a) https://www.iso.org/obp/ui#iso:std:iso- iec:tr:24028:ed-1:v1:en Xplique : https://github.com/deel-ai/xplique ISO (2020b) Zeiler, M. D., & Fergus, R. (2014, September). Visualizing and Information technology — Artificial intelligence — Risk manage- understanding convolutional networks. In European conference on ment - ISO/IEC CD 23894 computer vision (pp. 818-833). Springer, Cham. Kahneman D. (2013) Thinking, Fast and Slow, Farrar, Straus and Giroux; Littman et al. (2021) Michael L. Littman, Ifeoma Ajunwa, Guy Berger, Craig Boutilier, Morgan Currie, Finale Doshi-Velez, Gillian Hadfield, Michael C. Horowitz, Charles Isbell, Hiroaki Ki- tano, Karen Levy, Terah Lyons, Melanie Mitchell, Julie Shah, Ste- ven Sloman, Shannon Vallor, and Toby Walsh. “Gathering Strength, Gathering Storms: The One Hundred Year Study on Ar- tificial Intelligence (AI100) 2021 Study Panel Report.” Stanford University, Stanford, CA, September 2021. Doc: http://ai100.stan- ford.edu/2021-report. LNE (2021) Laboratoire National de Métrologie et d’Essais; Pro- cessus de conception, de développement, d’évaluation et de main- tien en conditions opérationnelles des intelligences artificielles Lundberg, S. M., & Lee, S. I. (2017, December). A unified ap- proach to interpreting model predictions. In Proceedings of the 31st international conference on neural information processing systems (pp. 4768-4777). NISTIR draft 8269 (2019), A Taxonomy and Terminology of Ad- versarial Machine Learning, E. Tabassi et al, https://nvl- pubs.nist.gov/nistpubs/ir/2019/NIST.IR.8269-draft.pdf