=Paper=
{{Paper
|id=Vol-3087/paper_40
|storemode=property
|title=Combining Data-Driven and Knowledge-Based AI Paradigms for Engineering AI-Based Safety-Critical Systems
|pdfUrl=https://ceur-ws.org/Vol-3087/paper_40.pdf
|volume=Vol-3087
|authors=Juliette Mattioli,Gabriel Pedroza,Souhaiel Khalfaoui,Bertrand Leroy
|dblpUrl=https://dblp.org/rec/conf/aaai/MattioliPKL22
}}
==Combining Data-Driven and Knowledge-Based AI Paradigms for Engineering AI-Based Safety-Critical Systems==
Combining Data-Driven and Knowledge-Based AI Paradigms for Engineering AI-Based Safety-Critical Systems Juliette MATTIOLI1 , Gabriel PEDROZA2 , Souhaiel KHALFAOUI 3,5 , Bertrand LEROY 4,5 1 Thales, France - 2 CEA List, U. Paris-Saclay, France - 3 Valeo, France - 4 Renault, France - 5 IRT SystemX, France, juliette.mattioli@thalesgroup.com - gabriel.pedroza@cea.fr - souhaiel.khalfaoui@valeo.com - bertrand.leroy@renault.com This work received French government aid under the Investments for the Future program (PIA) within the framework of SystemX Technological Research Institute Abstract responsibility and accountability of AICS and their out- comes. As any critical system, an AICS needs to be verified, The development of AI-based systems entails a manifold of validated, qualified and even certified, following, a suitable doubled-hard challenges. They are mainly due, on one side, development methodology. Explainability is also an issue to the technical debt of involved engineering disciplines (sys- tems, safety, security), their inherent complexity, their yet- because AI systems and their decisions need to be explained to-solve concerns, and, on the other side, to the emergent to integration and maintenance teams as well as end-users in risks of AI autonomy, the trade-offs between AI heuristics an understandable manner. Auditability for assessing algo- vs. required determinism, and, overall, the difficulty to de- rithms, data, knowledge, design and integration processes is fine, characterize, assess and prove that AI-based systems are also a key property that has to be handled. The Fig. 1 high- sufficiently safe and trustworthy. Despite the vast amount of lights various non-functional requirements that have to be research contributions and the undeniable progress in many verified and demonstrated for a sound AI-based component fields over the last decades, a gap still exists between exper- deployment within an AICS. All such requirements need a imental and certifiable AIs. The present paper aims at bridg- sound and trustworthy AI engineering methodology with ef- ing this gap “by design”. Considering engineering paradigms ficient supporting tools, while addressing various levels of as a basis to specify, relate and infer knowledge, a new paradigm is proposed to achieve AI certification. The pro- granularity. On one hand they must encompass specific al- posed paradigm recognizes existing AI approaches, namely gorithmic domains engineering, including associated data, connectionist, symbolic, and hybrid, and proffers to lever- models and knowledge representations. On the other hand, age their essential traits captured as knowledge. A concep- they must guarantee architecture design correctness up to the tual meta-body is thus obtained respectively containing cat- complete system engineering cycle. egories for Data-, Knowledge- and Hybrid- driven. Since it is observed that research strays from Knowledge-driven and it rather strives for Data-driven approaches, our paradigm calls for empowering Knowledge Engineering relying upon Hybrid-driven approaches to improve their coupling and ben- efit from their complementarity. Introduction Safety can be defined as “freedom from risk which is not tolerable” (ISO). This definition implies that a safe system is one in which scenarios with non-tolerable consequences Figure 1: AISCS induced various non functional require- have a sufficiently low probability, or frequency, of occur- ments ring. Thus, Safety critical systems must be dependable dur- ing all their life-cycle, supporting evolution without incur- ring prohibitive costs. It becomes mandatory that an AI- By the taxonomy of disciplines involved (Systems, based Critical System (AICS) does what it has been spec- Knowledge, Algorithms, Safety, and Security Engineering), ified to do (correctness). In the near term, the goal of de- their inherent complexity, the yet-to-solve concerns within ploying AI on critical systems motivates research to handle each of them (technical debt), and the fact that AICS lay accountability, reliability, suitability, timeliness, etc. More- down in the intersection of those disciplines, then AICS de- over, AICS need to be resilient, safe and (cyber)-secure. velopment becomes a doubled-hard challenge. Indeed, from Complex mechanisms have to be integrated to ensure both an engineering perspective, questions like, Which are the fundamental notions and features to characterize AICS? Copyright © 2022 for this paper by its authors. Use permitted under Which development languages and methods can sufficiently Creative Commons License Attribution 4.0 International (CC BY comprehend and interrelate those notions? How can knowl- 4.0). edge be structured to suitably elicit, fulfill and verify require- ments? and last, yet not the least, Whether the bundle of cri- certain cases, the complementarity between techniques even teria (ranging from data-sensitivity to explainability) can be leads to overcome certain limitations of each other. Indeed, harmonized and how? on one side, data-driven AI approaches successfully charac- This position paper introduces the need of a conceptual terize and capture the salient traits of the data sets. However, paradigm which aims to provide a basis upon which an- being the connectionist models heuristic and agnostic of typ- swers to previous questions can be elicited. To handle the ical notion-encapsulation archetypes, they lack argumenta- complexity of the subject, several choices are made. First, tion necessary for explainability. On the other side, sym- since autonomy is a distinctive feature of AICS, and it is bolic approaches introduce a semantic layer aligned with intrinsic to safety, the latter is placed as top-priority crite- the notion-encapsulation archetype, which is amenable for rion. Then it is used to guide design, conduct analyses and expressing domain knowledge and concerns useful for vali- align the rest of criteria: all in all, we target Safety-Critical dation and argumentation. AICS (AISCS). Secondly, the proposed paradigm assumes that fundamental notions, as such, should appear in there, ir- Paradigm Foreground respective of the development methodology or framework. The proposed paradigm leverages the background described Last yet not the least, the paradigm also assumes that chal- in previous Subsection in the following manner. First, sym- lenges in AISCS development can be addressed via a body bolic, connectionist and hybrid are methods and techniques of knowledge amenable to (1) emulate whatever human- with distinctive features which are applied to achieve the in- beings perceive as intelligence and (2) integrate related con- tended AI functionality. In that respect, once a technique cerns and in particular safety. Overall, this paper is dedicated is selected, the engineering choices are mostly oriented to to provide a first specification of a conceptual paradigm as explore and decide HOW and WHEN to apply. Given the a basis for safe AI-based systems development, in order to referred challenges for AISCS development, and in order design a sound and tooled AI engineering methodology that to extend the space of engineering choices, it is proposed encompasses with objective of trustworthy AI algorithm en- to consider such techniques as instances of a more abstract gineering, data engineering, knowledge engineering and AI meta-structure: an AI-body of knowledge. Such body is system engineering. meant to include structured information amenable to dis- sert about WHAT and WHY, in a first place, and in addition Data-Knowledge-Based Paradigm for Safe AI HOW and WHEN. Thus, since the connectionist approaches treat and depend upon data sets, then the AI-body of knowl- In this Section, we shortly describe the background taken as edge includes the category of approaches driven by data, i.e. reference for the description of our paradigm and how it is a Data-driven AI. Similarly, the symbolic approaches rely leveraged to constitute the expected foreground. upon rules allowing reasoning on terms and notions to infer further knowledge, then the AI-body contains a category of Paradigm Background approaches driven by knowledge, i.e. a Knowledge-driven After conducting a brief survey of approaches for AI de- AI. A mix of Data- and Knowledge-Driven yields the third velopment (Foggia, Genna, and Vento 2001) (Sun 2015) category named Hybrid-driven AI. (Besold et al. 2017), it is observed that, research production From the literature survey, the research seems to stray is mostly distributed over two big fields, namely symbolic from symbolic AI methods and instead leverage learning- (Belle 2020) and connectionist (Kasabov 2012). Symbolic based artificial neural networks. However, some underlin- approaches are based upon a syntax that is endowed with ing issues of current data-driven approaches such as ro- formal semantics (meaning) useful for properties expres- bustness, fairness, explainability, maintainability, etc. are sion and verification. They have been successfully applied leading some to call for a return of knowledge-based or to increase system’s trustworthiness in different applica- some reconciliation of the two main paradigms through Hy- tion domains like health care, automotive, aeronautics, rail- brid AI (Garnelo and Shanahan 2019). Following this call, way (Hofer-Schmitz and Stojanović 2020). Contrary to sym- the paradigm proposed hereinafter strives for strengthening bolic, connectionist approaches are not built upon an explicit Knowledge-driven AIs, targeting a tighter coupling to Data- representation of human expertise: the behaviour is learned driven AIs and relying upon the Hybrid-driven approaches. from data instances. Connectionist algorithms are based upon a statistical or probabilistic model which is tightly cou- AI Safety stakes pled to data sets which are first used for model training, and The conceptual paradigm aims to address several stakes of once tuned, for performance evaluation. The model is often AISCS. Some of the most salient are listed below. structured as a set of nodes defined by multi-value functions or random variables. The nodes are interconnected, and the • Quantitative safety metrics and methods. Metrics based links can be randomly weighted by values influencing nodes upon failure rates have been proven effective to achieve inputs/outputs (Kasabov 2012). More recently (Sun 2015), suitable levels of safety. However, new metrics and meth- hybrid approaches integrating symbolic-based and connec- ods to measure errors and misbehaviours of AI modules tionist paradigms have been proposed. Hybrid approaches are yet to be defined and incorporated. aim to profit of salient features of both symbolic and con- • Qualitative safety metrics and methods. Human inter- nectionist leaving out any potential concurrence (Foggia, pretability calls for qualitative metrics. Indeed, explain- Genna, and Vento 2001), (Garnelo and Shanahan 2019). In ability, risks, and even safety of AI are notions that re- quire qualitative interpretation (meaning) in order to be or analytically define - through statistical approaches by assessed. inferring the inherent structure of a set of examples (input • Data modeling and quality. Data modeling is proposed data) that can be used for mapping new data samples. as a mean for assessment of data features. The influence • (Newell and Simon 2007) claimed that “Symbols lie at of data traits (e.g., data diversity (Ashmore and Mda- the root of intelligent action” and should therefore be a har 2019), existing vs. possible input values) over the AI central component in the design of artificial intelligence. modules and their intended functionality need to be as- In its initial form, knowledge-based AI focused on the sessed and integrated during design. transfer process; transferring the expertise of a problem- • Traceability of safety-related events. Irrespective of the solving human into a program that could take the same design method, safety events need to be traceable. In a data and make the same conclusions. top-down perspective, high-level safety scenarios should cascade down over the detailed architecture so as to in- Knowledge-based AI fer dependencies and identify critical subsystems. In a In Software Engineering, the distinction between a func- bottom-up perspective, errors/failures at component level tional specification and the design/implementation of a sys- need to be propagated upwards to determine safety ef- tem is often discussed as a separation of what and how. Dur- fects. A structuring layer to support such analyses seems ing the specification phase, what the system should do is es- necessary but is still missing. tablished in interaction with the users. How the system func- • AI safety levels and certification. AI errors and misbe- tionality is realized is defined during design and implemen- haviours shall be characterized in such manner that their tation (e.g., which algorithmic solution can be applied). This effects only lead to bearable risks. Assurance and trust separation does not work in the same way for a knowledge- on AI can be built upon provable levels of safety incor- based system (KBS) which is a computer system that rep- porated and evaluated all along the development cycle. resents and uses knowledge to carry out a task and inference The conceptual paradigm aims to be a basis of a frame- procedures to solve problems that are difficult enough to re- work for representation and integration of previous as- quire significant human expertise for their resolution. Thus, pects as well as for inference and assessment. The build- such system has two distinctive features: a knowledge base ing process consists in empowering Knowledge Engineering and an inference engine, where knowledge is then assumed through a better coupling of Data- and Knowledge-driven to be given “declaratively“ by a set of Horn clauses, produc- approaches. The rest of the paper is dedicated to describe tion rules, constraints or frames and where inference engines the salient aspects and constituents of the proposal. like unification, forward or backward resolution, and inheri- tance capture the dynamic part of deriving new information. Empowering Knowledge-based AI systems by For instance, constraint programming (CP) is a knowledge-based AI approach (Rossi, Van Beek, and Knowledge Engineering Walsh 2008) for solving combinatorial problems where Introduced in 1956, AI is a computer sciences discipline constraints model the problem and a general purpose concerned with the theory and development of artificial sys- constraint solver is used to solve it. The main idea is to tems able to perform cognitive activities such as reasoning, propose (1) a modeling language for combinatorial opti- knowledge representation, planning, learning, natural lan- mization problems (through variable and constraints) and guage processing, perception and decision. AI includes a (2) a generic search algorithm able to solve a combinatorial wide range of technologies which can be divided into two problem described using the modeling language. Mainly, broad categories: (1) Data-driven AI which includes neu- a constraint solver aims at reducing dedicated algorithms ral networks, statistical learning, evolutionary computing...; implementation costs and constitutes a framework for reuse and (2) Knowledge-based AI which focuses on the develop- in combinatorial optimization. In other words, the essence ment of ontology and semantics graphs, knowledge-based of CP is based on a clean separation between the statement systems and reasoning... However, each AI paradigm only of the problem (the variables and the constraints), and the focuses on portions of the information and decision chain, resolution of the problem (the algorithms) (Heipcke 1999). leading to solutions that are thus not driven by the global “good decision” goal, making them globally inefficient. Knowledge engineering The main AI paradigms Knowledge engineering (KE) is the process of understand- ing and then representing human knowledge in data struc- The premises of data-driven AI and knowledge-based AI are tures, semantic models (conceptual diagram of the data as fundamentally different (see figure 2). The paradigm of data- it relates to the real world) and heuristics. Expert systems, driven AI is based on brain-style learning such as neural constraint programming, ontologies, ... are examples that networks, whereas Knowledge-based AI approaches employ form the basis of the representation and application of this model and knowledge reasoning. knowledge. The basic assumption is that both knowledge • Often used in the context of pattern recognition, classi- and experience can be captured and archived in textual or fication, clustering or perception, data-driven AI such rule-based form, using formalization methods.In its initial as machine learning aims at capturing tacit knowledge - form, KE focused on transferring the expertise of a problem- knowledge which is difficult or impractical to explicitly solving human into a program that could take the same data Figure 2: Data-driven AI, Knowledge-based AI and Hybrid AI paradigms illustrated with some techniques and make the same conclusions. In the 1990s, the KE com- from the synthesis of prior knowledge. One of the most im- munity shifted gradually to domain knowledge, in particu- portant key points of knowledge discovery is to ensure that lar reusable representations in the form of ontologies. This correct and relevant knowledge is extracted and represented evolution aimed at alleviating KE limitation to accurately to the stakeholders and decision makers. No matter what reflect how humans make decisions and more specifically kind of knowledge is collected, this process can be realized its failure to take into account intuition and “gut feeling”, in a manual way and in an automatic way. known as “reasoning by analogy”. Nevertheless, designing Even if knowledge discovery is today dominated by ma- knowledge-based AI component induces some general fun- chine learning (ML) approaches, the iterative execution of damental problems. the CRISP-DM1 methodology (Chapman et al. 1999), which 1. Knowledge discovery: how do we translate knowledge is today considered the de-facto standard for knowledge dis- as it currently exists in textbooks, articles, databases, and covery projects, assumes an interaction between domain ex- human skills into abstract representations in a computer? perts and the data scientists. In practice, the ML model cre- ation process tends to involve a highly iterative exploratory 2. Knowledge representation: how do we represent human process. In this sense, an effective ML modeling process re- knowledge in terms of data structures that can be pro- quires solid knowledge and understanding of the different cessed by a computer? How to determine the best repre- types of ML algorithms and their parameter tuning (Maher sentation for any given problem? and Sakr 2019), which can be guided by domain knowledge 3. Knowledge reasoning: how do we use these abstract or heuristics (Gibert et al. 2018). data structures to generate useful information in the con- text of a specific case? How to manipulate the knowledge Knowledge Representation and Reasoning to provide explanations to the user? Knowledge Representation and Reasoning (KRR) repre- 4. KBS development lifecycle: How to verify and up- sents information from the real world for a computer to un- date the knowledge base? How to evaluate and validate derstand and then utilize this knowledge to solve complex knowledge-based systems? real-life problems. KRR is not just about storing data in a database, it is the study of how what we know can at the Knowledge discovery same time be represented as comprehensibly as possible and reasoned with as effectively as possibly. One of the main is- While some knowledge is easy to obtain and understand, sue is to find the best trade-off between these two concerns. other knowledge may be difficult to obtain or interpret. In For (Sowa 2000), “Knowledge Representation is the ap- many situations, experts do not have any formal basis for plication of logic and ontology to the task of constructing problem solving or for explaining their reasoning process. computable models for some domain”. Therefore, the way So they tend to use “rules of thumb” (heuristics) devel- a knowledge representation is conceived reflects a particu- oped on the basis of their experience to help them make lar insight or understanding of how people reason. The se- decisions. Thus, Knowledge discovery is the process of collecting, extracting, transferring, accumulating, structur- 1 CRISP-DM stands for CRoss Industry Standard Process for ing, transforming and organizing (domain) knowledge (e.g., Data Mining is a model proposed by a consortium initially com- problem-solving expertise) from data and information or posed with DaimlerChryrler, SPSS and NCR lection of any of the currently available representation tech- ity to discover features in high-dimensional data with little nologies (such as logic, knowledge bases, ontology, seman- or no human intervention. Several features must be taken tic networks...) commits one to fundamental views on the into account when developing a KBS: nature of intelligent reasoning and consequently very dif- • Redundancy: are there identical or equivalent knowledge ferent goals and definitions of success. As we manipulate model (such as rules within expert systems, concepts concepts with words, all ontologies use human language within ontologies, constraints withing constrained solv- to “represent” the world. Thus, ontology is expressed as a ing) that is a special case of another (subsumed)? formal representation of knowledge by a set of concepts • Consistency: Are there ambiguous or conflicting knowl- within a domain and the relationships between these con- edge, is there indeterminacy in its application? Is it in- cepts. Nevertheless, the “fidelity” of the representation de- tended? Are several outcomes possible, for example, de- pends on what the knowledge-based system captures from pending on the strategy (the order in which the knowl- the real thing and what it omits. If such system has an im- edge models are ordered)? perfect model of its universe, knowledge exchange or shar- ing may increase or compound errors during the reasoning • Minimality: can the knowledge set be reduced and sim- process. As such, a fundamental step is to establish effective plified? Is the reduced form logically equivalent to the knowledge representation (symbolic representation) that can first one? be used by future hybrid systems. Symbolic methods may be • Completeness: Are all possible entries covered by the more adapted to dealing with sparse data, support enhanced knowledge of the set? explainability and incorporate past human knowledge, while Thus, a good KBS must have properties such as: machine learning methods excel at pattern recognition and data clustering/classification problems. • Representational Accuracy: It should represent all kinds of required knowledge. The symbol grounding problem • Inferential Adequacy: It should be able to manipulate Developers building knowledge-based systems (KBS), usu- the representational structures to produce new knowl- ally create knowledge bases from scratch through a tedious edge corresponding to the existing structure. and time-consuming process. First, they have to deal with • Inferential Efficiency: The ability to direct the inferen- the diversity and heterogeneity of knowledge representation tial knowledge mechanism into the most productive di- formalisms and with modeling, taxonomical, and termino- rections by storing appropriate guides. logical mismatch of different knowledge items, even if they • Acquisitional Efficiency: The ability to acquire new belong to the same application domain. Thus, while the data knowledge easily using automatic methods. engineers focus on building the data pipes and data scientists Another key concern in knowledge based modeling is sta- focus on inference methods, knowledge engineers focus on bility. How much variability is there between instances of modeling structural use cases and detailing concepts of ex- the problem? How stable is the solution method to small pert knowledge. Knowledge engineering methods adapt to changes? Is the problem very dynamic? What happens if (a use cases of knowledge and can model for specific require- small amount of) the data changes? Do solutions need to be ments and in many cases produce reusable formats. One of robust to small changes? Many such questions need to be the main limitations of knowledge-based systems lie in the answered before we can be sure that a particular knowledge abstract nature of the considered knowledge, in acquiring based AI technique is a suitable technology. and manipulating large volumes of information or data, and the limitations of cognitive and other scientific techniques. Knowledge-driven AISCS Engineering Despite of the progress in KE and ontology engineering in the last decade, obstacles remain. Modeling is still a dif- ML based AISCS engineering is often portrayed as the cre- ficult task, as with the choice of the suitable knowledge- ation of a ML/DL model, and its deployment. In practice, based AI technology. Like every model, such a model is however, the ML/DL model is only a small part of the only an approximation of the reality. The modeling pro- overall system and significant additional functionality is re- cess is often cyclic. Expert Knowledge, notably via Mod- quired to ensure that the ML/DL model can operate in a reli- eling bias, whereby a human manually designing a model able and predictable fashion with proper engineering of data (or part of a model) does not take into account some as- pipelines, monitoring and logging, etc. To capture these as- pects of the environment in building the model, consciously pects of AI engineering we defined the ML algorithm engi- or unconsciously. New observations may lead to a refine- neering pipeline (see fig. 3), where we distinguish between ment, modification, or completion of the already built-up requirements driven development, outcome-driven develop- model. On the other side, the model may guide the fur- ment and AI-driven development. As the starting point, data ther acquisition of knowledge. Therefore an evaluation of must be available for training. Based on data engineering, the model with respect to reality is indispensable for the cre- there are various ways to collect and qualify data set and ation of an adequate model. These limitations relate to the divide it to training, testing, and cross-validation sets. Engi- so-called symbol grounding problem (Harnad 1990), and neering activities have to be encapsulated as a series of steps concern the extent to which representational elements are within the pipeline such as: hand-crafted rather than learned from data. By contrast, one • 1) Problem specification, including the Operational De- of the strengths of machine learning methods are their abil- sign Domain (ODD), that is the description of the specific Figure 3: Proposed ML algorithm engineering pipeline operating condition(s) in which a safety-critical function • 5) Evaluation and verification, after an acceptable set of or system is designed to properly operate, including but hyperparameters is found and the model accuracy is op- not limited to environmental conditions and other do- timized, we can finally test our model. Testing uses our main constraints. These requirements describe the spe- test dataset and is meant to verify/demonstrate that our cific function that the ML items should implement as models are correct and guarantee some required proper- well as the safety, performance, and other requirements ties such as robustness and/or explainability. Based on that the ML items should achieve. the feedback, we may return to training the model to im- • 2) Data engineering, including data collection, prepa- prove correctness, accuracy and robustness, then adjust ration and data segregation propose some guidelines. A output settings, or deploy the model as needed. machine learning model requires large amounts of data, • 6) Model Deployment in the overall system with respect which help the model learning about system objectives to safety and cyber-security system requirements. Learn- and purpose. Before it can be used, data needs to be ing assurance case methods can be used. collected and usually also prepared. Data collection is Then, an ML algorithm has to be designed or selected on the process of aggregating data from multiple sources. existing ML library (such as Scikit Learn (Pedregosa et al. The collected data needs to be sizable, accessible, under- 2011)), to provide a ML model together with its hyperpa- standable, reliable, and usable. Data preparation, or data rameters. Next, the model is trained with the training data. pre-processing, is the process of transforming raw data During the training phase, the system is iteratively tuned so into usable information. that the output has a good match with the “right answers” • 3) ML Algorithm Design, after feeding training set to in the training material. This trained model can also be val- the ML algorithm, it can learn appropriate parameters idated with different data. If this validation is successful – and features. Once training is complete, the model will with any criteria we decide to use – the model is ready for be refined by using the validation data-set. This may in- deployment, similarly to any other component. volve modifying or discarding variables and including a process of tweaking model-specific settings (hyperpa- Conclusion rameters) until an acceptable accuracy level is reached. “Data-driven AI is the AI of the senses, and knowledge- • 4) Implementation, to develop ML components, we based AI is the AI of meaning”2 . This is why, in order have to decide on the targeted hardware platform, the to cover all cognitive capacities, the future lies in the hy- IDE (Integrated Development Environment) and the lan- bridization of these two paradigms, which are often placed guage for development. There are several choices avail- in opposition to AI. Indeed, the shortcomings of deep learn- able. Most of these would meet our requirements easily ing align with the strengths of knowledge-based AI, which as all of them provide the implementation of AI algo- rithms discussed so far, but sometimes we have to take 2 David Sadek, VP Research Technologies and Innovation at into account embedded constraints. Thales raises the possible benefits of hybridization. First, thanks to Belle, V. 2020. Symbolic Logic meets Machine Learning: A their declarative nature, symbolic representations can eas- Brief Survey in Infinite Domains. CoRR, abs/2006.08480. ily be reused in multiple tasks, which promotes data effi- Besold, T. R.; d’Avila Garcez, A.; Bader, S.; Bowman, H.; ciency. Second, symbolic representations tend to be high- Domingos, P.; Hitzler, P.; Kuehnberger, K.-U.; Lamb, L. C.; level and abstract, which facilitates generalization. Lastly, Lowd, D.; Lima, P. M. V.; de Penning, L.; Pinkas, G.; Poon, because of their propositional nature, symbolic representa- H.; and Zaverucha, G. 2017. Neural-Symbolic Learning and tions are amenable to human understanding. AI algorithms Reasoning: A Survey and Interpretation. arXiv:1711.03902. need relevant observations to be able to predict the outcome Chapman, P.; Clinton, J.; Kerber, R.; Khabaza, T.; Reinartz, of future scenarios accurately, and thus, data-driven models T.; Shearer, C.; and Wirth, R. 1999. The CRISP-DM user alone may not be sufficient to ensure safety as usually we guide. In 4th CRISP-DM SIG Workshop. do not have exhaustive and fully relevant data. Nevertheless, as any critical system, an AISCS needs to have well defined Foggia, P.; Genna, R.; and Vento, M. 2001. Symbolic development methods from its design to its deployment and vs. connectionist learning: an experimental comparison in qualification. This requires a complete tool chain ensuring a structured domain. IEEE Transactions on Knowledge and trust at all stages, as: Data Engineering, 13(2): 176–195. 1. Specification, knowledge and data management, Garnelo, M.; and Shanahan, M. 2019. Reconciling deep learning with symbolic artificial intelligence: representing 2. Algorithm and system architecture design, objects and relations. Current Opinion in Behavioral Sci- 3. AI functions characterization, verification and validation, ences, 29: 17–23. 4. Deployment, particularly on embedded architecture, Gibert, K.; Izquierdo, J.; Sànchez-Marrè, M.; Hamilton, 5. Qualification, certification from a system point of view. S. H.; Rodrı́guez-Roda, I.; and Holmes, G. 2018. Which method to use? An assessment of data mining methods in Environmental Data Science. Environmental modelling & software, 110: 3–27. Harnad, S. 1990. The symbol grounding problem. Physica D: Nonlinear Phenomena, 42(1-3): 335–346. Heipcke, S. 1999. Comparing constraint programming and mathematical programming approaches to discrete optimi- sation—the change problem. Journal of the Operational Re- Figure 4: Revisiting all engineering disciplines for a sound search Society, 50(6): 581–595. deployment of AISCS Hofer-Schmitz, K.; and Stojanović, B. 2020. Towards for- mal verification of IoT protocols: A Review. Computer Net- All that demands a sound and tooled AI engineering works, 174: 107233. methodology that encompasses, with objective of trustwor- thy AI algorithm engineering, data engineering, knowledge Kasabov, N. 2012. Evolving spiking neural networks for engineering and AI system engineering by addressing the is- spatio-and spectro-temporal pattern recognition. In 2012 6th sues described above. Academic research already proposes IEEE International Conference Intelligent Systems, 27–32. solutions towards AI certification3 , industry should take over Maher, M.; and Sakr, S. 2019. Smartml: A meta learning- now (see Fig. 4). At French national level major industrial based framework for automated selection and hyperparam- players in the fields of Automotive, Aeronautics, Defense, eter tuning for machine learning algorithms. In The 22nd Manufacturing and Energy (Air Liquide, Airbus, Atos, EDF, EDBT. Naval-Group, Renault, Safran, SopraSteria, Thales, Total Newell, A.; and Simon, H. A. 2007. Computer science as and Valeo) with the support of academic partners (CEA, IN- empirical inquiry: Symbols and search. In ACM Turing RIA, IRT Saint Exupéry and IRT SystemX) are collaborat- award lectures, 1975. ing together to address such issues through the French Na- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; et al. 2011. tional Program “Confiance.ai” (https://www.confiance.ai/). Scikit-learn: Machine learning in Python. the Journal of ma- Based on the specifications described above, this program chine Learning research, 12: 2825–2830. aims to bridge the gap between AI Proof of Concepts and AI deployment within critical systems toward certification by Rossi, F.; Van Beek, P.; and Walsh, T. 2008. Constraint pro- providing an interoperable engineering workbench to sup- gramming. Foundations of Artificial Intelligence, 3: 181– port AI processes and practices through methods and tools 211. during the overall lifecycle of the AI-based system. Sowa, J. F. 2000. Guided tour of ontology. Retrieved from. Sun, R. 2015. Artificial Intelligence: Connectionist and References Symbolic Approaches. In Wright, J. D., ed., International Ashmore, R.; and Mdahar, B. 2019. Rethinking Diversity in Encyclopedia of the Social & Behavioral Sciences (2nd Edi- the Context of Autonomous Systems. Safety-Critical Sys- tion), 35–40. Oxford: Elsevier, second edition edition. tems Symposium 2019, 175–192. 3 https://www.deel.ai/