=Paper= {{Paper |id=Vol-2808/Paper_22 |storemode=property |title=AI-Blueprint for Deep Neural Networks |pdfUrl=https://ceur-ws.org/Vol-2808/Paper_22.pdf |volume=Vol-2808 |authors=Ernest Wozniak,Henrik Putzer,Carmen Carlan |dblpUrl=https://dblp.org/rec/conf/aaai/WozniakPC21 }} ==AI-Blueprint for Deep Neural Networks== https://ceur-ws.org/Vol-2808/Paper_22.pdf
                                      AI-Blueprint for Deep Neural Networks
                                      Ernest Wozniak1, Henrik Putzer1,2, Carmen Cârlan1
                                  1
                                   fortiss GmbH, Guerickestr. 25, 80807 Munich, Germany, lastname@fortiss.org
                             2
                                 cogitron GmbH, Stefaniweg 4, 85652 Pliening, Germany, henrik.putzer@cogitron.de




                                 Abstract                                   This work advocates a different approach, where AI is con-
   Development of trustworthy (e.g., safety and/or security crit-           sidered as a 3rd kind technology (next to software (SW) and
   ical) hardware/software-based systems needs to rely on well-             hardware (HW)), which requires its own process model to
   defined process models. However, engineering trustworthy                 ensure trustworthiness. This is because AI, in particular Ma-
   systems implemented with artificial intelligence (AI) is still
                                                                            chine Learning (ML) is a new, data-driven technology,
   poorly discussed. This is, to large extend, due to the stand-
   point in which AI is a technique applied within software en-             which requires a specific engineering process with specific
   gineering. This work follows a different viewpoint in which              tailored methods for assuring trustworthiness. Such a struc-
   AI represents a 3rd kind technology (next to software and                tured engineering process will be introduced by this paper,
   hardware), with close connections to software. Consequently,             while also discussing the integration of this process into the
   the contribution of this paper is the presentation of a process
                                                                            overall system lifecycle, as presented in the VDE-AR-E
   model, tailored to AI engineering. Its objective is to support
   the development of trustworthy systems, for which parts of               2842-61 standard [VDE-AR-E 2842-61:2020].
   their safety and/or security critical functionality are imple-           This paper is structured as follows. Foundations section pre-
   mented with AI. As such, it considers methods and metrics at             sents the VDE-AR-E 2842-61 standard which states the
   different AI development phases that shall be used to achieve            main context for this work. Next section presents a generic
   higher confidence in the satisfaction of trustworthiness prop-
                                                                            process model called AI-Blueprint, upon which, process
   erties of a developed system.
                                                                            models tailored to specific AI techniques (e.g. Deep Neural
                                                                            Networks - DNNs, Reinforcement Learning) can be built.
                          Introduction                                      After that, a specific instance of AI-Blueprint for DNNs is
                                                                            provided, with the follow-up section showing its application
A common deficiency of safety standards like ISO 26262                      for a development of CNN (Convolutional Neural Net-
[ISO 26262:2018], in the automotive domain, is that they do                 work)-based pedestrians’ detection component. Finally, re-
not account explicitly for Artificial Intelligence (AI) tech-               lated work is presented, i.e. process models for the develop-
nology [Putzer, H.J.]. However, opposed to older rumors –                   ment of AI components, especially in the context of the de-
safety standards do not prohibit the use of AI, they just do                velopment of safety and security critical systems. In the last
not provide any guidelines on how to use this technology.                   section, we draw some conclusions and discuss future work.
Actually, lately, AI has reached high attention among the
automotive, healthcare, or defense industry. This is due to
the capabilities of AI during design (shorter time to market;                                     Foundations
implementation of implicit requirements) and during opera-
                                                                            Fig. 1 presents the reference lifecycle defined in VDE-AR-
tion (improved performance). To achieve the vision in
                                                                            E 2842-61 standard. The standard will consist of six parts
which AI not only supports, but also provides safety and/or
                                                                            where the three of them are already published. The reference
security critical functionality, systems must be assured by
                                                                            lifecycle can be used as a reference for a process model that
evidence for trustworthy behavior of AI components.
                                                                            supports the development and assurance of a concrete trust-
There is a tendency that AI is regarded as software. It is sug-
                                                                            worthy system. Trustworthiness is considered as a more ge-
gested that the application of existing standards in their cur-
                                                                            neric concept that combines a user defined and potentially
rent form is adequate and that the reuse of already available
                                                                            project specific set of aspects. These aspects include but are
processes, methods and metrics specific to software compo-
                                                                            not limited to (functional) safety, security, privacy, usabil-
nents can be used for AI.
                                                                            ity, ethical and legal compliance, reliability, availability,


Copyright © 2021 for this paper by its authors. Use permitted under Crea-
tive Commons License Attribution 4.0 International (CC BY 4.0).
                                                                                  Fig. 2: AI-blueprint for DNN

                                                                  The development phase at system level provides as inputs to
                                                                  the AI-Blueprint the system and trustworthiness require-
      Fig. 1: Reference Lifecycle of VDE-AR-E 2842-61
                                                                  ments, together with the desired Trustworthiness Perfor-
maintainability, and (intended) functionality. The reference      mance Level (TPL). TPL is a risk classification scheme sim-
lifecycle defines the logical flow of assurance activities and    ilar to the Automotive Safety Integrity Level (ASIL) with
is inspired by the structure of the ISO 26262 safety lifecycle.   the exception that it concerns trustworthiness, not only
However, it is domain-independent. Detailed description of        safety. It appoints selection of qualitative methods and met-
the phases can be found in [Putzer, H.J.]. In this work we        rics (M&M-s), in a systematic approach, to achieve certain
focus only on the development phase dedicated to the Tech-        TPL level.
nological Element (see Fig. 1). The scope of this phase is to     The AI-Blueprint outputs the AI element and the value of
provide guidance for the implementation of elements based         UCI (Uncertainty-related Confidence Indicator), which are
on a single technology (e.g. SW, HW, and especially AI).          provided back to the development at system level. UCI is a
With a suitable process interface, when using the VDE-AR-         quantitative indicator that describes the achieved confidence
E 2842-61 standard, these process models can be borrowed          in the trustworthiness of AI component, which can be com-
from other suitable standards. For example, in automotive,        bined (in a statistically principled way) to calculate failure
SW or HW based components can be developed following              rate at the system level [Zhao, X.]. It represents a quantita-
the V models defined in ISO 26262. However, there is no           tive guarantee that a component can deliver as part of the
standardized process model that can be used for AI compo-         trustworthiness contract established with the rest of a sys-
nents. Consequently, the following two sections present a         tem. Desired value of UCI is defined via assigned TPL. UCI
concept of AI-Blueprint, which acts as a template for con-        conceptually can be compared to the idea of  expressing
structing process models for specific AI technologies, such       random hardware failures (ISO 26262 part 5).
as the one presented later, i.e. a process model for DNNs.
                                                                                  AI-Blueprint for DNN
                   The AI-Blueprint                               In this section, we instantiate the concept of AI-Blueprint
The development of AI components does not fit into exist-         for a certain technology, namely DNN. For each phase in
ing process models (e.g. like for classical software) due to      the process model, we discuss its objectives to be achieved
the specific nature of the AI data-driven implementation.         and the methods and metrics (M&M-s) that can be used to
Even inside the field of AI, different methodologies and so-      ensure higher trustworthiness. Significant area of research is
lution concepts can have very specific requirements towards
the underlying process model. This urges for a new ap-
proach in which specific characteristics of certain AI tech-
nology are targeted by specific process model.
In this paper, we introduce the concept of AI-Blueprint. It is
a template process that shall be refined for a specific AI
technique. The AI-Blueprint is characterized by Input and
Output Interfaces, Structure (development phases) and
Qualifications (e.g. used for the first time, or proven, i.e.
used with success in many projects). The execution of an
instance of the AI-Blueprint provides an AI element charac-
terized by a predefined quality level, including guarantees       Fig. 3: Left DNN Blueprint Branch M&M-s for Trustworthi-
to meet defined trustworthiness requirements.                                           ness Assurance
                                                                  to verify a design. The set of possible M&M-s that shall in-
                                                                  crease confidence in trustworthiness of an AI element (i.e.
                                                                  DNN) and which regards data preparation has been inten-
                                                                  sively researched. For instance, data shall account for corner
                                                                  cases or adversarial examples. Specific use-cases may be
                                                                  obtained through synthetic data generated using for instance
                                                                  Generative Adversarial Networks (GANs) [Esteban, C].
                                                                  Further, Variational Autoencoders (VAE) may be applied to
                                                                  see whether data falls within the ODD (Operational Design
                                                                  Domain) [Vernekar, S.]. M&M-s can also be used to im-
                                                                  prove the quality of data labeling. For example, labeling in-
                                                                  accuracies may be circumvented by providing datasets la-
 Fig. 4: Right DNN Blueprint Branch M&M-s for Trustwor-           beled by different teams and/or technologies.
                     thiness Assurance

devoted to the identification of methods (apart from stand-       NN Design Phase
ard testing on validation and test set), and metrics that can     This is a phase that outputs as a main artefact the DNN de-
be used to reason about the trustworthiness of DNNs. Their        sign. The main difference between DNN design and DNN
usage (or not) is part of the systematic approach to achieve      model is that the later contains also information about
certain TPL level. Additionally, a subset of them will con-       trained weights. Consequently, the main objectives of this
tribute to the estimation of UCI. This however requires pro-      phase are the specification of design-related requirements
vision of a “bridge” between M&M-s and estimates of UCI,          and the development of a model that satisfies them. The
similarly as it could possibly be done for SW (see [Rushby,       M&M-s at this phase shall contribute to the higher trustwor-
J ]). Fig. 3 and Fig. 4 present examples of M&M-s grouped         thiness of AI element by making the design robust to failures
along the development phases. Still an open problem are the       (e.g. usage of redundancy) or noisy data (e.g. uncertainty
requirements imposed on their usage, depending on the as-         calculation with MC-dropout), contributing to generaliza-
signed TPL. For example, ISO 26262 part 6 provides corre-         tion property (e.g. design guidelines to select appropriate ac-
sponding set of methods to assure confidence in the fulfill-      tivation function, etc.), and other non-functional properties
ment of assurance objectives for a SW component. These            which impact trustworthiness.
methods (e.g. usage of strongly typed programming lan-
guages, formal verification, etc.) in the context of a particu-   Implementation & Training Phase
lar ASIL level (A, B, C or D) are highly recommended, rec-        This phase considers as an input the DNN design and the
ommended, or have no recommendation for/against their us-         training dataset. In order to assure higher confidence in the
age. Similar set of rules shall be also worked out for this       DNN training, the NN developer shall follow good practices
DNN blueprint. This is currently left for a future work.          for coding (e.g. for high-criticality functionality only
                                                                  strongly typed languages may have to be permitted). Other
Initiation Phase                                                  M&M-s can be used to optimize the training. These include
During this phase, the team that will develop the DNN com-        but are not limited to: cropping, subsampling, scaling, etc.
ponent is assembled. Then, all requirements allocated to          Higher trust can also be achieved by defining and following
DNN are collected and harmonized. These are product re-           criteria for the training platform (e.g. level of coherence with
lated and trustworthiness requirements specified during the       a final execution platform). The output artefact of this phase
system-level development phase. Further, the acceptance           is a DNN model, which is the main input to the verification
criteria for DNN are defined. M&M-s that can be used at           and validation phases.
this phase refer to requirements engineering.
                                                                  Training Verification Phase
Data Preparation Phase                                            This phase is part of the training procedure with the main
The first objective of this phase is to derive data-related re-   purpose of controlling and verifying it. Based on the valida-
quirements from system-level requirements. The second ob-         tion dataset and predefined validation metrics, the evolving
jective of this phase is to gather proper data (accordingly to    DNN model is verified after each epoch. A negative out-
the requirements) and to group it into training, validation,      come of the verification may require changes in the training
and test sets. Validation set is used during the training, but    (e.g. resignation from subsampling), or may require changes
with a purpose to assess the model convergence after each         of hyperparameters, defined during the design phase.
epoch. This set can be further used during the design phase
Design Verification                                                who is using a wheelchair or a means of conveyance pro-
This phase aims at the verification of DNN by investigating        pelled by human power other than a bicycle) are properly
possible problems related to the NN design. The NN devel-          detected and AIR02: ODD is defined through the European
oper shall evaluate the result of the training, using validation   roads. Further, we also derive trustworthiness AI require-
dataset, and assess possible problems that may have arisen         ments, purposed mainly to counteract identified hazards,
due to bad design decisions (e.g., inappropriate activation        e.g.: AITR01: the DNN shall output for each detected rele-
function). The NN developer may request follow-up itera-           vant pedestrian a bounding box with accurately estimated
tions over the epochs or, if needed, the redesign of NN hy-        size and position accuracy in the velocity dependent detec-
perparameters (return to Design Phase).                            tion zone, in all situations the Ego Vehicle may encounter,
                                                                   while being in the ODD; AITR02: pedestrians occluded up
NN Verification Phase                                              to 95% shall be properly identified; and AITR03: the DNN
The IV&V (Independent Verification and Validation) engi-           component shall not output false positives in the detection
neer shall execute a set of tests, in order to judge on the suc-   zone more than once in a sequence of 5 video frames. Next
cess or failure of the NN generalization, brittleness, robust-     to these requirements, there is also a value of the TPL, which
ness or efficiency. The judgement should be primarily based        is assigned at the system level. On a scale from A to D (high-
on the measured accuracy of NN over the test data set and/or       est trustworthiness criticality level), in case there are no re-
identified and/or generated adversarial examples and/or cor-       dundant components to pedestrians detection component,
ner cases. Tests may also involve faults injection or endur-       the assigned TPL value is D due to the high criticality of
ance tests to measure sustainability of an NN.                     functionality that it provides.

NN Validation, Deployment, and Release                             Example of Activities to fulfill Phases Objectives
During this final phase the IV&V engineer shall assess             This subsection presents examples of activities that can be
whether the AI element complies with all product and trust-        executed over the different phases of the process model for
worthiness requirements. The NN shall be then integrated           CNN in order to meet the objectives of the phases and pro-
with hardware and/or software libraries in order to be de-         vide as a final outcome an AI element together with a trust-
ployed in the overall system. The resulting AI element shall       worthiness guarantee expressed by the value of UCI.
be validated while running on the target platform. The final       During the Initiation Phase system-level requirements are
objective of this phase is to calculate the UCI in order to        refined. In the example of pedestrian recognition, examples
assess compliance of the AI element with an initially as-          of the refined product requirements are: AIR01 →
signed TPL. If the obtained UCI value does not correspond          AIR01.01: pedestrians of min. width (20 pixels) and min.
to the TPL level, redesign decisions either at the AI level        height (20 pixels) shall be classified; AIR02 → AIR02.01:
(e.g., design changes, collection of additional data) or at the    ODD shall consider right-lane and left-lane traffic.
system level (e.g., introduction of a redundant element to         Further, we refine the AI trustworthiness requirements as
decrease assigned TPL level) shall be planned and executed.        follows: AITR01 → AITR01.01: the value of mean aver-
                                                                   age precision shall be greater or equal to 97.9%; and
                                                                   AITR02 → AITR02.01: the data samples shall include ex-
                   Practical Example                               amples with a sufficient range of levels of occlusion giving
                                                                   partial view of pedestrians at crossings.
The objective of this section is to showcase the traversal         Next, to fulfill further objectives of this phase, the team de-
over the proposed DNN process model for developing a               veloping the AI element has to be assembled and, if neces-
Convolutional Neural Network (CNN) for pedestrian detec-           sary, the DNN blueprint needs to be adjusted to reflect fur-
tion. The overall context for this use-case, i.e., the System      ther identified needs expressed by the team.
of Interest (SoI) is a pedestrian collision avoidance system.      The first activity during the Data Preparation Phase is to
This system entails several components, SW or HW based,            look over the requirements and process those that impact the
among which there is the AI component with the main re-            data gathering and labeling activities. For instance, AITR02
quirement to detect pedestrians (i.e., 2D bounding box de-         has an implicit impact on the data because, to properly train
tection of pedestrians) based on the analysis of video data        and test the model, data shall contain pedestrians with dif-
acquired from a single camera.                                     ferent levels of occlusion (up to 95%). There could also exist
                                                                   data related concerns explicitly expressed, such as AIR03:
Input from the System-level                                        the data samples shall include a sufficient range of examples
From the system-level requirements, we derive AI func-             reflecting the effects of identified system failure modes,
tional requirements, such as: AIR01: relevant (defined via         AIR04: the format of each data sample shall be representa-
reachability zone) pedestrians (any person who is afoot or         tive of that which is captured using sensors deployed on the
ego vehicle or AIR05: the data samples shall include suffi-        could perform design investigation to analyze neurons acti-
cient range of pedestrians within the scope of the ODD. The        vation. This contributes to the explainability of how pedes-
data shall be then gathered, labeled and properly stored, ac-      trians are identified and may allow to prune those neurons,
cording to the identified requirements.                            which do not play any role in the decision process. Pruning
The Design Phase shall first analyze requirements which            may also be used to limit the number of neurons, to possibly
may have implications on a CNN design. Examples of such            eliminate problems of overfitting. M&M-s that shall be ap-
requirements are: AIR06: the DNN shall be robust against           plied can be classified as white-box because they refer to
all types of foreseeable noise, AIR07: a diagnosis function        internal characteristics of CNNs. Functional Verification
shall exist in order to detect distributional shift in the envi-   Phase activities shall also target verification of key trust-
ronment (out of ODD detection) or AIR08: plausibility              worthiness concerns, however more from the grey/black-
checks of detected bounding boxes are necessary (e.g. pe-          box perspective. Here not only CNN itself is verified, but
destrians usually do not fly). The designer shall then specify     also the data. For example, if the CNN does not detect pe-
the CNN in terms of the number of convolutional layers,            destrians on a wheelchair, most likely the CNN was never
kernel size, number of fully connected layers, neurons in          fed with such training examples. Explainability could be
each layer, loss function, and other hyperparameters, to pro-      further enhanced by using attention based methods. For ex-
vide a design that will best serve the intended purpose. The       ample, heat maps (grey-box method) may reveal those fea-
analyzed requirements shall also steer design activities. For      tures from the image which are used to identify pedestrians.
instance, AIR06 requests robustness to various types of            These may be different body parts, or maybe just vertical
noise which can occur in the input data. The presence of           lines. The result highly depends on the level of features an-
noise may also be problematic even during the training as it       notation performed during the labeling. If it is not detailed
may lead to overfitting. Further, AIR07 may be handled by          enough, the NN tester may request extended feature annota-
either introducing additional component (in such case the          tion to be performed at the data preparation phase.
solution would affect system level) based on VAE, which            The validation is executed in the last phase, i.e., NN Vali-
can identify distributional shift, or the CNN itself may use       dation, Deployment, and Release. Its main activity centers
MC-dropout to calculate uncertainties, where high uncer-           at the validation of the requirements provided as an input to
tainty may result from out of ODD input. AIR08 may be              the initiation phase, and their refined versions elaborated at
accommodated by design through additional knowledge in-            that phase. For instance, to validate AIR01.01 one has to
jected into the CNN (neural symbolic integration) that re-         identify input images within the test set on which there are
jects labels that based on human knowledge make no sense.          pedestrians with height and width close to 20 pixels and see
The first activity of the Implementation and Training              whether these are properly detected. In case they are not, ei-
Phase is to provide a code for the CNN. The coding activity        ther requirements shall be changed or the training data shall
can follow standard SW development practices recom-                be checked to identify whether enough samples was there to
mended in ISO 26262 part 6. However, certain differences           train the model for this requirement.
such as tooling, libraries or available programming lan-
guages (Python or preferably C++) create additional chal-          Output provided to the System-level
lenges to this activity. Next, the training platform needs to      The artefact output by the DNN process model is a trained
be selected with justification and training related parameters     and verified CNN model for pedestrians’ detection. Next to
shall be provided, e.g. batch size = 1024, number of epochs        it, the UCI value is computed, which accounts for M&M-s
= 4, number of iterations = 10, learning rate = 0.001 or decay     being used throughout the blueprint and their efficiency in
factor = 0.9. Then, according to predefined parameters the         minimizing risks of possible hazards which may occur.
training shall be assessed during the Training Verification
Phase. The parameters may be tuned if necessary (e.g.
model does not converge) or early stopping triggered if the                             Related Work
model has learned to extract all the meaningful relationships
                                                                   The fact that there is a need for a dedicated process model
from the data before starting to model the noise.
                                                                   for the development of AI components within safety and/or
Having the trained model, verification and validation activ-
                                                                   security critical systems has been underlined more than 20
ities can be performed. Verification shall primarily investi-
                                                                   years ago by Rodvold, who proposed a formal development
gate key trustworthiness concerns which are specific to
                                                                   methodology and a validation technique for AI trustworthi-
CNNs. These are robustness, brittleness, efficiency, ODD
                                                                   ness assurance [Rodvold, D.M]. While the phases in the pro-
coverage, distributional shift, unknown behavior in rare sit-
                                                                   cess model proposed by Rodvold resemble the phases of our
uations (corner cases or adversarial examples). Verification
                                                                   proposed AI-Blueprint, Rodvold does not discuss the met-
starts at Design Verification Phase in which performed ac-
tivities shall encounter possible problems regarding men-
tioned properties in relation to the design. For instance, one
rics and corresponding methods that can be used for the im-       framework for the development of trustworthy autono-
plementation and the verification of the considered trust-        mous/cognitive systems, regulated in the upcoming VDE-
worthiness requirements.                                          AR-E 2842-61 standard. This work however is still in its
Microsoft presents a nine-stage ML workflow for the devel-        early phases. As a future work, recommendations for spe-
opment of AI-based applications [Amershi, S]. The work-           cific metrics and methods, advertised along the DNN blue-
flow is claimed to be used by multiple teams inside Mi-           print, should be established, based on the TPL levels as-
crosoft, for diverse applications, and to have been integrated    signed to AI element. Also, current research status regarding
into overall, preexisting agile software engineering pro-         feasibility or performance of these methods should be inves-
cesses. Amershi et. al. categorize the workflow stages as         tigated, to eliminate those which cannot be used while de-
data-oriented (e.g., collection, cleaning, and labeling) and      veloping industry size DNNs. Next, further research on how
model-oriented (e.g., model requirements, feature engineer-       to calculate the value for the newly introduced UCI concept
ing, training, evaluation, deployment, and monitoring).           is necessary. Finally, having the concept of AI-Blueprint,
While these stages are similar to the ones in our proposed        new blueprints, such as for reinforcement learning, neural
AI-Blueprint, the Microsoft workflow does not consider ac-        symbolic integration could be derived.
tivities specific for trustworthiness assurance, as their work-
flow is only intended to be used for the implementation of
non-critical functionality.                                                                References
Ashmore et. al. present a process model for ML components         Putzer, H.J. and Wozniak, E., 2020. A Structured Approach to
in critical systems, consisting of four phases: Data Manage-      Trustworthy Autonomous/Cognitive Systems. arXiv preprint
ment, Model Learning, Model Verification and Model De-            arXiv:2002.08210.
ployment [Ashmore, R]. For each phase in the model, they          ISO 26262 Road vehicles – Functional safety, 2018
define the assurance-related desiderata (i.e., objectives) and    VDE-AR-E 2842-61 - Design and Trustworthiness of autono-
they discuss how state-of-the-art methods may contribute to       mous/cognitive systems, 2020
the achievement of those desiderata. The work presented in        Esteban, C., Hyland, S.L. and Rätsch, G., 2017. Real-valued (med-
this paper is complementary to the work of Ashmore et. al..       ical) time series generation with recurrent conditional gans. arXiv
                                                                  preprint arXiv:1706.02633.
First, AI-Blueprint for DNN elaborates more on the valida-
                                                                  Vernekar, S., Gaurav, A., Abdelzad, V., Denouden, T., Salay, R.
tion and verification of AI components, having separate
                                                                  and Czarnecki, K., 2019. Out-of-distribution detection in classifi-
phases for design verification, verification of functional re-    ers via generation. arXiv preprint arXiv:1910.04241.
quirements and validation of the AI component w.r.t. trust-       Rodvold, D.M., 1999, July. A software development process
worthiness requirements. Second, instead of proposing a           model for artificial neural networks in critical applications.
general AI process model, we advocate the need for both a         In IJCNN'99. International Joint Conference on Neural Networks.
higher-level template for process models guiding the devel-       Proceedings (Cat. No. 99CH36339) (Vol. 5, pp. 3317-3322).
opment of AI components (i.e., AI-Blueprint), and more            IEEE.
concrete process models for particular AI technologies (e.g.,     Amershi, S., Begel, A., Bird, C., DeLine, R., Gall, H., Kamar, E.,
                                                                  Nagappan, N., Nushi, B. and Zimmermann, T., 2019, May. Soft-
AI-Blueprint for DNNs). Third, we discuss the process mod-
                                                                  ware engineering for machine learning: A case study. In 2019
els in the context of the overall system lifecycle.               IEEE/ACM 41st International Conference on Software Engineer-
Toreini et. al. examine the qualities technologies should         ing: Software Engineering in Practice (ICSE-SEIP) (pp. 291-300).
have to support trust in AI-based systems, but from the per-      IEEE.
spective of social sciences [Toreini, E]. They present an in-     Ashmore, R., Calinescu, R. and Paterson, C., 2019. Assuring the
teresting, but abstract machine learning pipeline, whose          machine learning lifecycle: Desiderata, methods, and challenges.
phases could be aligned to those in the AI-Blueprint for          arXiv preprint arXiv:1905.04223.
DNNs. Nevertheless, they do not offer detailed description        Toreini, E., Aitken, M., Coopamootoo, K., Elliott, K., Zelaya, C.G.
                                                                  and van Moorsel, A., 2020, January. The relationship between trust
of each of the phases, neither they deliberate on specific
                                                                  in AI and trustworthy machine learning technologies. In Proceed-
M&M-s and how they shall be used to increase the confi-           ings of the 2020 Conference on Fairness, Accountability, and
dence in trustworthy solution.                                    Transparency (pp. 272-283).
                                                                  Rushby, J., 2009, November. Software verification and system as-
                                                                  surance. In 2009 Seventh IEEE International Conference on Soft-
          Conclusions and Future Work                             ware Engineering and Formal Methods (pp. 3-10). IEEE.
This paper presented the concept of AI-Blueprint and an ex-       Zhao, X., Robu, V., Flynn, D., Salako, K. and Strigini, L., 2019,
                                                                  October. Assessing the safety and reliability of autonomous vehi-
ample of how to use this blueprint for tailoring a process        cles from road testing. In 2019 IEEE 30th International Sympo-
model for a certain AI technology (i.e., DNNs), with the          sium on Software Reliability Engineering (ISSRE) (pp. 13-23).
scope of supporting trustworthiness assurance. We also dis-       IEEE.
cussed how the proposed AI-Blueprint fits in an overall