=Paper=
{{Paper
|id=Vol-3672/RE4AI-paper1
|storemode=property
|title=Documentation of non-functional requirements for systems with machine learning
components
|pdfUrl=https://ceur-ws.org/Vol-3672/RE4AI-paper1.pdf
|volume=Vol-3672
|authors=Elma Bajraktari,Thomas Krause,Christian Kücherer
|dblpUrl=https://dblp.org/rec/conf/refsq/BajraktariKK24
}}
==Documentation of non-functional requirements for systems with machine learning
components==
Documentation of Non-Functional Requirements for
Systems with Machine Learning Components
Elma Bajraktari1, Thomas Krause2 and Christian Kücherer3
1
adesso SE, Stockholmer Platz 1, 70173 Stuttgart, Germany
2
Serapion GmbH, Schäufeleinstr. 7, 80687 München, Germany
3
Reutlingen University, 72762 Reutlingen, Germany
Abstract
[Context and motivation] Many of today´s systems use artificial intelligence, where Machine learning
(ML) is a subfield. Requirements engineering (RE) addresses the needs of the stakeholders for systems
development. In particular, systems with ML components require specific non-functional require-
ments (NFRs) to define ML relevant details, such as quality aspects of training datasets, retrainability
of ML-models or specifics of the ML training pipeline. [Problem] The specific application of RE tech-
niques in practical use to systems with ML components is not yet completely understood. It is not
clear, which techniques for elicitation, documentation of requirements can be used efficiently for ML
based systems. [Ideas and results] Based on a systematic mapping study; we identify 58 NFRs used in
studies to describe particular ML requirements. Through an online survey and expert interviews, we
identified 30 NFRs that need to be considered in particular for systems with ML components. For the
documentation of the highly relevant NFRs, a template was designed, evaluated and optimized in two
IT companies. This template helps to ensure consistent documentation of the NFRs. [Contribution]
Based on the systematic mapping study, the online survey and the expert interviews, we provide a
list of relevant NFRs and a template for documenting the NFRs for systems with ML components. We
validated the proposed template using a real world case in the context of two IT industry companies
and several software projects. The evaluation shows an increased completeness of requirements.
Keywords
Requirements elicitation and documentation, machine learning, non-functional requirements
1. Introduction
Requirements Engineering (RE) is a process used in the development of software-based systems
to address the needs of stakeholders. During the RE process, requirements are elicited, docu-
mented, validated, and managed. A requirement is a statement that reflects the needs of the
stakeholder, such as the capabilities or characteristics that the software to be developed must
have [1, 2]. A distinction is made between functional and non-functional requirements (NFRs).
Functional requirements describe functionalities that must be provided by a system. NFRs are
understood to be quality requirements on the one hand and constraints on the other [1, 3].
_________________________
In: D. Mendez, A. Moreira, J. Horkoff, T. Weyer, M. Daneva, M. Unterkalmsteiner, S. Bühne, J. Hehn, B. Penzenstadler,
N. Condori-Fernández, O. Dieste, R. Guizzardi, K. M. Habibullah, A. Perini, A. Susi, S. Abualhaija, C. Arora, D. Dell’Anna,
A. Ferrari, S. Ghanavati, F. Dalpiaz, J. Steghöfer, A. Rachmann, J. Gulden, A. Müller, M. Beck, D. Birkmeier, A. Herr-
mann, P. Mennig, K. Schneider. Joint Proceedings of REFSQ-2024 Workshops, Doctoral Symposium, Posters & Tools Track,
and Education and Training Track. Co-located with REFSQ 2024. Winterthur, Switzerland, April 8, 2024.
elma.bajraktari@adesso.de (E. Bajraktari); thomas.krause@serapion.net (T. Krause); christian.kuecherer@reutlingen-
university.de (C. Kücherer);
0009-0002-2340-502X (T. Krause) 0000-0001-5608-482X (C. Kücherer)
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR Workshop Proceedings (CEUR-WS.org)
CEUR
http://ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
In this paper, only NFRs are considered. Quality requirements describe quality criteria that the
system must meet, such as performance and availability, and constraints describe conditions
and restrictions that can influence the development of the system, such as laws, regulations or
standards [1, 3]. According to Ahmad et al. [3], RE has been sufficiently explored in the devel-
opment of classical software-based systems, and it is clear which techniques can be used in RE
activities. Recent software developments are increasingly using artificial intelligence (AI) com-
ponents and machine learning (ML) [3]. According to Kreutzer and Sirrenberg [4] AI is under-
stood as an overarching term for systems that automatically generate intelligent solutions to a
problem [4]. ML is understood as a subfield of AI in which self-learning algorithms are used.
These systems are able to learn and solve problems on their own without a human being pro-
gramming them to do so [4, 5]. The difference between classical software-based systems and
systems with ML components is that systems with ML components are probabilistic [6]. Ac-
cording to Ahmad et al. [3], the use of RE in the development of systems with AI or ML com-
ponents has not been sufficiently explored. This statement is confirmed by Yoshioka et al. [6]
and Villamizar et al. [7] in that research regarding techniques of RE in the development of ML-
based systems is insufficient. According to Ahmad et al. [3], new techniques are needed for RE
activities that can be used in the development of systems with AI or ML components. According
to Rupp and the SOPHISTs [2], the specification of ML-based systems differs from classical
systems in that this type of system focuses on the quality of the ML model used as well as the
training, validation, and test data. Greater emphasis must be placed on NFRs in ML-based sys-
tems [2]. A ML model is built using a selected ML algorithm trained on training data for solving
a particular problem [4, 5, 8]. Validation and test data are then used to check the quality of the
ML model before it is deployed [9]. The quality aspects of NFRs that must be taken into account
when developing classic software-based systems are defined and categorized in ISO/IEC 25010
[10]. This standard does not yet explicitly consider quality criteria related to systems with ML
components. Quality criteria should not only refer to the end product, but also to the previous
aspects such as the training data used, or the ML model used.
Summarizing this, there is a need for specific ML-related requirements documentation tech-
niques. This is also confirmed by the employees of adesso SE. Due to the lack of best practices
in this area, it is still unclear which documentation techniques are specifically suitable for sys-
tems with ML components. With over 9,000 employees and annual sales of EUR 900.3 million
in 2022, adesso Group is one of Germany's largest IT service providers. We aim to provide a
template, focusing requirements analysts to ML specifics and improve the completeness of re-
quirements by specifying ML quality requirements. This approach promises to improve project
success and to help on save development costs. The primary goal of this paper is twofold: (1) to
identify the NFRs of systems with ML components that need to be specified in greater detail
when applying ML and (2) to conceptualize a template for the documentation of the most rele-
vant NFRs. This paper will answer the following research questions (RQ):
• RQ1: How are requirements for systems with ML components documented?
• RQ2: Which NFRs are specific to systems with ML components?
• RQ3: Which documentation styles for systems with ML components are most used in
industrial projects at adesso SE?
• RQ4: Which problems do current elicitation and documentation techniques have for ML
components in industrial projects at adesso SE?
• RQ5: What is the template design to document NFRs for ML systems?
2. Related Works
The following previous works are relevant: The systematic mapping study of Villamizar et al.
[7] covers RE for ML-based systems. They provide an overview of 35 studies (2018-2020) start-
ing from the results of a Scopus search and snowballing. They investigated specific ML based
systems NFRs, based on their frequency in the primary studies, given in brackets hereinafter.
The ML specific requirements are Usability (1), Scalability (1), Modularity (1), Robustness (1),
Autonomy (1), Uncertainty (1), Suitability (1), Accuracy (2), Ethics (2), Accountability (2), Test-
ability (2), Legal requirements (2), Maintainability (3), Performance (3), Safety (4), Reliability (4),
Transparency (5), Fairness (5), Data quality (5), Privacy (6), Explainability (6) and Security (6).
The systematic literature review (SLR) of Yoshioka et al. [6] covers 32 papers (2017-2021), in-
vestigating which techniques are used to document ML system requirements with concrete ex-
amples. They found GORE (i* and KAOS) in 10 cases, UML in 7 and safety cases in 1 case. The
ML specific requirements are Overfitting (2), Fairness (2), dataset requirements (6), Robustness
(6), Accuracy (7), Explainability, Transparency and Accountability (9).
The literature review from Ahmad et al. [3] covers RE for AI and ML based systems. They ana-
lyzed 27 studies (2011-2020). The authors focused on the documentation techniques of require-
ments for systems with AI or ML components with an emphasis on their specific NFRs. First of
all, the GORE notations FLAGS, CORE, GRL, i* und GORE-MLOps are used (5), and with the
same frequency UML and SysML notations (5) and Conceptual Models (CM). Further they sum-
marized ML relevant NFRs: Transparency, Trust, Privacy, Safety, Reliability, Security, Fairness,
Explainability, Ethics, Robustness, Accuracy, Uncertainty, Data quality, Testability, Legal re-
quirements und Availability of training-, validation- and test-data.
The literature review of Gjorgjevikj et al. [54] shows an interesting mixed-method study on the
use of requirements for ML in projects. They validate that RE activities are crucial to the ML
development process. Requirements should cover quality ML specifics, which are Interpretabil-
ity, Fairness, Robustness, Security, Privacy, and Safety, that occurs also in our research. Most
importantly, they state future research should focus on adjusting the RE activities to fit the ML
development. The template described in this study, provides further research to this direction.
The existing SLRs show documentation forms for ML components (RQ1) and specific ML com-
ponents NFRs (RQ2) till 2021. We complete this view of related works to June 2022.
3. State of the Art
In this article, we perform a systematic mapping study according to the principles of Petersen
[53] to provide an overview of the state-of-the-art regarding use of NFR for ML systems, an-
swering RQ1 and RQ2. The mapping study, particularly the data gathering, was performed by
the first and reviewed by the third author. We used principles of Kitchenham und Charters [11]
for study selection and search term construction. Data was gathered in June 2022.
3.1. Method of Literature Review
As RQ1 and RQ2 are closely related, we use one search term for the literature acquisition:
(("requirements engineering") AND (documentation OR specification OR notation
OR "modeling language") AND "machine learning" AND (software OR application OR
system)). For a broad search and a high level of completeness around machine learning, the
search for additional ML sub-terms was omitted. We used the following databases: (i) Spring-
erLink1 as they have broad basis on RE and AI, (ii) ScienceDirect2 focuses on engineering of AI
systems and architectures, (iii) IEEE Xplore3 and (iv) ACM Digital Library4 are engineering da-
tabases for studies with emphasis on AI applications. Inclusion (In) and exclusion (En) criteria
are shown in Table 1. I1 selects studies after the SLRs in related works have been published. I2
assures studies to have minimal scientific standard, whilst the selected databases list peer re-
viewed articles only. I3 includes papers that contribute to the topic of this article. A paper was
selected if a review showed it to offer a contribution to documentation techniques or NFRs for
systems with ML components. E1 filters studies that use ML techniques to support RE: If the
paper’s main focus was on approaches to support requirements activities by AI or ML, the paper
was excluded, as this was not our scope. This criterion was validated by a detailed review of the
article. E2 avoids SLRs or mapping studies, that we already addressed as related works. E3 de-
selects similar studies from the same authorship.
Table 1. In- and exclusion criteria
ID Criterion
I1 publication date Sept. 2021 to June 2022
I2 English language, peer reviewed
I3 contributes to RQ1 or RQ2
E1 Literature focusing on ML for RE
E2 systematic literature reviews or system-
atic mapping studies
E3 duplicates
Fig. 1. Study selection chart
Study Selection. Fig. 1. shows the four phases of study selection. In phase 1 we queried
the selected databases with the presented search term, applied I1 and I2, resulting in 1233 pub-
lications. In phase 2 the title, keywords and abstracts were considered using I3 and E1. In phase
3 we reviewed the content and filtered according I3 and E2. As there were no duplicates E3, we
summed up to 15 relevant publications shown in Table 3. No relevant publication could be
identified from ScienceDirect. The contents of the listed publications did not cover documenta-
tion techniques or NFRs for systems with ML components. The references of the literature from
phase 2 onwards are available for download in the last section.
3.2. Result of the Literature Review
Documentation Techniques of Systems with ML Components.
Four out of 15 selected studies cover techniques for requirements documentation. The other 11
studies cover NFRs of systems with ML components. Zaidi [12] shows the use of conceptual
models (CM) in various phases of ML to document requirements and goals of the project. The
use i* and UML, Business Process Model and Notation (BPMN) [13] and Building Information
Modeling (BIM). The latter is a domain specific notation model for building [14]. Tun et al. [15]
found several requirements documentation notations: An AI Project Canvas is used to document
decision making considerations and to capture the impact on organizational structure due to
1
https://link.springer.com/ 3
https://ieeexplore.ieee.org/Xplore/home.jsp
2
https://www.sciencedirect.com/ 4
https://dl.acm.org/
the ML components. Safety Cases and System-Theoretic Accident Model and Processes (STAMP)
are used for safety analysis, modelling possible accidents and giving evidence for the systems
suitability [16]. The architectural design of the system is described with SysML [17] and the
functional requirements with a Goal Model (GORE) notation.
Khan et al. [18] investigates the use of NFRs for ML systems. They propose an extended
SysML requirements diagram, and a GORE-MLOps model, that addresses uncertainty and un-
predictability in ML systems during RE. Husen et al. [19] proposes a framework for safety-
critical ML systems consisting of the documentation techniques AI project canvas, ML Canvas,
KAOS, UML Component Diagram, STAMP/Systems Theoretic Process Analysis (STPA) and
Safety Cases. The AI project canvas details early business requirements from the system speci-
fication. KAOS is used to document functional requirements that are detailed into an architec-
tural component model. By using safety cases counter measures for risks through STAMP/STPA
are defined, whereas STAMP is a method for risk [19, 20]. Table 2 summarizes these techniques.
Table 2. Identified documentation techniques for ML requirements
Documentat./Study Zaidi [12] Tun et al. [15] Khan et al. [18] Husen et al.[19]
UML, SysML CM X SysML Req.diagr. X
BPMN CM
BIM CM
AI/ML Project Canvas X X
Safety Cases X X
Goal Model i* X GORE-MLOps KAOS
STAMP X X
In summary, GORE notations and UML or SysML diagrams are mostly used for ML require-
ments. Three studies provide details about GORE: i* [12], KAOS [19] and GORE-MLOps [18].
All four works use UML/SysML diagrams: in two papers this is detailed to component diagrams
[19] and extended SysML requirements diagrams [18]. For the documentation of NFRs GORE
[15], GORE-MLOps and an extension of a SysML requirements diagrams [18] is mentioned.
NFRs of Systems with ML Components
Within 14 of the 15 selected studies, we identified in summary 54 NFRs for ML components as
shown in the extraction table available for download as open research data (see last section).
The related works showed four more NFRs that were not included in the 14 selected studies.
We have added the additional four NFRs to the results: (i) Autonomy of ML algorithm, (ii) Suit-
ability, (iii) Legal requirements and (iv) dataset requirements. As there is no common set of ML
specific NFRs yet, we identified NFRs with synonymous terms or describing similar phenomena.
Therefore, we consolidated the 58 identified NFRs into 33 relevant NFRs with the following
descriptions. An overview of these results is given in Table 3.
NFR-1 Suitability describes the appropriateness of the ML usage and the extent to which
the ML solves the given problem [7, 21]. NFR-2 Explainability and Interpretability describes
mechanisms that allows users to comprehend the systems results [22–25]. NFR-3 Justifiabil-
ity Users require system’s decision to be rational [26]. NFR-4 Transparency describes details
about ML algorithms, training- and test data to validate the systems decisions [22, 24, 26, 27].
NFR-5 Traceability For users and developers the source of training and validation data, arti-
facts and processes must be documented [22, 28]. NFR-6 Fairness describes that systems must
respect human rights, equal rights, equal opportunity for all users, and follow the democracy
principles. The results of AI are supposed to be un-biased and discrimination free [22, 25, 29,
30]. NFR-7 Safety describes requirements that avoids risks to humans or the environment [23].
NFR-8 Trust describes users expectation to the reliability of system decisions and the correct-
ness of results [10, 24]. NFR-9 Efficiency of ML algorithm are used to describes quality as-
pects of the ML training and prediction algorithms [26]. NFR-10 Performance of ML model
describes the expectations to performance and correctness of the ML model for prediction and
the resources to gain these predictions [22]. This incorporates accuracy as internal performance
metric for the prediction quality [23, 31, 32] and correctness of the decisions or prediction results
[22, 23]. NFR-11 Latency of ML model defines the acceptable time between data acquisition
and the result of the prediction [33]. NFR-12 Security defines the access to used and created
data (sets) [22, 23]. NFR-13 Privacy considers rights of human privacy due to confidentiality
and protection of data [22, 23]. NFR-14 Integrity of data defines quality attributes to training,
validation and test data, e.g. correctness, preprocessing and data integrity [22, 23, 27]. NFR-15
Dataset requirements capture requirements to data records such as topic, domain, context,
origin of data [34], quantity of data [9, 27]. NFR-16 Accountability describes details to the
extent to which the system takes ownership and responsibility of its decisions [26, 35]. NFR-17
Reliability captures the predicted functioning expectations for the system [23, 25]. NFR-18
Reproducibility and Repeatability address the necessity that predictions must be identical
for multiple requests with the same data [23, 26, 36]. This is synonymously referred to as con-
sistency [37]. NFR-19 Fault tolerance describes expectation to resilience to incorrect data or
partial system failures to avoid complete failure [38], sometimes referred to as robustness [22,
39]. NFR-20 Autonomy of ML algorithm expresses the independence or encapsulation of ML
algorithms. The retraining of a ML subsystem must be possible without depending on the ap-
plication’s development [4, 7]. NFR-21 Maintainability describes the needs regarding further
development as extensions or system evolutions [10, 40]. NFR-22 Modularity defines the re-
quired level of maintainability. A system consist of several modules that work independent but
collectively to provide the necessary functionality [10, 40]. NFR-23 Reusability describes de-
tails about the possibility to reuse parts or components of the system in other contexts or sys-
tems. In ML contexts this could be the reuse of ML models or data sets [40]. This NFR also
covers the domain adaptation, where in transfer learning scenarios labelled data are used to
create models of other domains [41]. NFR-24 Modifiability describes the extent to which a
module can be changed without affecting the ML module quality [10, 40]. NFR-25 Retraina-
bility describes that ML models must be newly trained with other than initial data [27]. NFR-
26 Testability defines, how ML models and their prediction components can be tested and
what the resulting quality of decisions can be [40]. NFR-27 Usability describes the same as-
pects as in ISO/IEC 25010 with a focus on ML models and their decision or prediction presen-
tation to users [10, 27]. NFR-28 Interoperability describes the requirements that allow the
communication and data exchange with other systems [10, 42]. NFR-29 Portability defines to
what extend ML models are supposed to be transferred to other contexts, e.g. a classification of
blood diseases for cancer patients to non-cancer patients [10, 32]. NFR-30 Adaptability (port-
ability of ISO/IEC 25010) describes how the ML model can be used in other contexts such as
operating systems or system environments [10, 27]. NFR-31 Scalability of ML pipeline defines
requirements for processing tools with regard to data volumes, large data sets and runtime is-
sues during training [37, 43]. NFR-32 Complexity of ML model describes the number of fea-
tures the ML model is capable of. A too high model complexity leads to overfitting and a too
low complexity leads to underfitting [44]. NFR-33 Legal requirements covers the specification
of regulatory needs such as standards, acts, laws etc. [1, 45].
In addition to these NFRs, further requirements were mentioned in the primary studies, as
given below. We combined these requirements to the following NFRs: Ethics is part of NFR-33,
Uncertainty is part of NFR-17, data issues are details of dataset requirements in NFR-15 covering
cleaning of datasets, revision and transition is part of Maintainability NFR-21. Completeness is
discussed in Habibullah und Horkoff [37] but is not clearly defined; Flexibility covers the flexi-
bility of the ML pipeline and is categorized as Reusability NFR-23.
4. Handling NFRs in Systems with ML Components in Practice
An online survey (12 persons) and expert interviews (4 experts) with employees were conducted
at adesso SE, to identify the special needs of the documentation of NFRs for systems with ML
components. We used this case to prove the relevance of the NFRs identified by the literature
review and to identify further NFRs (RQ2). Moreover, we wanted to understand the documen-
tation techniques used by adesso SE (RQ3) and related problems of the documentation (RQ4).
4.1. Relevant NFRs of Systems with ML Components
After the analysis of the systematic mapping study, the online survey and the expert interviews,
the following 30 of the 33 NFRs mentioned in Table 3 were identified as especially relevant for
ML components and provide the final answer to RQ2: The NFRs marked in blue in Table 3 have
been classified as relevant in literature research but also in the online survey and expert inter-
views at adesso SE. The NFRs marked in grey were additionally classified as relevant in the
online survey and in the expert interviews. The online survey questions are available for down-
load, given in the last section. Based on the online survey, the following NFRs were rated as the
most relevant NFRs (all of the participants gave a positive response): Suitability, Integrity of
data, Reliability, Latency of ML model, Testability, Explainability and Interpretability, Perfor-
mance of ML model, Security and Retrainability. Through the expert interviews, it was deter-
mined that the NFRs Accountability and Transparency must be considered relevant, even
though they were not classified as relevant by the online survey. The expert interviews showed
that Integrity of data always appears in the top 2 of the most important NFRs. This is justified
by the fact that data is the basic building block of ML and therefore its integrity is necessary.
The questions of the expert interviews are available for download, given in the last section.
Table 3. Frequency of NFRs from the systematic mapping study. The last column shows the number of people who have classified the NFR as relevant and the number
of persons who gave an answer (n) in the online survey.
Llinas et al. [24]
de Oliveira Car-
Husen et al.[19]
Zhang et al.[48]
Nahar et al.[27]
Habibullah and
valho et al.[25]
Khan et al.[18]
Hutchinson et
Tun et al. [15]
McDermott et
Pankiewicz et
Importance
Horkoff [37]
∑ Primary through
Razaulla et
Yurrita et
Russell et
ID NFR
Studies online survey
al.[46]
al.[26]
al.[47]
al.[42]
al.[32]
al.[22]
(#relevant/n)
NFR-1 Suitability 0 11/11
NFR-2 Explainability and Interpretability x x x x x x x x x x x 11 9/9
NFR-3 Justifiability x x x 3 7/8
NFR-4 Transparency x x x x x x x x x 9 3/8
NFR-5 Traceability x x x 3 6/9
NFR-6 Fairness x x x x x x x x x x 10 6/9
NFR-7 Safety x x x x x x x x x 9 9/11
NFR-8 Trust x x x x x x x x 8 8/11
NFR-9 Efficiency of ML algorithm x x 2 7/10
NFR-10 Performance of ML model x x x x x x x x x x x 11 9/9
NFR-11 Latency of ML model x 1 10/10
NFR-12 Security x x x x x x x 7 9/9
NFR-13 Privacy x x x x x x x 7 10/12
NFR-14 Integrity of data x x x x x 5 11/11
NFR-15 Dataset requirements 0 4/6
NFR-16 Accountability x x x x x 5 4/8
NFR-17 Reliability x x x x x x x 7 11/11
NFR-18 Reproducibility and Repeatability x x x x x x 6 8/9
NFR-19 Fault tolerance x x x x x x x 7 9/10
NFR-20 Autonomy of ML algorithm 0 3/9
Llinas et al. [24]
de Oliveira Car-
Husen et al.[19]
Zhang et al.[48]
Nahar et al.[27]
Habibullah and
valho et al.[25]
Khan et al.[18]
Hutchinson et
Tun et al. [15]
McDermott et
Pankiewicz et
Horkoff [37]
Importance
∑ Primary
Razaulla et
Yurrita et
Russell et
ID NFR through
Studies
online survey
al.[46]
al.[26]
al.[47]
al.[42]
al.[32]
al.[22]
NFR-21 Maintainability x x x 3 8/9
NFR-22 Modularity x 1 6/7
NFR-23 Reusability x x 2 4/6
NFR-24 Modifiability x x x 3 6/7
NFR-25 Retrainability x x 2 7/7
NFR-26 Testability x x x x x 5 10/10
NFR-27 Usability x x x 3 6/9
NFR-28 Interoperability x x 2 5/6
NFR-29 Portability x x 2 4/9
NFR-30 Adaptability x x 2 5/8
NFR-31 Scalability of ML pipeline x x x 3 6/8
NFR-32 Complexity of ML model x x x x x x 6 1/8
NFR-33 Legal requirements 0 8/9
∑ 8 3 2 8 10 10 12 3 9 19 7 25 15 14 - -
Table 3 shows that NFRs occur with different frequency in the articles. In particular the work of Habibullah and Horkoff [37] show the
most NFRs, as they performed a comprehensive qualitative interview study in which NFRs for ML were explored in detail. Part of the inter-
view study was to identify NFRs that are more or less important in the industry. The Importance through online survey column shows how
many people classified the respective NFR as relevant in the online survey. For each NFR, the participants were asked to state whether it is a
relevant NFR for systems with ML components. The participants could respond to the statement as follows: strongly disagree, disagree,
neither, agree and strongly agree. When evaluating the relevance of the NFRs, the answers with the selection Neither were sorted out in
relation to neutrality. An NFR was categorized as relevant for systems with ML components if more than 50% of the participants in the online
survey agreed with the statement. Only 14 of the 15 papers identified in Fig. 1. are listed in this table, as the paper from Zaidi [12] only deals
with documentation techniques and therefore makes no reference to the NFRs of systems with ML components. This table answers RQ2.
4.2. Documentation Techniques for Systems with ML Components
The online survey at adesso SE showed that employees use the following techniques to docu-
ment NFRs (in order of occurrences): user stories, sentence templates, to be concepts, UMLs
(e.g. use case diagram), entity relationship models (ERM), tests, service level agreements (SLA),
authorization concepts, mockups and usability standards. The first two techniques were also
identified as appropriate by online survey participants. These findings answer RQ3: Using user
stories, sentence templates, NFRs can be documented in natural language. The expert interviews
also confirm this finding.
4.3. Problems with the Documentation of NFRs
Through the online survey and expert interviews, the following problems in documenting NFRs
were identified, which provide the answer to RQ4: (1) No consistent, appropriate, or proper
format in documentation, (2) definition problems in certain situations, (3) insufficient specifica-
tion and thus missing information, which later led to problems during implementation, and (4)
missing technical aspect in the requirements documentation.
4.4. Design of the Template for the Documentation of NFRs
Based on project examples, the specifics of documenting NFRs for systems with ML components
were examined in the expert interviews. Table 4 shows the roles, skills and the work experience
in years of the experts. The projects did not use a standard template for the documentation of
NFRs. For example, in one project a combination of user story and key results was used to
document the requirements, and in another project the requirements were documented without
using an appropriate sentence template or similar. However, each requirement was prioritized
using the MoSCoW prioritization technique [1]. Based on the results of the expert interviews
and a literature review, a template was designed that can be used to document NFRs as a natural
language documentation technique in a uniform manner. A natural language documentation
Table 4. Details of the experts from adesso SE, data & analytics division for the interview
Expert Professional Role Skills Work experience
Person 1 Data Scientist and ML-Engineer high five years
Person 2 Architect, Consultant, Requirements Engineer and Project Manager high nine years
Person 3 Data Scientist high ten years
Person 4 AI Consultant high n/a
technique was chosen because natural language documentation techniques are used most fre-
quently at adesso SE. This does not exclude the possibility that the specified template can also
be supplemented by model-based documentation techniques. After the initial conception, the
template was evaluated with a further five experts in semi-structured expert interviews at
adesso SE. The template used for the evaluation is available for download in the last section.
The expert interviews showed beneficial aspects of the template and some need for optimiza-
tion. The experts were asked whether they would recommend the first conception of the tem-
plate to colleagues, measured by the Net Promoter Score (NPS). NPS is a metric that categorizes
people into three groups, promoters, passives, and detractors based on a question that can be
answered on a scale of 0-10. Only the number of detractors and promoters is required to calcu-
late the NPS, as the percentage of detractors is subtracted from the percentage of promoters
[49]. Out of five experts, two experts have a neutral attitude (passives), one expert has a critical
attitude (detractor) and two experts have a positive attitude (promoters) towards the present
template. The expert with the critical attitude mentioned that s/he would rate the template with
a much higher value if the role and benefits were taken out of the template. The benefit field is
covered by the justification of priority, which is why this field does not offer any added value.
Another expert agreed that the role should be removed, as this does not add any value. The
benefit field and the in the sentence template have therefore been removed. But the Other
notes (optional) field was added to the template. The following NPS was determined for the first
draft of the template: NPS = ((2-1)/5) x 100 = 20%. According to Lee [50], 20% is a good
value. The improved template can be found in Table 5. and provides the answer to RQ5. Fur-
thermore, an example can be seen in Table 6. The template supports the management of re-
quirements through consistent documentation and the relation the NFR-classes in Table 3 de-
fined by the NFR-class and the object of consideration in the template. The NFR-class, for ex-
ample, is relevant because different metrics must be defined in the acceptance criteria depend-
ing on the NFR-class. These metrics are not considered in more detail in this paper but would
need to be analyzed and defined in more detail in further research work. The NFR-classes are
only an example in our template. These can be extended by further NFR-classes. In addition,
further research could determine whether the template could be expanded to include further
elements. Additionally, the template was evaluated in a second company Serapion, presented
below. This template is the answer to RQ5.
Table 5. Customized template for the documentation of the most relevant NFRs
Identificator .
Acceptance criteria AC1: (expandable list)
NFR-class (according [Suitability, Integrity of data, Reliability, Latency of ML model, Testability,
Tab.3) Explainability and Interpretability, Performance of ML model, Security or Retrainability]
Object of consideration