<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Requirements Formalization for Automatic Test Case Generation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Robin Gröpler</string-name>
          <email>robin.groepler@ifak.eu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Viju Sudhi</string-name>
          <email>viju.sudhi@ifak.eu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emilio José Calleja García</string-name>
          <email>emiliojose.calleja@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andre Bergmann</string-name>
          <email>andre.bergmann@akka.eu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>AKKA Germany GmbH</institution>
          ,
          <addr-line>80807 München</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>ifak Institut für Automation und Kommunikation e.V.</institution>
          ,
          <addr-line>39106 Magdeburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Due to the growing complexity and rapid changes of software systems, the assurance of their quality becomes increasingly dificult. Model-based testing in agile development is a way to overcome these dificulties. However, major efort is still required to create specification models from a large set of functional requirements provided in natural language. This paper presents an approach for a machineaided requirements formalization technique based on Natural Language Processing (NLP) to be used for an automatic test case generation. The goal of the presented method is to automate the process of model creation from requirements in natural language by utilizing appropriate algorithms, thus reducing cost and efort. The application of our procedure will be demonstrated using an industry example from the e-mobility domain. In this example, requirement models are generated for a charging approval system within a larger vehicle battery charging application. Additionally, existing tools for automated model synthesis and test case generation are applied to our models to evaluate whether valid test cases can be generated.</p>
      </abstract>
      <kwd-group>
        <kwd>Generation</kwd>
        <kwd>Requirements analysis</kwd>
        <kwd>natural language processing</kwd>
        <kwd>test generation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In the life cycle of a device, component or system in industrial use, a rapidly changing and
growing number of requirements and the associated increase in features and feature changes
inevitably lead to an increasing efort for verifying requirements and testing of the
implementation. To manage test complexity and reduce necessary test efort and cost, agile methods
for model-based testing have been developed [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The efectiveness of model-based testing
highly depends on the quality of the used specification model. The creation and maintenance
of well-defined specification models is therefore crucial and usually comes with high efort and
cost. This is especially true in agile development, where requirements are subject to frequent
changes.
      </p>
      <p>In this context, an approach for requirements-based testing was developed that enables
eficient test processes, see Fig.</p>
      <p>1. Model synthesis and model-based test generation methods are
used to systematically and eficiently create a test suite that contains suitable test cases. This
approach is based on behavioral requirements that serve as input for model synthesis. The only</p>
      <p>Functional
requirements
Text documents</p>
      <p>Sequence</p>
      <p>models
Req. 1 Req. 2 Req. n</p>
      <p>Specification</p>
      <p>model
UML state machine</p>
      <p>Generated
test cases
Abstract test cases
TCT1CT2CT3CT4CT5C6
Requirements</p>
      <p>Formalization
(semi-automated)</p>
      <p>ReForm</p>
      <p>Model
Synthesis
(automated)
ModGen</p>
      <p>Test Case
Generation
(automated)</p>
      <p>TCG
time-consuming manual step is the creation of requirement models from textual requirements
documents.</p>
      <p>
        Recent advances in natural language processing show promising results to organize and
identify desired information from raw text. As a result, NLP techniques show a growing interest
in automating various software development activities like test case generation. Several NLP
approaches and tools have been investigated in recent years aiming to generate test cases from
preliminary requirements documents [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ]. A major drawback of existing methods is the use
of controlled natural language or templates that force the requirements engineer or designer
not only to concentrate on the content but also on the syntax of the requirement. Furthermore,
those algorithms are in general not applicable to existing requirements.
      </p>
      <p>In this work, we propose a new, semi-automated technique for requirements-based model
generation that reduces human efort and supports frequent requirements changes and extensions.
The aim of our approach is to develop a method that
1) can handle an extended range of domains and formats of requirements, i.e. it is not limited
to a specific template or controlled natural language, and
2) provides enhanced but easily interpretable intermediate results in the form of a textual
and graphical representation of UML sequence diagrams.</p>
      <p>Our approach utilizes an existing NLP parser to obtain basic syntactic information about the
words and their relationship to each other. Based upon this information, several rule-based
steps are performed in order to identify relevant syntactic entities which are then mapped to
semantic entities. Finally, these entities are used to form requirement models as UML sequence
diagrams. The main contributions of this work are
1) the development of a rule-based approach based on NLP information that automates the
various steps involved in deriving requirement models, and
2) the evaluation on an industrial use case using meaningful metrics that demonstrates the
good quality of our approach.</p>
      <p>The paper is structured as follows. In Section 2, we briefly outline related work on
NLPbased requirements formalization methods. In Section 3, we present the individual steps of our
methodology for deriving requirement models from textual descriptions. The method is applied
to the battery charging approval system presented in Section 4. In Section 5, we define several
evaluation metrics and demonstrate the results of the application. Finally, a conclusion and
outlook is given in Section 6.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>
        In order to circumvent the challenges of analyzing highly complex requirements, many authors
restrict their NLP approaches to a specific domain or a prescribed format. In [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the authors
propose an algorithm creating activity diagrams from requirements following a predefined
structure. They consider the SOPHIST method which performs a refinement and formalization
of structured texts by introducing text templates with a defined syntactical structure [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], a
small set of structural rules was developed to address common requirement problems including
ambiguity, complexity and vagueness. In [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], requirements are expected to be written in a
controlled natural language and are supposed to be from Data-Flow Reactive Systems (DFRS).
The approach in [9] is to generate test cases from use cases or user stories, both of which
have to comply with a specified format. In [ 10], requirements engineers shall be supported
with formalization of use case descriptions by writing pre-conditions and post-conditions in a
predefined format from which test cases can be generated automatically. Likewise [ 11] aims to
ifnd and complete missing test cases by parsing exceptional behaviors from Javadoc comments
written in natural language, provided the documentation is in a specified template. [ 12] relies
on the artifacts that the programmers create while developing the product which belong to a
smaller subset of specifications.
      </p>
      <p>Even for simple syntactical structures of requirements it is still necessary to enable the
requirements engineer to review the intermediate results, i.e. the generated model artifacts, and
to adjust them where necessary. The toolchain of [13] involves eliciting requirements according
to Restricted Use Case Modeling (RUCM) specifications. This applies to the work of [ 14], where
the authors attempt to generate executable test cases from a Restricted Test Case Modeling
(RTCM) language which restricts the style of writing test cases. This becomes an additional
overhead to the requirement engineers who draft formal requirements. Additionally, the users
are expected to inspect the generated OCL constraints before proceeding to test case generation.
Similarly, in [15] the authors explore the possibility of test case generation using Petri Net
simulation; however the interpretability of Colored Petri Nets as proposed in the approach
may vary depending on the user’s level of expertise. These intermediate results may not be
easily understood by the user and it may be cumbersome for him to fine-tune or modify the
predictions before generating reliable test cases.</p>
      <p>A notable work from the authors of [16] makes use of recursive dependency matching to
formulate test cases. Though our approach aligns with theirs in this step, we attempt to generate
test cases from a broader set of functional requirements while they restrict themselves with
user stories from which a cause-efect relationship can be learnt.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>We utilize an existing NLP parser and use a rule-based algorithm to perform the transformation
from requirements written in natural language to requirement models. Our rule set tries to
conceive all relevant rules that could satisfactorily parse the input behavioral requirement and
extract its semantic content.</p>
      <sec id="sec-3-1">
        <title>3.1. Linguistic pre-processing</title>
        <p>The behavioral requirements are, in general, complex by nature. In order to reliably extract the
syntactic and semantic content of these requirements, a thorough linguistic pre-processing is
indispensable. For this stage, we rely on spaCy (v2.1.8) [17] - a free, open-source library for
advanced Natural Language Processing. We follow the basic NLP pipeline including tokenization,
lemmatization, part-of-speech (POS) tagging and dependency parsing in various stages of the
algorithm.</p>
        <sec id="sec-3-1-1">
          <title>3.1.1. Pronoun resolution</title>
          <p>Though the formal requirements tend to avoid first person (I, me, etc.) or second person (you,
your, etc.) pronouns, they may contain third person neutral pronouns (it, they, etc.) [18].
These pronouns are identified and resolved with the farthest subject, inline with the algorithm
proposed in [19] and [20]. Owing to the simplicity of the task, we assume there is no particular
need to use more sophisticated algorithms checking grammatical gender and person while
resolving pronouns. However, we attempt to resolve pronouns only if the grammatical number
of the pronoun agrees with that of the antecedent. Since pleonastic pronouns (pronouns without
a direct antecedent) do not afect the algorithm, they are cited but not replaced.
Example: Consider the requirement ”If the temperature of the battery is below Tmin or it
exceeds Tmax, charging approval has to be withdrawn”. Here, the pronoun it is resolved with
its antecedent the temperature of the battery.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>3.1.2. Decomposition</title>
          <p>Textual requirements with multiple conditions and conjunctions are hard to be transformed
and mapped to individual relations. This demands decomposition of complex requirements
into simple clauses [21]. Multiple conditions (sentences with multiple if s, whiles, etc.), root
conjunctions (sentences with multiple roots connected with a conjunction) and noun phrase
conjunctions (sentences with multiple subjects and/or objects connected with a conjunction)
are decomposed to simple primitive clauses.</p>
          <p>We resort to the syntactic dependencies obtained from the parser to decompose requirements.
The algorithm considers the token(s) with dependency mark to decompose multiple conditions
and dependency conj for decomposing root and noun phrase conjunctions. The span of the
sub-requirement can then be defined by identifying the edges (for e.g. the left-most edge refers
to the token towards the left of the dependency graph with which the parent token holds a
syntactic dependency) of the token of interest.</p>
          <p>Example: In the requirement ”If the temperature of the battery is below Tmin or the
temperature of the battery exceeds Tmax, charging approval has to be withdrawn”, the root conjunction
(arising from the two roots is and exceeds) and the subsequent multiple conditions (arising
from if ) are decomposed to three sub-requirements as ”[if the temperature of the battery is
below Tmin] or [if the temperature of the battery exceeds Tmax], [charging approval has to be
withdrawn]”.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Syntactic entity identification</title>
        <p>Almost all behavioral requirements describe a particular action (linguistically, verb) done by an
agent (linguistically, subject) on the system of interest (linguistically, object). This motivates
the idea of identifying syntactic entities from the requirements. The algorithm identifies these
syntactic entities by checking the dependencies of tokens with the root.</p>
        <p>1) Action: The main action verb in the requirement (mostly, with the dependency ROOT )
is identified and called an action. The algorithm particularly distinguishes the type of
actions as: Nominal action which has a noun and a verb together (e.g. send a message),
Boolean action which can take a Boolean constraint (e.g. is withdrawn) and Simple action
which has only an action verb (e.g. send).</p>
        <p>In addition, the algorithm also tries to identify the verb type(s) (transitive, dative,
prepositional, etc.) as suggested in [21] to supplement the syntactic significance of action types.</p>
        <p>This is essential particularly when we rely on action types for relation formulation.
2) Subjects and Objects: The tokens with dependencies subj and obj (and their variants
like nsubj, pobj, dobj, etc.) are identified mostly as Subjects and Objects, respectively.
They can be noun chunks (e.g. temperature of the battery), compound nouns (e.g. battery
temperature) or single tokens (e.g. battery) in the requirement.</p>
        <p>Also, we noted that there are several requirements involving a logical comparison (identified as
an adjective or an adverb) between the expressed quantities. In order to identify comparisons
(e.g. greater than, exceeds, etc.) in the requirement, we utilize the exhaustive synonym hyperlinks
from Roget’s Thesaurus [22] and map them to the corresponding equality (=), inequality (!=),
inferiority (&lt;, &lt;=) and superiority (&gt;, &gt;=) symbols.</p>
        <p>Example: From the sub-requirements ”[if the temperature of the battery is below Tmin] or
[if the temperature of the battery exceeds Tmax], [charging approval has to be withdrawn]”,
the system identifies Battery_Temperature and Charging_Approval as Subjects, Tmin and Tmax
as Objects and withdrawn as a Boolean Action. Also, the comparison term below is mapped as &lt;
and exceeds is mapped as &gt;.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Semantic entity identification</title>
        <p>Semantic entities are tightly coupled with the end application which translates the parsed
syntactic information to sequence diagrams and then to abstract test cases. The semantic
entities are defined from the perspective of interactions in a sequence diagram and are outlined
below. The algorithm derives these entities from their syntactic counterparts1.
1) Actor or Component: The participants involved in an interaction are defined as actors
and components. To diferentiate other participants from the system under test (SUT),
component is always considered as the SUT.
2) Signal: The interaction between diferent participants is defined as a signal.
3) Attributes: The variables holding the status at diferent points of interaction are defined
as attributes.
4) State: This refers to the initial, intermediate and final states of an interaction.</p>
        <p>Semantic entities demand additional details for completeness. For example, if the value of an
attribute is not given, it can not be initialized in its corresponding test case. Likewise, for each
signal the corresponding actor needs to be identified. For each requirement, the direction of
communication (incoming: towards the system under test or outgoing: from the system under
test) should be identified. In cases where the algorithm lacks the desired ontology information,
user input is demanded to update these values.</p>
        <p>It is worth noting that the separation of the entities as syntactic (application independent but
grammar dependent) and semantic (application dependent but grammar independent) gives
more flexibility to the algorithm to be used in parts also in a diferent environment than the
description language considered here. However, the mapping from the syntactic entities to their
semantic counterparts can be completely automated with stricter rules or can be accomplished
with user intervention and validation.</p>
        <p>Example: From the sub-requirements ”[if the temperature of the battery is below Tmin] or
[if the temperature of the battery exceeds Tmax], [charging approval has to be withdrawn]”,
the identified Subjects ( Battery_Temperature and Charging_Approval) are mapped as Signals and
the identified Objects ( Tmin and Tmax) are mapped as Attributes. Here, the identified Action
withdrawn is also considered as an Attribute owing to the semantics of its corresponding Boolean
Signal. Additionally, we can arrive at the Actor for Battery_Temperature as battery. However,
the Actor of Charging_Approval is ambiguous (or rather unknown). Likewise, Attribute values
should either be passed by the user or they remain uninitialized in the resulting test case.</p>
        <p>1Note that the algorithm maps syntactic to semantic entities with more complex rules (including action types
and verb types). In Table 1, we have presented only the most primitive ones for brevity. This diference is also
detailed in the example where an Action is considered as an Attribute and a Subject is mapped to a Signal.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Transformation to requirement model</title>
        <p>For the description of the formal requirements a simple text-based domain-specific language
(DSL) is used, the ifak requirements description language (IRDL) [23]. This notation for
requirement models was developed on the basis of UML sequence diagrams and is specially adapted to
the needs of describing requirements as sequences. The IRDL defines a series of model elements
(e.g. components, messages) with associated attributes (name, description, recipient, sender,
etc.) and special model structures (behavior based on logical operators or time). Functional,
behavior-based requirements are described textually using IRDL and can then be visualized
graphically as sequence diagrams (Fig. 2).</p>
        <p>Once the entities are mapped and validated, the algorithm forms IRDL relations for each
clause and then combines them together to form relations for the whole requirement. IRDL
defines mainly two types of relations:
1) Incoming messages: SUT receives these messages provided the guard expression evaluates
to be true and then continues to the next sequence in an interaction. IRDL defines this
class of messages as ’Check’.
2) Outgoing messages: SUT sends these messages to other interaction participants with the
content defined in the signal. In IRDL, these messages are denoted as ’Message’.</p>
        <p>Check(Actor-&gt;Component): Signal[guard expression];</p>
        <p>Message(Component-&gt;Actor): Signal(signal content);</p>
        <p>As an intermediate result, the user is shown the formulated IRDL relations along with
the sequence diagram corresponding to the requirement and is asked if the IRDL and the
corresponding sequence diagram are correct. In case the user wants to further modify the
relation formulation, the algorithm repeats from the mapping of syntactic entities to semantic
entities. This continues until the user confirms the model is satisfactory.</p>
        <p>Example: IRDL relations for the example requirement ”If the temperature of the battery
is below Tmin or it exceeds Tmax, charging approval has to be withdrawn”, after the
abovementioned steps is shown in Fig. 2.</p>
        <p>Textual representation (IRDL)</p>
        <p>Graphical representation
system
battery
unknown_actor
State iState_001 at system;
Check(battery-&gt;system):Battery_Temperature</p>
        <p>[msg.value &lt; Tmin || msg.value &gt; Tmax];
Message(system-&gt;unknown_actor):</p>
        <p>Charging_Approval(false);
State fState_001 at system;
iState_001</p>
        <p>Battery_Temperature
fState_001</p>
        <p>Charging_Approval</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Model synthesis and test generation</title>
        <p>The formalized requirements of the SUT can be combined to a specification model using existing
methods for model synthesis [23]. The sequence elements described before, are transformed
using a rule-based algorithm into equivalent elements of a UML state machine.</p>
        <p>After model synthesis, test cases can be automatically generated from the state machine
using an existing method for model-based test generation [24]. Selecting a specific graph-based
coverage criteria such as ”all paths”, ”all decisions”, ”all places” or ”all transitions”, the state
machine is transformed into a special form of a Petri net from which abstract test cases in the
form of sequence diagrams can be generated. In this way, the approach allows modeling of
even complex system behavior and applying graph-based coverage criteria to the entire system
model.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Application</title>
      <p>The toolchain for requirements-based model and test case generation presented in the previous
section is applied to an industrial use case from the e-mobility domain. The use case describes
a system for charging approval of an electric vehicle in interaction with a charging station.
The industrial use case was defined by AKKA and aims to provide a typical basic scenario and
development workflow in software development for an automotive electronic control unit (ECU).
It does so by defining requirements, using model-based software development and deploying
the functionality on an ECU.</p>
      <p>The use case has to be seen in the context of an electric car battery that is supposed to be
charged. The function “charging approval” implements a simple function, which decides upon
specific input signals, if the charging process of the battery is allowed or not. For example,
charging approval is given or withdrawn depending on the battery temperature, voltage or state
of charge, the requested current is adjusted according to the battery temperature, and error
behavior is handled for certain conditions. This is a continuous process, i.e. the signal values
may change over time. A more detailed technical description of the industrial use case can be
found in [25]. To fulfill the requirement of model-based software development, the module is
implemented in Matlab Simulink. Matlab Simulink Coder is used to generate C/C++ code that
can be compiled and deployed to the target. A Raspberry Pi is used to simulate some but not all
aspects of an ECU. A basic overview of the charging approval system and its interfaces to the
environment is given in Fig. 3.</p>
      <p>Environment</p>
      <p>Velocity
Parking brake</p>
      <p>Ignition</p>
      <p>Charging
Approval</p>
      <p>Charging Approval</p>
      <p>State of Charge
Temperature of Battery</p>
      <p>Environment</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>The battery charging approval system described in the former section is used to evaluate the
proposed method. We first define the used evaluation metrics and then demonstrate the results.
To our knowledge, there are no available tools with similar input and output properties as our
tool that enable a direct comparison.</p>
      <sec id="sec-5-1">
        <title>5.1. Evaluation metrics</title>
        <p>Let  be the set of textual requirements. For a requirement  ∈  , let   be the set of expected
artifacts and   be the set of generated artifacts. Here, artifacts refer to all the semantic entities
including the relation indicators. Let  = ⋃∈   denote the set of expected artifacts in all
requirements and  = ⋃∈   the set of generated artifacts in all requirements. Then we define
the following metrics to measure the performance of the method.</p>
        <p>1) Completeness: For an individual requirement, this metric denotes the number of
expected artifacts  ∈   for which a corresponding (not necessarily identical) generated
artifact  ∈   exists, in relation to the total number of expected artifacts |  |.
2) Correctness: For an individual requirement, this metric denotes the number of
generated artifacts  ∈   for which a corresponding, semantically identical (up to naming
conventions) expected artifact  ∈   exists, in relation to the total number of generated
artifacts |  |.
3) Consistency: This metric denotes the number of generated artifacts  ∈  for which a
corresponding expected artifact  ∈  exists and is used identically in all requirements
 ∈  , in relation to the total number of generated artifacts | | .</p>
        <p>The macro average for completeness and correctness, respectively, is then given by the mean
value of all individual percentage values for all  ∈  . The micro average is given by the sum of
all values in the numerator divided by the sum of all values in the denominator for all  ∈  .
Example: In order to assert the evaluation metrics in detail, consider the requirement clause
’if the SoC of the battery is below SoC_max’.</p>
        <p>Expected: C h e c k ( c h a r g i n g _ m a n a g e m e n t - &gt; s y s t e m ) : B a t t e r y _ S o C [ m s g . I a l u e &lt; S o C _ m a x ] ;
Generated: C h e c k ( b a t t e r y - &gt; s y s t e m ) : b a t t e r y _ S o C [ m s g . v a l u e &lt; S o C _ m a x ] ;</p>
        <p>For the metric completeness, we check if all the expected artifacts (i.e. Check,
charging_management, Battery_SoC, etc.) are generated by the algorithm. In this case, we can see that all
of them were generated. For obtaining the correctness, we check if those generated artifacts
are semantically correct. In this case, though we expect the actor charging_management, the
algorithm generates battery. This reduces the value of correctness. If the algorithm generates
battery_SoC for every occurrence of ’SoC of battery’ across all the requirements, it is considered
consistent for this artifact.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Requirements formalization</title>
        <p>As part of the demonstrated use case, AKKA has provided functional requirement documents
describing the expected behavior for the relevant SUT. To apply the NLP-based requirements
formalization method, each statement is treated as a separate entity for which a well-defined
requirement model is created. Overall, the charging approval SUT is described by 14 separate
requirement statements.</p>
        <p>The results of our evaluation are shown in Table 2. We have determined the individual
correctness and completeness values and calculated the macro and micro average for them.
We avoided double counting of identical entity detections as not to skew the results. As
mentioned above, if an actor or value of an attribute is not explicitly mentioned in the textual
requirement, it cannot be detected by the algorithm. Therefore we also show the results using
domain knowledge, which could be in the form of a predefined list of signals, attributes, etc. or
integrated by direct user interaction from an expert with knowledge about the system.</p>
        <p>As one can observe, the method shows good results, most of the signals and other artifacts
were detected correctly and completely. Having a list of artifact declarations in advance
produces even more accurate predictions. Thus, our NLP-based approach shows a good quality
and supports the generation of the formal requirement model to a significant extent.</p>
        <p>A comparison of the time for its creation, both with and without the provided tool is not
measured directly. However, from our experience of former and the presented use case it takes
a lot of time for a requirements engineer to get into the description language for sequence
diagrams by reading documentations and having discussions, to create the logical structure
and to add all the details to the model manually. The new semi-automated approach supports
the user in a great manner. It gives a first proposal of the requirement model in a textual and
graphical view and provides options for handeling unclear points. This should therefore save a
lot of time, even though a manual review of the created model is still required.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Model synthesis and test generation</title>
        <p>For the next step, the requirement models of the charging approval system were used as the
input for model synthesis using ifak’s prototypical tool ModGen. Since the NLP-based approach
treats every requirement as a separate entity, it is also necessary to connect each requirement by
modelling the boundaries explicitly. As a result, a graph-based representation of the functionality
as described by the requirements was generated in the form of a UML state machine (Fig. 4).
The generated model contains 6 states and 20 transitions with appropriate signals, guards and
actions. The semantic as well as syntactic validity of the generated UML state machine could
be confirmed by a thorough evaluation based on the initial requirement documents and by
checking for deadlocks and livelocks. It could be shown that no further manual editing of the
model is required for a full description of the behavior of the system.</p>
        <p>In this evaluation, ifak’s prototypical tool TCG with the coverage criteria “all-paths” was
selected, which ensures that each possible path in the utilized model is covered by at least one
test case. By utilizing the existing algorithm for test generation, a total of 73 test cases were
generated. In Fig. 4, one of the generated test cases is visualized in the form of a sequence
diagram. Here, a test system (TS) interacts with the SUT (charging approval) and provides a
number of parameters, upon which the system decides if charging approval is given.</p>
        <p>Overall, it can be shown that valid abstract test cases are generated based on the specification
model. Using an appropriate framework for test case execution and a suitable test adapter, the
generated test cases could be used for validation of the functional behavior of the SUT.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this work, an NLP-based method for machine-aided model generation from textual
requirements is presented. The method is designed to cover a wide range of requirements formulations
without being restricted to a specific domain or format. Further, the generated requirement
models are given in a user-friendly, comprehensible textual and graphical representation in the
form of sequence diagrams.</p>
      <p>We evaluated our approach on the industrial use case of a battery charging approval system
and showed that the algorithm can produce complete, correct and consistent artifacts to a high
degree. We have also shown how these artifacts are then used to create sequence diagrams
for each requirement and transformed into a state machine for the entire specification model
to finally generate abstract test cases. With the proposed semi-automated approach, we aim
to reduce the human efort of creating test cases from textual requirements to validating the
generated requirement models. In future versions of this prototypical implementation, we intend
to refine the rule-based approach further, thus reducing the need for manual modifications. One
possible solution to this regard could be training a Named Entity Recognition (NER) algorithm
to identify the semantic entities, however at the cost of intensive labelling work. Another
solution could be to rely on (pre-trained) Semantic Role Labels (SRL).</p>
      <p>This study is still research-in-progress, since even more complex textual requirements have
to be considered for future applications. The use of the methodology is also conceivable in
other domains, such as in rail, industrial communication and automotive. In future work, we
therefore intend to analyze how we can improve the method to cover more application domains.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This research was funded by the German Federal Ministry of Education and Research (BMBF)
within the ITEA 3 projects TESTOMAT under grant no. 01IS17026G and XIVT under grant
no. 01IS18059E. We thank our former colleague Martin Reider and our research assistant Libin
Kutty from ifak for the valuable contributions to this paper. We also thank AKKA Germany
GmbH for providing an industrial use case for the evaluation of the presented method.
[9] S. C. Allala, J. P. Sotomayor, D. Santiago, T. M. King, P. J. Clarke, Towards Transforming</p>
      <p>User Requirements to Test Cases Using MDE and NLP, in: COMPSAC, 2019, pp. 350–355.
[10] C. Nebut, F. Fleurey, Y. Le Traon, J.-M. Jezequel, Automatic test generation: A use case
driven approach, IEEE Transactions on Software Engineering 32 (2006) 140–155.
[11] A. Gofi, A. Gorla, M. D. Ernst, M. Pezzè, Automatic generation of oracles for exceptional
behaviors, in: ISSTA, 2016, pp. 213–224.
[12] A. Blasi, A. Gofi, K. Kuznetsov, A. Gorla, M. D. Ernst, M. Pezzè, S. D. Castellanos,
Translating code comments to procedure specifications , in: ISSTA, 2018, pp. 242–253.
[13] C. Wang, F. Pastore, A. Goknil, L. Briand, Automatic Generation of Acceptance Test Cases
from Use Case Specifications: an NLP-based Approach , IEEE Transactions on Software
Engineering (2020) 1–38.
[14] T. Yue, S. Ali, M. Zhang, RTCM: a natural language based, automated, and practical test
case generation framework, in: ISSTA, 2015, pp. 397–408.
[15] B. C. F. Silva, G. Carvalho, A. Sampaio, Test Case Generation from Natural Language</p>
      <p>Requirements Using CPN Simulation, in: SBMF, 2015, pp. 178–193.
[16] J. Fischbach, A. Vogelsang, D. Spies, A. Wehrle, M. Junker, D. Freudenstein, Specmate:</p>
      <p>Automated creation of test cases from acceptance criteria, in: ICST, 2020, pp. 321–331.
[17] spaCy, Industrial-strength Natural Language Processing in Python, 2020. URL: https:
//spacy.io/.
[18] H. Yang, A. de Roeck, V. Gervasi, A. Willis, B. Nuseibeh, Analysing anaphoric ambiguity
in natural language requirements, Requirements Engineering 16 (2011) 163–189.
[19] S. Lappin, H. J. Leass, An algorithm for pronominal anaphora resolution, Computational</p>
      <p>Linguistics 20 (1994) 535–561.
[20] L. Qiu, M.-Y. Kan, T.-S. Chua, A Public Reference Implementation of the RAP Anaphora</p>
      <p>Resolution Algorithm, in: LREC, 2004, pp. 291–294.
[21] D. K. Deeptimahanti, R. Sanyal, An Innovative Approach for Generating Static UML</p>
      <p>Models from Natural Language Requirements, in: ASEA, 2008, pp. 147–163.
[22] Roget’s Hyperlinked Thesaurus, Categories of notions, 2020. URL: http://www.roget.org/
scripts/hier.php/?class=I&amp;division=0&amp;section=III.
[23] S. Magnus, T. Ruß, J. Krause, C. Diedrich, Modellsynthese für die Testfallgenerierung
sowie Testdurchführung unter Nutzung von Methoden zur Netzwerkanalyse, at -
Automatisierungstechnik 65 (2017) 73–86.
[24] J. Krause, Testfallgenerierung aus modellbasierten Systemspezifikationen auf der Basis
von Petrinetzentfaltungen, Ph.D. thesis, Otto-von-Guericke-Universität Magdeburg, 2012.
[25] D. Grujic, T. Henning, E. J. C. García, A. Bergmann, Testing a Battery Management System
via Criticality-based Rare Event Simulation, preprint, arXiv:2107.00530 [cs.SE], 2021.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ammann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ofutt</surname>
          </string-name>
          , Introduction to Software Testing , 2nd ed., Cambridge University Press,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Escalona</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Gutierrez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mejías</surname>
          </string-name>
          , G. Aragón,
          <string-name>
            <surname>I. Ramos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Torres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. J.</given-names>
            <surname>Domínguez</surname>
          </string-name>
          ,
          <article-title>An overview on test generation from functional requirements</article-title>
          ,
          <source>Journal of Systems and Software</source>
          <volume>84</volume>
          (
          <year>2011</year>
          )
          <fpage>1379</fpage>
          -
          <lpage>1393</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>I.</given-names>
            <surname>Ahsan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. H.</given-names>
            <surname>Butt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. W.</given-names>
            <surname>Anwar</surname>
          </string-name>
          ,
          <article-title>A comprehensive investigation of natural language processing techniques and tools to generate automated test cases</article-title>
          , in: ICC,
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>V.</given-names>
            <surname>Garousi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bauer</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Felderer, NLP-assisted software testing: A systematic mapping of the literature</article-title>
          ,
          <source>Information and Software Technology</source>
          <volume>126</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Riebisch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hubner</surname>
          </string-name>
          ,
          <article-title>Traceability-Driven Model Refinement for Test Case Generation</article-title>
          , in: ECBS,
          <year>2005</year>
          , pp.
          <fpage>113</fpage>
          -
          <lpage>120</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rupp</surname>
          </string-name>
          , Requirements-Engineering und
          <article-title>-Management: Aus der Praxis von klassisch bis agil</article-title>
          , 6th ed.,
          <source>Hanser</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mavin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wilkinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Harwood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Novak</surname>
          </string-name>
          , Easy Approach to Requirements Syntax (EARS), in: RE,
          <year>2009</year>
          , pp.
          <fpage>317</fpage>
          -
          <lpage>322</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G.</given-names>
            <surname>Carvalho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Barros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Carvalho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cavalcanti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mota</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sampaio</surname>
          </string-name>
          ,
          <article-title>NAT2TEST Tool: From Natural Language Requirements to Test Cases Based on CSP</article-title>
          , in: SEFM,
          <year>2015</year>
          , pp.
          <fpage>283</fpage>
          -
          <lpage>290</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>