Towards Using Formal Methods in Prototyping:
               Advantage or Impediment?
       Sebastian Schirmer                       Tino Teige                      Christoph Torens                    Udo Brockmeyer
      Unmanned Aircraft       BTC Embedded Systems AG       Unmanned Aircraft       BTC Embedded Systems AG
German Aerospace Center (DLR)    Oldenburg, Germany   German Aerospace Center (DLR)    Oldenburg, Germany
     Brunswick, Germany          tino.teige@btc-es.de      Brunswick, Germany        udo.brockmeyer@btc-es.de
   sebastian.schirmer@dlr.de                              christoph.torens@dlr.de


   Abstract—In aviation and other safety-critical domains, soft-            At the department Unmanned Aircraft of German Aerospace
ware faults are unacceptable. A means of detecting and avoiding          Center (DLR), prototypes are built to foster autonomy of
these faults is to use formal methods. Although formal methods           unmanned aircraft system. This also incorporates to consider
strongly contribute to the reliability and robustness of the system,
some drawbacks prevent their general usage. A drawback is their          AI methods and their validation. One problem in AI is that the
reputation to be hard to apply for non-experts. Non-experts have         function is not written as transparent and human-readable code
to be familiarized with the tools to efficiently make use of them.       which can be manually checked and traced. Instead, the func-
But is this reputation still valid? Over the years, formal methods       tion is learned and tested based on data and, therefore, often
tools have evolved. They are capable to analyze more complex             considered as black-box. Currently, there is a trend towards
system properties. Further, their user experience was addressed
by industrial companies to actually allow non-experts to profit          simulation validation of AI due to the lack of other means. To
from the advantages of formal methods.                                   check black-box systems, e.g. during simulation, one approach
   This paper represents the first step towards putting the men-         is runtime monitoring [3], depicted in Fig. 1. There, the inputs
tioned assumption under test by trying to use formal methods for         and outputs of the black-box are send to a monitor. The
prototyping. We propose an approach for software prototyping             monitor is generated based on a formal specification of desired
which makes use of the formalization of requirements. We depict
advantages and discuss first results of evaluating the commercial        system properties and evaluates whether the requirements
tool BTC EmbeddedPlatform R that we were able to use without             are met by the black-box [4]. Further, in case of an AI
cost in a project cooperation. We plan to continue the project           component, a monitor feedback loop, as additional qualitative
cooperation to answer the headline in future.                            or quantitative features, can also enhance the AI learning
   Index Terms—prototyping, formal specification, monitoring             phase [5]. For runtime assurance purposes, monitoring plays
                                                                         an essential role. Regulatory agencies are investigating the
                        I. I NTRODUCTION                                 means to ensure comparable safety of AI components, which
   In software development, validation is a difficult and error-         are in line with common practices. One promising means
prone task. There are several different development models               seems to be some version of the simplex architecture [6] to
that lead through the software life cycle and support the                sandbox the untrusted AI function. There, a decision module
development of the software product. Common to most of                   decides whether to switch from an untrusted function to a
these methods is a set of requirements from which code                   trusted function. Similar to the fuzzy values mentioned above,
and test cases are created. Complementary, when developing               the decision module requires accurate threshold values to
safety-critical software, recommended software standards such            switch safely and efficiently. An extension to the simplex
as DO-178C [1] also recommend traceability in both directions            architecture [5] was recently presented which incorporates
from requirement to code and from code to test cases. Further,           neural aspects of AI.
it is known that the earlier in the development process a bug
is found the cheaper it can be fixed.
   Often a proof of concept is developed with reduced rigor in
form of a prototype before the final software product is created.
In aviation, ARP4754A [2] even proposes prototyping to iden-
tify incorrect or missing requirements. One of the main reasons
for developing a prototype is improving the understanding of
the target product. Only with this knowledge it is possible
to define good software requirements. Otherwise, the values
used in requirements are unknown or too fuzzy. To derive
these values, we think that a structured way of prototyping
is desirable during the prototyping process, i.e. consistent and
documented adaptation of threshold values.                                     Fig. 1. Using runtime monitoring to check black-box systems.


Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
   In summary, an approach which allows to early identify
requirement violations and to safely (documented, consistent)
adapt requirement threshold values to capture or even bound
the behavior of the target (AI) application is highly desirable
for both developing an end-product as well as prototyping. In
this paper, we propose an approach based on the formalization
of requirements and depict its possible advantages. Further,
we point out first insights whether the commercial tool BTC
EmbeddedPlatform R can be used for the proposed approach.
                     II. R ELATED W ORK
   Imprecise or incorrect requirements is one of the root
causes of software errors and the associated costs to fix
a software error in late stages of development is highest
for errors introduced during requirements elicitation [7][8].
A requirements-based testing process that was supported by
formal methods has been shown to significantly reduce the                       Fig. 2. Proposed approach for prototyping.
number of issues before the implementation phase [9][10].
Previous work targeted this by formalizing requirements first
in a semi-formal natural language template and finally in          be automatically generated. Forth, monitors can be generated
a temporal logic [11]. The formal requirements were then           for each property and attached to the code, checking the
model-checked against a manually developed model of the            adherence to the property online. In the next subsection, we
software. Although this approach was extremely helpful to          describe one way how the formalization can be achieved. The
understand the system, it required a lot of manual development     approach is shown in Fig. 2 and explained in more detail in
effort and the approach did not include support for code and       Subsection III-B.
test development. With this work, we continue these efforts        A. Tool Support
and include additional aspects.
                                                                      BTC EmbeddedPlatform R (EP) is a tool suite for specifica-
                       III. A PPROACH                              tion, testing, and verification of requirements for Simulink R
                                                                   and TargetLink R models and production code, currently
   Often during prototyping, assumptions and requirements on
                                                                   mainly used in the automotive domain. EP has been success-
the system are implicitly given. Still, we think they should
                                                                   fully certified by German TÜV Süd as fit for purpose for the
be explicitly stated during the prototyping process. Especially
                                                                   usage in safety-critical software development projects accord-
functional requirements are interesting for prototyping: every
                                                                   ing to IEC 61508-3:2010, ISO 26262, EN 50128, IEC 62304
100Hz the controller has to output a value; control-flow
                                                                   as well as ISO 25119. However, there is no dedicated tool
properties like if the pilot requests ascending of the drone
                                                                   qualification support for the aerospace domain, as of yet.
then the drone shall increase height within 0.2 to 0.5 seconds;
                                                                      The workflow for the use case of this paper is shown in
or if an obstacle is detected above then the drone shall stop
                                                                   Fig. 3. Textual requirements are first imported into EP. Then
to increase height within 0.1 seconds; or even high-level
                                                                   the BTC specification method guides the user during the com-
assumptions like during daytime tracking of an object works.
                                                                   plex conversion phases from an informal textual requirement to
They can actually help developers to avoid false, e.g. outdated
                                                                   a uniquely defined and correctly formalized requirement. One
due to system changes, assumptions on the current software
                                                                   particularity of the method is that the underlying formalism
state. Of course, writing and checking requirements on the
                                                                   called Simplified Universal Pattern (SUP) inherently reflects
prototype level imply an additional workload. However, it can
                                                                   the intrinsic nature of an informal requirement by describing
guide towards the source of error. Also, after prototyping,
                                                                   a temporal trigger-action relationship, namely in an intuitive
written requirements offer a good starting point for creating
                                                                   and graphical way.
the full set of requirements. In general, prototyping should not
turn into an end-product development, therefore a lightweight
approach is desirable.
   The approach focuses on the formalization of such func-
tional requirements. As a result, the requirements could be
depicted as an automaton or formula. This is an additional
step which needs to be light and efficient. But the main
advantages of formalized requirements are: First, they offer
an unambiguous documentation. Second, at an early stage
consistency can be automatically checked and, therefore, later
code iterations potentially avoided. Third, basic test cases can            Fig. 3. Workflow using BTC EmbeddedPlatform R .
   We briefly introduce the idea behind SUP by means of an
example. For more details about the syntax and semantics
of SUP the reader is referred to [12][13]. Fig. 4 gives two
examples of formal requirements using the temporal trigger-
action relation of the graphical SUP language. The SUPs FR-1
and FR-2 formalize the textual requirements from Section III
“If the pilot requests ascending of the drone then the drone
shall increase height within 0.2 to 0.5 seconds.” and “If
an obstacle is detected above then the drone shall stop to
increase height within 0.1 seconds.”, respectively. An example                       Fig. 5. Example executions of SUPs.
execution of FR-1 is shown in the upper part of Fig. 5: in step 1
the pilot requests ascending of the drone and thus the trigger
of the SUP is activated. This in turn means that the action is      actually test the requirements, most often is a manual and thus
expected to occur two to five steps later (where the step size      very time-consuming task. As shown in Fig. 3, EP facilitates
is defined to be 0.1 seconds). In fact, the height is increased     the automatic generation of test cases by taking into account
in step 5 meaning that the action part of the SUP fires which       the intended behavior described by the requirements, leading
leads to a fulfillment of the SUP in the same step.                 to high-quality test cases, cf. [14]. For the SUPs FR-1 and FR-
                                                                    2, a generated test case is shown in the upper part of Fig. 5.
   Based on formalized specifications, fully automated formal
checks can be executed in order to find and fix problems               Another capability is to export executable monitors, also
in the requirements set at a very early design stage, which         called observers, from formal requirements as depicted in
leads to greater product quality with less cost compared to         Fig. 3. These observers implement the behavior of the formal
traditional processes. As indicated by Fig. 3, we consider the      requirements and establish a very flexible way of monitoring
consistency check of requirements within this paper, i.e. we        the correctness of the behavior of the actual system under
aim at finding any contradiction within the requirements fully      test in external simulation environments. Technically, these
automatically [14]. When applying consistency analysis on the       observers are exported as a functional mock-up unit (FMU)
two SUPs of Fig. 4, EP reveals an inconsistency witnessed           in the tool independent FMI standard1 , cf. [15].
by a violating execution shown in the lower part of Fig. 5:
                                                                    B. Prototyping Using Formal Requirements
the pilot requests ascending of the drone in step 0. Due
to FR-1, this enforces to increase height at least in step 2          We propose an iterative prototyping approach (cf. Fig. 2):
and at most in step 5. In the concrete inconsistency witness,        1 Informal requirements on the application are collected.
height is not increased in steps 2, 3, and 4, i.e. the only          2 We formalize the requirements using EP. Note that EP
chance to fulfill FR-1 is to increase height in step 5. However,       focuses on functional requirements. Requirements on the
in step 5 an obstacle occurs which triggers FR-2. A valid              system design, e.g. structural redundancies, cannot be
behavior by FR-2 is to stop to increase height in the same             expressed in general.
step which in turn contradicts with finishing the action of FR-      3 Then, formal requirements are checked for consistency.
1. This establishes an inconsistency between FR-1 and FR-              An automatic check is highly desirable since the software
2. Though the detection of inconsistencies is fully automatic,         is build iteratively and variables can change at any point
their resolution is a manual task but with the support of a            in the prototype development. Also, especially, wrong
concrete counterexample. In the example above, it was most             timing assumptions are hard to find during later iteration.
likely forgotten to consider detected obstacles for the pilot          Imagine the requirement on component c when input
request in FR-1. One potential solution would be to define an          a is set, output b should be set within 50ms. At a
exit (cf. [12][13]) in FR-1 whenever an obstacle is detected,          later development phase, component c is divided into
with the effect that observation of FR-1 is restarted (without         consecutive subcomponents c1 and c2 with requirements
violating or fulfilling the SUP) whenever an obstacle occurs.          if input a is set then a1 is set after 25ms and when input
   Finding requirement-specific test cases, i.e. test cases that       a1 is set then b is set within 25ms, respectively. Due to
                                                                       the automated check of the formal requirements, we can
                                                                       automatically detect an inconsistency by finding a trace
                                                                       falsifying the initial requirement on c. The required time
                                                                       for the second property is unbounded, after 25ms and
                                                                       could lead to a violation, i.e. b could be set after 50ms.
 (FR-1)                                                              4 Based on formal requirements Test Cases and FMU
                                                                       Observers are generated by EP.
                                                                     5 We instrument the system code to send required data
(FR-2)                                                                 to the generated monitors. The monitors evaluate the

          Fig. 4. Two SUPs FR-1 (upper) and FR-2 (lower).            1 Details about FMU and FMI can be found at https://fmi-standard.org/.
     specified properties and return an early verdict whether       # −−−−−−−−−−−−−−−− M o n i t o r−s i d e −−−−−−−−−−−−−−−−                         1
     the system adheres to the system requirements. Note            fmu = FMU( p a t h )                                                              2
     that it is preferable to monitor the system outline on a       fmu . i n s t a n t i a t e ( . . . )                                             3
                                                                    mapping = { V a r i a b l e −> ( ID , Type ) }                                    4
     dedicated hardware to avoid impacts on the system.             s o c k e t = o p e n S o c k e t ( i p , p o r t , mapping )                     5
  6 Next, we run our system. First, we use the generated Test       while ( t r u e ) :                                                               6
     Cases. Then, simulations are used.                                   ID , d a t a , e v a l u a t e = s o c k e t . r c v ( )                    7
                                                                          fmu . s e t V a l u e ( ID , d a t a )                                      8
After each step, the artifacts are checked for violations,                if evaluate :                                                               9
e.g. during execution the outcome of the FMU Observers are                    fmu . d o S t e p ( . . . )                                             10
                                                                               f o r key i n mapping :                                                11
checked whether any requirements are violated.                                      p r i n t ( key , fmu . g e t B o o l e a n V a l u e ( key ) )   12
                                                                      . . . Teardown                                                                  13
                       IV. D ISCUSSION                              # −−−−−−−−−−−−−−−− System−s i d e −−−−−−−−−−−−−−−−                                14
   Currently, EP is able to realize the required functions. But       ...           s t a r t up . . . i n i t i a l i s e s o c k e t                15
                                                                    while ( t r u e ) :                                                               16
its scalability on real-world example from aviation domain is             hgt cmd , c u r h g t =getHgtCmd ( ) , g e t C u r H e i g h t ( )          17
unclear and needs to be addressed in future. First tests indicate         obs = checkObstacleAbove ( )                                                18
that the formalization of textual requirements is easy to use.            cmd hgt = c a l c u l a t i o n ( hgt cmd , c u r h g t , o b s )           19
                                                                          s e t C m d H e i g h t ( cmd hgt )                                         20
However, currently, one is limited to SUP which represent                 sendData (                                                                  21
temporal trigger-action relationships. Still, based on experi-            [ ’request_ascending’ , ’increase_hgt’ , ’obstacle’ ]                       22
ence, most of the functional requirements do actually fit in this         [ hgt cmd > 0 , cmd hgt > 0 , o b s t a c l e a b o v e ] )                 23
                                                                      . . . Teardown                                                                  24
pattern. For others, other languages exist like the stream-based
specification language Lola [16] which was already used to
                                                                    Fig. 6. Simplified code fragments for the data exchange between the
monitor the status of unmanned aircraft systems during flight.      instrumented system and the generated monitors.
   In future work, we have to investigate the trade-off of the
approach in order to estimate the cost-benefit ratio between
writing and formalizing requirements and the added value of                                         R EFERENCES
better code at early stages, e.g. test cases and monitors. Clear     [1] RTCA, “DO-178C/ED-12C Software Considerations in Airborne Sys-
metrics have to be found to estimate the benefit. Formalization          tems and Equipment Certification,” Radio Technical Commission for
overhead also incorporates mistakes during formalization. To             Aeronautics, 2011
                                                                     [2] SAE International, “ARP 4754A Guidelines for Development of Civil
which degree these kinds of mistakes are automatically de-               Aircraft and Systems,” SAE International, 2010
tected by the consistency checks is also important.                  [3] Y. Falcone, K. Havelund, and G. Reger, “A tutorial on runtime verifica-
   During simulation, the use of generated monitors offer                tion,” In Engineering Dependable Software Systems, 2013
                                                                     [4] K. Y. Rozier, “From Siumlation to Runtime Verification and Back:
clear benefits like early feedback and avoidance of inline               Connecting Single-Run Verification Techniques,” In SpringSim, 2019
assertions. There, the main concern is the integration into          [5] D. Phan, N. Paoletti, R. Grosu, N. Jansen, S. Smolka, and S. Stoller,
the system setup, especially the system instrumentation. The             “Neural Simplex Architecture,” arXiv 1908.00528. 2019
                                                                     [6] D. Seto, B. Krogh, L. Sha, and A. Chutinan, “The Simplex Architecture
monitors need to be able to run along the system while having            for Safe Online Control System Upgrades,” In Proc. 1998 American
a low impact on its execution. Figure 6 exemplary shows                  Control Conference Vol. 6.
our current system instrumentation. We are not limited on            [7] R. R. Lutz, “Analyzing software requirements errors in safety-critical,
                                                                         embedded systems,” Proceedings of the IEEE International Symposium
passively reading databuses. We also instrumented the code to            on Requirements Engineering, 1993
send information about the internal software status via UDP.         [8] Davis, Alan. “Just enough requirements management: where software
                                                                         development meets marketing.” Addison-Wesley, 2013.
                       V. C ONCLUSION                                [9] M. Whalen, D. Cofer, S. Miller, B. Krogh, and W. Storm, ‘Integration
                                                                         of formal analysis into model-based software development process,”
   We motivated a structured approach for prototyping. The               Formal Methods for Industrial Critical Systems, 2008
approach is centered around the formalization of require-           [10] P. Skokovic, “Requirements-Based Testing Process in Practice,” Interna-
ments. It allows to automatically check their consistency and            tional Journal of Industrial Engineering and Management (IJIEM), Vol.1
                                                                         No 4, 2010, pp. 155 - 161, ISSN: 2217-2661, 2010
to generate test cases and monitors. The approach can be            [11] C. Torens, F.-M. Adolf, “Fail-Safe Systems from a UAS Guidance
implemented using open-source tools. As formal specification             Perspective,” In: Encyclopedia of Aerospace Engineering UAS. Wiley.
language, linear temporal logic (LTL) could be a candidate               ISBN 9780470686652
                                                                    [12] T. Teige, T. Bienmüller, and H.J. Holberg, “Universal Pattern: Formal-
for which tools for model checking, runtime monitoring, and              ization, Testing, Coverage, Verification, and Test Case Generation for
test generation do exist. We decided to use a commercial                 Safety-Critical Requirements,” MBMV 2016: 6-9
tool that combines all of these approaches. We showed how           [13] J.S. Becker, “Analyzing Consistency of Formal Requirements,” ECE-
                                                                         ASST 76 (2018)
the integration into development could be done. Then, we            [14] T. Bienmüller, T. Teige, A. Eggers, and M. Stasch, “Modeling Require-
discussed the barriers for an actual integration. In future, we          ments for Quantitative Consistency Analysis and Automatic Test Case
plan to actually apply the approach and to report findings to            Generation,” FM&MDD 2016
                                                                    [15] H. Esen, M. Kneissl, A. Molin, S. vom Dorff, B. Böddeker, E.
come closer to answering the question whether formal methods             Möhlmann, U. Brockmeyer, T. Teige, G.G. Padilla, S. Kalisvaart,
pay off for prototyping.                                                 “Validation of Automated Valet Parking,” Validation and Verification
                                                                         of Automated Systems, Springer, 2019
                                                                    [16] F-M. Adolf, P. Faymonville, B. Finkbeiner, S. Schirmer, and C. Torens,
                                                                         “Stream Runtime Monitoring on UAS,” In Runtime Verification, 2017