Satisfaction Meets Practice and Confidence
                          Tom Bienmüller                                                              Tino Teige
                           SVP Products                                            Chief Research Engineer Innovation & Technology
                     BTC Embedded Systems AG                                                 BTC Embedded Systems AG
                        Oldenburg, Germany                                                       Oldenburg, Germany
                       bienmueller@btc-es.de                                                       teige@btc-es.de


        Abstract—The industrial application of formal methods and          These two partners and BTC-ES strive to providing products
    in particular of software verification tools, e.g. based on            and services for modeling, auto code generation, test
    satisfiability checking and symbolic computation as being in the       automation, and formal verification to customers. dSPACE
    main focus of the SC2 network, necessitates two main                   GmbH and BTC-ES support the tool chain Matlab®,
    requirements. The methods and tools, first, need to actually aim       Simulink®, Stateflow® and dSPACE’s TargetLink®. IBM
    at the problem class that occur in practice and, second, have to       Rational and BTC-ES provide solutions for the tool chains
    guarantee a high level of confidence. In this position paper, we       concerning the modeling tools Statemate® and Rhapsody®
    raise two challenges in the domain of software verification:           UML. Together with dSPACE GmbH, BTC-ES has a strong
    1) enhancing the field of application of software verification tools
                                                                           focus on the automotive domain with key customers in
    in practice by means of efficient support of IEEE-754-based
    floating-point models and 2) enhancing confidence of software
                                                                           Germany and Japan. Due to the cooperation with IBM
    verification tools by means of generating certificates for their       Rational the BTC-ES products are also widespread across
    computed analysis results.                                             several other domains, e.g. aerospace, defense, rail,
                                                                           telecommunication and consumer. The core competences of
       Keywords—formal specification; formal verification; model           BTC Embedded Systems AG are:
    checking; floating-point; proof certificates
                                                                                   model-driven development process

                           I. INTRODUCTION                                         formal specification and verification of requirements
        As an industrial associate partner of the SC2 network, we                  automatic test case generation for models and
    certainly play a minor role in the academic research activities                 requirements
    performed in that network. Even though having a matching
    academic background for this network, the benefit we see in                    automatic test case generation based on code
    our participation lays merely in the contribution we can give:                 automatic test case, trace, and vector execution
    show which topics are very close to the emerging needs real
    end users have who successfully apply formal test and                          model and code coverage
    verification tools already today. This could enable researchers
    to align their focus of work. Furthermore, as we as a tool                                    III. USE CASES
    vendor are close to real world applications, we can also
    provide data on these models coming from practice, allowing                As tool provider for automated testing and formal
    academic partners to validate the efficiency of their                  verification methods in safety critical development BTC-ES is
    approaches in a real world setting. The benefit for us is              challenged to keep pace with the increasing demands from end
    obvious: getting in touch with leading researchers in a very           users. There are two major aspects relevant to the SC2
    important domain and get insight into ideas and technologies           network:
    on how to answer to the emerging needs from end users.                         enhanced C-Code language support derived from the
                                                                                    increasing use of IEEE-754-based floating-point
                   II. BTC EMBEDDED SYSTEMS AG                                      processors, and
        BTC Embedded Systems AG (BTC-ES), a majority                               higher demands on the confidence in formal
    shareholding of BTC Business Technology Consulting AG,                          verification tools in safety-critical development
    provides products and services for formal verification,                         projects.
    automated validation, and automated testing of embedded
    systems software. BTC-ES products significantly reduce                 A. IEEE-754-Based Floating-Point Support
    required efforts for testing of embedded systems software, and
                                                                              Even though not thoroughly established yet, production
    considerably increase the quality of the developed systems.
                                                                           application of formal verification techniques more and more
    Additionally, BTC-ES offers on-site and off-site services
                                                                           requests to efficiently support floating-point arithmetic for
    around the testing products and testing in general in order to
                                                                           basic operations such as addition and multiplication, but also
    support customers in using the BTC-ES products successfully
                                                                           for complex operations such as square-root calculation,
    and efficiently. BTC-ES cooperates with two larger
                                                                           exponential and trigonometric functions. Current and well-
    enterprises, dSPACE GmbH and IBM Rational Software.


Copyright c by the paper’s authors. Copying permitted for private and academic purposes.
In: E. Ábrahám, J. Davenport, P. Fontaine, (eds.): Proceedings of the 1st Workshop on Satisfiability Checking and Symbolic Computation (SC2 ),
Timisoara, Romania, 02-07-2016, published at http://ceur-ws.org
established model checking techniques based on propositional       increasing trend of using floating-point processors and
satisfiability (SAT) checking or binary decision diagrams can      production code in industrial applications.
deal with those floating-point operations, but the computation
performance is very much negatively impacted by the                B. Confidence in Formal Verification Tools in Safety-Critical
necessity to reduce these operations to some Boolean decision          Development Projects
problem aka bit-blasting. Even though it is a good approach to
                                                                       In reactive embedded software development for
reuse the generic and well-established techniques for this type
                                                                   automotive applications, formal verification techniques such
of problem, the obvious drawback is that it can be a hard
                                                                   as model checking reached a high technology readiness level3.
problem for a Boolean solver to deal even with a small amount
                                                                   Off-the-shelf products implementing those techniques are
of combined floating-point variables.
                                                                   available and applied not only in pilot projects and research,
    To illustrate the problem of bit-blasting for more complex     but also in production. This evolution comes along with
arithmetic floating-point functions like the exponential           increasing demands in the technology confidence.
function exp() as provided by the standard C library math.h,       Furthermore, applying exhaustive formal verification
consider the following rather small formula to be solved:          techniques are viewed under return on investment constraints,
                                                                   obviously balancing efforts for formal verification with
                           1.4 == exp(x)                           traditional testing approaches. Besides that, as model checking
    with x being a 64-bit floating-point variable ranging in the   claims to be a “complete test”, it is assumed that all bugs in a
interval [-104.0, 89.0].                                           system under verification are detected using that technology.
                                                                   More precisely, no witness trace showing a violation of a
    To solve above formula, one can apply cbmc1, a bounded         formal requirement specification remains unrevealed. Even
model checker for C and C++ programs [1][2]. Before passing        though the techniques indeed are invented to give the ultimate
the formula to cbmc, one has to cope with a suitable               answer, this expectation leaves one important aspect out of
definition of the function exp() which is not provided by          consideration: what about semantic bugs in the model checker
cbmc. We remark that the C standard does not precisely             itself influencing that ultimate answer? Albeit it is well
define library functions like exp() but just gives a rough         understood that a software product of a size of a model
description. The implementation of exp() in our experiment         checker contains bugs itself, occurrences of such bugs in
consists of about 700 LOC including comments. Using cbmc           production projects obviously have disproportionately
with a SAT solver as its backend, the C code is translated into    dramatic impact on both the quality of the system and the cost
a SAT formula containing about quarter of a million variables      to establish them, but also in the product’s confidence. Note
and more than a million of clauses, provoking a rather high        that we are not talking about tool crashes here, but about bugs
runtime of about 5 minutes.                                        which are based on semantically wrong computations on a
    In recent years there is a trend to extend SAT by dedicated    system under verification: the model checking result is wrong.
theories leading to the problem of satisfiability modulo               Software products of this importance and impact require
theories (SMT). The SMT community currently provides a             dedicated measures to assure quality. Traditional software
large set of theories accompanied with powerful SMT solvers.       quality assurance techniques such as systematic testing are
Concerning above problem, the interval-based SMT solver            applied before shipping the product to end users. This kind of
iSAT32 [3][4] is a promising choice as it supports non-linear      offline quality assurance works well but it can of course not
real arithmetic involving complex mathematical functions like      exhaustively simulate the application at the end-user’s site.
exp(). Due to this built-in support, the internal formula          This means that still some bugs in the tool may remain
representation has no space penalty, yielding a very fast          undetected even though tremendous effort is spent for this
solving time of above formula of fractions of a second. The        kind of offline quality assurance. The arising question
obvious drawback of the iSAT3 approach (currently) is the          therefore is: what about the remaining bugs in model checkers
lack of floating-point interpretations of these complex            which are not detected offline to the end-user’s application?
mathematical functions, though some first steps towards            How can we detect the occurrence of semantic bugs in a
floating-point interpretation of basic operations were             model checker in some kind of online quality assurance in
undertaken most recently [5].                                      parallel to the production operation?
    The SMT approach perfectly fits into the challenge we are         The question of online quality assurance during real
facing. Though there already are first attempts to standardize     production application of model checkers is essentially the
the theory of floating-point arithmetic within the SMT             same as the question that the system under verification has
community [6][7], powerful tool support–in particular not          been translated correctly to a model checkers input plus
relying on bit-blasting–is still a challenging and ongoing task,   correct implementation of the model checking algorithms. In
e.g. [8]. We strongly believe that strengthening the effort on     terms of a model checker, the above question can also be
the development of efficient floating-point SMT solvers being      reduced to answer the following two queries:
able to handle complex mathematical functions will be very
beneficial for our customers, in particular with regard to the           1) Is a detected counter-example a genuine counter-
                                                                            example of the system under verification?

1
    http://www.cprover.org/cbmc
2                                                                  3
    https://projects.avacs.org/projects/isat3                          https://en.wikipedia.org/wiki/Technology_readiness_level
    2) How can be made sure that claiming absence of a              reducing to the cone of influence, unwinding internal loops,
       counter-example is a trustful answer that a counter-         and rewriting to single assignment code.
       example indeed does not exist?
   Intuitively, 1) is quite easy to answer. If a counter-example
is generated, then this counter-example can be viewed as
simulation trace. A simulation trace contains both stimulus
and computed values of the representation in the model
checker. Replaying that trace on the original system under
investigation by checking if some stimulus indeed leads to the
computed reactions as stated in the trace allows to safely judge
whether the counter-example is a genuine one also showing
malfunction of the original design4.
    Giving an answer to 2) is much harder. No counter-
example is generated which would allow for straight-forward
validation of the model checker’s outcome. Alternatively, one
could argue to simply let a second, different model checker
answer the same inquiry in parallel to the first one. Upon
successful application of the second model checker, diversity
arguments could be applied: whenever two different model
checkers return the same result then the probability of a wrong
result is very low. That is true and theoretically the problem is
solved with this argumentation. However and in particular in
real world applications, there could be some constraints                 Fig. 1. Typical verification tool chain starting with a C program and a
hindering the approach of really being applicable:                   specification and leading to a corresponding SAT formula (left) and idea of
                                                                     lifting back a potential proof certificate from the SAT to the C layer (right).
        As we are in a proprietary setting: is there a second
         model checker available at all?
                                                                        When employing an SMT and/or SAT solver as the
        Even if there is a second model checker available, is      backend engines of the analysis tool chain, the code is finally
         that model checker able to produce the same result in      translated into an SMT and/or SAT formula. It usually holds
         terms of complexity?                                       that the SMT/SAT formula is unsatisfiable if and only if the
        Would an end user accept a potential run time and          state where the observer variable is true is unreachable (for
         space increase?                                            some bounded search depth).

    An alternative approach might be based on formal proofs             Whenever it turns out that the SMT/SAT formula is
of absence of bugs, preferably on the original software layer.      unsatisfiable, we know that the given formal requirement is
When reducing the original formal requirement specification         fulfilled–at least in theory. In practice however the verification
to (state or line) reachability in a program then we aim at a       tool chain mentioned above is a very complex collection of
certificate of unreachability (of the corresponding state or        software being prone to error. Efficient SMT and SAT solvers
line). The definition of such certificates desirably should not     in particular are highly optimized, even on bit level. In order
rely on any concrete analysis technique and should be               to enhance trustworthiness of an unreachability result of the
revisable by a small stand-alone proof checker. This ultimate       overall verification tool chain, it is desirable to generate a
goal seems to be a very challenging and hardly investigated         certificate which facilitates a separate validation of the result,
scientific topic but would have a very big impact on the            namely independent from the actual model checking software.
confidence level of the system under verification and on                One natural approach is to generate a certificate on the
formal verification methods in general due to the possibility of    SMT/SAT solver level and then to lift this certificate back
validating verification results online on customer site.            through all the intermediate translation and transformation
    Let us roughly consider a typical verification tool chain as    layers up to the original code level as indicated in Fig. 1.
depicted in Fig. 1: the original C program is instrumented          Though there is an extensive work on certificates of
according to the formal requirement specification, e.g., by         unsatisfiability for SAT formulas (based on resolution proofs
introducing a new Boolean observer variable which is set to         of unsatisfiability, e.g. [9]), it seems that certificates of
true if and only if the given requirement becomes violated.         unreachability for imperative programming languages like C is
Thereafter, the C program is typically translated into some         hardly investigated. We believe that certification of
internal language in order to perform several transformation        verification results will become a major topic in future, in
steps into a syntactically simpler form. Some usual rules are       particular when establishing formal methods and their
inlining of function calls, flattening of structural data types,    trustworthiness in industry.


4
  Here, we abstract from bugs in simulators enabling to replay
a counter-example on the original system under investigation.
                      IV. CONCLUSION                               an inquired formal requirement? We believe that providing
                                                 2
    As industrial associate partner of the SC network we           evidence for results a model checker produces is very
described some major challenges which we believe to be very        beneficial when talking about bringing formal verification into
important to be addressed in order to bring formal verification    production. This is why we propose to think about some kind
alive in broad production application in automotive embedded       of proof certificates, which enable to probe whether a model
software development.                                              checker’s result “is true with almost absolute certainty”.

    First, with raising attractiveness of IEEE-754 floating-                                ACKNOWLEDGMENT
point processors in the automotive domain, adjusted model
checking techniques will be needed. Here SMT solving                  This work was supported by the H2020-FETOPEN-2016-
avoiding bit-blasting holds a lot of promise, not only for basic   2017-CSA project SC² (712689).
floating-point operations but also for complex ones like
exponential function, square-root, etc. We strongly believe                                      REFERENCES
that SMT is the right choice to approach this new problem          [1]   Daniel Kroening, Michael Tautschnig: CBMC - C Bounded Model
class as we obtained promising results here already also in              Checker - (Competition Contribution). TACAS 2014: 389-391
cooperations with other academic partners of the SC2 network.      [2]   Peter Schrammel, Daniel Kroening, Martin Brain, Ruben Martins, Tino
                                                                         Teige, Tom Bienmüller: Successful Use of Incremental BMC in the
    Second, application of model checkers in broad software              Automotive Industry. FMICS 2015: 62-77
production dramatically increases the demand of confidence in      [3]   Martin Fränzle, Christian Herde, Tino Teige, Stefan Ratschan, Tobias
the applied tool suites. Obviously, errors in one of a chain of          Schubert: Efficient Solving of Large Non-linear Arithmetic Constraint
tools used for developing and testing, in particular, safety             Systems with Complex Boolean Structure. JSAT 1(3-4): 209-236 (2007)
critical software may have tremendous impact in efforts and        [4]   Karsten Scheibler, Bernd Becker: Implication Graph Compression inside
costs on end-user’s side for quality assurance. In worst case,           the SMT Solver iSAT3. MBMV 2014: 25-36
the bug is not detected at all leading to issues in the produced   [5]   Karsten Scheibler, Felix Neubauer, Ahmed Mahdi, Martin Fränzle, Tino
system, the car, later on with effects hopefully not causing             Teige, Tom Bienmüller, Detlef Fehrer, Bernd Becker: Accurate ICP-
                                                                         Based Floating-Point Reasoning. FMCAD 2016 (to be published).
harm to human beings.
                                                                   [6]   Philipp Rümmer, Thomas Wahl: An SMT-LIB Theory of Binary
    For model checkers, the worst case scenario is semantic              Floating-Point Arithmetic. SMT 2010.
bugs: the model checker’s analyzed semantics of a system           [7]   Martin Brain, Cesare Tinelli, Philipp Rümmer, Thomas Wahl: An
under verification differs from the semantics of the original            Automatable Formal Semantics for IEEE-754 Floating-Point Arithmetic.
                                                                         ARITH 2015: 160-167
design, leading to wrong verification results. Returning a
wrong witness trace for a formal requirement violation is the      [8]   Martin Brain, Vijay D’Silva, Alberto Griggio, Leopold Haller, Daniel
                                                                         Kroening: Deciding floating-point logic with abstract conflict driven
easy scenario here–this can simply be tackled by subsequent              clause learning. Formal Methods in System Design 45(2): 213-245
simulation against the original semantics. But what is about a           (2014)
model checker returning no counter witness, claiming that          [9]   Lintao Zhang, Sharad Malik: Validating SAT Solvers Using an
there is no bug in the system under verification with respect to         Independent Resolution-Based Checker: Practical Implementations and
                                                                         Other Applications. DATE 2003: 10880-10885