Accurate Dead Code Detection in Embedded C
           Code by Arithmetic Constraint Solving
                             Felix Neubauer∗ , Karsten Scheibler∗ , Bernd Becker∗ , Ahmed Mahdi† ,
                                Martin Fränzle† , Tino Teige‡ , Tom Bienmüller‡ , Detlef Fehrer§
                                  ∗ Chair of Computer Architecture, University of Freiburg, Germany
                     † Research Group Hybrid Systems, Carl von Ossietzky University of Oldenburg, Germany
                                          ‡ BTC Embedded Systems AG, Oldenburg, Germany
                                                      § Sick AG, Waldkirch, Germany


     Abstract—Unreachable code fragments in software, despite             which does not allow complex dead code analysis at compile-
 not having a negative impact on the functional behavior, can             time, but the analysis itself has to be sound, which means that
 have a bad effect in other areas, such as code optimization or           only really unreachable code may be marked as dead. The
 coverage-based code validation and certification. Especially the
 latter is important in industrial, safety critical environments,         consequence of this tradeoff is a vastly incomplete detection,
 where detecting such dead code is a major goal to adjust the             where substantial amounts of dead code are not discovered.
 coverage of software tests.                                              In our use cases we also need a sound method to securely
     In the context of embedded systems we focus on C pro-                detect dead code, but additionally it has to be highly accurate
 grams which are reactive, control-oriented, and floating-point           with respect to both the amount (enhancing the interpretation
 dominated. Such programs are either automatically generated
 from Simulink-plus-Stateflow models or hand-written according
                                                                          of the test coverage) and localization (approching the demand
 to coding guidelines. While there are various techniques –               of dead code free programs) of the dead code detected.
 e. g. abstract interpretation or Counterexample guided abstrac-             In the context of embedded systems our focus lies on
 tion refinement (CEGAR) – to deal with individual issues of this         reactive, control-oriented, and floating-point dominated C
 domain, there is none which can cover all of them. The AVACS             programs, in particular those automatically generated from
 transfer project T1 aims at the combination of these techniques
 in order to provide an accurate and efficient dead code detection.       Simulink-plus-Stateflow models of embedded control appli-
     In this paper we present the ideas and goals of the project as       cations or those hand-written according to coding guidelines.
 well as the current status (including very promising experimental        Dead code detection in this domain is quite difficult because
 results) and future challenges.                                          there might be no clear distinction between control and data
                                                                          in the C code. Considering automatically generated code this
    Index Terms—formal specification, formal verification, SMT,
 floating-point, dead code detection, CEGAR, abstract interpre-
                                                                          lack of distinction is caused by encoding the major part of
 tation                                                                   the model’s control flow into the data part of the C code.
                                                                          Hand-written embedded control software on the other hand
                         I. I NTRODUCTION                                 often has a central control loop which defers obligations to a
                                                                          set of dedicated handlers. Some of these handlers are generic,
    The occurence of dead (i. e. unreachable) code fragments              offering both a state and data dependent control flow. This
 in sizable software development projects is a regular phe-               construction also blurs the distinction between control and
 nomenon – it can be created either intentionally (e. g. defen-           data. An additional difficulty is the dependency of the control
 sive programming) or unintentionally (e. g. reusing of code,             flow on complex floating-point computations.
 modification of program context) in hand-written programs or                The development of a method to handle these challenges,
 as a byproduct of automated code generation. Although dead               namely:
 code has no negative impact on the functional behavior of the            (1) handling the lack of distinction between control and data
 software, it can have a bad effect e. g. on code optimization            (2) dealing with the dominant impact of (often non-linear)
 or coverage-based code validation and certification. Especially              floating-point computations on the reachability of code
 the latter is important in industrial, safety critical environ-          (3) scaling to industrial-size programs
 ments, where standards (e. g. ISO 26262) for high-integrity
 software demand, among other criteria, reaching a certain test           is the main goal of the AVACS1 transfer project T1. This
 coverage or the absence of dead code. Obviously the presence             project is a transregional collaboration between the University
 of dead code violates the latter demand but also affects the             of Freiburg2 , the Carl von Ossietzky University of Oldenburg3
 interpretation of coverage-based criteria negatively. Therefore          and the industrial partners BTC Embedded Systems AG4 in
 it is a major goal to detect dead code from high-integrity                 1 http://www.avacs.org
 software and thereby be able to adjust the test coverage.                  2 http://ira.informatik.uni-freiburg.de
    The classical dead code detection in compilers is not                   3 https://www.uni-oldenburg.de/hs

 suitable to meet these demands. Compilers have to be fast                  4 https://www.btc-es.de


Copyright c by the paper’s authors. Copying permitted for private and academic purposes.
In: E. Ábrahám, J. Davenport, P. Fontaine, (eds.): Proceedings of the 1st Workshop on Satisfiability Checking and Symbolic Computation (SC2 ),
Timisoara, Romania, 02-07-2016, published at http://ceur-ws.org
Oldenburg and Sick AG5 in Waldkirch.                                  1 #include <math.h>
   There are various techniques which are able to deal with           2
                                                                      3 typedef struct { double x, y; } point_t;
individual issues of the list above. Abstract interpretation (AI)     4
[1]–[4] can handle very large programs (3), but lacks exactness       5 double distance(int length, point_t p[]) {
                                                                      6   double sum = 0;
when the data part of the state space can be fractioned               7   unsigned int i;
and reshuffled more or less arbitrarily by data operations.           8   for (i = 0; i < length-1; i++) {
                                                                      9     sum += sqrt((p[i].x - p[i+1].x) * (p[i].y - p[i+1].y)
Counterexample guided abstraction refinement (CEGAR) [5]–                        );
[7] is often able to create concise abstractions of the inter-       10   }
                                                                     11   return sum;
play of control and data (1) but is currently limited to the         12 }
analysis of either programs with confined arithmetic frag-           13
                                                                     14 unsigned char step(int length, point_t p_array[]) {
ments (mostly linear) or of complete hybrid systems with             15   double dist_value = distance(length, p_array);
tiny control skeletons. Regarding the second issue there was         16   unsigned char dist_level_ok;
                                                                     17   if (dist_value <= 10.0) {
much progress lately in the field of non-linear arithmetic           18     dist_level_ok = 0;
constraint solving based on conflict-driven clause learning over     19   } else {
                                                                     20     dist_level_ok = 1;
arithmetic theories (CDCL(T)) [8], [9]. It is now possible to        21   }
solve extremely large arithmetic constraint systems including        22   return dist_level_ok;
                                                                     23 }
a complex Boolean structure and linear, polynomial and even          24
transcendental arithmetic. But there is no support for loops in      25 point_t p_array[3]; // inputs
                                                                     26 unsigned char dist_level_ok; // output
the control flow and the method does not scale well enough           27
for complex programs with a full branching structure.                28 int main(void) {
                                                                     29   while (1) {
   The transfer project aims at the combination of those             30     // receive inputs: p_array
techniques in order to handle all required challenges and to         31     dist_level_ok = step(3, p_array);
                                                                     32     // transmit outputs: dist_level_ok
provide an accurate and efficient dead code detection. This          33     // wait for next iteration
paper presents the current status of the project, ideas and          34   }
                                                                     35 }
possible solutions to unresolved issues and further challenges.
      Structure of the paper: After introducing the goals and                            Listing 1. example 1 orig.c
challenges of the AVACS transfer project in the current section,
we describe the model checking-based tool chain for structural
code coverage analysis of BTC Embedded Systems in Section
                                                                     chain. The “main”-function sketches a typical reactive program
II. In Section III we speak about the integration of our SMT-
                                                                     with an unbounded feedback loop: first, all input data for the
solver iSAT3 as an alternative backend solver into this tool
                                                                     current iteration or step are received, then the actual function
chain and the experimental results vs. the standard backend
                                                                     is executed, and finally the computed outputs are transmitted.
solver. Section IV gives an outlook about future challenges,
                                                                     Thereafter the program waits for the next loop iteration.
before Section V concludes this paper.
                                                                        In each step, the “step”-function takes an array of three
      II. M ODEL C HECKING -BASED T OOL C HAIN FOR                   two-dimensional points (modelled by struct type “point t”
          S TRUCTURAL C ODE C OVERAGE A NALYSIS                      consisting of two variables of scalar type “double”) as input
                                                                     and returns, whether the level of the aggregated distance of
    One of the most prominent test criteria for production code
                                                                     consecutive points is okay or not.
in industrial and in particular safety-critical domains relies on
structural code coverage such as function, statement, branch,           Please note that the C code contains several floating-point
and condition coverage or MC/DC [10]. Such coverage is               computations and, moreover, relies on the function “sqrt” from
often measured on a percentage basis by means of program             the standard C library “math.h”. This actually is representative
simulations stimulated by so-called test cases or vectors,           as there is an increasing trend of using floating-point proces-
i. e. the number of tested coverage goals relative to the number     sors and production code in industrial applications.
of all goals. Intuitively, the higher the code coverage the             With regard to the use of the “math.h” library, we remark
higher the test confidence and thus the lower the probability        that there are two major ways of dealing with such complex
of undetected software bugs.                                         mathematical functions such as “sqrt” in a model checking
    In the following, we describe by way of example a typical        context: first, the production code itself resolves the prob-
tool chain for automatic analysis of code coverage being based       lem, namely by reducing such complex functions to basic
on model checking backend tools.                                     operations, or, second, to keep these functions in the code
    In Listing 1, a small excerpt of a C code under test is given.   and apply a model checker that has dedicated support for
Though – for illustration reasons – the code fragment is rather      such complex functions. Concerning performance, the second
simple, it however contains some frequent code constructs            option has a clear benefit over the first one as implementations
occuring in practice which need to be handled within the tool        of mathematical functions are frequently very complex and
                                                                     contain several hundreds lines of code. A model checker
  5 https://www.sick.com                                             with native “math.h” support can exploit dedicated algorithms
which are generally much more efficient. In the following, we       1   #include <math.h>
assume a model-checking backend tool that is able to handle         2
                                                                    3   // new observer variables for coverage
the function “sqrt”.                                                4   unsigned char LINE_X_REACHED = 0;
   In the first step of the tool chain for analyzing code           5   double p_array_0_x, p_array_1_x, p_array_2_x; // inputs
                                                                    6   double p_array_0_y, p_array_1_y, p_array_2_y; // inputs
coverage, we need to annotate the original program with             7
additional constructs to automatically deal with code coverage      8   int main(void) {
                                                                    9     while (1) {
metrics. In our example we concentrate on statement coverage       10       // receive inputs: p_array
and – for the sake of simplicity – we just aim at covering         11       double dist_value;
                                                                   12       {
the statements in lines 18 and 20. For the purpose of ob-          13         double sum = 0;
serving the reachability of the corresponding lines, two new       14         unsigned int i;
                                                                   15         for (i = 0; i < 2; i++) {
variables “LINE X REACHED”, “LINE Y REACHED” are                   16           switch (i) {
introduced. Listing 2 shows the neccessary changes for one         17             case 0:
                                                                   18               sum += sqrt((p_array_0_x - p_array_1_x) * (
possible annotation method.                                                              p_array_0_y - p_array_1_y));
                                                                   19               break;
                                                                   20             case 1:
   [...]                                                           21               sum += sqrt((p_array_1_x - p_array_2_x) * (
   // new observer variables for coverage                                                p_array_1_y - p_array_2_y));
   unsigned char LINE_X_REACHED = 0;                               22               break;
   unsigned char LINE_Y_REACHED = 0;                               23           }
   [...]                                                           24         }
     if (dist_value <= 10.0) {                                     25         dist_value = sum;
       LINE_X_REACHED = 1;                                         26       }
       dist_level_ok = 0;                                          27       if (dist_value <= 10.0) { LINE_X_REACHED = 1; }
     } else {                                                      28       // transmit outputs: dist_level
       LINE_Y_REACHED = 1;                                         29       // wait for next iteration
       dist_level_ok = 1;                                          30     }
     }                                                             31   }
   [...]
                                                                                 Listing 3. example 345 funcs types cone.c
               Listing 2. example 2 instr.c (changes)


   After this annotation with coverage goals, normally the C            [...]
                                                                        int main(void) {
code is translated to an intermediate language called SMI                 while (1) {
(which is a C-like programming language). To keep this                      // receive inputs: p_array
                                                                            double dist_value;
example simple, we skip this translation and explain the further            {
steps on C code also.                                                         double sum = 0;
                                                                              sum += sqrt((p_array_0_x - p_array_1_x) * (
   The original code is now instrumented in such a way that                        p_array_0_y - p_array_1_y));
automatic analysis w. r. t. code coverage by a model checker                  sum += sqrt((p_array_1_x - p_array_2_x) * (
                                                                                   p_array_1_y - p_array_2_y));
is rendered possible. In order to reach the input language                    dist_value = sum;
of a model checker, e. g. a declarative, logical description                }
                                                                            if (dist_value <= 10.0) { LINE_X_REACHED = 1; }
as in a SAT or SMT solver, the instrumented code needs to                   // transmit outputs: dist_level
be transformed into a syntactically simpler form. Two usual                 // wait for next iteration
                                                                          }
transformation steps are the following:                                 }
   • function calls are resolved
                                                                                    Listing 4. example 6 opt.c (changes)
   • complex structural data types are flattened

   Usually, a model checker is called separately per proof
obligation, in our context per coverage goal. To further op-
                                                                      The resulting code in static single assignment form can
timize the code in that respect, a cone-of-influence reduction
                                                                   then be translated in a rather simple way into a corresponding
is performed, i. e. all code fragments that do not directly
                                                                   SMT formula Φ(k) with loop iteration depth k such that the
or indirectly influence the state of the corresponding goal
                                                                   following property holds:
expression are removed. Listing 3 shows the result of the two
                                                                      Property 1: The goal “variable LINE X REACHED be-
transformation steps mentioned above as well as an example
                                                                   comes 1” is covered within k steps (meaning that the statement
for the cone of the goal “variable LINE X REACHED be-
                                                                   in line 18 of the original code is reached) if and only if the
comes 1”.
                                                                   SMT formula Φ(k) is satisfiable.
   Depending on the capabilities and requirements of the con-
                                                                      One very important feature of the above approach is that a
crete model checker, further transformations might be applied,
                                                                   satisfying assignment of Φ(k) corresponds to a test case of the
such as:
                                                                   original program that proves coverage of the corresponding
   • loop unwinding (naive or optimized; see Listing 4 for the
                                                                   goal. We remark that construction of such a test case is a
     optimized variant)                                            layered approach starting from the satisfying assignment of
   • static single assignment form, cf. Listing 5
                                                                   the lowermost SMT layer up to the original C code layer.
 1    #include <math.h>                                              the SAT-solver. In case of an inconsistency, the theory solver
 2                                                                   identifies an infeasible subset of the theory atoms which is
 3    // new observer variables for coverage
 4    unsigned char LINE_X_REACHED = 0;                              then encoded into a clause and added to the Boolean skeleton.
 5    // inputs                                                      Thus, in lazy SMT the SAT solver is called multiple times
 6    double p_array_0_x, p_array_1_x, p_array_2_x;
 7    double p_array_0_y, p_array_1_y, p_array_2_y;                  and operates together with a theory solver in order to refine
 8                                                                   the Boolean skeleton incrementally. Therefore, this scheme is
 9    int main(void) {
10      while (1) {                                                  also abbreviated as DPLL(T) or CDCL(T) – with T being the
11        // receive inputs: p_array                                 theory used within the atoms.
12        double dist_value;
13        double sum1;                                                  In contrast to CDCL(T), the iSAT algorithm [8], [11]
14        unsigned char dist_value_leq_10;                           does not have such a separation between the SAT and the
15        sum1 = sqrt((p_array_0_x - p_array_1_x) * (
               p_array_0_y - p_array_1_y));                          theory part. Instead, interval constraint propagation (ICP, see
16        dist_value = sum1 + sqrt((p_array_1_x - p_array_2_x)       e. g. [12]) is tightly integrated into the CDCL framework in
               * (p_array_1_y - p_array_2_y));
17        dist_value_leq_10 = (dist_value <= 10.0);                  order to reason about the theory atoms directly – yielding
18        LINE_X_REACHED = dist_value_leq_10 ? 1 :                   a unified lattice-based view in the meantime also known as
               LINE_X_REACHED;
19        // transmit outputs: dist_level                            abstract conflict-driven clause learning (ACDCL) [13].
20        // wait for next iteration                                    In this paper we build on the third implementation of
21      }
22    }                                                              the iSAT algorithm which is called iSAT3 [14], [15]. The
                                                                     search process in iSAT is similar to CDCL: it consists of
                                Listing 5. example 7 ssa.c
                                                                     alternating propagation and decision phases interspersed with
                                                                     the resolution of conflicts – but additionally, the first two
                                                                     phases are extended with ICP and interval splits. Similar
   If it turns out that the SMT formula Φ(k) is unsatisfiable        to CDCL-based SAT solvers which operate on a conjuctive
then the corresponding coverage goal is unreachable within k         normal form (CNF), iSAT operates on a CNF as well – but
iterations. This may be a hint for dead code.                        enriched with so-called primitive constraints (PCs) which are
   BTC Embedded Systems solves the SMT formula Φ(k)                  the result of a Tseitin-like transformation of the original theory
with the BMC solver CBMC in the backend using k-induction            atoms. This transformation ensures that each PC contains
to perform unreachability proofs. In the following section           only one arithmetic operation – which helps to keep the ICP
we will use the interval constraint solver iSAT36 with Craig         contractors for each PC-type simple.
interpolation as backend solver and compare it to CBMC. The             Thus, ICP contractors are responsible for performing the
iSAT algorithm was developed in AVACS project H1/2.                  arithmetic reasoning. Adding support for new operations es-
                III. I SAT3 AS BACKEND SMT-S OLVER                   sentially requires adding further ICP contractors. Recently,
                                                                     we adapted iSAT3 in order to support accurate reasoning
   While SAT solving aims at finding solutions for proposi-          for floating-point arithmetic [16]. Somewhat earlier, another
tional formulas (or proving the absence thereof), SMT aims at        ICP-based decision procedure for floating-point arithmetic was
solving boolean combinations of so-called theory atoms. Such         proposed in [17]. But in contrast to [17], our approach also
atoms may for example contain linear arithmetic over the reals       contains support for bitwise integer operations – as C programs
or constraints involving floating-point arithmetic.                  usually contain a mix of floating-point arithmetic, integer
   In eager SMT the Boolean structure as well as all theory          arithmetic and bitwise integer operations. This allows iSAT3 to
atoms are translated into one big propositional formula. A SAT       be used as the first non-bit-blasting SMT solver for automatic
solver is then called once in order to determine whether there       dead-code detection in floating-point dominated embedded C
is a solution or not. This approach is also called bit-blasting.     programs. Thus, we did a prototypical integration of iSAT3
While it is in general not applicable to all theories, it is up to   into the tool chain explained in Section II.
now the most prominent approach for SMT solvers addressing              Supporting the floating-point domain also affects the com-
floating-point arithmetic – e.g. Z37 , Mathsat8 and CBMC9 rely       pleteness of the ICP-based decision procedure. While ICP
on this technique.                                                   reasoning is in general incomplete on real arithmetics, it gains
   When doing SMT lazily, the formula is split into a set of         completeness in the finite floating-point domain.
theory atoms and a Boolean skeleton which abstracts the truth-          In the following we give a summary of the experiments per-
values of the theory atoms with Boolean literals. The Boolean        formed in [16]. We relied on a set of benchmarks which orig-
skeleton is processed by a SAT-solver in order to search for         inate from TargetLink10 -generated production C code (com-
a satisfying assignment. If such an assignment is found, a           piled from Simulink-Stateflow models) containing floating-
separate theory solver is used to check whether the theory           point arithmetic and integer arithmetic as well as casts between
atoms are consistent under the truth values determined by            these two domains. Each single benchmark represents a goal
     6 https://projects.informatik.uni-freiburg.de/projects/isat3    defined by a structural code coverage metrics like MC/DC. The
     7 https://github.com/Z3Prover/z3/                               industrial-strength embedded test-vector generation toolBTC
     8 http://mathsat.fbk.eu/
     9 http://www.cprover.org/cbmc/                                    10 http://www.dspace.com/en/pub/home/products/sw/pcgs/targetli.cfm
EmbeddedTester R preprocesses and translates the original C                  at the number of solved U∞ instances – which are of special
code (which is annotated with coverage goals) into the inter-                interest in the context of dead-code detection. Here, SMI-
mediate SMI language as explained in Section II.                             iSAT3 proves for 831 instances that the code fragment of
   In our measurements, we used the version of SMI-CBMC                      interest is unreachable, which is an improvement of 30%
which is part of BTC EmbeddedTester R 3.4 (July 2014). SMI-                  compared to the 631 instances found by SMI-CBMC. Sim-
CBMC supports incremental BMC [18] and relies on CBMC                        ilarly, the number of solved U51 instances is higher in SMI-
4.811 as its backend. But due to technical reasons the current               iSAT3: 172 compared to 44. Furthermore, when looking at the
version of SMI-CBMC does not support incremental solving                     accumulated number of solved instances, SMI-iSAT3 solves
in combination with k-induction.                                             331 instances more than SMI-CBMC.
   Similar to SMI-CBMC, SMI-iSAT3 operates on SMI files                         Due to the prototypical Cygwin-based integration, SMI-
as well. It is a prototypical implementation which runs in                   iSAT3 has some sort of initial runtime-offset (as revealed by
a Cygwin environment and uses a bunch of wrapper scripts                     the diagram in Figure 1). Despite this penalty SMI-iSAT3
in order to (1) translate the given SMI file into the iSAT3                  outperforms SMI-CBMC in terms of runtime. E. g. when
input language, and to (2) validate every satisfying assignment              considering the instances which were solved within 3 seconds
found by iSAT3 by an SMI simulator. All experiments were                     then SMI-CBMC solved 5305 such instances, while SMI-
performed on an Intel Core i7-2600 with 3.4 GHz and 8 GB                     iSAT3 finished 6960.
RAM under Windows 7 (64 bit). We set a time limit of 60
seconds per benchmark. Setting a memory limit within the                                      IV. F UTURE CHALLENGES
Cygwin environment did not work properly. Therefore, we                        While our current method is already able to cover one
omitted such a limit.                                                        part of the goals of the transfer project, there are still open
                                                                             challenges to be solved.
                                                                      60s
           SMI-CBMC                                                          A. Counterexample Guided Abstraction Refinement
           SMI-iSAT3
                                                                                CEGAR [5] is found to be useful in model checking while
                                                                             verifying large problems since its abstraction maps complex
                                                                      40s
                                                                             system models with infinite states to simpler ones (e. g. finite
                                                                             Kripke structures); thus it circumvents the state-space explo-
                                                                      Time


                                                                             sion problem. In our proposed approach, CEGAR starts from
                                                                      20s    a simple initial abstract model representing the control flow
                                                                             of the original program which will be continuously refined
                                                                             based on model checking the abstraction and analysing the
                                                                             counterexamples generated by the model checker. In each
                                                                      0s     refinement step, either an erroneous counterexample will be
   0            2 100          4 200          6 300          8 400
                        Number of solved benchmarks                          eliminated by attaching – to the abstract model – sufficient
       Solver              S+U     SAT      U51       U∞      TO             predicates obtained from the stepwise interpolants as proposed
       SMI-CBMC            8099    7424      44       631     679            in lazy abstraction technique [19], or a real counterexample
       SMI-iSAT3           8430    7427     172       831     348            is found. This continuous refinement is feasible despite non-
                                                                             polynomial arithmetic as the floating-point numbers are from
Fig. 1. Comparison between SMI-CBMC and SMI-iSAT3 over a set of 8778
SMI benchmarks.                                                              a large but finite domain.
                                                                                Craig interpolation (CI) thus is our workhorse for abstrac-
   The benchmarks are encoded as BMC problems. For both                      tion refinement in the CEGAR loop. In conjunction with
solvers we allowed up to 51 BMC unwindings per benchmark.                    CEGAR, in software verification with lazy abstraction tech-
The results are shown in Figure 1. The diagram depicts the                   nique [19], stepwise interpolants are used to extract meaning-
number of solved benchmarks together with the run time                       ful predicates from infeasible error paths, where the resultant
needed per benchmark. In the table, the columns S+U, SAT,                    interpolants are used to refine the abstract model. This ensures
U51 and U∞ show the number of solved instances (SAT: sat-                    that the interpolants at the different control locations achieve
isfiable; U51: unsatisfiable until depth 51; U∞: unsatisfiable               the goal of providing a precision that eliminates the infeasible
for all depths proved with k-induction or Craig interpolation;               error path from further explorations.
S+U: SAT + U51 + U∞). The column TO shows the number                            CEGAR has been applied successfully in the context of
of aborted benchmarks due to a timeout.                                      programs [20], real time systems [21] and hybrid systems [22].
   In order to detect unreachability (e. g. dead-code), CBMC                 Moreover, CI-based CEGAR is standard in the domain of
uses k-induction, while iSAT3 uses Craig interpolation. Both                 software model-checking, but current approaches are confined
solvers perform equally well when counting the number of                     only to linear arithmetics or polynomials. However, what sets
solved satisfiable instances. The picture changes when looking               our method aside is that it applies Craig interpolation to
                                                                             learn concise reasons for a counterexample being spurious
  11 http://www.cprover.org/svn/cbmc/releases/cbmc-4.8-incremental/          despite its rich, non-polynomial arithmetic domain as we
use iSAT as a backend solver where the latter handles the             plication and Division). In the industrial context the support
non-polynomiality by integrating CDCL(T) with ICP in one              of more complex arithmetic operations such as exp, sin, cos,
framework as aforementioned.                                          sqrt is mandatory and should be included in the current solver.
                                                                      Implementing these operations in iSAT3 might also have an
B. Handling of Interrupts                                             advantage compared to bit-blasting solutions since in the latter
   As the transfer project aims at covering analysis of hand-         context all those operations have to be translated into one big
written embedded C code also, the ability to deal with the            propositional formula while iSAT3 can handle them on the
implicit transfer of control flow mediated by external interrupts     arithmetic level.
is of importance, as interrupt handlers are a traditional means          Alongside k-induction (CBMC) and Craig interpretation
of ensuring reactivity to asynchronous events. The specific           (iSAT3) there exist other techniques (e. g. IC3 [23], [24]) to
problem here is that in contrast to the control flow mediated by      prove the unreachability of code fragments. These may have
explicit control structures, interrupts can divert the control flow   some advantages over the currently used ones which may
at any time. Analyzing such code by means of symbolic path            result in a larger number of successful unreachability proofs
enumeration in SMT-based bounded model checking would be              – or different instances which may complement and improve
prohibitively expensive due to the enormous branching width           the results of the current techniques.
of the computation tree induced by unconstrained interrupts.             Symbolic Computation methods (such as Gröbner bases
To handle this problem, we are setting out to combine a               [25], virtual substitution [26], [27] or cylindrical algebraic
number of techniques, the two most prominent of which are             decomposition (CAD) [28]) might also be beneficial in the
described in the following.                                           context of floating-point reasoning. Especially for more com-
   Semantics-preserving restriction of interrupt points: As in-       plex arithmetic expressions these techniques might be able to
terrupt handlers in embedded software tend to share only a            (approximately) guide ICP during the search for a solution.
small set of global variables with the main program, extensive
assignment sinking or lifting is possible between statements                                   V. C ONCLUSION
in the main program and the interrupt routine. This implies              The AVACS transfer project T1 aims at a method for accu-
that formally possible interrupt points can be eliminated from        rate dead code detection in embedded C code using arithmetic
the analysis, as the execution sequences are confluent with           constraint solving. While in the target domain of reactive,
those obtained from somewhat earlier or later interrupts. The         control-oriented, and floating-point dominated embedded C
analysis can thus w. l. o. g. confine control flow to interrupts      programs several methods exist to deal with parts of the
happening at a limited subset of the factually possible interrupt     difficulties induced by this domain, there is none which can
points, e. g. just before the main program reads variables            cover all of them. Thus a combination of these techniques as
affected by the interrupt handler. This reduction reduces             a solution seems obvious.
branching in the execution tree and helps SMT-based analysis.            As a first step we integrated iSAT3 (with added support for
   Computation of summaries for interrupt routines: In order          floating-point arithmetic and Craig interpolation) into the tool
to accelerate analysis of the possible effect of interrupts on        chain of BTC Embedded Systems as an alternative backend
the main program’s control flow (and thus, code reachability),        solver. This allows us to perform code coverage analysis on
we will take advantage of summaries for interrupt routines            automatically generated C code of industrial size including
obtained using abstract interpretation and inter-procedural           floating-point operations. The experimental results show, that
analysis. Abstract interpretation here is based on exploiting         iSAT3 is competitive in this environment – it is even able to
iSAT’s floating-point and integer domains as abstract domains         prove more code fragments of interest as unreachable.
and iSAT’s corresponding contractors as abstract transformers            To tackle the remaining challenges of the transfer project
when analyzing straight-line code. Inter-procedural (or inter-        there is work in progress concerning CEGAR and interrupt
interrupt-handler) analysis is then achieved by mutually using        handling. Furthermore there are some ideas to improve the
abstract interpretation on the individual interrupt handlers to       current method.
compute safe approximations of the possible return values
based on (over-approximate) knowledge of the possible entry                                  ACKNOWLEDGMENT
values and on the main program to compute safe approxima-               This work has been supported by the German Research
tions of the possible entry values to interrupt handlers based on     Foundation (DFG) as part of the Transregional Collaborative
(over-approximate) knowledge of the possible return values.           Research Center “Automatic Verification and Analysis of
   The summaries thus obtained will later be exploited in             Complex Systems” (DFG, SFB/TR 14 AVACS).
the exact SMT-based analysis, where they can prune the
search space tremendously before unfolding detailed statement                                    R EFERENCES
sequences across interrupts.                                           [1] P. Cousot and R. Cousot, “Abstract Interpretation: A Unified
                                                                           Lattice Model for Static Analysis of Programs by Construction or
C. Solver Improvements                                                     Approximation of Fixpoints,” in Proceedings of the 4th ACM SIGACT-
                                                                           SIGPLAN Symposium on Principles of Programming Languages, ser.
   Currently iSAT3 only supports floating-point reasoning for              POPL ’77. New York, NY, USA: ACM, 1977, pp. 238–252. [Online].
the basic arithmetic operations (Addition, Subtraction, Multi-             Available: http://doi.acm.org/10.1145/512950.512973
 [2] ——, “Abstract interpretation and application to logic programs,”                    - Oct. 4, 2013. IEEE, 2013, pp. 7:1–7:10. [Online]. Available:
     The Journal of Logic Programming, vol. 13, no. 2, pp. 103–179,                      http://dx.doi.org/10.1109/EMSOFT.2013.6658585
     1992. [Online]. Available: http://www.sciencedirect.com/science/article/       [23] A. R. Bradley, “Sat-based model checking without unrolling,” in
     pii/0743106692900307                                                                Verification, Model Checking, and Abstract Interpretation - 12th
 [3] J. Julien Bertrane, P. Cousot, R. Cousot, J. Feret, L. Mauborgne,                   International Conference, VMCAI 2011, Austin, TX, USA, January
     A. Miné, and X. Rival, “Static Analysis by Abstract Interpretation                 23-25, 2011. Proceedings, ser. Lecture Notes in Computer Science,
     of Embedded Critical Software,” SIGSOFT Softw. Eng. Notes,                          R. Jhala and D. A. Schmidt, Eds., vol. 6538. Springer, 2011, pp. 70–87.
     vol. 36, no. 1, pp. 1–8, Jan. 2011. [Online]. Available: http:                      [Online]. Available: http://dx.doi.org/10.1007/978-3-642-18275-4 7
     //doi.acm.org/10.1145/1921532.1921553                                          [24] M. Brain, V. D’Silva, A. Griggio, L. Haller, and D. Kroening,
 [4] P. Cousot, Formal Verification by Abstract Interpretation.           Berlin,        Interpolation-Based Verification of Floating-Point Programs with
     Heidelberg: Springer Berlin Heidelberg, 2012, pp. 3–7.                              Abstract CDCL. Berlin, Heidelberg: Springer Berlin Heidelberg,
 [5] E. M. Clarke, O. Grumberg, S. Jha, Y. Lu, and H. Veith,                             2013, pp. 412–432. [Online]. Available: http://dx.doi.org/10.1007/
     “Counterexample-guided abstraction refinement,” in Computer Aided                   978-3-642-38856-9 22
     Verification, 12th International Conference, CAV 2000, Chicago,                [25] T. Becker and V. Weispfenning, Gröbner bases: a computational ap-
     IL, USA, July 15-19, 2000, Proceedings, ser. Lecture Notes                          proach to commutative algebra. London, UK: Springer-Verlag, 1993.
     in Computer Science, E. A. Emerson and A. P. Sistla, Eds.,                     [26] V. Weispfenning, “The complexity of linear problems in fields,” J.
     vol. 1855. Springer, 2000, pp. 154–169. [Online]. Available:                        Symb. Comput., vol. 5, no. 1-2, pp. 3–27, Feb. 1988. [Online].
     http://dx.doi.org/10.1007/10722167 15                                               Available: http://dx.doi.org/10.1016/S0747-7171(88)80003-8
 [6] E. Clarke, O. Grumberg, S. Jha, Y. Lu, and H. Veith, “Counterexample-          [27] ——, “Quantifier elimination for real algebra — the quadratic case
     guided Abstraction Refinement for Symbolic Model Checking,” J.                      and beyond,” Applicable Algebra in Engineering, Communication and
     ACM, vol. 50, no. 5, pp. 752–794, Sep. 2003. [Online]. Available:                   Computing, vol. 8, no. 2, pp. 85–101, 1997. [Online]. Available:
     http://doi.acm.org/10.1145/876638.876643                                            http://dx.doi.org/10.1007/s002000050055
 [7] R. Alur, T. Dang, and F. Ivancic, “Counter-Example Guided Predicate            [28] B. F. Caviness and J. R. Johnson, Quantifier Elimination and Cylindrical
     Abstraction for Hybrid Systems,” in Tools and Algorithms for the                    Algebraic Decomposition. Vienna: Springer-Verlag, 1998.
     Construction and Analysis of Systems, ser. LNCS, H. Garavel and
     J. Hatcliff, Eds., vol. 2619. Springer, 2003.
 [8] M. Fränzle, C. Herde, T. Teige, S. Ratschan, and T. Schubert, “Efficient
     solving of large non-linear arithmetic constraint systems with complex
     boolean structure,” Journal on Satisfiability, Boolean Modeling, and
     Computation, vol. 1, no. 3-4, pp. 209–236, 2007.
 [9] J. G. Cleary, “Logical arithmetic,” Future Computing Systems, vol. 2,
     no. 2, pp. 125–149, 1987.
[10] H. Kelly J., V. Dan S., C. John J., and R. Leanna K., “A practical tutorial
     on modified condition/decision coverage,” Tech. Rep., 2001.
[11] C. Herde, “Efficient solving of large arithmetic constraint systems with
     complex boolean structure: proof engines for the analysis of hybrid
     discrete-continuous systems,” Ph.D. dissertation, 2011.
[12] F. Benhamou and L. Granvilliers, “Continuous and Interval Constraints,”
     in Handbook of Constraint Programming, ser. Foundations of Artificial
     Intelligence, 2006, pp. 571–603.
[13] M. Brain, V. D’Silva, L. Haller, A. Griggio, and D. Kroening, “An
     abstract interpretation of DPLL(T),” in VMCAI 2013.
[14] K. Scheibler, S. Kupferschmid, and B. Becker, “Recent improvements
     in the SMT solver iSAT,” in MBMV, 2013, pp. 231–241.
[15] K. Scheibler and B. Becker, “Using interval constraint propagation for
     pseudo-boolean constraint solving,” in FMCAD 2014.
[16] K. Scheibler, F. Neubauer, A. Mahdi, M. Fränzle, T. Teige, T. Bi-
     enmüller, D. Fehrer, and B. Becker, “Accurate ICP-based Floating-Point
     Reasoning,” in Formal Methods in Computer-Aided Design, FMCAD
     2016. IEEE, 2016.
[17] M. Brain, V. D’Silva, A. Griggio, L. Haller, and D. Kroening, “Deciding
     floating-point logic with abstract conflict driven clause learning,” Formal
     Methods in System Design, vol. 45, no. 2, pp. 213–245, 2014.
[18] P. Schrammel, D. Kroening, M. Brain, R. Martins, T. Teige, and
     T. Bienmüller, “Successful use of incremental BMC in the automotive
     industry,” in FMICS 2015.
[19] T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre, “Lazy
     abstraction,” in Conference Record of POPL 2002: The 29th SIGPLAN-
     SIGACT Symposium on Principles of Programming Languages,
     Portland, OR, USA, January 16-18, 2002, J. Launchbury and
     J. C. Mitchell, Eds. ACM, 2002, pp. 58–70. [Online]. Available:
     http://doi.acm.org/10.1145/503272.503279
[20] D. Beyer, T. A. Henzinger, R. Jhala, and R. Majumdar, “The software
     model checker blast,” STTT, vol. 9, no. 5-6, pp. 505–525, 2007.
     [Online]. Available: http://dx.doi.org/10.1007/s10009-007-0044-z
[21] T. Nagaoka, K. Okano, and S. Kusumoto, “An abstraction refinement
     technique for timed automata based on counterexample-guided
     abstraction refinement loop,” IEICE Transactions, vol. 93-D, no. 5,
     pp. 994–1005, 2010. [Online]. Available: http://search.ieice.org/bin/
     summary.php?id=e93-d 5 994
[22] P. S. Duggirala and A. Tiwari, “Safety verification for linear
     systems,” in Proceedings of the International Conference on Embedded
     Software, EMSOFT 2013, Montreal, QC, Canada, September 29