<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Employing Run-time Static Analysis to Improve Concolic Execution</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Position Paper</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Coen De Roover Software Languages Lab Vrije Universiteit Brussel Brussels</institution>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Maarten Vandercammen Software Languages Lab Vrije Universiteit Brussel Brussels</institution>
          ,
          <country country="BE">Belgium</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>-Dynamic symbolic execution, or concolic execution, is a program testing technique that systematically executes a program with the aim of exploring all feasible program paths, and locating and reporting all errors encountered in these paths. However, as the complexity of the program grows, the number of program paths explodes, making it infeasible for concolic testers to explore all of them. To reduce the number of paths to explore, several concolic testing tools have recently started employing static analysis to prune paths guaranteed by the static analysis to be safe. The concolic tester must then only focus on those paths that might contain an error, as reported by the analysis. However, due to imprecisions in the analysis' result, the reported errors may just be false positives, and it is up to the tester to verify whether a reported alarm is an actual error or merely a false positive. In this position paper, we propose to increase the precision of these analyses by not only performing an initial static analysis before starting concolic testing of the program, but also by launching incremental static analyses over the program at run time, and incorporating into the analyses run-time information observed by the tester. The increased precision that results from incorporating such run-time information should enable further pruning of the program paths that must be explored by the concolic tester. Index Terms-Concolic Testing, Static Analysis, Blended Analysis</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>
        In recent years, there has been an interest in blending static
and dynamic analyses to combine the best of both worlds. The
results of a sound static analysis over-approximate all possible
program executions, at the cost of precision. The results of
a dynamic analysis are precise, yet only valid for a subset
of all program executions. In this position paper, we propose
a novel mechanism to blend an abstract interpretation-based
static analysis [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] with a dynamic analysis in the form of
a concolic tester [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The proposed blended technique
employs an initial static analysis, performed before starting
concolic execution of the program, and thereafter launches
static analyses at run time over the program. The run-time
analyses incorporate observed run-time information to increase
the precision of the analysis with respect to the program
execution from which the analysis was launched.
      </p>
      <p>The remainder of this paper is organized as follows.
Section II introduces the concept of run-time static analysis.
Section III gives an introduction to concolic execution and
explains how a concolic tester may benefit from (run-time)
static analysis. Section IV briefly describes an implementation
of a prototype of this technique. Section V concludes.</p>
    </sec>
    <sec id="sec-2">
      <title>II. RUN-TIME STATIC ANALYSIS</title>
      <p>
        Our proposed technique focuses on the abstracting abstract
machines (AAM) technique, developed by Van Horn and
Might [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], where an abstract interpreter is systematically
derived from a concrete interpreter represented as a ‘concrete’
abstract machine written in a small-step style. AAM
systematically threads all unbounded components of the concrete
machine through the machine’s store and limits the amount of
store addresses that can be allocated. As a result, the resulting
‘abstract’ abstract machine, or static analysis, explores a finite
number of program states during its evaluation of a program,
thus ensuring that analysis terminates.
      </p>
      <p>Our run-time static analysis technique builds on the close
relationship between the interpreter and its derived static
analysis. All run-time information on the program’s execution
is encoded in the concrete machine’s program state. This
information is made available to the analysis by abstracting the
program state in a similar manner as the abstraction of the
machine itself. In general, this technique therefore first involves
executing a program and observing the interpreter’s concrete
program state. When an external client decides to commence
static analysis, the concrete state is abstracted, and the abstract
interpreter is initialized with this abstract state and runs until
it reaches a fixpoint. Figure 1 illustrates how the run-time
information enables refining static analysis. As the predicate
depicted in Figure 1a involves a variable with a random value,
the flow graph (1b) computed by a traditional static analysis
must contain both branches. However, should our run-time
static analysis technique be launched after binding the variable
r to its evaluated value, and before evaluating the predicate’s
value, the run-time analysis has the variable’s value available,
and can produce a flow graph (1c) that precisely predicts the
(let ((r (random 10)))
(if (&gt; r 5))
(then)
(else)))
(a)
(let ((r (random 10))) 
(if (&gt; r 5))
(then)
(else)))
(if (&gt; r 5))
(then)
(else))
(then)</p>
      <p>(else)
(b)
(if (&gt; r 5))
(then)
(else))
(then)
(c)
branch to be be taken. Such a run-time analysis only soundly
predicts the future execution given the current execution of the
program. Should the program be restarted, a different value for
r may be generated, thus invalidating the result of the run-time
analysis.</p>
    </sec>
    <sec id="sec-3">
      <title>III. CONCOLIC TESTING</title>
      <p>
        A concolic tester iteratively explores all feasible paths in
a program by manipulating the values of input parameters to
the program, such as program arguments, random numbers,
values that are read from files etc. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>In the first iteration, the tester assigns a random value
to these input parameters while simultaneously collecting all
conditional predicates it encounters into a path constraint that
symbolically represents the conditions that must be true in
order for the program to execute that particular path. After
completing the first run of the program, a part of the path
constraint is negated and the resulting path constraint is given
to a SMT solver. If the solver can find some value for the input
parameters so that this new path constraint is satisfied, concolic
testing continues by restarting the program and assigning these
newly computed values to the input parameters. This process
continues until all feasible program paths have been explored,
or until the tester exceeds some given time constraint.</p>
      <p>We illustrate the working of a concolic tester in more detail
via the example Scheme program depicted in Figure 2. In
this program, the variables a and b are considered input
parameters. Hence, they are symbolically represented as the
input variables a0 and b0. Suppose now that in the first
iteration, the concolic tester randomly assigns 67 to a and
31 to b. These values cause the concolic tester to execute the
program statement on line 8, at which point the tester reports
the error encountered there. Simultaneously with this concrete
execution, the tester also collects the symbolic representation
of the conditional predicates that were encountered at line 5
and 6, into a so-called path constraint, i.e., a0 &gt; 50 ^ b0 &lt;=
a0+b0. After completing this run, the tester attempts to explore
another path, such as the path leading to the program statement
at line 7. To do this, the solver negates the last conditional
predicate in the path constraint, resulting in the constraint
a0 &gt; 50 ^ b0 &gt; a0 + b0 and feeds this resulting path to a SMT
solver, which subsequently attempts to assign values to a0 and
b0 so that the resulting path constraint is true. Note that, as
a0 is guaranteed to be a positive number, the second conjunct
b0 &gt; a0 + b0 can never be true. Hence no values for a0 and b0
can be found so that the constraint is satisfied. The concolic
tester thus concludes that the constraint is unsatisfiable and
that the program statement at line 7 is unreachable. At this
point, the concolic tester backtracks further into the execution
and tries to negate the first conjunct in the path constraint,
resulting in the new constraint a0 &lt;= 50, and attempts to
solve this constraint, for example by assigning the value 0 to
a0. The concolic tester re-executes the program and assigns
the value 0 to a. It completes the program by reaching the
expression at line 9. As no new branches were encountered
by the tester during this last run, the tester concludes that it
has explored all feasible program paths and concolic testing
terminates. The three path constraints that were constructed
can be collected in a symbolic execution tree that represents
all possible executions of the program, as shown in Figure 3.
A. Combining Static Analysis and Concolic Testing</p>
      <p>
        The number of program paths is exponential in the number
of conditional branches of the program. As it is therefore
often infeasible for concolic testers to explore more than a
fraction of all possible program paths in a real-world
application, testers employ sophisticated search strategies to select
a program path that can be explored and that is most likely
to result in a non-trivial error [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Nevertheless, in practical
applications, concolic testers are not guaranteed to identify
all errors located in a program. Recently, some concolic
testers [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] have started employing static analysis
to greatly prune the number of program paths that must be
explored by the analysis. As a sound static analysis safely
proves the absence of errors, any program path not reported
to contain an error can be avoided by the tester. However,
errors that were reported by the static analysis may actually
(f)
(error)
(h)
???
be false positives that are the result of imprecision in the static
analysis. The tester therefore exercises these paths in order to
verify whether the reported error is an actual error. Note that
the tester relies on the static analysis to be sound: as the tester
only explores program paths highlighted by the analysis, a
false negative produced by the analysis would not be caught
by the tester.
      </p>
      <p>Consider the program listed in Figure 4 and its abstract state
graph, produced by performing a simple initial static analysis
over this program before starting concolic testing, in Figure 5.
It can be seen from the state graph that the analysis considers
the path where a is not larger than 5 to be safe, the path where
both a and b are larger than 5 to be unsafe, and the path
where a is larger than 5 but b is not to be potentially unsafe.
The conclusion for this last path is due to imprecision that
arose in the static analysis: if we employ e.g., a very
simplebut-fast constant propagation analysis that does not precisely
track reassignments to variables, such as the reassignment to
the function h at line 7, then the analysis cannot conclude
that calling h at line 10 is safe. As a result, any concolic
tester relying on this analysis can only prune the first path
and must still explore the other two paths.</p>
      <p>B. Combining Run-time Static Analysis and Concolic Testing</p>
      <p>The safe path where h is called can be pruned by employing
a run-time static analysis, in combination with an initial static
analysis performed before starting testing of the program. If
the tester starts exploring the error path that results in the call
to f at line 9, the tester could launch a run-time static analysis
at line 7, after the reassignment to h but before the paths
to the calls of f and h are split. At this point, the run-time
analysis can incorporate the knowledge that h equals the safe
function g, thereby avoiding the imprecision caused by the
reassignment of h at line 7. In this case, the abstract state graph
produced by the run-time analysis, as depicted in Figure 6, is
precise enough to determine that calling h will not result in
an error, hence this path should not be explored by the tester.
In general, a run-time analysis could be launched over the
program when the tester has reached a common ancestor node
to two or more paths that are potentially unsafe, in the hope
that the run-time information available at that point enables
the static analysis to be precise enough to determine whether
or not any of the potentially unsafe paths can be pruned.</p>
    </sec>
    <sec id="sec-4">
      <title>IV. IMPLEMENTATION AND FUTURE WORK</title>
      <p>
        We have implemented a prototype of this technique1 in the
Scala-AM framework [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] to test Scheme programs.We
are currently in the process of extending and improving the
prototype and evaluating its effectiveness. To overcome the
performance overhead incurred by performing (potentially
multiple) static analyses over the program at run time, we
aim to make these analyses incremental with respect to each
other, so that one analysis can reuse part of the work computed
by another. Furthermore, we expect this technique to be most
effective when it employs simple-but-fast analyses that have a
low execution time but a high degree of imprecision, as these
imprecisions can hopefully be resolved by incorporating
runtime information.
      </p>
    </sec>
    <sec id="sec-5">
      <title>V. CONCLUSION</title>
      <p>We have proposed a novel technique for improving precision
of a static analysis at run time, using run-time information on
the program’s execution. We propose to couple this technique
to a concolic tester, to reduce the number of program paths
that the tester must explore.</p>
    </sec>
    <sec id="sec-6">
      <title>1Available</title>
      <p>prototype
at
https://github.com/mvdcamme/scala-am/tree/concolic</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Roberto</given-names>
            <surname>Baldoni</surname>
          </string-name>
          , Emilio Coppa,
          <string-name>
            <surname>Daniele Cono D'Elia</surname>
            ,
            <given-names>Camil</given-names>
          </string-name>
          <string-name>
            <surname>Demetrescu</surname>
            , and
            <given-names>Irene</given-names>
          </string-name>
          <string-name>
            <surname>Finocchi</surname>
          </string-name>
          .
          <article-title>A survey of symbolic execution techniques</article-title>
          .
          <source>arXiv preprint arXiv:1610.00502</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Jacob</given-names>
            <surname>Burnim</surname>
          </string-name>
          and
          <string-name>
            <given-names>Koushik</given-names>
            <surname>Sen</surname>
          </string-name>
          .
          <article-title>Heuristics for scalable dynamic test generation</article-title>
          .
          <source>In Automated Software Engineering</source>
          ,
          <year>2008</year>
          .
          <source>ASE</source>
          <year>2008</year>
          . 23rd IEEE/ACM International Conference on, pages
          <fpage>443</fpage>
          -
          <lpage>446</lpage>
          . IEEE,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Ting</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Xiao-song Zhang, Shi-ze Guo, Hong-yuan Li, and Yue Wu. State of the art: Dynamic symbolic execution for automated test generation</article-title>
          .
          <source>Future Generation Computer Systems</source>
          ,
          <volume>29</volume>
          (
          <issue>7</issue>
          ):
          <fpage>1758</fpage>
          -
          <lpage>1773</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Maria</given-names>
            <surname>Christakis</surname>
          </string-name>
          , Peter Mu¨ller, and
          <article-title>Valentin Wu¨stholz. Guiding dynamic symbolic execution toward unverified program executions</article-title>
          .
          <source>In Proceedings of the 38th International Conference on Software Engineering</source>
          , ICSE '
          <volume>16</volume>
          , pages
          <fpage>144</fpage>
          -
          <lpage>155</lpage>
          , New York, NY, USA,
          <year>2016</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Patrick</given-names>
            <surname>Cousot</surname>
          </string-name>
          and
          <string-name>
            <given-names>Radhia</given-names>
            <surname>Cousot</surname>
          </string-name>
          .
          <article-title>Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints</article-title>
          .
          <source>In Proceedings of the 4th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, POPL '77</source>
          ,
          <year>1977</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Christoph</given-names>
            <surname>Csallner</surname>
          </string-name>
          and
          <string-name>
            <given-names>Yannis</given-names>
            <surname>Smaragdakis</surname>
          </string-name>
          .
          <article-title>Check 'n' crash: Combining static checking and testing</article-title>
          .
          <source>In Proceedings of the 27th International Conference on Software Engineering</source>
          , ICSE '
          <volume>05</volume>
          , pages
          <fpage>422</fpage>
          -
          <lpage>431</lpage>
          , New York, NY, USA,
          <year>2005</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Kostas</given-names>
            <surname>Ferles</surname>
          </string-name>
          , Valentin Wu¨ stholz, Maria Christakis, and
          <string-name>
            <given-names>Isil</given-names>
            <surname>Dillig</surname>
          </string-name>
          .
          <article-title>Failure-directed program trimming</article-title>
          .
          <source>In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering</source>
          , ESEC/FSE 2017, pages
          <fpage>174</fpage>
          -
          <lpage>185</lpage>
          , New York, NY, USA,
          <year>2017</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Xi</given-names>
            <surname>Ge</surname>
          </string-name>
          , Kunal Taneja, Tao Xie, and
          <string-name>
            <given-names>Nikolai</given-names>
            <surname>Tillmann</surname>
          </string-name>
          . Dyta:
          <article-title>Dynamic symbolic execution guided with static verification results</article-title>
          .
          <source>In Proceedings of the 33rd International Conference on Software Engineering</source>
          , ICSE '
          <volume>11</volume>
          , pages
          <fpage>992</fpage>
          -
          <lpage>994</lpage>
          , New York, NY, USA,
          <year>2011</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Patrice</given-names>
            <surname>Godefroid</surname>
          </string-name>
          , Nils Klarlund, and
          <string-name>
            <given-names>Koushik</given-names>
            <surname>Sen</surname>
          </string-name>
          . Dart:
          <article-title>Directed automated random testing</article-title>
          .
          <source>SIGPLAN Not</source>
          .,
          <volume>40</volume>
          (
          <issue>6</issue>
          ):
          <fpage>213</fpage>
          -
          <lpage>223</lpage>
          ,
          <year>June 2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Koushik</surname>
            <given-names>Sen</given-names>
          </string-name>
          , Darko Marinov, and
          <string-name>
            <given-names>Gul</given-names>
            <surname>Agha</surname>
          </string-name>
          .
          <article-title>Cute: A concolic unit testing engine for c</article-title>
          .
          <source>In Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ESEC/FSE13</source>
          , pages
          <fpage>263</fpage>
          -
          <lpage>272</lpage>
          , New York, NY, USA,
          <year>2005</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11] Quentin Stie´venart, Jens Nicolay, Wolfgang De Meuter, and Coen De Roover.
          <article-title>Building a modular static analysis framework in scala (tool paper)</article-title>
          .
          <source>In Proceedings of the 2016 7th ACM SIGPLAN Symposium on Scala, SCALA</source>
          <year>2016</year>
          , pages
          <fpage>105</fpage>
          -
          <lpage>109</lpage>
          , New York, NY, USA,
          <year>2016</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12] Quentin Stie´venart, Maarten Vandercammen, Wolfgang De Meuter, and Coen De Roover.
          <article-title>Scala-am: A modular static analysis framework</article-title>
          .
          <source>In Source Code Analysis and Manipulation (SCAM)</source>
          ,
          <year>2016</year>
          IEEE 16th International Working Conference on, pages
          <fpage>85</fpage>
          -
          <lpage>90</lpage>
          . IEEE,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>David Van</surname>
            Horn and
            <given-names>Matthew</given-names>
          </string-name>
          <string-name>
            <surname>Might</surname>
          </string-name>
          .
          <article-title>Abstracting abstract machines</article-title>
          .
          <source>SIGPLAN Not</source>
          .,
          <volume>45</volume>
          (
          <issue>9</issue>
          ):
          <fpage>51</fpage>
          -
          <lpage>62</lpage>
          ,
          <year>September 2010</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>