<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Design of a Modified Concolic Testing Algorithm with Smaller Constraints</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yavuz Koroglu</string-name>
          <email>yavuz.koroglu@boun.edu.tr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alper Sen</string-name>
          <email>alper.sen@boun.edu.tr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Engineering, Bogazici University</institution>
          ,
          <country country="TR">Turkey</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Concolic testing is a well-known unit test generation technique. However, bottlenecks such as constraint solving prevents concolic testers to be used in large projects. We propose a modification to a standard concolic tester. Our modification makes more but smaller queries to the constraint solver, i.e. ignores some path conditions. We show that it is possible to reach the same branch coverage as the standard concolic tester while decreasing the burden on the constraint solver. We support our claims by testing several C programs with our method. Experimental results show that our modification improves runtime performance of the standard concolic tester in half of the experiments and results in more than 5x speedup when the unit under test has many infeasible paths.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Software testing is an important but resource consuming part of software development process.
According to a study in 2005, 79% of Microsoft developers write unit tests [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Therefore, automated
unit testing is an important area of research. Concolic testing is an automated unit testing method first
introduced in 2005 [
        <xref ref-type="bibr" rid="ref13 ref8">8, 13</xref>
        ]. Empirical study on concolic testing shows that scalability is an issue for
concolic testing [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Therefore making concolic testing more scalable is an open research area.
      </p>
      <p>
        We propose a modification to the input generation method of concolic testing, called Incremental
Partial Path Constraints (IPPC). A standard concolic tester instruments the unit under test (UUT) to
collect operations that either affect or depend on symbolic variables during a concrete execution. This
sequence of operations is called an execution trace. A concolic tester symbolically reexecutes the
execution trace to generate path conditions that solely depend on the symbolic variables. Then, the concolic
tester negates the last path condition that is not negated before. At the final step, constraint solver is
called only once with all path conditions to generate a new input vector. As noted in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], solving more
but smaller constraints against one large constraint (i.e. a constraint that contains large number of path
conditions) is more efficient. Towards this goal, we modified the input generation step of a standard
concolic tester such that we call the constraint solver multiple times, but using only a few path conditions
at each invocation. Our experiments show that we can generate inputs that fall into the same
equivalence class (i.e. inputs that force the program into generating the same execution trace) using fewer path
conditions than a standard concolic tester. Although we pay the cost of more constraint solver calls on
a ≤ b
a ≤ c
average we show that our modification results in more than 5x speedup when UUT has many infeasible
paths.
      </p>
      <p>
        This paper makes two contributions. First, we propose an input generation strategy based on partial
path constraints, called IPPC and incorporate our strategy into a standard concolic testing algorithm.
Partial path constraints contain a subset of path conditions of an execution trace. Second, we implement
our modification using the CREST tool [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], which is a concolic testing framework for C. Our results
indicate that IPPC achieves the same branch coverage as the standard concolic tester while decreasing
the burden on the constraint solver and leading to speedups.
      </p>
      <p>We illustrate our approach on a small example in the next section. In section 4, we describe our
approach in detail. We present our experimental results in section 5. We discuss the related work in
section 3 and conclude in section 7.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Overview</title>
      <p>In this section, we run both the standard concolic tester and IPPC on a small program unit which checks
if three numbers are sorted or not, denoted by isSorted(a, b, c). The control-flow graph of the unit under
test (UUT) is given in Figure 1.</p>
      <p>Figure 2 shows an execution of the standard concolic testing algorithm on isSorted(a, b, c). A
standard concolic tester starts from an arbitrary input ([0; 0; 0] in this case), executes the UUT using the
input values. The concolic tester collects the execution trace of this concrete execution and generates the
path constraint p0 = [a b; a c; b c]. The last condition of p0 is negated and we get p1 = [a b; a
c; b &gt; c]. We invoke the constraint solver with p1, denoted as CS(3), where 3 represents the number
of path conditions in p1. We get [0; 1; 0] as the new input from the constraint solver. We execute the
program with new inputs and collect the path constraint p2 = [a b; a c; b &gt; c]. Now, we negate the
second condition of p2 and remove all conditions after the second condition and get p3 = [a b; a &gt; c].
We explain the reasons to exclude path conditions that come after the negated condition in Section 4.2.
We get [0; 1; 1] by invoking CS(2). Concrete execution of the program results in p4 = [a b; a &gt; c].
Then, we negate the only remaining path condition and get p5 = [a &gt; b] after removing the last condition.
CS(1) generates [2; 1; 1]. The concolic tester stops generating test inputs after executing the program
since the resulting path constraint p6 = [a &gt; b] has no path condition that is not negated before. At the
[collected]
[generated]</p>
      <p>[CS(3)]
[collected]
[generated]</p>
      <p>[CS(2)]
[collected]
[generated]</p>
      <p>[CS(1)]
[collected]
[Path Exhaustion]
end, we have invoked the constraint solver three times with an average path constraint length of two.</p>
      <p>Figure 3 illustrates the execution of our IPPC algorithm on isSorted(a, b ,c). In this case, each time
we generate a new path constraint, we do not immediately call the constraint solver. We define the
constraint obtained by executing the program a full path constraint. We generate a subset of the full
path constraint, f 0, which we call a partial path constraint. Partial path constraint f 0 is initialized as
an array which contains only the negated path condition. Partial path constraint generation phase is
called overapproximation. We call the constraint solver with f 0. In case the constraint solver does not
generate an input which satisfies p, we add the remaining path conditions of p to f 0 and generate larger
path constraints such as f 1, f 2 etc. until the constraint solver generates an input that satisfies p. For our
example, we only made three calls to the constraint solver with average constraint length of one.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Related Work</title>
      <p>
        The idea of incremental constraint solving for concolic testing is as old as concolic testing itself and
suggested along with CUTE, one of the first concolic testing tools [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. However their incremental
solving idea should not be confused with the one that we present in this paper. In CUTE, since they
negate only one path condition, they keep the variables which are irrelevant to that path condition fixed.
Therefore, they solve for variables which are only relevant to the negated path condition and decrease
the burden on the constraint solver [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. In our approach, CREST still does the incremental solving
optimization of CUTE. On top of that, we also produce path contraints with fewer path conditions than
a standard concolic tester.
      </p>
      <p>
        Solving only a subset of path conditions instead of the whole path constraint is an approach recently
used in finding integer overflows [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. The motivation is supported by observing that many of the path
conditions of an execution trace are irrelevant w.r.t. integer overflows. In our work, we make a more
general assumption that some of the path conditions should be irrelevant to the execution trace itself.
Experimental results support our motivational assumption.
      </p>
      <p>All concolic testers require a constraint solver. Z3 and Yices are commonly used constraint solvers
b; a
b; a
c; b
f30 = [a &gt; c]
f50 = [a &gt; b]
i4 = [2; 1; 0]
i4 satisfies p5
[collected]
[generated]</p>
      <p>[CS(1)]</p>
      <sec id="sec-3-1">
        <title>CONT INU E</title>
        <p>[collected]
[generated]
[CS(1)]</p>
      </sec>
      <sec id="sec-3-2">
        <title>CONT INU E</title>
        <p>[collected]
[generated]
[overapproximation]
[CS(1)]</p>
      </sec>
      <sec id="sec-3-3">
        <title>CONT INU E</title>
        <p>
          [collected]
[
          <xref ref-type="bibr" rid="ref3 ref6">6, 3</xref>
          ]. Details on the complexity of constraint solving can be found in [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. Standard constraint solvers
are not able to solve all types of constraints [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. There are constraint solvers which solve nonlinear
constraints [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] and constraint solvers which solve constraints involving high-precision rational numbers
[
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. Yun et al. has modified CREST to use Z3 constraint solver in order to support nonlinear
arithmetic [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. We could easily implement IPPC modification on top CREST-z3, although plain CREST is
sufficient for our experiments.
        </p>
        <p>
          Concolic testing is not restricted to sequential programs. There are concolic testers which test
concurrent software [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Therefore, IPPC can also be used to test concurrent units.
        </p>
        <p>
          Many concolic testers (e.g. CUTE) use bounded depth first search (bDFS) instead of bounding
the number of iterations as in IPPC and CREST [
          <xref ref-type="bibr" rid="ref12 ref13">13, 12</xref>
          ]. bDFS can be viewed as an attempt to keep
the number of path conditions limited. We believe that our approach makes bounding the number of
iterations more preferable to bounding the depth of the search as in bDFS since experiments show that
IPPC already reduces the size of path constraints significantly.
        </p>
        <p>
          Random branch selection (RND) is an approach introduced in CREST [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. In random branch
selection one negates each path condition of a path constraint with probability 0:5. A variant of this strategy
negates only one path condition at random and calls the constraint solver with that path condition.
        </p>
        <p>
          Control flow directed search (CFG) is a search strategy which guides the search using the static
structure of the UUT. CFG assigns weights to edges of the control flow graph of the UUT and calculates
distances to each unvisited branch. Then, CFG solves a path constraint which leads to the unvisited
branch with the least distance to the current execution trace [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. We compare our approach with both
CFG and RND.
        </p>
        <p>
          A recent study on CREST proposes a new search strategy, called DYNASTY, which uses the control
flow graph of the UUT to guide the search as in CFG technique [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. They modify CREST as in our work
and increase branch coverage with fewer iterations. Their approach is based on avoiding infeasible
paths, whereas our approach decreases the penalty of hitting an infeasible path by using partial path
constraints.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Incremental Partial Path Constraint (IPPC)</title>
      <sec id="sec-4-1">
        <title>In this section, we describe IPPC in detail.</title>
        <p>4.1</p>
        <p>Background
We have previously stated that a concolic tester generates path conditions from an execution trace of the
UUT. We define the following using this fact:
Definition 1. A path condition is a function of symbolic variables to fT; Fg.</p>
        <p>Definition 2. An input vector is an assignment to all symbolic variables of UUT.</p>
        <p>Definition 3. A full path constraint is an array whose elements form a set of path conditions.
Definition 4. Any array whose elements form a subset of the elements of the full path constraint is
called a partial path constraint.</p>
        <p>Definition 5. An input vector is said to satis f y a path constraint if and only if the input vector makes
all conditions of the path constraint T .</p>
        <p>Conjunction of all conditions in a partial path constraint is an overapproximation of conjunction of
all conditions in the full path constraint. Therefore the process of generating partial path constraints
from full path constraints is called an overapproximation.</p>
        <p>Definition 6. An algorithm which takes a path constraint and returns an input vector which satisfies that
constraint is called a constraint solver. If there exists no such input vector, it returns null.</p>
        <p>
          We used the Yices constraint solver, which is the default solver in CREST [
          <xref ref-type="bibr" rid="ref1 ref6">6, 1</xref>
          ].
4.2
        </p>
        <p>Algorithm Design of IPPC
We implemented IPPC on top of CREST’s standard concolic testing algorithm (denoted by DFS) as
shown in Algorithm 1. We initially give nIterations 0, TestInput random input and TestInputs 0/ .
We can see from line 1 that the algorithm is bounded by the maximum number of iterations. At line 2,
we check the input vector because constraint solver is assumed to return null if an input vector can not be
generated, i.e. given path constraint is infeasible. We increment iterations only if the path constraint is
satisfiable. At line 4, we execute UU T with input and save the full path constraint of the execution trace
as p. Lines 8 and 9 ensure that the algorithm performs a depth first search (DFS). We always negate the
last unvisited condition of a path constraint. If there exists no path condition that is not negated before,
we stop because all paths are executed. At line 11, constraintSolver takes a path constraint and returns a
satisfying input vector. We don’t terminate the solver, we use it in it’s built-in incremental mode. Notice
that we exclude the path conditions which come after the negated path condition. Any input vector
which satisfies path conditions up to the negated path condition must generate the same execution trace
input : uut (unit under test)
maxIterations
testInput
testInputs
nIterations
output: testInputs
end
for i sizeOf(p) 1 to 0 do
if !isNegatedBefore(p[i]) then
setNegatedBefore(p[i]);
p[i] :p[i];
testInput constraintSolver(fp[0]:::p[i]g);
return crest(uut, maxIterations, testInput, testInputs, nIterations);</p>
        <p>Algorithm 1: Concolic Testing Algorithm Implementing DFS with Maximum Iterations (CREST)
up to the occurrence of the negated path condition. Negated path condition will force the program to a
different execution trace. So, path conditions coming after the negated condition is not a part of the new
execution trace and therefore should be removed. If the unnecessary path conditions are not excluded,
they may cause false path constraint infeasibilities. False infeasibilities force the concolic tester not to
generate any input vector for the path constraint and therefore cause a decrease in test coverage.</p>
        <p>IPPC is exactly the same as Algorithm 1 except we replace line 11 with input IPPC Solve
(UU T; fp[0]:::p[i]g; fp[i]g). We describe IPPC Solve in Algorithm 2. We keep the solver running
in incremental mode.</p>
        <p>Line 1 of Algorithm 2 invokes the constraint solver with f . If the constraint solver is unable to
generate an input vector, we conclude that the full path constraint is also infeasible and return null.
We prove the infeasiblity of a full path constraint given the infeasibility of a partial path constraint in
Theorem 1. The for loop starting at line 5 checks if the input vector satisfies the full path constraint.
If input satisfies p, we can return input. If not, we learn the unsatisfied path condition by adding it
to f and generate a new input vector. This process may be continued until f = p in the worst case
where either an input can be generated or p is infeasible given that the constraint solver is sound (if
exists, returns a correct test input) and complete (terminates and is correct for all cases). This worst case
condition is rare or non existent at least in our experiments.
4.3</p>
        <p>Correctness and Completeness
To be able to produce readable proofs, we assume all boolean operations on a path constraint are
performed on the conjunction of all conditions of the path constraint.</p>
        <p>input : uut (unit under test)
p (full path constraint)
f (partial path constraint)
output: testInput
1 testInput constraintSolver(f );
2 if testInput = null then
3 return null;
4 end
5 for i 0 to sizeOf(p) 1 do
6 if !sat(testInput; p[i]) then
7 append(f ; p[i]);
8 return IPPC Solve (uut, p, f );</p>
        <p>end
9
10 end
11 return testInput;</p>
        <p>Algorithm 2: IPPC Solve</p>
        <sec id="sec-4-1-1">
          <title>Theorem 1. If a partial path constraint f is unsatisfiable, then its full path constraint p is also unsat</title>
          <p>isfiable.</p>
          <p>Proof. Since f is unsatisfiable, :f is valid. Since f is an overapproximation of p, p ! f by definition.
Therefore by modus tollens, :p is valid. In other words, p is unsatisfiable.</p>
          <p>We now show the correctness of IPPC Solve. We change only one line in Algorithm 1. Therefore, to
prove the correctness of IPPC, we only need to ensure that for all path constraints, IPPC Solve generates
an input vector that has the same property as the input vector generated by the constraint solver. If such
an input vector exists, a constraint solver always generates an input vector that satisfies the full path
constraint.</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>Theorem 2. (a) IPPC Solve generates null whenever the constraint solver in Algorithm 1 at Line 11</title>
          <p>generates null. (b) Otherwise IPPC Solve always generates an input vector that satisfies the full path
constraint.</p>
          <p>Proof. Proof of (a) is trivial due to lines 1-4 of Algorithm 2. We can see from lines 5-10, that Algorithm
2 would not stop until the input vector satisfies the full path constraint. Therefore proof of (b) is trivial
as well.</p>
        </sec>
        <sec id="sec-4-1-3">
          <title>Theorem 3. IPPC Solve is correct and eventually terminates, i.e. IPPC Solve is complete.</title>
          <p>Proof. IPPC Solve is correct by Theorem 2. The second part of the proof is as follows. At any time,
partial path constraint f is either (a) unsatisfiable, or an input vector i is generated. If an input vector i
is generated, either (b) i satisfies p, or (c) we add a new path condition of full path constraint to f . If
(a), IPPC Solve returns null and terminates. If (b), IPPC Solve returns i and terminates. If (c) happens
N 1 times, where N denotes the number of path conditions in p, f $ p must hold. In other words, we
build up the partial path constraint up to the full path constraint. In that case, the constraint solver must
either generate null or generate a satisfying input vector, therefore the algorithm terminates.</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>1. DFS: Standard concolic testing algorithm.</title>
        <p>2. CFG: Control flow directed testing algorithm.
3. RND: Random branch testing heuristic on top of the standard concolic testing algorithm.
4. IPPC: Our Incremental Partial Path Constraint algorithm.</p>
      </sec>
      <sec id="sec-4-3">
        <title>Brief explanations of CFG and RND algorithms are given in Section 3. We carried out the experiments on a virtual Linux guest with 1024MB memory and one CPU hosted by a MacBook Pro with an Intel Core i7 2.9 GHz GPU and 8GB Memory. We collected the following information for each experiment:</title>
        <p>1. Time elapsed: Time spent to test the UUT.
2. Total CS Calls: Total number of constraint solver calls made by the concolic tester.
3. Avg Const. Size: Average size of path constraints solved by the constraint solver.
4. Branch Coverage: Measurement of total branch coverage of UUT.</p>
        <p>Concolic testing involves a degree of randomness due to the randomness of initial inputs. Therefore
we performed 10 executions of each experiment and took average values of each measure. All of our
experimental results can be found in Table 2.</p>
        <p>Table 1 shows the program units we used in our experiments. Column KLOC represents the lines of
code measured in thousands. Column #vars represents the number of symbolic variables.</p>
        <p>
          We implemented five small benchmarks, gcd, bsort, sqrt, prime and factor to conduct our initial
experiments. gcd implements the binary greatest common divisor algorithm. We downloaded and
modified the bubble sort algorithm bsort, which sorts a given array of integers [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. sqrt takes the floor
of the square root of a given integer. prime decides whether a given integer is prime or not. factor is
an integer factorization algorithm. replace and grep are benchmarks that come with CREST framework
and used in several research studies on concolic testing [
          <xref ref-type="bibr" rid="ref1 ref14 ref5">1, 5, 14</xref>
          ]. ptokens is the printtokens benchmark
available at Software Infrastructure Repository (SIR) [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. ptokens tokenizes the given string according
to a grammar. For reasons described in Section 6, our implementation of gcd, prime and factor do not
contain any bitwise masking or modulo operations. Instead, we decide divisibility via only subtraction
and comparison operations. Also, sqrt does not use any floating point operations since CREST has no
symbolic equivalent of floating point variables.
        </p>
        <p>In our experiments, we used programs in different sizes. None of the given programs are too large in
size so they can be tested in reasonable time without getting into scalability issues. In our experimental
set, we argue that the structure of UUT is related to the performance of the testers that we use rather
than the sizes of the programs. We observed that small programs such as factor can have very long
runtimes (120 sec). We argue that IPPC will perform well not when the program is small or large,
but when the program contains many infeasible paths. Although being small, prime and factor contain
many infeasible paths. Although being large, grep and replace did not contain many infeasible paths.</p>
        <p>We show all experimental results in Table 2. CFG method is not applicable to grep due to a bug
in CREST, therefore we used N/A to denote the results we could not observe. We used 1000, 5000
and 10000 as the maximum number of iterations (#Itr) in experiments. We decreased #Itr for prime,
factor, bsort and grep, since those units have few distinct execution traces. When there are few distinct
execution traces and #Itr is too high, DFS and IPPC are able to stop before completing #Itr iterations,
since they check if all execution traces are explored or not. However, CFG and RND have no such
stopping condition and iterate #Itr times. So, if we kept #Itr high for prime, factor, bsort and grep, it
would unfairly result in bad runtimes for CFG and RND.</p>
        <p>We show the best algorithms for each benchmark in Table 3 where All Equal means all techniques
are equally well. The column denoted as by Runtime First compares techniques by runtime first, if
techniques have similar runtimes, compares branch coverages. The last column compares techniques by
coverage first. DFS does not perform well in both columns. In terms of runtimes RND is the best and
IPPC is the second. In terms of coverage CFG and IPPC are the best. Hence, IPPC performs well in
both categories.</p>
        <p>We next compare IPPC with DFS in more detail since IPPC is derived from DFS. We observe from
Table 4 that there exists a correlation between Avg Const. Size of DFS / Avg Const. Size of IPPC and
speedup of IPPC over DFS. When the gap between the constraint sizes of DFS and IPPC increases,
the speedup of IPPC over DFS increases with the slight exception of grep. Similarly, Figure 4 shows
that whenever the number of constraint solver calls (Total CS Calls) of DFS is close to the maximum
number of iterations (#Itr), DFS works faster. If DFS makes many more calls than the number of
iterations, IPPC works faster. Normally, DFS is expected to call constraint solver exactly once for
each iteration. DFS makes more than one constraint solver call for an iteration only if the generated
path constraint for that iteration is infeasible. In that case, DFS changes the path constraint and calls
constraint solver repeatedly until a feasible path is found. We believe factor has the largest number of
infeasibilities since DFS makes many constraint solver calls for few iterations. Figure 4 shows that the
speedup of IPPC over DFS is above 5x when DFS generates four or more infeasible path constraints for
each feasible path constraint (i.e. DFS makes more than five constraint solver calls for each iteration,
#Itr=Total CS Calls of DFS &lt; 0:2). Therefore, IPPC finds infeasibilities faster for all benchmarks.</p>
        <p>In all experiments, the largest path constraint produced by IPPC had a length of 157, whereas the
largest path constraints of DFS, CFG and RND had lengths of 2922, 1603 and 1391, respectively. We
conclude that we eliminated the need for solving large path constraints to generate test inputs while
keeping the runtimes fast and coverage high using IPPC.</p>
        <p>We get exactly the same coverage results for both IPPC and DFS. We expect this since both
algorithms perform a depth-first search. We argue that the coverage results being similar indicates that IPPC
correctly does DFS on the UUT while improving the performance.</p>
        <p>IPPC considers constraints that are 10x smaller than other techniques. We believe it is because that
IPPC finds infeasibilities early, since most infeasibilities arise from a combination of few conditions in
the constraint.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Threads to Validity</title>
      <p>
        We assume that the constraint solver is sound, i.e. the constraint solver can find an input vector whenever
there exists an input vector which satisfies the path constraint. In general, this assumption is not valid.
We know that the constraint solver used in CREST, Yices [
        <xref ref-type="bibr" rid="ref11 ref6">6, 11</xref>
        ], does not find solutions for nonlinear
path conditions (e.g. x2x1 &lt; 12). We also know that CREST may not be able to solve conditions
involving modulo operation and bitwise masking [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. All path conditions generated in our experiments
are linear, in other words they can be written as k ./ åiN=1 cixi where ./2 f&lt;; ; =; 6=; &gt;; g, k 2 R,
8i 2 f1; :::; Ng; ci 2 R and N: total number of symbolic variables). All path conditions in our experiments
are also free of modulo and bit masking operations. Hence, we safely assume that the constraint solver
is sound in our environment. We also do not use any floating point arithmetic.
      </p>
      <p>The UUT can have intermediary variables calculated from symbolic variables and therefore can have
path conditions on those intermediary variables. All path conditions that CREST returns are on the initial
symbolic variables. Therefore, even if all branch conditions in the code may seem trivial, CREST may
fail to generate correct inputs if variables are nonlinear (e.g. multiplication of two symbolic variables).</p>
      <p>We assume the UUT to be sequential and deterministic, i.e. if an input vector i produces an execution
trace e, i will always produce e for this UUT. However, for example a process which depends on random
numbers could violate this assumption. We carefully chose the experiments so that we never violate
these basic assumptions.</p>
      <p>We assume the UUT is terminating, since if UUT halts, so does the tester as well. The test cases we
chose and the test cases in previous work are all terminating.</p>
      <p>We report runtimes that we acquired from a virtual environment. We got similar results on an host
machine as well.</p>
      <p>It is possible that IPPC may learn partial path constraints up to a point that they become full path
constraints. So in the worst case, a standard concolic tester is more efficient than IPPC. However, our
experimental results show that we require only a small portion of the full path constraint to generate
input vectors belonging to the same equivalence class, i.e. input vectors which generate same execution
traces when given to UUT.
7</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusions and Future Work</title>
      <p>In this paper, we designed a constraint solving strategy, called Incremental Partial Path Constraints
(IPPC), which eliminates the need for solving large constraints to generate unit tests. We compared our
design with other concolic testing algorithms in experiments. We observed that when there are many
infeasible path constraints in a program, IPPC has more than 5x speedup over a standard concolic tester.
We significantly reduce the number of path conditions required to generate an input vector and show
that IPPC dominates other techniques in two of eight cases and has the best coverage levels in three of
eight cases. We show that it is possible to reach high branch coverage while decreasing the burden on
the constraint solver.</p>
      <p>We believe that our constraint solving strategy contains room for improvement and shows promise
for future work. The performance of IPPC on nonlinear path constraints (i.e. path constraints that
contain at least one nonlinear path condition) or on concurrent software is an important question. Also,
it is possible to further improve IPPC by input vector caching, i.e. trying the previously generated input
vectors first to avoid calling constraint solver. We believe that our work sufficiently motivates further
research on constraint solving strategies involving partial path constraints.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Jacob</given-names>
            <surname>Burnim</surname>
          </string-name>
          and
          <string-name>
            <given-names>Koushik</given-names>
            <surname>Sen</surname>
          </string-name>
          .
          <article-title>Heuristics for scalable dynamic test generation</article-title>
          .
          <source>In Proceedings of the 2008 23rd IEEE/ACM International Conference on Automated Software Engineering, ASE '08</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>[2] CREST-z3</article-title>
          , https://github.com/heechul/crest-z3.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3] Leonardo de Moura and Nikolaj Bjørner.
          <article-title>Z3: An efficient smt solver</article-title>
          .
          <source>In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS'08/ETAPS'08</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Hyunsook</given-names>
            <surname>Do</surname>
          </string-name>
          , Sebastian G. Elbaum, and
          <string-name>
            <given-names>Gregg</given-names>
            <surname>Rothermel</surname>
          </string-name>
          .
          <article-title>Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact</article-title>
          .
          <source>Empirical Software Engineering: An International Journal</source>
          ,
          <volume>10</volume>
          (
          <issue>4</issue>
          ):
          <fpage>405</fpage>
          -
          <lpage>435</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Yu</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Mengxiang</given-names>
            <surname>Lin</surname>
          </string-name>
          , Kai
          <string-name>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <surname>Yi Zhou</surname>
            , and
            <given-names>Yinli</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
          </string-name>
          .
          <article-title>Achieving high branch coverage with fewer paths</article-title>
          .
          <source>In Proceedings of the 2011 IEEE 35th Annual Computer Software and Applications Conference Workshops, COMPSACW '11</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Bruno</given-names>
            <surname>Dutertre</surname>
          </string-name>
          .
          <article-title>Yices 2.2</article-title>
          . In A. Biere and R. Bloem, editors,
          <source>Computer Aided Verification</source>
          , volume
          <volume>8559</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>737</fpage>
          -
          <lpage>744</lpage>
          . Springer International Publishing,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Azadeh</given-names>
            <surname>Farzan</surname>
          </string-name>
          , Andreas Holzer, Niloofar Razavi, and Helmut Veith.
          <article-title>Con2colic testing</article-title>
          .
          <source>In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, ESEC/FSE</source>
          <year>2013</year>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Patrice</given-names>
            <surname>Godefroid</surname>
          </string-name>
          , Nils Klarlund, and
          <string-name>
            <given-names>Koushik</given-names>
            <surname>Sen</surname>
          </string-name>
          . DART:
          <article-title>Directed automated random testing</article-title>
          .
          <source>SIGPLAN Not</source>
          .,
          <volume>40</volume>
          (
          <issue>6</issue>
          ):
          <fpage>213</fpage>
          -
          <lpage>223</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Pierluigi</given-names>
            <surname>Nuzzo</surname>
          </string-name>
          , Alberto Puggelli, Sanjit A.
          <string-name>
            <surname>Seshia</surname>
          </string-name>
          , and
          <string-name>
            <surname>Alberto</surname>
          </string-name>
          Sangiovanni-Vincentelli.
          <article-title>CalCS: SMT solving for non-linear convex constraints</article-title>
          .
          <source>In Proceedings of the 2010 Conference on Formal Methods in Computer-Aided Design, FMCAD '10</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>[10] http://bitbucket.org/yavuzkoroglu/crest-ppc.</mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Xiao</given-names>
            <surname>Qu</surname>
          </string-name>
          and
          <string-name>
            <given-names>Brian</given-names>
            <surname>Robinson</surname>
          </string-name>
          .
          <article-title>A case study of concolic testing tools and their limitations</article-title>
          .
          <source>In Proceedings of the 2011 International Symposium on Empirical Software Engineering and Measurement</source>
          ,
          <source>ESEM '11</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Koushik</given-names>
            <surname>Sen</surname>
          </string-name>
          .
          <source>Scalable Automated Methods for Dynamic Program Analysis</source>
          .
          <source>PhD thesis</source>
          , University of Illinois at Urbana-Champaign,
          <year>2006</year>
          . Available at http://srl.cs.berkeley.edu/~ksen/doku.php.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Koushik</surname>
            <given-names>Sen</given-names>
          </string-name>
          , Darko Marinov, and
          <string-name>
            <given-names>Gul</given-names>
            <surname>Agha</surname>
          </string-name>
          .
          <article-title>CUTE: A concolic unit testing engine for c</article-title>
          .
          <source>SIGSOFT Softw. Eng. Notes</source>
          ,
          <volume>30</volume>
          (
          <issue>5</issue>
          ):
          <fpage>263</fpage>
          -
          <lpage>272</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Hyunmin</given-names>
            <surname>Seo</surname>
          </string-name>
          and
          <string-name>
            <given-names>Sunghun</given-names>
            <surname>Kim</surname>
          </string-name>
          .
          <article-title>How we get there: A context-guided search strategy in concolic testing</article-title>
          .
          <source>In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE</source>
          <year>2014</year>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Stelios</given-names>
            <surname>Sidiroglou-Douskos</surname>
          </string-name>
          , Eric Lahtinen, Nathan Rittenhouse, Paolo Piselli,
          <string-name>
            <given-names>Fan</given-names>
            <surname>Long</surname>
          </string-name>
          , Deokhwan Kim, and
          <string-name>
            <surname>Martin</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Rinard</surname>
          </string-name>
          .
          <article-title>Targeted automatic integer overflow discovery using goal-directed conditional branch enforcement</article-title>
          .
          <source>In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '15</source>
          , pages
          <fpage>473</fpage>
          -
          <lpage>486</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Matheus</surname>
            <given-names>Souza</given-names>
          </string-name>
          , Mateus Borges,
          <article-title>Marcelo d'Amorim, and Corina S. Pa˘sa˘reanu. CORAL: Solving complex constraints for symbolic pathfinder</article-title>
          .
          <source>In Proceedings of the Third International Conference on NASA Formal Methods, NFM'11</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17] BubbleSort, http://www.programmingsimplified.com/c/source-code/
          <article-title>c-program-bubble-sort.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Edward</surname>
            <given-names>P. K.</given-names>
          </string-name>
          <string-name>
            <surname>Tsang</surname>
          </string-name>
          .
          <article-title>Foundations of constraint satisfaction. Computation in cognitive science</article-title>
          . Academic Press,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>