<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Speeding Up SMT-Based Quantitative Program Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Daniel J. Fremont</string-name>
          <email>dfremont@berkeley.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sanjit A. Seshia</string-name>
          <email>sseshia@eecs.berkeley.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of California</institution>
          ,
          <addr-line>Berkeley</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Quantitative program analysis involves computing numerical quantities about individual or collections of program executions. An example of such a computation is quantitative information ow analysis, where one estimates the amount of information leaked about secret data through a program's output channels. Such information can be quanti ed in several ways, including channel capacity and (Shannon) entropy. In this paper, we formalize a class of quantitative analysis problems de ned over a weighted control ow graph of a loop-free program. These problems can be solved using a combination of path enumeration, SMT solving, and model counting. However, existing methods can only handle very small programs, primarily because the number of execution paths can be exponential in the program size. We show how path explosion can be mitigated in some practical cases by taking advantage of special branching structure and by novel algorithm design. We demonstrate our techniques by computing the channel capacities of the timing side-channels of two programs with extremely large numbers of paths.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Quantitative program analysis involves computing numerical quantities that are functions of
individual or collections of program executions. Examples of such problems include computing
worst-case or average-case execution time of programs, and quantitative information ow, which
seeks to compute the amount of information leaked by a program. Much of the work in this area
has focused on extremal quantitative analysis problems | that is, problems of nding
worstcase (or best-case) bounds on quantities. However, several problems involve not just nding
extremal bounds but computing functions over multiple (or all) executions of a program. One
such example, in the general area of quantitative information ow, is to estimate the entropy
or channel capacity of a program's output channel. These quantitative analysis problems are
computationally more challenging, since the number of executions (for terminating programs)
can be very large, possibly exponentially many in the program size.</p>
      <p>In this paper, we present a formalization and satis ability modulo theories (SMT) based
solution to a family of quantitative analysis questions for deterministic, terminating programs.
The formalization is covered in detail in Section 2, but we present some basic intuition here.
This family of problems can be de ned over a weighted graph-based model of the program.
More speci cally, considering the program's control ow graph, one can ascribe weights to
nodes or edges of the graph capturing the quantity of interest (execution time, number of bits
leaked, memory used, etc.) for basic blocks. Then, to obtain the quantitative measure for a
given program path, one sums up the weights along that path. Furthermore, in order to count
the number of program inputs (and thus executions) corresponding to a program path, one can
perform model counting on the formula encoding the path condition. Finally, to compute the
quantity of interest (such as entropy or channel capacity) for the overall program, one combines
the quantities and model counts obtained for all program paths using a prescribed formula.</p>
      <p>The obvious limitation of the basic approach sketched above is that, for programs with
substantial branching structure, the number of program paths (and thus, executions) can be
exponential in the program size. We address this problem in the present paper with two ideas.
First, we show how a certain type of \con uent" branching structure which often occurs in real
programs can be exploited to gain signi cant performance enhancements. A common example of
this branching structure is the presence of a conditional statement inside a for-loop, which leads
to 2N paths for N loop iterations. In this case, if the branches are proved to be \independent"
of each other (by invoking an SMT solver), then one can perform model counting of individual
branch conditions rather than of entire path conditions, and then cheaply aggregate those model
counts. Secondly, to compute a quantity such as channel capacity, it is not necessary to derive
the entire distribution of values over all paths. For this case, we give an e cient algorithm
to compute all the values attained by a given quantity (e.g. execution time) over all possible
paths | i.e., the support of the distribution | which runs in time polynomial in the sizes of
the program and the support. Our algorithmic methods are particularly tuned to the analysis
of timing side-channels in programs. Speci cally, we apply our ideas to computing the channel
capacity of timing side-channels for two standard programs which have far too many paths for
previous techniques to handle.</p>
      <p>Our techniques enable the use of SMT methods in a new application, namely quantitative
program analyses such as assessing the feasibility of side-channel attacks. While SMT methods
are used in other program veri cation problems with exponentially-large search spaces, nave
attempts to use them to compute statistics like those we consider do not circumvent path
explosion. The optimizations that form our primary contributions are essential in making
feasible the application of SMT to our domain.</p>
      <p>To summarize, the main contributions of this paper include:
a method for utilizing special branching structure to reduce the number of model counter
invocations needed to compute the distribution of a class of quantitative measures from
potentially exponential to linear in the size of the program, and
an algorithm which exploits this structure to compute the support of such distributions in
time polynomial in the size of the program and the support.</p>
      <p>The rest of the paper is organized as follows. We present background material and problem
de nitions in Sec. 2. Algorithms and theoretical results are presented in Sec. 3. Experimental
results are given in Sec. 4 and we conclude in Sec. 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Background and Problem De nition</title>
      <p>We present some background material in Sec. 2.1 and the formal problem de nitions in Sec. 2.2.
2.1</p>
      <sec id="sec-2-1">
        <title>Preliminaries</title>
        <p>We assume throughout that we are given a loop-free deterministic program F whose input
is a set of bits I. Our running example for F will be the standard algorithm for modular
exponentiation by repeated squaring, denoted modexp, where the base and modulus are xed
and the input is the exponent. Usually modexp is written with a loop that iterates once for each
bit of the exponent. To make modexp loop-free we unroll its loop, yielding for a 2-bit exponent
the program shown on the left of Figure 1. Lines 2{5 and 6{9 correspond to the two iterations
of the loop.</p>
        <p>1: r 1
2: if (e &amp; 1) = 1 then
3: r rb (mod m)
4: e e &gt;&gt; 1
5: b b2 (mod m)
6: if (e &amp; 1) = 1 then
7: r rb (mod m)
8: e e &gt;&gt; 1
9: b b2 (mod m)
10: return r
e1
e2
e5
e7
e9
e3
e4
e6
e8
CFG</p>
        <p>C</p>
        <p>A</p>
        <p>B</p>
        <p>A + B - C</p>
        <p>To describe the execution paths of F we use the formalism introduced by McCabe [6].
Consider the control- ow graph (CFG) of F , where there is a vertex for each basic block,
conditionals having two outgoing edges. For example, since 2-bit modexp has two conditionals,
its CFG (shown in Figure 1) has two vertices with outdegree 2. We call such vertices branch
points, and denote the set of them by B. Which edge out of a branch point b 2 B is taken
depends on the truth of its branch condition Cb, the condition in the corresponding conditional
statement. In Figure 1, the branch condition for the rst branch point is (e&amp;1) = 1: if this
holds, then edge e3 is taken, and otherwise edge e2 is taken. We model the nite-precision
semantics of programs, variables being represented as bitvectors, so that the branch conditions
can be expressed as bitvector SMT formulae. Since these conditions can depend on the result
of prior computations (e.g. the second branch condition in Figure 1), the corresponding SMT
formulae include constraints encoding how those computations proceed. Then each formula
uniquely determines the truth of its branch condition given an assignment to the input bits.
When necessary, these formulae can be bit-blasted into propositional SAT formulae for further
analysis (e.g. model counting).</p>
        <p>For convenience we add a dummy vertex to the CFG which has an incoming edge from all
sink vertices. Since F is loop-free the CFG is a DAG, and each execution of F corresponds to
a simple path from the source to the (now unique) sink. Given such a path P , we write B(P )
for the set of branch points where P takes the right of the two outgoing edges, corresponding
to making Cb true. If there are N edges then these paths can be viewed as vectors in f0; 1gN ,
where each coordinate speci es whether the corresponding edge is taken. For example, in Figure
1 path A corresponds to the vector (1; 0; 1; 1; 1; 1; 0; 0; 1) under the given edge labeling. This
representation allows us to speak meaningfully about linear combinations of paths, as long as
the result is in f0; 1gN . A basis of the set of paths is de ned by analogy to vector spaces to be
a minimal set of paths from which all paths can be obtained by taking linear combinations. In
Figure 1, the paths A, B, and C form a basis, as the only other path through the CFG can be
expressed as A + B C.</p>
        <p>Now suppose we are given an integer weight for each basic block of F , or equivalently for
each vertex of its CFG.1 We de ne the total weight wt(P ) of an execution path P of F to be
the sum of the weights of all basic blocks along P . Note that we get the same value if the
weight of each vertex is moved to all of its outgoing edges (obviously excluding the dummy
sink), and we sum edge instead of vertex weights | thus wt( ) is a linear function. Since F is
deterministic, each input x 2 f0; 1gI triggers a unique execution path we denote P(x), and so
has a well-de ned total weight wt(x) = wt(P(x)).
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Problem De nition</title>
        <sec id="sec-2-2-1">
          <title>We consider in this paper the following problems:</title>
          <p>Problem 1. Picking x 2 f0; 1gI uniformly at random, what is the distribution of wt(x)?
and the special case:
Problem 2. What is the support of the distribution of wt(x), i.e. what is the set wt(f0; 1gI ) =
fwt(x) j x 2 f0; 1gI g?</p>
          <p>One way to think about these problems is to view the weight of a basic block as some
quantity or resource, say execution time or energy, that the block consumes when executed.
Then Problem 1 is to nd the distribution of the total execution time or energy consumption
of the program.</p>
          <p>Computing or estimating this distribution is useful in a range of applications (see [10]).
We consider here a quantitative information ow (QIF) setting, with an adversary who tries to
recover x from wt(x). In the example above, this would be a timing side-channel attack scenario
where the adversary can only observe the total execution time of the program. Given the
distribution of wt(x), we can compute any of the standard QIF metrics such as channel capacity
or Shannon entropy measuring how much information is leaked about x. For deterministic
programs, the channel capacity2 is simply the (base 2) logarithm of the number of possible
observed values [11]. Thus to compute the channel capacity we do not need to know the full
distribution of wt(x), but only how many distinct values it can take | hence our isolation of
Problem 2. As we will see, this special case can sometimes be solved much more rapidly than
by computing the full distribution.</p>
          <p>We note that the general problems above can be applied to a variety of di erent types of
resources. On platforms where the execution time of a basic block is constant (i.e. not dependent
on the state of the machine), they can be applied to timing analysis. The weights could also
represent the size of memory allocations, or the number of writes to a stream or device. For
all of these, solving Problems 1 and 2 could be useful for performance characterization and
analysis of side-channel attacks.
3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Algorithms and Theoretical Results</title>
      <p>The simplest approach to Problem 1 would be to execute program F on every x 2 f0; 1gI ,
computing the total weight of the triggered path and eventually obtaining the entire map
1Note that our formalism and approach can be made to work with rational weights, but we focus here on
applications for which integer weights su ce.</p>
      <p>2Sometimes called the conditional min-entropy of x with respect to wt(x), since for deterministic programs
with a uniform input distribution they are the same [11].
x 7! wt(x). This is obviously impractical when there are more than a few input bits, and is
wasteful because often many inputs trigger the same execution path. A more re ned approach
is to enumerate all execution paths, and for each path compute how many inputs trigger it. This
can be done by expressing the branch conditions corresponding to the path as a bitvector or
propositional formula and applying a model counter [3] (this idea was used in [1] to count how
many inputs led to a given output, although with a linear integer arithmetic model counter). If
the number of paths is much less than 2jIj, as is often the case, this approach can be signi cantly
more e cient than brute-force input enumeration. However, as noted above the number of paths
can be exponential in the size of F , in which case this approach requires exponentially-many
calls to the model counter and therefore is also impractical.</p>
      <p>A prototypical example of path explosion is our running example modexp. For an N -bit
exponent, there are N conditionals, and all possible combinations of these branches can be
taken, so that there are 2N execution paths. This makes model counting each path infeasible,
but observe that the algorithm's branching structure has two special properties. First, the
conditionals are unnested : the two paths leading from each conditional always converge prior
to the next one. Second, the branch conditions are independent : they depend on di erent bits
of the input. Below we show how we can use these properties to gain greater e ciency, yielding
Algorithms 2 and 4 for Problems 1 and 2 respectively.
3.1</p>
      <sec id="sec-3-1">
        <title>Unnested Conditionals</title>
        <p>If F has no nested conditionals, its CFG has an \N -diamond" form like that shown in Figure
1 (the number of basic blocks within and between the \diamonds" can vary, of course | in
particular, we do not assume that the \else" branch of a conditional is empty, as is the case
for modexp). This type of structure naturally arises when unrolling a loop with a conditional
in the body, as indeed is the case for modexp. Verifying that there are no nested conditionals is
a simple matter of traversing the CFG.</p>
        <p>With unnested conditionals, there is a one-to-one correspondence between execution paths
and subsets of B, given by P 7! B(P ). For any b 2 B, we write Bb for the path which takes
the left edge at every branch point except b (i.e. makes every branch condition false except for
that of b | of course it is possible that no input triggers this path). We write Bnone for the
path which always takes the left edge at each branch point. For example, in Figure 1 if the
conditionals on lines 2 and 6 correspond to branch points a and b respectively, then A = Ba,
B = Bb, and C = Bnone. In general, Bnone together with the paths Bb form a basis for the set
of all paths. In fact, for any path P it is easy to see that</p>
        <p>0</p>
        <p>1</p>
        <p>
          BcA
c2B(P )
(jB(P )j
1) Bnone :
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
        </p>
        <sec id="sec-3-1-1">
          <title>This representation of paths will be useful momentarily.</title>
          <p>3.2</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Independence</title>
        <p>Recall that an input variable of a Boolean function is a support variable if the function
actually depends on it, i.e. the two cofactors of the function with respect to the variable are not
equivalent. For each branch point b 2 B, let Sb I be the set of input bits which are support
variables of Cb. We make the following de nition:
De nition 1. Two conditionals b; c 2 B are independent if Sb \ Sc = ;.</p>
        <p>Independence simply means that there are no common support variables, so that the truth
of one condition can be set independently of the truth of the other.</p>
        <p>To compute the supports of the branch conditions and check independence, the simplest
method is to iterate through all the input bits, checking for each one whether the cofactors
of the branch condition with respect to it are inequivalent using an SMT query in the usual
way. This can be substantially streamlined by doing a simple dependency analysis of the branch
condition in the source of F , to determine which input variables are involved in its computation.
Then only input bits which are part of those variables need be tested (for example, in Figure 1
both branch conditions depend only on the input variable e, and if there were other input
variables the bits making them up could be ignored). This procedure is outlined as Algorithm
1. Note that as indicated in Sec. 2.1, the formula computed in line 6 encodes the semantics
of F so that the truth of Cb (equivalently, the satis ability of ) is uniquely determined by an
assignment to the input bits. For lack of space, the proofs of Lemma 1 and the other lemmas
in this section are deferred to the Appendix found in the full version of this paper.
Algorithm 1 FindConditionSupports(F )
1: Compute CFG of F and identify branch points B
2: if there are nested conditionals then
3: return FAILURE
4: for all b 2 B do
5: Sb ; // these are global variables
6: SMT formula representing Cb
7: V input bits appearing in
8: for all v 2 V do
9: if the cofactors of Cb w.r.t. v are not equivalent then
10: Sb Sb [ fvg
11: return SUCCESS
2</p>
        <p>0
Lemma 1. Algorithm 1 computes the supports Sb correctly, and given an SMT oracle runs in
time polynomial in jF j and jIj.</p>
        <p>If all of the conditionals of F are pairwise independent, then I can be partitioned into the
pairwise disjoint sets Sb and the set of remaining bits which we write Snone. For any b 2 B, the
truth of Cb depends only on the variables in Sb, and we denote by Tb the number of assignments
to those variables which make Cb true. Then we have the following formula for the probability
of a path:
Lemma 2. Picking i 2 f0; 1gI uniformly at random, for any path P , the probability that the
path corresponding to input i is P is given by</p>
        <p>Lemma 2 allows us to compute the probability of any path as a simple product if we know
the quantities Tb. Each of these in turn can be computed with a single call to a model counter,
as done in Algorithm 2.</p>
        <p>Theorem 1. Algorithm 2 correctly solves Problem 1, and given SMT and model counter oracles
runs in time polynomial in jF j, jIj, and the number of execution paths of F . The model counter
is only queried jBj times.</p>
        <p>Y
b2B(P )
b2BnB(P )</p>
        <p>13
2jSbj</p>
        <p>Tb A5 =2jIj :
Algorithm 2 FindWeightDistribution(F; weights)
1: if FindConditionSupports(F ) = FAILURE then
2: return FAILURE
3: if the sets Sb are not pairwise disjoint then
4: return FAILURE
5: for all b 2 B do
6: Tb model count of Cb over the variables in Sb
7: dist constant zero function
8: for all execution paths P do
9: p probability of P from Lemma 2
10: dist dist[wt(P ) 7! dist(wt(P )) + p]
11: return dist</p>
        <sec id="sec-3-2-1">
          <title>Proof. Follows from Lemmas 1 and 2.</title>
          <p>Algorithm 2 improves on path enumeration by using one invocation of the model counter
per branch point, instead of one invocation per path. In total the algorithm may still take
exponential time, since we need to compute the product of Lemma 2 for each path, but if
model counting is expensive there is a substantial savings.</p>
          <p>Further savings are possible if we restrict ourselves to Problem 2. For this, we want to
compute the possible values of wt(x) for all inputs x. This is identical to the set of possible
values wt(P ) for all feasible paths P (the paths that are executed by some input). Thus, we do
not need to know the probability associated with each individual path, but only which paths
are feasible and which are not. Lemma 2 implies that all paths are feasible (unless some Tb = 0
or Tb = 2jSbj, corresponding to a conditional which is identically false or true; then Sb = ;, so
we can detect and eliminate such trivial conditionals), and this leads to
Lemma 3. Let D be the multiset of di erences wt(Bb) wt(Bnone) for b 2 B. Then the possible
values of wt(i) over all inputs i 2 f0; 1gI are the possible values of wt(Bnone) + D+, where D+
is the set of sums of submultisets of D.</p>
          <p>To use Lemma 3 to solve Problem 2, we must nd the set D+. The brute-force approach of
enumerating all submultisets is obviously impractical unless D is very small. We cannot hope
to do better than exponential time in the worst case3, since D+ can be exponentially larger
than D. However, in many practical situations D+ is not too much larger than D. This is
because the paths Bb often have similar weights, so the variation V = max D min D is small
and we can apply the following lemma:
Lemma 4. If V = max D</p>
          <p>min D, then jD+j = O(V jDj2).</p>
          <p>Small di erences between weights are exploited by Algorithm 3, which as shown in the
Appendix computes D+ in O(jDj jD+j) time. By Lemma 4, the algorithm's runtime is O(jDj
V jDj2) = O(V jDj3), so it is very e cient when V is small. The essential idea of the algorithm
is to handle one element x 2 D at a time, keeping a list of possible sums found so far sorted so
that updating it with the new sums possible using x is a linear-time operation. For simplicity
we only show how positive x 2 D are handled, but see the analysis in the Appendix for the
general case.</p>
          <p>3Although we note that for channel capacity analysis we only need D+ and not D+ itself, and there could
be a faster (potentially even polynomial-time) algorithm to nd this value.</p>
          <p>Algorithm 3 SubmultisetSums(D)
1: sums (0)
2: for all x 2 D do
3: newSums (sums[0])
4: i ! 1 // index of next element of sums to add to newSums
5: for all y 2 sums do
6: z x + y
7: while i &lt; len(sums) and sums[i] &lt; z do
8: newSums:append(sums[i])
9: i i + 1
10: newSums:append(z)
11: if i &lt; len(sums) and sums[i] = z then
12: i i + 1
13: sums newSums
14: return sums</p>
          <p>Using Algorithm 3 together with Lemma 3 gives an e cient algorithm to solve Problem 2,
outlined as Algorithm 4. This algorithm has runtime polynomial in the size of its input and
output.</p>
          <p>Algorithm 4 FindPossibleWeights(F; weights)
1: if FindConditionSupports(F ) = FAILURE then
2: return FAILURE
3: if the sets Sb are not pairwise disjoint then
4: return FAILURE
5: Eliminate branch points with Sb = ; (trivial conditionals)
6: D empty multiset
7: for all b 2 B do
8: d wt(Bb) wt(Bnone)
9: D D [ fdg
10: D+ SubmultisetSums(D)
11: return wt(Bnone) + D+
Theorem 2. Algorithm 4 solves Problem 2 correctly, and given an SMT oracle runs in time
polynomial in jF j, jIj, and wt(f0; 1gI ) .</p>
          <p>Proof. Clear from Lemmas 1 and 3, and the analysis of Algorithm 3 (see the Appendix).
3.3</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>More General Program Structure</title>
        <p>As presented above, our algorithms are restricted to loop-free programs which have only
unnested, independent conditionals. However, our techniques are still helpful in analyzing
a large class of more general programs. Loops with a bounded number of iterations can be
unrolled. Unrolling the common program structure consisting of a for-loop with a conditional in
the body yields a loop-free program with unnested conditionals. If the conditionals are pairwise
independent, as in the modexp example, our methods can be directly applied. If the number of
dependent conditionals, say D, is nonzero but relatively small, then each of the 2D assignments
0.04
0.035
0.03
to these conditionals can be checked for feasibility with an SMT query, and the remaining
conditionals can be handled using our algorithms. If many conditionals are dependent then
checking all possibilities requires an exponential amount of work, but we can e ciently handle
a limited failure of independence. An example where this is the case is the Mersenne Twister
example we discuss in Sec. 4, where 2 out of 624 conditionals are dependent. A small level
of conditional nesting can be handled in a similar way. In general, when analyzing a program
with complex branching structure, our methods can be applied to those regions of the program
which satisfy our requirements. Such regions do frequently occur in real-world programs, and
thus our techniques are useful in practice.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experiments</title>
      <p>As mentioned in Sec. 2, Problem 2 subsumes the computation of the channel capacity of the
timing side-channel on a platform where basic blocks have constant runtimes. To demonstrate
the e ectiveness of our techniques, we use them to compute the timing channel capacities of two
real-world programs on the PTARM simulator [4]. The tool GameTime [9] was used to generate
SMT formulae representing the programs, and to interface with the simulator to perform the
timing measurements of the basis paths. SMT formulae for testing cofactor equivalence were
generated and solved using Z3 [2]. Model counting was done by using Z3 to convert SMT
queries to propositional formulae, which were then given to the model counter Cachet [8].
Raw data from our experiments can be obtained at http://math.berkeley.edu/~dfremont/
SMT2014Data/.</p>
      <p>The rst program tested was the modexp program already described above, using a 32-bit
exponent. With 232 paths, enumerating and model counting all paths is clearly infeasible.
Our new approach was quite fast: nding the branch supports, model counting4, and running
Algorithm 3 took only a few seconds, yielding a timing channel capacity of just over 8 bits. In
fact, although the number of paths is very large, the per-path cost of Algorithm 2 is so low
that we were able to compute modexp's entire timing distribution with it in 23 hours (e ectively
analyzing more than 50,000 paths per second). The distribution is shown in Figure 2.</p>
      <p>4We note that for this program, each branch condition had only a single support variable, and thus we have
Tb = 1 automatically without needing to do model counting.</p>
      <p>The second program we tested was the state update function of the widely-used
pseudorandom number generator the Mersenne Twister [5]. We tested an implementation of the most
common variant, MT19937, which is available at [7]. On every 624th query to the generator,
MT19937 performs a nontrivial updating of its internal state, an array of 624 32-bit integers.
We analyzed the program to see how much information about this state is leaked by the time
needed to do the update. The relevant portion of the code has 2624 paths and thus would
be completely impossible to analyze using path enumeration. With our techniques the
analysis became feasible: nding the branch supports took 54 minutes, while Algorithm 3 took
only 0.2 seconds because there was a high level of uniformity across the path timings. The
channel capacity was computed to be around 9.3 bits. We note that among the 624 branch
conditions there are two which are not independent. Thus all four truth assignments to these
conditions needed to be checked for feasibility before applying our techniques to the remaining
622 conditionals.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>We presented a formalization of certain quantitative program analysis problems that are
dened over a weighted control- ow graph representation. These problems are concerned with
understanding how a quantitative property of a program is distributed over the space of
program paths, and computing metrics over this distribution. These computations rely on the
ability to solve a set of satis ability (SAT/SMT) and model counting problems. Previous work
along these lines has only been applicable to small programs with very few conditionals, since
it typically depends on enumerating all execution paths and the number of these can be
exponential in the size of the program. We investigated how in certain situations where the number
of paths is indeed exponential, special branching structure can be exploited to gain e ciency.
When the conditionals are unnested and independent, we showed how the number of expensive
model counting calls can be reduced to be linear in the size of the program, leaving only a
very fast product computation to be done for each path. Furthermore, a special case of the
general problem, which for example is su cient for the computation of side-channel capacities,
can be solved avoiding exponential path enumeration entirely. Finally, we showed the
practicality of our methods by using them to compute the timing side-channel capacities of two
commonly-used programs with very large numbers of paths.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>
        Daniel wishes to thank Jon Kotker, Rohit Sinha, and Zach Wasson for providing assistance
with some of the tools used in our experiments, and Garvit Juniwal for a useful discussion of
Algorithm 3 and Lemma 4. The authors also thank the anonymous reviewers for their helpful
comments and suggestions. This work was supported in part by the TerraSwarm Research
Center, one of six centers supported by the STARnet phase of the Focus Center Research
Program (FCRP), a Semiconductor Research Corporation program sponsored by MARCO and
DARPA.
[2] L. de Moura and N. Bj rner. Z3. http://z3.codeplex.com/.
[3] C. P. Gomes, A. Sabharwal, and B. Selman. Model counting. Handbook of Satis ability, pages
633{654, 2009.
[4] E. A. Lee and S. A. Edwards. PTARM simulator. http://chess.eecs.berkeley.edu/pret/src/
ptarm-1.0/ptarm_simulator.html.
[5] M. Matsumoto and T. Nishimura. Mersenne twister: A 623-dimensionally equidistributed uniform
pseudo-random number generator. ACM Transactions on Modeling and Computer Simulation
(TOMACS), 8(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ):3{30, 1998.
[6] T. J. McCabe. A complexity measure. IEEE Transactions on Software Engineering, (4):308{320,
1976.
[7] T. Nishimura and M. Matsumoto. Implementation of MT19937. http://www.math.sci.
      </p>
      <p>hiroshima-u.ac.jp/~m-mat/MT/MT2002/CODES/MTARCOK/mt19937ar-cok.c.
[8] T. Sang, F. Bacchus, P. Beame, H. Kautz, and T. Pitassi. Combining component caching and clause
learning for e ective model counting. In 7th International Conference on Theory and Applications
of Satis ability Testing, 2004.
[9] S. A. Seshia and J. Kotker. GameTime: A toolkit for timing analysis of software. In 17th
International Conference on Tools and Algorithms for the Construction and Analysis of Systems
(TACAS), 2011.
[10] S. A. Seshia and A. Rakhlin. Quantitative analysis of systems using game-theoretic learning. ACM</p>
      <p>Transactions on Embedded Computing Systems (TECS), 11(S2):55:1{55:27, 2012.
[11] G. Smith. On the foundations of quantitative information ow. In Foundations of Software Science
and Computational Structures, pages 288{302. Springer, 2009.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Backes</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          <article-title>Kopf, and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Rybalchenko</surname>
          </string-name>
          .
          <article-title>Automatic discovery and quanti cation of information leaks</article-title>
          .
          <source>In 30th IEEE Symposium on Security and Privacy</source>
          , pages
          <volume>141</volume>
          {
          <fpage>153</fpage>
          . IEEE,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>