<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Structural Bit-vector Model Counting</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Seonmo Kim</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stephen McCamant</string-name>
          <email>mccamantg@cs.umn.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Minnesota</institution>
          ,
          <addr-line>Minneapolis, Minnesota</addr-line>
          ,
          <country country="US">U.S.A</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <fpage>26</fpage>
      <lpage>36</lpage>
      <abstract>
        <p>Various approximate model counting techniques have been proposed and are used in many applications such as probabilistic inference and quantitative information- ow security. The hashing-based technique is a well-known approach and can be more scalable than exact model counting techniques. However, its performance is highly dependent on the performance of a decision procedure (SAT or SMT solver) and adding numerous hashing constraints to a formula might cause a solver to perform poorly. We propose a model counting technique which computes lower and upper bounds of the number of solutions to an SMT formula by analyzing the structure of the formula, which means this approach does not rely on a decision procedure. Our algorithm runs in polynomial time and gives rm lower and upper bounds unlike other approximate model counters that compute probabilistic bounds. We compare our algorithm with state-of-theart model counters and our experiments show that our approach is faster than others and provides a trade-o between computational e ort and the precision of results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Model counting is a quantitative generalization of the satis ability problem that asks for the
number of satisfying assignments for a formula. Some of the many applications of model
counting include combinatorics, safety analysis, probabilistic inference, and quantitative
informationow analysis of software. Model counting is also closely connected to uniform random sampling
of formula solutions. However, exact model counting even just for quanti er-free Boolean
formulas (#SAT) is complete for the complexity class #P which is believed to be intractable.
Approximations to an exact model count are useful in many applications, so the di culty of
exact model counting motivates exploring approximation algorithms.</p>
      <p>The best-known approaches in recent research use hashing to reduce approximate model
counting to an adaptive sequence of satis ability queries covering subsets of the solution space.
These algorithms provide probabilistic bounds on the accuracy of their approximation, and they
can take advantage of advances in satis ability solving (both SAT and SMT). However these
approaches still have limited scalability in practice: they require a large number of satis ability
queries to achieve tight bounds, and hashing can make individual queries much more expensive,
given the complex interactions between hashing constraints and solver optimizations.</p>
      <p>Also valuable but less developed are approximation algorithms which achieve guaranteed
e ciency and sound bounds, at the expense of not providing an accuracy guarantee. This
trade-o can be achieved using algorithms similar to ones used in static program analysis that
derive lower and/or upper bounds following the syntactic structure of a formula rather than
using semantic decision procedure queries. We refer to these as \structural" approximate model
counting algorithms.</p>
      <p>
        Previous structural model counting algorithms have been specialized for narrow domains,
or have been built into larger systems in ways that are not easily reusable. In this work, we
build a new structural approximate model counting tool for quanti er-free SMT formulas over
the theory of bit vectors (SMT-LIB QF BV), one of the most common theories used to model
bounded arithmetic and software semantics. We also provide a commonly useful generalization
known as projected model counting, in which a user can specify a subset of the variables in a
formula over which a model should be counted. Our tool uses algorithms which build on the
partial description of a previous closed-source tool [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], but we needed to develop new structural
rules for cases that were missing or restricted in previous work. We extend the algorithm to
cover a more complete set of bit-vector operators, and to use both the signed and unsigned
orderings of bit vectors. However our current implementation, like Martin's [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], is limited
to conjunctions and not arbitrary Boolean combinations of bit-vector relations. This matches
the needs, for instance, of typical single-path symbolic execution, where a path condition is a
conjunction of branch conditions, each of which is an equality or inequality over numeric (e.g.
bit-vector) terms.
      </p>
      <p>We have built a standalone tool, SMC, that operates on input in the standard SMT-LIB2
format, and which we have open-source. We have checked the correctness of the propagation
rules for each type of bit-vector expression via bounded exhaustive testing. We empirically
compare SMC's performance with representative state-of-the-art exact and approximate model
counting tools. The results show that SMC's performance scales much better than these other
tools, and that the sound upper and lower model-count bounds that it provides are often usefully
accurate.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>We categorize previous model-counting tools based on exact approaches, hashing-based
approximation, and structural approximation.</p>
      <p>
        Exact model counting Exact model counting for propositional formulas (#SAT) typically
uses techniques related to SAT solving: intuitively, a #SAT algorithm must traverse the entire
search space even for a satis able formula, instead of stopping at the rst satisfying assignment
as a SAT solver does. The counting process is optimized by caching information about the
number of solutions for sub-formulas that appear repeatedly in the total. Relsat [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], Cachet [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ],
and sharpSAT [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] are some examples of tools based on varieties of caching. dSharp [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] builds
on sharpSAT and also computes a conversion of Boolean formula to a restricted normal form
known as d-DNNF (deterministic, decomposable negation normal form) which facilitates other
operations. dSharp p [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] further builds on dSharp to support projected model counting; it
is the exact model counting tool we compare with in our evaluation below. Two other notable
more recent systems are countAntom [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], which uses a parallel algorithm, and Ganak [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ],
which uses probabilistic hashing to save space, but still produces an exact model count with a
guaranteed minimum probability.
      </p>
      <p>
        Hashing-based model counting The core principle that enables hashing-based
approximation was rst deployed for theoretical results, notably by Sipser [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], Stockmeyer [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], and
Valiant and Vazirani [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. It was rst applied to build model counting tools by Gomes et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ],
who gave an approach that probablistically checked a hypothesized lower or upper bound. Later
systems achieved complete automation by designing outer-loop algorithms to search over model
count hypotheses. The best known tool of this kind is ApproxMC [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and its successors
including ApproxMC2 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], and BIRD/ApproxMC3 [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], and a projecting variant ApproxMC-p [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
SearchMC [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] reduces the number of queries needed by keeping a statistical estimation of the
model count, and also implements a mode that can apply directly wrap an SMT solver (tools
(declare-fun x () (_ BitVec 8))
(declare-fun y () (_ BitVec 8))
(assert (= y (bvadd (bvand x 15) 4)))
designed solely for #SAT require eager bit-blasting). We take SearchMC as the representative
of hashing-based approximate model counters in our evaluation.
      </p>
      <p>
        Structural model counting Structural approaches to approximate model counting that
can provide non-probabilistic bounds (but not guaranteed precision) have seen relatively less
tool development. The most direct predecessor of our work in this paper is the FSCB (Fast
Solution Count Bounder) algorithm of Martin [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], which as far as we are aware was applied
only as part of symbolic execution system to estimate the amount of information revealed by
bug reports [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Martin's implementation was not available, but we used the description in
his technical report as a starting point for the system we developed, as described in more
detail in Section 3. Other previous structural model-counting systems have been specialized
for other domains. Luu et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] build a model counter for string constraints such as arise in
symbolic execution of high-level languages like JavaScript; their approach is based on generating
functions. Aydin et al.'s MT-ABC [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] uses a combination of structural and automaton-based
algorithms for constraints that can include a mix of strings and linear arithmetic. Meng and
Smith's two-bit-pattern technique [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] is a hybrid of structural and exact counting approaches
to #SAT. It structurally over-approximates the model count by determining, for every pair
of bits in a formula, what values they can have in isolation (using many small satis ability
queries). Then the combination of these constraints is a 2CNF formula, which the authors
found could be model-counted e ciently in practice (though general 2CNF model counting is
still #P-hard).
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Structural Model Counter</title>
      <p>The Structural Model Counter (SMC1) takes as input an SMT-LIB2 formula which consists
of variables and assertions. It outputs lower and upper bounds of satisfying assignments that
make a given formula true. First, we describe how this structural model counting technique
works with a simple example in Fig 1. Since x and y are 8-bit variables, the model counts
of two variables are both 256 when they are declared. Then we parse an assertion to analyze
the structure of the formula. (bvand x 15) has the minimum 0 and the maximum 15. This
generates 16 distinct values. Adding 4 makes the minimum 4, the maximum 19 and still 16
distinct values. This is equal to y thus y has 16 distinct values. The model count of this formula
is 256 since x has 256 distinct values and x determines y. This shows a simple process of the
structure model counting. In this section, we describe this algorithm in detail, its correctness,
some current limitations and di erences from the previous work.</p>
      <p>1The source code and benchmarks are available at: https://github.com/seonmokim/smc
3.1</p>
      <sec id="sec-3-1">
        <title>Algorithm</title>
        <p>
          The main algorithm is mostly inspired by Martin's FSCB (Fast Solution Count Bounder)
algorithm [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. He proposed this idea to be a fast algorithm that can handle complex expressions
such as standard arithmetic or bit shifts. However, FSCB does not consider signed data types
and the source code is not available. We extend FSCB to handle both unsigned and signed data
types and cover more operators including signed division, signed less-than, signed greater-than
and so on. Also, we improve the idea to compute tighter bounds and verify its correctness using
small-sized unit tests exhaustively. In this section, we describe two steps of our algorithm. It
rst computes the bounds for each individual assertion, and then merges them for the model
count of a given formula.
3.1.1
        </p>
        <sec id="sec-3-1-1">
          <title>Per-Assertion Bounds and Analysis</title>
          <p>When a variable is declared or an expression is generated, we create a corresponding node to
represent this variable or expression. Each node contains elements as [ul, uh, sl, sh, lc, hc,
hom, vars]. Since we only consider bit vectors in SMT-LIB2 format, we compute both cases,
unsigned and signed representations, in a node. ul, uh, sl and sh are the unsigned
minimum (low), the unsigned maximum (high), the signed minimum and the signed maximum of
the node, respectively. lc and hc are the low cardinality (lower bound), the high cardinality
(upper bound) of the node, respectively. We select the bounds which have a shorter interval
between the unsigned and signed bounds since both the bounds are sound and tighter bounds
give a better precision. hom is a ag such that a node is homogeneous only if every image has the
same number of preimages. For example, (bvand x 15) is homogeneous since it generates 16
distinct values and each value (image) has the same number of preimages (16 cases if x is 8-bit).
This ag is useful when computing bounds more precisely with a constant value. For example,
let us assume that we have (= (bvand x 15) 0) and x is 8-bit. (bvand x 15) should be zero
to satisfy this assertion and one value of (bvand x 15) maps to 16 preimages of x. Therefore,
this can be computed easily as x has exactly 16 distinct values to satisfy this assertion. vars is
a set of variables which presents in an assertion. As we described above, we represent a signed
and unsigned values in a single node. One reason that we need both representations is that we
want to handle signed operators such as signed division, signed less-than, signed greater-than
and so on. Another reason is the ways of computing signed values and unsigned values are
di erent. Therefore, we maintain both representations in every operation and give a warning
whenever SMC sees a con ict case.</p>
          <p>SMC rst replaces each constant or variable from an expression with the corresponding node.
If it is a constant c, the node can be represented as [c, c, c, c, 1, 1, true, fg]. Note that
if c is a binary or hexadecimal number, we convert the number using the two's complement.
The next step is that SMC breaks down an expression tree and generates a node based on the
expression rules starting from the sub-expressions. SMC determines how to compute the node
values from the expression rules for each supported operation. We can extend SMC to support
additional operations by adding new operation rules. We only show a subset of the operation
rules here.</p>
          <p>Pseudocode for bvadd is shown in Fig 2. When the tool reaches a bvadd operation, it
rst checks whether the variables are both constant values. This checking applies to other
operations as well and we can easily compute all the values if they are both constant values.
Next we check that the answer might be an arithmetic positive or negative over ow. If this
occurs, we set ul and uh as the minimum and the maximum of the variable, respectively, based
on its bit-width unless the variables are both constants. This applies to sl and sh to be the
( bvadd f g )
if i sC on st an t ( f ) and isC on st an t ( g ):
ul = uh = ( f . ul + g . ul ) % 2^ f . width
sl = sh = ( f . sl + g . sl ) % 2^ f . width
return ul , uh , sl , sh , 1 , 1 , true , [ f . vars [ g . vars ]
if f . uh + g . uh &gt; umax :
ul = umin
uh = umax
else :
ul = f . ul + g . ul
uh = f . uh + g . uh
if f . sl + g . sl &lt; smin or f . sh + g . sh &gt; smax :
sl = smin
sh = smax
else :
sl = f . sl + g . sl
sh = f . sh + g . sh
lc = min ( f . lc + g . lc -1 , abs ( uh - ul ) , abs ( sh - sl ))
if isCommon (f , g ):</p>
          <p>lc = 1
hc = min ( f . hc * g . hc , abs ( uh - ul ) , abs ( sh - sl ))
hom = ( f . hom and is Co ns tan t ( g ))
or ( i sC on st an t ( f ) and g . hom )
or ( not isCommon (f , g ) and isPerm ( f ) and isPerm ( g ))
return ul , uh , sl , sh , lc , hc , hom , [ f . vars [ g . vars ]
negative minimum and positive maximum. lc and hc can be computed as the sum of two
variable's minimum cardinality minus 1. In order to generate the smallest set of adding two
sets, one value has to be generated in multiple ways. For example, adding f1, 2, 3g and f2,
3g makes f3, 4, 5, 6g which has 4 elements. If the function has variables in common (for
example, (x&amp;1)+(:x&amp;1)), the minimum cardinality could be 1. The maximum cardinality
can be computed as the multiplication of each variable's maximum cardinality. Note that
the cardinalities must be less than the distance between its high value and low value. The
addition is homogeneous if one variable is homogeneous and another is a constant or they do
not have variables in common and both are permutations. A variable is a permutation if it is
homogeneous and unconstrained.</p>
          <p>Let us go back to the example in Fig 1 and proceed with the SMC algorithm. When
x and y are declared, two corresponding nodes are generated. Initial values of x would
be [0, 255, -128, 127, 256, 256, true, f g</p>
          <p>x ] and initial values of y would be the same
except vars=fyg. If we have the equation (= y (bvadd (bvand x 15) 4)), we parse the
equation and generate a node from the expression. (bvand x 15) returns a node with
(= f g )
leq = max ( f . ul , g . ul )
heq = min ( f . uh , g . uh )
minlhit = f . lc - max ( f . uh - heq , leq - f . ul )
minlhit = min ( minlhit , heq - leq + 1)
minrhit = g . lc - max ( g . uh - heq , leq - g . ul )
minrhit = min ( minrhit , heq - leq + 1)
inter = max (1 , minlhit + minrhit - ( heq - leq + 1))
ic = 2^( f . width + g . width )
if f . hom :
ld_f = 2^ f . width / f . hc
hd_f = 2^ f . width / f . lc
else :
ld_f = 1
hd_f = 2^ f . width - f . lc + 1
if g . hom :
ld_g = 2^ g . width / g . hc
hd_g = 2^ g . width / g . lc
else :
ld_g = 1
hd_g = 2^ g . width - g . lc + 1
lb = max (1 , min ( inter * ld_f * ld_g , ic ))
hb = min ( inter * hd_f * hd_g , ic )
[0, 15, 0, 15, 16, 16, true, fxg] and then (bvadd (bvand x 15) 4) returns a node with
[4, 19, 4, 19, 16, 16, true, fxg]. When there is an equality or inequality check in an
assertion, SMC computes a lower bounds and upper bound of variables in the assertion. We
denote lb and hb which are the lower and upper bounds of a set of variables, respectively. Fig
3 shows the crucial part to compute the bounds.</p>
          <p>This equal rule computes the cardinalities of the left-hand side and the right-hand side
and then the cardinality of the intersection between the left-hand side and the right-hand.
Depending on the homogeneity of the variable, we compute the low and high density of the
variable so we can compute the lower bound and upper bound. We also apply the same
procedure to signed values and select the bounds that has a smaller interval. In this example,
SMC computes the lower bound as 256 and the upper bound as 256, which is an accurate
answer.
3.1.2</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>Combining Bounds</title>
          <p>The second step is to combine per-assertion bounds. For each assertion with comparison
operators such as equal, greater-than or less-than, we compute the lower and upper bounds on
the number of solutions to variables in the assertion. Recall that lb and hb are the bounds
of a set of variables. We use vars to denote the set of variables in an assertion. Given two
m e r g e B o u n d s ( Bounds a , Bounds b ):
vars = a . vars [ b . vars
in te r_ va rs = a . vars \ b . vars
width = in te r_ va rs . width
inter_max = min (2^ width , math . gcd ( a . ha , b . ha ))
inter_min = min (2^ width , math . gcd ( a . la , b . la ))
if a . vars and b . vars are in common :
if set ( a . vars ) == set ( b . vars ):
la = min ( a . la , b . la )
ha = min ( a . ha , b . ha )
elif a . vars b . vars :
la = b . la
ha = b . ha
elif b . vars a . vars :
la = a . la
ha = a . ha
else :
ula = a . ula * b . ula / inter_min
uha = a . uha * b . uha / inter_max
else :
la = a . la * b . la
ha = a . ha * b . ha
return Bounds ( la , ha , vars )
such bounds, we can merge per-assertion bounds into bounds that apply to both equations
together. We recursively merge the bounds pairwise until the bound represent all the variables
in the system. For example, if two bounds are independent, we can simply multiply each lower
bound and upper bound. We present pseudocode for merging bounds in Figure 4. We rst
check whether two bounds are independent or not. If they are independent, we can just simply
multiply two bounds. If they have variables in common, then we consider four cases: they are
identical, one is subset of another (or vice versa) and they are overlapping sets. We compute
the bounds conservatively based on the case. The bounds can be more precise based on the
order of merging such as merging similar variables rst. This needs an extra running time and
we left this for the future work.
3.2</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Di erences between SMC and FSCB</title>
        <p>Here we brie y describe the di erences between SMC and FSCB. First, we cover a more
complete set of bit-vector operators for both the signed and unsigned orderings of bit vectors. Table
1 shows which bit-vector operators are supported by FSCB and SMC.</p>
        <p>Also, we added more edge cases in operators and xed some operators to compute a better
precision. For example, bvadd in FSCB computes the low cardinality as max(f.lc, g.lc) but
we empirically nd out that f.lc+g.lc-1 gives a better result.
FSCB
SMC
bvadd, bvsub, bvmul, bvand, bvor, bvxor, bvshl, bvlshr, =, distinct, ult, ugt
bvadd, bvsub, bvmul, bvand, bvor, bvxor, bvshl, bvlshr, =, distinct, ult, ugt,
ule, uge, slt, sgt, sle, sge, bvudiv, bvsdiv, concat, extract, sign extend
We have tested the correctness of the operation rules of bit-vector expression using bounded
exhaustive checking. We set each variable to be a bit-vector of width 4 and executed each
operator with the power set of each variable. We checked whether ul, uh, sl, sh, lc and hc
from our operation rules are sound. We verify unary operators and binary operators. First, we
compute ul, uh, sl, sh, lc and hc based on our operation rules. We generate the power set of
one variable which has 216 sets and execute an operation which generates 232 sets for binary
operators. The same procedure is applied to unary operators. We verify that (1)the unsigned
smallest value of each set is greater than ul, (2)the unsigned largest value of each set is less
than uh, (3)the signed smallest value of each set is greater than sl, (4)the signed largest value
of each set is less than sh and (5)the number of elements of each set is less than hc and greater
than lc. However, we have not checked the correctness of merging bounds due to the state
explosion problem since we need to test the case where bounds contain multiple variables.
In this section, we discuss the limitations of our current implementation. We have extended
FSCB to support more operators in SMT-LIB2 format. However, some of SMT-LIB2 operators
are not supported in SMC and we are currently working on handling the whole SMT-LIB2
format standard. Also, some operators compute the cardinalities very conservatively due to the
limitation of the node representation, hence those operators lead to less precise results (loose
bounds). This is for the future work to have more coverage of SMT-LIB2 format standard and
design more precise rules.</p>
        <p>Lastly, the order of assertions and merging bounds a ects precision. For example, let say
we have two variable v1 and v2 and one constant c. If we have assertions (= v1 v2) and (=
v2 c), the results are computed di erently depending on which assertion is processed rst. For
example, (= v1 v2) computes the bounds [256; 256] and (= v2 c) computes the bounds [1; 1] if
all the variables are 8-bit. If we merge the two bounds, the nal bound would be [256; 256].
But if we ip the order which (= v2 c) computes the bounds rst, then (= v1 v2) computes
the bounds [1; 1] since we know that v2 is a constant value this time. This can be resolved
by recursively computing the bounds until the bounds do not change but we do not have any
proof about its time complexity. This means the bounds can be more precise by merging the
per-assertion bounds in a better order. We believe nding more e cient way to merge bounds
will improve the performance and this is for the future work.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental Results</title>
      <p>
        In this section, we show our experimental results and all our experiments were performed
on a machine with an Intel Core i7 3.40Ghz CPU and 16GB memory. We implement our
algorithm with Python and use our own SMT-LIB2 parser. Our algorithm supports SMT-LIB2
format [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. We compare our algorithm with state-of-the-art model counters: SearchMC [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and
dSharp p [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. SearchMC is an approximate model counter using XOR hashing constraints to
estimate a lower bound and upper bound of the model count. It is a randomized algorithm and
gives a desired level of distance between a lower bound and upper bound with a probability of
at least 0.6. In this experiment, we ran SearchMC 10 times for each benchmark until it gave
the rst bounds with a probability of at least 0.6 and computed the average of the results.
dSharp p is an exact model counter and is implemented on top of dSharp [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] to support
projection. We collected various SMT BV benchmarks from previous works [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ]. Here
we show partial results of some representative benchmarks. Table 2 shows a comparison on
performance and approximation. The second column shows the number of bits we want to
count over. Since dSharp p is an exact model counter, we take log2 of the answer which
shows in the third column and the running times (in seconds) of dSharp p are shown in
the fourth column. We also show the bounds (log base 2) computed from SearchMC and
SMC following with their running times. dSharp p performs well on a small-sized problems
and its performance decreases as the size and the complexity of formula increases. In these
experiments, SearchMC used the state-of-the-art SMT(BV) solver Z3 [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and its performance
was highly dependent on the performance of the solver.
      </p>
      <p>SMC shows a good precision on some benchmarks if the structure of the formula is
wellorganized. However, it gave very loose bounds on some benchmarks which SMC was not able to
analyze the formula well. For example, 5 10 1 and 5 20 1 are the volume computation problems
of convex bodies and consist of a number of inequality constraints. This type of problems show
very loose bounds since SMC computes the bounds very conservatively on inequality constraints.
The main bene t of SMC is the running times. This shows that our approach is faster than
others and it is a trade-o chosen between computational e ort and the precision of results in
some benchmarks.</p>
      <p>Hashing-based model counting techniques like SearchMC rely on prior hypotheses to produce
more useful results and start an initial hypothesis from zero knowledge. The initial hypothesis
for SearchMC is a uniform distribution over 0 to the maximum bit-width of the output
bitvector. If we gather results from SMC and use them as the initial hypothesis for SearchMC,
SearchMC is able to give a desired answer faster. For example, in order to solve coloring 4 the
initial hypothesis for SearchMC is a uniform distribution over 0 to 32. If the initial hypothesis
for SearchMC is a uniform distribution over 29.61 to 32 which is computed by SMC, SearchMC
needs a smaller number of queries to nd a desired result.</p>
      <p>Table 3 shows the performance of the combination of SMC and SearchMC. The results for
plain SearchMC are equivalent to the results in Table 2. We also measured the average number of
iterations (loops) in SearchMC and the results show that the number of iterations was decreased
when we used SMC and SearchMC together. Since SMC already computed tight bounds on
gettoPath1 and calDate 10, we did not run SearchMC on the benchmarks. This experimental
results show that using SMC as a preprocessor of SearchMC gives the performance bene t up
to 1.36x speedup.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>We propose a structural approximate model counting algorithm, SMC, to compute the lower
and upper bound of solutions to a given SMT formula. This is a fast polynomial algorithm
compared to other state-of-the-art approximate model counters and it runs in O(n + m) where
n is the number of variables and m is the number of assertions. We extend the FSCB algorithm
to cover a more complete set of SMT-LIB2 standard operators and to use both the signed and
unsigned representations of bit vectors. Our evaluation results illustrate that our technique is</p>
      <p>dSharp p
log2(M C)
30.75
21.96
7.92
6.39
16.95
12.02
8.44</p>
      <p>Benchmarks
coloring 4
FINDpath1
getopPath1
queue
calDate 10
5 10 1
5 20 1
Benchmarks
coloring 4
FINDpath1
getopPath1
queue
calDate 10
5 10 1
5 20 1
most bene cial when time performance is a tight requirement.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>We would like to thank the anonymous reviewers for suggestions which have helped us to
improve our system and the paper's presentation. The research described in this paper has
been supported in part by the National Science Foundation under grant 1526319 and by the
O ce of Naval Research under grant N00014-19-1-2541.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Abdulbaki</given-names>
            <surname>Aydin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>William</given-names>
            <surname>Eiers</surname>
          </string-name>
          , Lucas Bang, Tegan Brennan, Miroslav Gavrilov,
          <article-title>Tev k Bultan,</article-title>
          and
          <string-name>
            <given-names>Fang</given-names>
            <surname>Yu</surname>
          </string-name>
          .
          <article-title>Parameterized model counting for string and numeric constraints</article-title>
          .
          <source>In Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE</source>
          <year>2018</year>
          ,
          <article-title>Lake Buena Vista</article-title>
          , FL, USA, November
          <volume>04</volume>
          -
          <issue>09</issue>
          ,
          <year>2018</year>
          , pages
          <fpage>400</fpage>
          {
          <fpage>410</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Rehan</given-names>
            <surname>Abdul</surname>
          </string-name>
          <string-name>
            <surname>Aziz</surname>
          </string-name>
          , Geo rey Chu, Christian Muise, and
          <string-name>
            <given-names>Peter</given-names>
            <surname>Stuckey</surname>
          </string-name>
          . #9SAT:
          <article-title>Projected model counting</article-title>
          .
          <source>In Theory and Applications of Satis ability Testing</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Clark</given-names>
            <surname>Barrett</surname>
          </string-name>
          , Aaron Stump, and
          <string-name>
            <given-names>Cesare</given-names>
            <surname>Tinelli</surname>
          </string-name>
          .
          <source>The SMT-LIB standard: Version 2.0. Technical report</source>
          , Department of Computer Science, The University of Iowa,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Roberto</surname>
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Bayardo</surname>
          </string-name>
          , Jr. and
          <string-name>
            <surname>Robert</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Schrag</surname>
          </string-name>
          .
          <article-title>Using CSP look-back techniques to solve real-world SAT instances</article-title>
          .
          <source>In Proceedings of AAAI</source>
          , pages
          <volume>203</volume>
          {
          <fpage>208</fpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Jan</given-names>
            <surname>Burchard</surname>
          </string-name>
          , Tobias Schubert, and
          <string-name>
            <given-names>Bernd</given-names>
            <surname>Becker</surname>
          </string-name>
          .
          <article-title>Laissez-Faire caching for parallel #SAT solving</article-title>
          .
          <source>In Proceedings of SAT</source>
          , pages
          <volume>46</volume>
          {
          <fpage>61</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Miguel</given-names>
            <surname>Castro</surname>
          </string-name>
          , Manuel Costa, and
          <string-name>
            <surname>Jean-Philippe Martin</surname>
          </string-name>
          .
          <article-title>Better bug reporting with better privacy</article-title>
          .
          <source>In Proceedings of ASPLOS</source>
          , pages
          <volume>319</volume>
          {
          <fpage>328</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Supratik</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          , Kuldeep S. Meel, and
          <string-name>
            <surname>Moshe</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Vardi</surname>
          </string-name>
          .
          <article-title>A scalable approximate model counter</article-title>
          .
          <source>In Proceedings of CP</source>
          , volume
          <volume>8124</volume>
          , pages
          <fpage>200</fpage>
          {
          <fpage>216</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Supratik</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          , Kuldeep S. Meel, and
          <string-name>
            <surname>Moshe</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Vardi</surname>
          </string-name>
          .
          <article-title>Improving approximate counting for probabilistic inference: From linear to logarithmic SAT solver calls</article-title>
          .
          <source>In Proceedings of IJCAI</source>
          , pages
          <volume>3569</volume>
          {
          <fpage>3576</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Leonardo</given-names>
            <surname>De Moura</surname>
          </string-name>
          and
          <article-title>Nikolaj Bj rner. Z3: An e cient smt solver</article-title>
          .
          <source>In Proceedings of TACAS</source>
          , pages
          <volume>337</volume>
          {
          <fpage>340</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Wei</surname>
            <given-names>Gao</given-names>
          </string-name>
          , Hengyi Lv, Qiang Zhang, and
          <string-name>
            <given-names>Dunbo</given-names>
            <surname>Cai</surname>
          </string-name>
          .
          <article-title>Estimating the volume of the solution space of SMT(LIA) constraints by a at histogram method</article-title>
          .
          <source>Algorithms</source>
          ,
          <volume>11</volume>
          :
          <fpage>142</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Cunjing</surname>
            <given-names>Ge</given-names>
          </string-name>
          , Feifei Ma, Tian Liu, Jian Zhang, and
          <string-name>
            <given-names>Xutong</given-names>
            <surname>Ma</surname>
          </string-name>
          .
          <article-title>A new probabilistic algorithm for approximate model counting</article-title>
          . In Didier Galmiche, Stephan Schulz, and Roberto Sebastiani, editors,
          <source>Automated Reasoning</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Carla</surname>
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Gomes</surname>
            , Ashish Sabharwal, and
            <given-names>Bart</given-names>
          </string-name>
          <string-name>
            <surname>Selman</surname>
          </string-name>
          .
          <article-title>Model counting: A new strategy for obtaining good bounds</article-title>
          .
          <source>In Proceedings, The Twenty-First National Conference on Arti cial Intelligence and the Eighteenth Innovative Applications of Arti cial Intelligence Conference, July 16-20</source>
          ,
          <year>2006</year>
          , Boston, Massachusetts, USA, pages
          <volume>54</volume>
          {
          <fpage>61</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Seonmo</given-names>
            <surname>Kim and Stephen McCamant</surname>
          </string-name>
          .
          <article-title>Bit-vector model counting using statistical estimation</article-title>
          .
          <source>In Tools and Algorithms for the Construction and Analysis of Systems</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Vladimir</surname>
            <given-names>Klebanov</given-names>
          </string-name>
          , Alexander Weigl, and Jorg Weisbarth.
          <article-title>Sound Probabilistic #SAT with Projection</article-title>
          . In Workshop on QAPL,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Loi</surname>
            <given-names>Luu</given-names>
          </string-name>
          , Shweta Shinde,
          <string-name>
            <given-names>Prateek</given-names>
            <surname>Saxena</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Brian</given-names>
            <surname>Demsky</surname>
          </string-name>
          .
          <article-title>A model counter for constraints over unbounded strings</article-title>
          .
          <source>In Proceedings of PLDI</source>
          , pages
          <volume>565</volume>
          {
          <fpage>576</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Jean-Philippe Martin</surname>
          </string-name>
          .
          <article-title>Upper and lower bounds on the number of solutions</article-title>
          .
          <source>Technical Report MSR-TR-2007-164</source>
          , Microsoft Research,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Ziyuan</given-names>
            <surname>Meng</surname>
          </string-name>
          and
          <article-title>Geo rey Smith. Calculating bounds on information leakage using two-bit patterns</article-title>
          .
          <source>In Proceedings of PLAS</source>
          , pages
          <volume>1</volume>
          :
          <issue>1</issue>
          {1:
          <fpage>12</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Christian</surname>
            <given-names>Muise</given-names>
          </string-name>
          ,
          <article-title>Sheila A</article-title>
          .
          <string-name>
            <surname>McIlraith</surname>
            ,
            <given-names>J. Christopher</given-names>
          </string-name>
          <string-name>
            <surname>Beck</surname>
            , and
            <given-names>Eric</given-names>
          </string-name>
          <string-name>
            <surname>Hsu</surname>
          </string-name>
          . DSHARP:
          <article-title>Fast d-DNNF Compilation with sharpSAT</article-title>
          .
          <source>In Canadian Conf. on Arti cial Intelligence</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Tian</surname>
            <given-names>Sang</given-names>
          </string-name>
          , Fahiem Bacchus, Paul Beame, Henry A.
          <string-name>
            <surname>Kautz</surname>
            , and
            <given-names>Toniann</given-names>
          </string-name>
          <string-name>
            <surname>Pitassi</surname>
          </string-name>
          .
          <article-title>Combining component caching and clause learning for e ective model counting</article-title>
          .
          <source>In Proceedings of SAT</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Shubham</surname>
            <given-names>Sharma</given-names>
          </string-name>
          , Subhajit Roy, Mate Soos, and
          <string-name>
            <surname>Kuldeep</surname>
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Meel. GANAK</surname>
          </string-name>
          :
          <article-title>A scalable probabilistic exact model counter</article-title>
          . In Sarit Kraus, editor,
          <source>Proceedings of the Twenty-Eighth International Joint Conference on Arti cial Intelligence</source>
          ,
          <source>IJCAI</source>
          <year>2019</year>
          , Macao, China,
          <source>August 10-16</source>
          ,
          <year>2019</year>
          , pages
          <fpage>1169</fpage>
          <lpage>{</lpage>
          1176. ijcai.org,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Sipser</surname>
          </string-name>
          .
          <article-title>A complexity theoretic approach to randomness</article-title>
          .
          <source>In Proceedings of the Fifteenth Annual ACM Symposium on Theory of Computing</source>
          , page
          <volume>330</volume>
          {
          <fpage>335</fpage>
          ,
          <year>1983</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Mate</given-names>
            <surname>Soos</surname>
          </string-name>
          and
          <string-name>
            <given-names>Kuldeep</given-names>
            <surname>Meel</surname>
          </string-name>
          . Bird:
          <article-title>Engineering an e cient cnf-xor sat solver and its applications to approximate model counting</article-title>
          .
          <source>Proceedings of the AAAI Conference on Arti cial Intelligence</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Larry</given-names>
            <surname>Stockmeyer</surname>
          </string-name>
          .
          <article-title>The complexity of approximate counting</article-title>
          .
          <source>In Proceedings of the Fifteenth Annual ACM Symposium on Theory of Computing</source>
          , page
          <volume>118</volume>
          {
          <fpage>126</fpage>
          ,
          <year>1983</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Marc</given-names>
            <surname>Thurley</surname>
          </string-name>
          . sharpSAT
          <article-title>- counting models with advanced component caching and implicit BCP</article-title>
          .
          <source>In Proceedings of SAT</source>
          , pages
          <volume>424</volume>
          {
          <fpage>429</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Leslie</surname>
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Valiant and Vijay</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Vazirani</surname>
          </string-name>
          .
          <article-title>NP is as easy as detecting unique solutions</article-title>
          .
          <source>Theor. Comput. Sci.</source>
          ,
          <volume>47</volume>
          (
          <issue>3</issue>
          ):
          <volume>85</volume>
          {
          <fpage>93</fpage>
          ,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>