=Paper= {{Paper |id=None |storemode=property |title=A Study of Sweeping Algorithms in the Context of Model Checking |pdfUrl=https://ceur-ws.org/Vol-832/paper_7.pdf |volume=Vol-832 |dblpUrl=https://dblp.org/rec/conf/fmcad/HassanZS11 }} ==A Study of Sweeping Algorithms in the Context of Model Checking == https://ceur-ws.org/Vol-832/paper_7.pdf
A Study of Sweeping Algorithms in the Context of
               Model Checking
                                 Zyad Hassan, Yan Zhang, and Fabio Somenzi
                               Dept. of Electrical, Computer, and Energy Engineering
                                         University of Colorado at Boulder
                                                 Boulder, CO 80309



   Abstract—Combinational simplification techniques have      niques benefit interpolation considerably. The recently
proved their usefulness in both industrial and academic       introduced ic3 [1] incrementally discovers invariants
model checkers. Several combinational simplification al-      that hold relative to stepwise reachability information.
gorithms have been proposed in the past that vary in          Designing a model checking flow that involves ic3
efficiency and effectiveness. In this paper, we report our
                                                              requires understanding how combinational simplification
experience with three algorithms that fall in the com-
binational equivalence checking (sweeping) category. We
                                                              algorithms affect it.
propose an improvement to one of these algorithms. We            This paper makes the following contributions:
have conducted an empirical study to identify the strengths      • We carry out a comparative study of the different
and weaknesses of each of the algorithms and how they              sweeping methods.
can be synergistically combined, as well as to understand        • We propose a BDD-based cut sweeping method that
how they interact with ic3 [1].
                                                                   is more effective than the original cut sweeping.
                                                                 • We propose a combined sweeping approach in
                   I. I NTRODUCTION
                                                                   which more than one sweeping method is applied.
   Combinational simplification eliminates redundancies            We show that the combined approach can achieve
and increases sharing of logic in a design. It has been            more simplification than any of the methods can
successfully employed in logic synthesis, equivalence              achieve individually.
checking, and model checking.                                    • We perform an empirical study of the effect of
   In the model checking context, combinational sim-               sweeping on ic3.
plification often dramatically improves the performance
                                                                 The rest of the paper is organized as follows. Section
of the proof engines. This has made it into a primary
                                                              II contains preliminaries. In Section III, we introduce
component in both industrial [2] and academic [3],
                                                              the BDD-based cut sweeping method. In Section IV,
[4] model checkers. Several combinational simplification
                                                              we explain the rationale behind the combined sweeping
algorithms have been proposed in the past, such as
                                                              approach. In Section V we present the experimental
DAG-aware circuit compression [5], [6] and sweeping
                                                              results and in Section VI we conclude.
methods [7]–[9]. Sweeping methods merge functionally
equivalent nodes. They include BDD sweeping [7], SAT                             II. P RELIMINARIES
sweeping [8], [10], and cut sweeping [9].
   When designing a model checker, the strengths and          A. AND-inverter-graph
weaknesses of each of the sweeping methods should                The input and output of our sweeping algorithms
be taken into account. To the best of our knowledge,          are AND-inverter-graphs (AIGs). An AIG is a directed
no studies have been carried out so far to evaluate           acyclic graph that has four types of nodes: source nodes,
and compare the different sweeping methods, with the          sink nodes, internal nodes and the constant TRUE node.
exception of a limited study reported in [9].                 A primary input (PI) is a source node in an AIG. A
   The effect of combinational simplification on several      primary output (PO) is a sink node and has exactly one
model checking engines has been studied in the past.          predecessor. The internal nodes represent 2-input AND
In [8], SAT sweeping has been shown to positively             gates. A node v is a fanin of v0 if there is an edge (v, v0 );
affect bounded model checking (BMC) [11]. In [12], it         it is a fanout of v0 if there is an edge (v0 , v). Left(v)
is shown that combinational redundancy removal tech-          and Right(v) refer to the left and right predecessors of v .
Fanin(v) and Fanout (v) denote the fanins and fanouts          point is that the original BDD is kept for equivalence
of node v . An edge in an AIG may have the INVERTED            checking, but is not used to produce new BDDs. The
attribute to model an inverter. The Boolean function of        algorithm is complete if the threshold is so large that
a PI is a unique Boolean variable. For an edge, it is          no auxiliary variable is introduced. In practice, however,
the function of its source node if the edge does not have      this can be prohibitive.
the INVERTED attribute, and the complement otherwise.
For an internal node, it is the conjunction of the functions   C. SAT Sweeping
of its incoming edges. The Boolean function of a PO is            The advancements in SAT solver technology over the
that of its incoming edge.                                     past two decades has proliferated SAT-based reasoning
   A path from node a to b is a sequence of nodes              methods. SAT sweeping is one such method proposed
hv0 , v1 , v2 , . . . , vn i, such that v0 = a, vn = b and     by Kuehlmann [8] for combinational redundancy identi-
vi ∈ Fanin(vi+1 ), 0 ≤ i < n. The height of a node v ,         fication. SAT sweeping queries the SAT solver to prove
h(v), is the length of the longest path from a PI to v . The   or refute the equivalence of two nodes in the circuit.
fanin (fanout) cone of node v is the set of nodes that have    The basic SAT sweeping algorithm works as follows.
a path to (from) v . Two nodes are functionally equivalent     First, the circuit is simulated with random inputs, and
(complementary) if they represent the same (comple-            candidate equivalence classes are formed, where nodes
mentary) Boolean function(s). Functionally equivalent          having the same simulation values are placed together.
(complementary) nodes can be merged transferring the           Next, for each two nodes belonging to the same class,
fanouts of the redundant nodes to their representative.        the SAT solver is queried for their equivalence. If the
To simplify our presentation, in what follows we delib-        SAT solver reports they are equivalent, one of them is
erately ignore detection of complementary functions.           merged into the other. Otherwise, the counterexample
                                                               provided is simulated to break this equivalence class,
B. BDD Sweeping                                                and possibly rule out other candidate equivalences as
   BDD sweeping identifies two nodes as equivalent if          well. This process is repeated until a resource limit is
they have the same BDD representation. The original            reached, or until all classes become singletons, indicating
algorithm of Kuehlmann and Krohm [7] works in two              the absence of further equivalences.
stages: equivalence checking and false negative detec-            In our implementation of SAT sweeping, several
tion. In the first stage, it builds BDDs for each node         heuristics were applied. We mention each of them briefly.
and merges nodes having the same BDD. The algorithm               1) Simulating Distance-1 Vectors: This heuristic was
introduces an auxiliary BDD variable (cut point) for each      proposed in [13]. Instead of just simulating the coun-
node that has been merged, which potentially leads to          terexample to equivalence derived by the SAT solver,
false negatives. In the second stage, it takes the BDDs        all distance-1 vectors, that have a single bit flipped, are
of the output functions and for each of them, replaces         simulated as well. Only the bits corresponding to the
the auxiliary variables with their driving functions. The      inputs that are in the fanin cone of the two nodes being
algorithm is complete in the sense that it can find all        checked for equivalence are flipped. We have found this
the equivalences in the circuit given a sufficiently large     heuristic to be very effective in practice. In [13], this
limit on the BDD sizes. However, a large limit often hurts     process is repeated for vectors that were successful in
efficiency. In this paper, we intend to use BDD sweeping       breaking up equivalence classes until convergence. In our
in conjunction with SAT sweeping which is complete             implementation, we only simulate the distance-1 vectors
and avoids inherent BDD inefficiencies [8]. For that, we       for the original counterexample: for the benchmark suite
employ a version of BDD sweeping that is incomplete            we experimented with, recurring on successful vectors is
yet faster than the original.                                  too expensive for the number of refinements it achieves.
   The algorithm we adopt iterates the following steps            2) Clustering: Simulating distance-1 vectors often
on each node v of the AIG in some topological order.           results in considerable refinement of the equivalence
It builds a BDD for v and checks if there exists another       classes. This is desirable, since an equivalence class
node that has the same BDD. If so, it merges these two         is often broken up more cheaply by simulation than
nodes and continues. Otherwise, if the BDD size exceeds        by the SAT solver. Moreover, we have observed that
a given threshold, the algorithm introduces an auxiliary       with distance-1 vector simulation, it becomes very likely
BDD variables for v to be used in place of the original        that nodes remaining in an equivalence class are indeed
BDD when dealing with the fanouts of v . An important          equivalent. Therefore, rather than checking the equiva-
lence of two nodes at a time, we check the equivalence         in Φ(v) is associated with a truth table. Next, it searches
of all nodes in an equivalence class using a single SAT        for a node equivalent to v by looking for a cut equivalent
query. If they are all indeed equivalent, we find that using   to some cut in Φ(v). If it succeeds, the two nodes are
a single SAT query rather than n − 1 queries where n is        merged. Otherwise, a heuristic is applied to prune Φ(v)
the number of nodes in the class.                              to at most N cuts. After pruning, the algorithm stores
   3) Height-Based Merging: When two nodes are                 Φ(v) as the cut-set of v and builds a cross-reference
proved equivalent, we merge the node with a larger             between each of its cuts and v .
height into the one with a smaller height, instead of             The heuristic for pruning, which we call the quality
merging based on a topological order as in [13]. The           heuristic, computes a value q for each cut:
intuition being that a node having a larger height often                               X         1
                                                                             q(C) =                       .            (3)
has a larger fanin cone, which suggests that merging it                                    | Fanout (n)|
                                                                                      n∈C
would lead to a larger reduction. Nodes coming later in a
                                                               The cuts with the smallest values of q are kept. The
topological order do not necessarily have a larger height
                                                               intuition of the quality heuristic is two-fold. First, it tries
than nodes coming earlier.
                                                               to localize the equivalence and thus favors smaller cuts.
D. Cut Sweeping                                                Second, it normalizes cuts by attempting to express them
                                                               with the same set of variables. The chosen variables are
   Cut sweeping [9] is a fast yet incomplete approach
                                                               those that have a large fanouts, i.e., that are shared by
for combinational equivalence checking. It iteratively
                                                               many other nodes.
computes cuts for each AIG node and compares the
                                                                  A good truth-table implementation is critical to the
functions associated to the cuts.
                                                               performance of cut sweeping. In [9], truth tables are
   A cut is defined with respect to an AIG node, called
                                                               implemented as bit vectors. An advantage of bit vectors
root. A cut C(v) of root v is a set of nodes, called leaves,
                                                               is the constant-time cost of Boolean operations. On the
such that any path from a PI to v intersects C(v). A cut-
                                                               other hand, bit interleaving is required to extend the bit
set Φ(v) consists of several cuts of v . For cut-sets Φ(v1 )
                                                               vectors to the same length so that the corresponding bits
and Φ(v2 ), the merge operator ⊗ is defined as
                                                               represent the same minterm1 .
 Φ(v1 ) ⊗ Φ(v2 ) = {C1 ∪ C2 | C1 ∈ Φ(v1 ), C2 ∈ Φ(v2 )} .              III. BDD- BASED C UT S WEEPING
                                                        (1)
Assume k ≥ 1. A k-feasible cut is a cut that contains         Representing functions having a small number of
at most k leaves. A k-feasible cut-set is a cut-set that   inputs  using bit vectors is very efficient. However, the
contains only k-feasible cuts. The k-merge operator, ⊗k ,  number    of bits required grows exponentially with the
creates only k-feasible cuts. Cut enumeration recursively number of variables, which can easily lead to memory
computes all k-feasible cuts for an AIG. It computes the blow-up. As an alternative, BDDs, which are more suit-
k-feasible cut-set for a node v as follows:                able for large functions, can also be used to represent cut
        (                                                  functions. Furthermore, the strong canonicity of BDDs
          {{v}}                                 v ∈ PI     makes it trivial to check for equivalence. The use of
Φ(v) =
          {{v}} ∪ Φ (Left (v)) ⊗k Φ (Right (v)) v ∈ AND , BDDs also enables a heuristic which we describe below.
                                                       (2)    The proposed algorithm differs from the original one
where PI and AND refer to the set of PIs and 2-input in two aspects. First, we introduce a new parameter s, the
AND gates respectively. Note that cuts are not built for maximum size of a BDD, to replace k. That is, instead of
POs because they are never merged.                         k-feasible cuts, we keep cuts whose functions contain at
   The function of a cut is the Boolean function of the most s BDD nodes. Node count, as opposed to number
root in terms of the leaves. It can be represented in of inputs, is a more natural characterization of BDD size.
different ways, for instance, using bit vectors or BDDs.      The second difference comes from the pruning heuris-
Two cuts are equivalent if their cut functions are equal. tic. We define the height h of a cut C as the average
Hence, two nodes are functionally equivalent if their cut- height of its nodes:
sets contain equivalent cuts.                                                             X h(v)
   Cut sweeping is parametrized by k and N , the maxi-                         h(C) =                  .               (4)
                                                                                         v∈C
                                                                                                |C|
mum cut size and the maximum cut-set size, respectively.
For each node v in some topological order of the AIG,        1
                                                               A good reference of bit-interleaving can be found at http://
the algorithm builds a k-feasible cut-set Φ(v). Each cut graphics.stanford.edu/∼seander/bithacks.html.
A smaller h indicates that the leaves in the cut are closer     cut sweeping only keeps those below the threshold. The
to the PIs. The height heuristic keeps at most N cuts           quality heuristic tries to attract cuts on certain nodes,
choosing the ones with smallest values of h.                    which is similar to the placement of auxiliary variables in
                                                                BDD sweeping. Nevertheless, the number of “attractors”
     a
                p
                                     a
                                            r
                                                                in the quality heuristic tends to be much larger than in
     b                               c
                                                      g
                                                                BDD sweeping.
                            f
     c                               b
                q                           s
     d                               d                                  IV. C OMBINING S WEEPING M ETHODS
      Fig. 1.       Two implementations of a 4-input AND gate      The idea of combining several simplification algo-
                                                                rithms is not new. Many existing model checkers iterate
    A motivating example for the new heuristic is in            several simplification algorithms before the problem is
Figure 1, which shows two different 4-input AND gates.          passed to the model checking engines. However, we are
Nodes a, b, c, and d are PIs. Nodes p, q , r , and s are        unaware of any studies that have been carried out to
internal nodes. Nodes f and g can only be merged if             identify the best way they could be combined. In this
their cut-sets both contain {a, b, c, d}. However, if the       section we attempt to give general guidelines to which
internal nodes have many more fanouts than the PIs, the         a combined approach should adhere. We support our
quality heuristic may select cuts containing the internal       claims by empirical evidence collected in the experi-
nodes instead, causing the merge to be missed.                  ments reported in Section V.
    As mentioned before, the quality heuristic tries to            The problem we address is as follows: given a time
normalize the cuts on certain “attractors.” This reduces        budget for combinational simplification, how should it
the possibility that equivalent functions are represented       be allotted to the different algorithms? The sweeping
differently. However, this might also lead to the loss          algorithms discussed in previous sections vary in their
of the opportunity to find equivalences that cannot be          completeness and speed, with cut sweeping being the
expressed by those “attractors,” as in Figure 1.                most incomplete method yet the fastest of the three meth-
    On the other hand, the height heuristic tries to push the   ods, SAT sweeping being a complete, yet the slowest,
cut boundary as far as possible. A supporting argument          and BDD sweeping lying in between, both in terms of
is that, if a node is employed in equivalent cuts, then         completeness and speed.
replacing it with its predecessors preserve equivalence.           Possible solutions include allocating the whole time
Furthermore, new merges that are otherwise undiscover-          budget to a single algorithm, or dividing it among two or
able (consider other equivalences that require a and b in       more algorithms. The fact that some methods are better
the above example) may be found. The height heuristic           in approaching certain problems than others, suggests
does not attract cuts to certain nodes, which may result        that more than one method should be applied. If two or
in different cuts for equivalent nodes. As shown in the         more methods are to be applied in sequence, the intuition
experiments, the effectiveness of the height heuristic          suggests that the lower effort methods should precede
reduces as the height of nodes increases.                       the higher effort ones. The advantages of doing so are
    The two heuristics have their own strengths and weak-       two-fold. First, although the higher-effort methods are
nesses. A natural question is whether it is possible to         likely to discover the merges that a lower-effort method
combine them to benefit from their individual strengths.        discovers, in general, it will take them more time to do
We can choose a few cuts with each heuristic. This may          so. Second, preceding higher-effort methods by lower-
lead to more merges but may also worsen the efficiency          effort methods is beneficial in having them present a
if it significantly increases the number of cuts. To prevent    smaller problem that is easier to handle.
such an increase, a combined heuristic only records                Finally, the percentages of total time that should be
height cuts for the lower nodes, while it keeps both types      allotted to each of the methods to yield the maximum
of cuts for the others.                                         possible reduction is studied in Section V.
    There is some connection between cut sweeping with
each of the two heuristics and BDD sweeping. With                                     V. R ESULTS
the height heuristic, cut sweeping tries to build cuts as
large as possible, as BDD sweeping does. However, BDD             The experiments are divided into three parts. The first
sweeping can store cuts that exceed the threshold while         part compares different variations of the cut sweeping
algorithm. The second part shows the results of the com-                proved significantly with s. This means that with s, each
bined sweeping methods, and the third part is concerned                 cut has a larger chance of resulting in a merge.
with the effect of sweeping on ic32 .                                      Second, while “k-quality” and “k-combined-1” have
   We use the benchmark set from HWMCC’10 [14], a                       very close sweeping times, the latter achieves 19.8%
set of 818 benchmarks in total. The experiments are run                 more merges. Furthermore, the decrease in the “Height”
on a machine with a 2.8GHz Quad-Core CPU and 9GB                        column reveals that the height cuts indeed lead to
of memory. We use CUDD [15] as our BDD package                          merges. Although “s-quality” is more effective than the
for all the BDD-related algorithms.                                     two above methods, it is less efficient due to the larger
                                                                        cut sizes.
A. BDD-based Cut Sweeping                                                  For the methods with s (excluding “s-quality”), we
   Variations of cut sweeping are applied to the                        observe that “s-height N = 1” is the fastest and produces
HWMCC’10 benchmark set. The differences between                         a good number of merges. Increasing the number of
variations are two-fold. First, either the number of vari-              height cuts to two triples the run time without gaining
ables, k, or the number of BDD nodes, s, is used to                     many more merges. Comparing it with “s-combined-1”,
drop over-sized cuts. Second, we experiment with several                an improvement on the merges is shown by the latter.
heuristics for pruning cut-sets: the quality heuristic,                 This indicates that maintaining one height cut and one
the height heuristic, and two combined heuristics. The                  quality cut works better than two height cuts. For “s-
naive combined heuristic (“combined-1”) chooses one                     combined-2”, the number of merges is between the two
cut based on the height heuristic and the others based                  above methods, but with lower run time. Furthermore,
on the quality heuristic. The other heuristic (“combined-               the numbers of generated cuts and kept cuts are even
2”) sets a threshold on the node height (350 in our                     comparable to “s-height N = 1”. That is, even though
experiments). For nodes that are below the threshold,                   we keep three cuts for those nodes with height larger
it only keeps a height cut. For higher nodes, it produces               than 350, on average we compute only a few percent
cut-sets consisting of a height cut and two quality cuts.               more cuts than we do in the case of one cut per node.
We denote a method by a k or an s followed by the the                      The “Height” values of the three methods confirm the
heuristic name. All the variations use BDDs to represent                assumption made in Section III: most merges produced
the cut functions.                                                      by the height heuristic come from cuts close to the
   The results are shown in Table I; they are aggregated                PIs. When the two heuristics are combined, a significant
over the 818 benchmarks. Based on experiments, both                     increase on the “Height” value is observed. In Figure
the threshold of BDD sweeping and s in BDD-based                        2, we show the number of merges found by “s-height
cut sweeping are set to 250. The total number of AIG                    N = 1” and “s-combined-2” on nodes within different
nodes before sweeping is 7.22M. “Final” is the size of                  height ranges. The plot is normalized to “k-quality” and
AIGs after sweeping. “Generated” and “Kept” are the                     has bin size of 50, i.e., a point at (2, 1) indicates that the
number of cuts generated and kept by the corresponding                  method finds the same number of merges as “k-quality”
methods. For an individual benchmark, its “height” is the               for nodes with height from 100 to 149. Obviously, the
average height of all merged cuts. The “Height” column                  height heuristic works better on smaller height nodes,
is computed by taking the average of the “height” of                    while the quality heuristic catches more on larger height
all the benchmarks. A smaller value indicates that more                 ones. The combined heuristic takes the advantage of
merges are found by cuts that are close to the PIs. Note                both and produces an even better profile on nodes with
that since we use BDDs, the results in terms of efficiency              larger heights. Note that although the height heuristic
of bit-vectors based methods may not be as good as in                   works worse on the nodes with larger heights, it can
[9]. Therefore, when dealing with them, we just compare                 still get more merges. This may be due to the fact that
the effectiveness.                                                      in this benchmark set, a large percent of equivalences
   Results indicate that the resulting AIGs are con-                    are located at lower heights.
sistently smaller with s than with k. There are                            In our setup, cut sweeping is intended for usage as a
a few interesting observations. First, the ratios                       fast method. Thus we consider “s-height N = 1” and
GeneratedCuts /Merge and KeptCuts/Merge are im-                         “s-combined-2” to be the best variants. Compared to
                                                                        BDD sweeping, those two variants are faster because
  2
    Detailed results for first and second parts can be found at http:   they create fewer BDD nodes than BDD sweeping, but
//eces.colorado.edu/∼hassanz/sweeping-study                             are less effective since BDD sweeping may keep BDDs
                                                     TABLE I
   R ESULTS OF CUT SWEEPING , BDD SWEEPING AND SAT SWEEPING ON THE HWMCC’10 SET. B Y DEFAULT, k = 8 AND s = 250.

                                                       Method              Final (×106 )             Merges (×105 )   Time (s)   Generated (×107 )   Kept (×107 )   Height
                                                  k-quality N = 5              6.82                      2.62          123.26          5.83             1.92        10.20
                                               k-combined-1 N = 5              6.75                      3.14          129.64          5.84             1.90         8.57
                                                  s-quality N = 5              6.71                      3.31          536.75          7.63             2.32        12.11
                                                  s-height N = 1               6.55                      4.07           58.99          1.07             0.54         3.19
                                                  s-height N = 2               6.51                      4.20          181.52          2.18             0.99         2.94
                                               s-combined-1 N = 2              6.48                      4.42          181.21          2.29             1.02        12.86
                                                    s-combined-2               6.52                      4.28           74.64          1.10             0.54        12.72
                                                  BDD Sweeping                 6.34                      5.61          112.74            –                –           –
                                                   SAT Sweeping                6.10                      6.37          2149.4            –                –           –



                                      2
                                     10                                                                               spent in sweeping. Data are aggregated over the 818
                                                                                        Height Heuristic
                                                                                        Combined Heuristic            benchmarks. The base case for our comparisons is the
                                                                                                                      pure SAT sweeping case in which SAT sweeping gets
      Ratio over Quality Heuristic




                                      1
                                                                                                                      the whole budget. The time budget used in our study is
                                     10
                                                                                                                      10 seconds.
                                                                                                                         We consider the following policies: (a) allocating
                                                                                                                      the budget to two methods, (b) allocating it to three
                                      0
                                     10                                                                               methods, and (c) allocating the whole budget to SAT
                                                                                                                      sweeping. For (a) and (b), we consider all the different
                                                                                                                      permutations of assigning integer time values to each
                                      −1
                                                                                                                      method, such that they sum up to 10 seconds. Note that if
                                     10
                                           0        20    40          60
                                                               Node Height (/50)
                                                                                   80          100           120      a sweeping algorithm times out, what it has achieved thus
                                                                                                                      far is used in what follows. In all cases, a set of light-
Fig. 2.                               Number of merges on nodes within different height ranges
                                                                                                                      weight sequential simplification algorithms are applied
                                                                                                                      before sweeping. This set of algorithms includes cone-
                                                                                                                      of-influence reduction, stuck-at-constant latch detection,
that exceed the threshold.                                                                                            and equivalent latch detection. The total number of
                                                                                                                      AIG nodes for all 818 benchmarks measured after the
B. Combined Sweeping Methods                                                                                          sequential simplification step is 6.1M.
   In this section, we show experimental evidence that                                                                   Results are presented in Table II. The first column lists
supports the guidelines of combining sweeping methods                                                                 the methods, where the number before each sweeping
presented in Section IV. In particular, we try several pos-                                                           method indicates the number of seconds given to it.
sibilities of allotting the budget to the different sweeping                                                          The second column shows the number of AIG nodes
algorithms with the purpose of identifying empirically if                                                             removed. The third column shows the total time spent in
they should be combined, and if so, in which way. In                                                                  sweeping. The methods are listed in order of decreasing
what follows we use the “s-height N = 1” variant of                                                                   reduction. The last row is for pure SAT sweeping. We
cut-sweeping since it is the fastest, and we simply refer                                                             only show the best three setups in terms of reduction for
to it as cut sweeping.                                                                                                each of the possible orders of the method sequences.
   In our combined approach, SAT sweeping is always                                                                      Several observations can be made. First, when it
run last since it is the only complete method of the three,                                                           comes to running two methods in sequence, BDD
and should thus be given whatever time is left to find                                                                sweeping combined with SAT sweeping outperforms cut
equivalences not discovered by the other methods. Also,                                                               sweeping combined with SAT sweeping. The method
the time not used by any of the preceding methods is                                                                  that achieves maximum reduction (8 seconds of BDD
passed to SAT sweeping. For instance, if cut sweeping                                                                 sweeping followed by 2 seconds of SAT sweeping)
is given a time budget of 4 seconds and only uses 3 of                                                                removes 56K more nodes than pure SAT sweeping (7.7%
them, SAT sweeping gets to run for one extra second.                                                                  more reduction). Second, more reduction is achievable
   We compare the reduction measured in terms of the                                                                  by running three methods in sequence. As suggested in
number of AIG nodes removed, and the total time                                                                       Section IV, ordering the methods by increasing effort
                        TABLE II                                                        TABLE III
      E FFECT OF BUDGET ALLOCATION ON REDUCTION .                      E FFECT OF SWEEPING ON ic3’ S PERFORMANCE .

           Method         Reduction   Total Sweeping Time (s)                 Solves      Solves     Runtime (s)   Runtime (s)
    4 Cut, 5 BDD, 1 SAT    801,932              518                  Seed       (No        (With        (No          (With
    2 Cut, 5 BDD, 3 SAT    801,137              516                 Index    Sweeping)   Sweeping)   Sweeping)     Sweeping)
    6 Cut, 3 BDD, 1 SAT    801,119              522                    1        693         698        96,297        91,762
    4 BDD, 1 Cut, 5 SAT    794,052              517                    2        689         699        95,629        90,341
    8 BDD, 1 Cut, 1 SAT    793,921              515                    3        691         699        95,050        92,714
    7 BDD, 2 Cut, 1 SAT    793,814              519                    4        696         697        93,691        91,141
       8 BDD, 2 SAT        793,226              500                    5        693         698        95,007        89,656
       7 BDD, 3 SAT        793,068              503                    6        690         695        96,270        91,559
       5 BDD, 5 SAT        792,797              508                    7        693         699        94,784        92,056
        1 Cut, 9 SAT       772,563              512                    8        690         701        94,351        90,837
        6 Cut, 4 SAT       771,070              513                    9        693         693        95,491        92,847
        3 Cut, 7 SAT       769,483              511                   10        690         693        95,124        93,048
           10 SAT          736,594              619                Average     691.8       697.2       95,169        91,596




(or equivalently by increasing degree of completeness)          and after sweeping) in the sweeping case. We compare
achieves more reduction than otherwise. Here, the best          the number of solves, and the aggregate runtime among
method (4 seconds, 5 seconds, and 1 second for cut,             all benchmarks.
BDD and SAT sweeping, respectively), has an edge                   It is important to note that the ic3 algorithm has a
of 65K nodes over pure SAT sweeping (about 8.9%                 random flavor. In particular, the order by which gener-
more reduction). Third, in terms of sweeping time, it           alization (dropping literals) is attempted is randomized.
is clear that a large drop occurs (> 100 seconds) when          Also, since the algorithm is SAT-based, randomization
two or three methods are combined versus pure SAT               occurs in the SAT solver decisions. To have reliable
sweeping, which is due to the often smaller time needed         experimental results, each experiment is repeated with
by BDD and cut sweeping to discover equivalences than           10 different seeds, and the results are averaged over the
SAT sweeping. Given an overall model checking budget,           different seeds.
smaller sweeping time allows more time for the model               Results are shown in Table III. The first column shows
checking engine, which is desirable.                            the seed index, the second and third columns show the
   The question of whether such difference has a consid-        number of solves without and with sweeping, and the
erable effect on the performance of the model checking          fourth and fifth columns show the aggregate runtime
engine is answered in the next section.                         without and with sweeping.
                                                                   The results confirm a positive effect of sweeping on
C. Effect on ic3
                                                                the performance of ic3. On average, five more solves
   The recently developed model checking algorithm,             are achieved with sweeping, and the aggregate runtime
ic3 [1], has been regarded as the best standalone model         drops by 3.8%.
checking algorithm developed up till now [16]. As the              The enhancement in the performance of ic3 in pres-
interaction of combinational simplification methods with        ence of sweeping can be attributed to two factors. First,
different model checking algorithms has been studied in         reduction in the number of gates caused by sweeping can
the past, we here aim to study how they interact with           result in the reduction in the SAT solver time. Second,
ic3. In particular, we would like to empirically find           simplification often results in dropping of latches (e.g.,
out if ic3 benefits from preprocessing the netlist with         if it merges the next-state functions of two latches). For
a simplification algorithm or not, and if it does, how          example, for the benchmark set used in our experiments,
sensitive it is to the magnitude of reductions achieved         sweeping reduces the aggregate number of latches from
through simplification.                                         279,161 to 269,091 (3.6% decrease). This reduces the
   In the first experiment, we compare two runs of ic3,         amount of work done by ic3 in generalization of
one that is preceded by SAT sweeping, and one that              counterexamples to induction.
is not. The experimental setup is as follows. A total              We now repeat the previous experiment, but this time
timeout of 10 minutes is used. The budget for SAT               we compare the number of solves and the aggregate
sweeping is 10 seconds. The light-weight sequential             runtime between pure SAT sweeping and the empirically
simplification algorithms referred to in Section V-B are        optimum combined sweeping scheme of Section V-B.
applied once in the no-sweeping case, and twice (before         The purpose of this experiment is to understand how
                      TABLE IV
     O PTIMUM SWEEPING VERSUS PURE SAT SWEEPING .               reviewers for their insightful comments regarding cut
                                                                sweeping’s pruning heuristics that prompted us to try
              Solves       Solves   Runtime (s)   Runtime (s)   the “combined-2” heuristic.
               (Pure     (Optimum      (Pure       (Optimum
    Seed        SAT      Sweeping       SAT        Sweeping
    Index    Sweeping)    Scheme)   Sweeping)       Scheme)                               R EFERENCES
       1        698         696       91,762         91,567
       2        699         696       90,341         91,113
                                                                 [1] A. R. Bradley, “SAT-based model checking without unrolling,”
       3        699         697       92,714         92,373          in Proc. Int. Conf. on Verification, model checking, and abstract
       4        697         702       91,141         90,478          interpretation, 2011, pp. 70–87.
       5        698         697       89,656         90,327      [2] H. Mony, J. Baumgartner, V. Paruthi, R. Kanzelman, and
       6        695         699       91,559         90,606          A. Kuehlmann, “Scalable automated verification via expert-
       7        699         697       92,056         91,498          system guided transformations,” in Formal Methods in
       8        701         699       90,837         92,228
       9        693         696       92,847         91,252
                                                                     Computer-Aided Design, 2004, pp. 159–173.
      10        693         697       93,048         92,663      [3] R. Brayton and A. Mishchenko, “ABC: An academic industrial-
   Average     697.2       697.6      91,596         91,411          strength verification tool,” in Computer Aided Verification,
                                                                     2010, pp. 24–40.
                                                                 [4] G. Cabodi, S. Nocco, and S. Quer, “Benchmarking a model
                                                                     checker for algorithmic improvements and tuning for perfor-
sensitive ic3 is to the magnitude of reductions.                     mance,” in Proc. Hardware Verification Workshop, 2010.
                                                                 [5] P. Bjesse and A. Boralv, “DAG-aware circuit compression for
   Results are shown in Table IV, where the second and               formal verification,” in Proc. Int. Conf. on Computer-aided
third columns compare the number of solves for pure                  design, 2004, pp. 42–49.
SAT sweeping and the optimum sweeping scheme, and                [6] A. Mishchenko, S. Chatterjee, and R. Brayton, “DAG-aware
                                                                     AIG rewriting a fresh look at combinational logic synthesis,”
the fourth and fifth columns compare the total runtime.
                                                                     in Proc. Design Automation Conference, 2006, pp. 532–535.
   As the results indicate, ic3 does not benefit much            [7] A. Kuehlmann and F. Krohm, “Equivalence checking using cuts
from the better reduction achieved by the combined                   and heaps,” in Proc. Design Automation Conference, 1997, pp.
sweeping scheme. The lack of performance enhancement                 263–268.
                                                                 [8] A. Kuehlmann, “Dynamic transition relation simplification for
on ic3 can be attributed to the small improvement                    bounded property checking,” in Proc. Int. Conf. on Computer-
in reduction the combined sweeping approach achieves                 aided design, 2004, pp. 50–57.
over pure SAT sweeping. In particular, while pure SAT            [9] N. Eén, “Cut sweeping,” Cadence Design Systems, Tech. Rep.,
sweeping removes 737K nodes out of the total 6.1M                    2007.
                                                                [10] A. Mishchenko, S. Chatterjee, and R. Brayton, “FRAIGs: A
nodes in the 818 benchmarks (12.1% reduction), the                   unifying representation for logic synthesis and verification,”
combined approach removes 802K nodes (13.2% reduc-                   EECS Dept., UC Berkeley, ERL, Tech. Rep., Mar 2005.
tion); a mere 1.1% improvement. This suggests that ic3          [11] A. Biere, A. Cimatti, E. M. Clarke, and Y. Zhu, “Symbolic
                                                                     model checking without BDDs,” in TACAS, 1999, pp. 193–207.
has a small sensitivity to the magnitude of reduction.
                                                                [12] G. Cabodi, M. Murciano, S. Nocco, and S. Quer, “Stepping
                    VI. C ONCLUSION                                  forward with interpolants in unbounded model checking,” in
                                                                     Proc. Int. Conf. on Computer-aided design, 2006, pp. 772–778.
  In this paper, we presented an empirical study of             [13] A. Mishchenko, S. Chatterjee, R. Brayton, and N. Een, “Im-
the different sweeping methods proposed in the past.                 provements to combinational equivalence checking,” in Proc.
                                                                     Int. Conference on Computer-aided design, 2006, pp. 836–843.
We have shown that a combined sweeping approach                 [14] A. Biere and K. Claessen, “Hardware model checking compe-
outperforms any of the sweeping methods alone. We                    tition,” in Hardware Verification Workshop, 2010.
have also proposed a BDD-based cut sweeping method              [15] F. Somenzi, CUDD: CU Decision Diagram Package, University
that is more effective than the original cut sweeping.               of Colorado at Boulder, ftp://vlsi.colorado.edu/pub/.
                                                                [16] R. Brayton, N. Een, and A. Mishchenko, “Continued relevance
Finally, we have studied the effect of sweeping on the               of bit-level verification research,” in Proc. Usable Verification,
new model checking algorithm, ic3, and investigated                  Nov. 2010, pp. 15–16.
the causes of the better performance it experiences with
sweeping. The goal of this analysis is to help designers
of model checkers to make decisions regarding the
incorporation of sweeping methods and to provide a
deeper understanding of how sweeping methods interact
with ic3.
                 ACKNOWLEDGEMENTS
  We thank Aaron Bradley who motivated this work
and contributed to many discussions. We also thank the