=Paper= {{Paper |id=None |storemode=property |title=A Study of Sweeping Algorithms in the Context of Model Checking |pdfUrl=https://ceur-ws.org/Vol-832/paper_7.pdf |volume=Vol-832 |dblpUrl=https://dblp.org/rec/conf/fmcad/HassanZS11 }} ==A Study of Sweeping Algorithms in the Context of Model Checking == https://ceur-ws.org/Vol-832/paper_7.pdf

A Study of Sweeping Algorithms in the Context of
Model Checking
Zyad Hassan, Yan Zhang, and Fabio Somenzi
Dept. of Electrical, Computer, and Energy Engineering
University of Colorado at Boulder
Boulder, CO 80309

Abstract—Combinational simplification techniques have niques benefit interpolation considerably. The recently
proved their usefulness in both industrial and academic introduced ic3 [1] incrementally discovers invariants
model checkers. Several combinational simplification al- that hold relative to stepwise reachability information.
gorithms have been proposed in the past that vary in Designing a model checking flow that involves ic3
efficiency and effectiveness. In this paper, we report our
requires understanding how combinational simplification
experience with three algorithms that fall in the com-
binational equivalence checking (sweeping) category. We
algorithms affect it.
propose an improvement to one of these algorithms. We This paper makes the following contributions:
have conducted an empirical study to identify the strengths • We carry out a comparative study of the different
and weaknesses of each of the algorithms and how they sweeping methods.
can be synergistically combined, as well as to understand • We propose a BDD-based cut sweeping method that
how they interact with ic3 [1].
is more effective than the original cut sweeping.
• We propose a combined sweeping approach in
I. I NTRODUCTION
which more than one sweeping method is applied.
Combinational simplification eliminates redundancies We show that the combined approach can achieve
and increases sharing of logic in a design. It has been more simplification than any of the methods can
successfully employed in logic synthesis, equivalence achieve individually.
checking, and model checking. • We perform an empirical study of the effect of
In the model checking context, combinational sim- sweeping on ic3.
plification often dramatically improves the performance
The rest of the paper is organized as follows. Section
of the proof engines. This has made it into a primary
II contains preliminaries. In Section III, we introduce
component in both industrial [2] and academic [3],
the BDD-based cut sweeping method. In Section IV,
[4] model checkers. Several combinational simplification
we explain the rationale behind the combined sweeping
algorithms have been proposed in the past, such as
approach. In Section V we present the experimental
DAG-aware circuit compression [5], [6] and sweeping
results and in Section VI we conclude.
methods [7]–[9]. Sweeping methods merge functionally
equivalent nodes. They include BDD sweeping [7], SAT II. P RELIMINARIES
sweeping [8], [10], and cut sweeping [9].
When designing a model checker, the strengths and A. AND-inverter-graph
weaknesses of each of the sweeping methods should The input and output of our sweeping algorithms
be taken into account. To the best of our knowledge, are AND-inverter-graphs (AIGs). An AIG is a directed
no studies have been carried out so far to evaluate acyclic graph that has four types of nodes: source nodes,
and compare the different sweeping methods, with the sink nodes, internal nodes and the constant TRUE node.
exception of a limited study reported in [9]. A primary input (PI) is a source node in an AIG. A
The effect of combinational simplification on several primary output (PO) is a sink node and has exactly one
model checking engines has been studied in the past. predecessor. The internal nodes represent 2-input AND
In [8], SAT sweeping has been shown to positively gates. A node v is a fanin of v0 if there is an edge (v, v0 );
affect bounded model checking (BMC) [11]. In [12], it it is a fanout of v0 if there is an edge (v0 , v). Left(v)
is shown that combinational redundancy removal tech- and Right(v) refer to the left and right predecessors of v .
Fanin(v) and Fanout (v) denote the fanins and fanouts point is that the original BDD is kept for equivalence
of node v . An edge in an AIG may have the INVERTED checking, but is not used to produce new BDDs. The
attribute to model an inverter. The Boolean function of algorithm is complete if the threshold is so large that
a PI is a unique Boolean variable. For an edge, it is no auxiliary variable is introduced. In practice, however,
the function of its source node if the edge does not have this can be prohibitive.
the INVERTED attribute, and the complement otherwise.
For an internal node, it is the conjunction of the functions C. SAT Sweeping
of its incoming edges. The Boolean function of a PO is The advancements in SAT solver technology over the
that of its incoming edge. past two decades has proliferated SAT-based reasoning
A path from node a to b is a sequence of nodes methods. SAT sweeping is one such method proposed
hv0 , v1 , v2 , . . . , vn i, such that v0 = a, vn = b and by Kuehlmann [8] for combinational redundancy identi-
vi ∈ Fanin(vi+1 ), 0 ≤ i < n. The height of a node v , fication. SAT sweeping queries the SAT solver to prove
h(v), is the length of the longest path from a PI to v . The or refute the equivalence of two nodes in the circuit.
fanin (fanout) cone of node v is the set of nodes that have The basic SAT sweeping algorithm works as follows.
a path to (from) v . Two nodes are functionally equivalent First, the circuit is simulated with random inputs, and
(complementary) if they represent the same (comple- candidate equivalence classes are formed, where nodes
mentary) Boolean function(s). Functionally equivalent having the same simulation values are placed together.
(complementary) nodes can be merged transferring the Next, for each two nodes belonging to the same class,
fanouts of the redundant nodes to their representative. the SAT solver is queried for their equivalence. If the
To simplify our presentation, in what follows we delib- SAT solver reports they are equivalent, one of them is
erately ignore detection of complementary functions. merged into the other. Otherwise, the counterexample
provided is simulated to break this equivalence class,
B. BDD Sweeping and possibly rule out other candidate equivalences as
BDD sweeping identifies two nodes as equivalent if well. This process is repeated until a resource limit is
they have the same BDD representation. The original reached, or until all classes become singletons, indicating
algorithm of Kuehlmann and Krohm [7] works in two the absence of further equivalences.
stages: equivalence checking and false negative detec- In our implementation of SAT sweeping, several
tion. In the first stage, it builds BDDs for each node heuristics were applied. We mention each of them briefly.
and merges nodes having the same BDD. The algorithm 1) Simulating Distance-1 Vectors: This heuristic was
introduces an auxiliary BDD variable (cut point) for each proposed in [13]. Instead of just simulating the coun-
node that has been merged, which potentially leads to terexample to equivalence derived by the SAT solver,
false negatives. In the second stage, it takes the BDDs all distance-1 vectors, that have a single bit flipped, are
of the output functions and for each of them, replaces simulated as well. Only the bits corresponding to the
the auxiliary variables with their driving functions. The inputs that are in the fanin cone of the two nodes being
algorithm is complete in the sense that it can find all checked for equivalence are flipped. We have found this
the equivalences in the circuit given a sufficiently large heuristic to be very effective in practice. In [13], this
limit on the BDD sizes. However, a large limit often hurts process is repeated for vectors that were successful in
efficiency. In this paper, we intend to use BDD sweeping breaking up equivalence classes until convergence. In our
in conjunction with SAT sweeping which is complete implementation, we only simulate the distance-1 vectors
and avoids inherent BDD inefficiencies [8]. For that, we for the original counterexample: for the benchmark suite
employ a version of BDD sweeping that is incomplete we experimented with, recurring on successful vectors is
yet faster than the original. too expensive for the number of refinements it achieves.
The algorithm we adopt iterates the following steps 2) Clustering: Simulating distance-1 vectors often
on each node v of the AIG in some topological order. results in considerable refinement of the equivalence
It builds a BDD for v and checks if there exists another classes. This is desirable, since an equivalence class
node that has the same BDD. If so, it merges these two is often broken up more cheaply by simulation than
nodes and continues. Otherwise, if the BDD size exceeds by the SAT solver. Moreover, we have observed that
a given threshold, the algorithm introduces an auxiliary with distance-1 vector simulation, it becomes very likely
BDD variables for v to be used in place of the original that nodes remaining in an equivalence class are indeed
BDD when dealing with the fanouts of v . An important equivalent. Therefore, rather than checking the equiva-
lence of two nodes at a time, we check the equivalence in Φ(v) is associated with a truth table. Next, it searches
of all nodes in an equivalence class using a single SAT for a node equivalent to v by looking for a cut equivalent
query. If they are all indeed equivalent, we find that using to some cut in Φ(v). If it succeeds, the two nodes are
a single SAT query rather than n − 1 queries where n is merged. Otherwise, a heuristic is applied to prune Φ(v)
the number of nodes in the class. to at most N cuts. After pruning, the algorithm stores
3) Height-Based Merging: When two nodes are Φ(v) as the cut-set of v and builds a cross-reference
proved equivalent, we merge the node with a larger between each of its cuts and v .
height into the one with a smaller height, instead of The heuristic for pruning, which we call the quality
merging based on a topological order as in [13]. The heuristic, computes a value q for each cut:
intuition being that a node having a larger height often X 1
q(C) = . (3)
has a larger fanin cone, which suggests that merging it | Fanout (n)|
n∈C
would lead to a larger reduction. Nodes coming later in a
The cuts with the smallest values of q are kept. The
topological order do not necessarily have a larger height
intuition of the quality heuristic is two-fold. First, it tries
than nodes coming earlier.
to localize the equivalence and thus favors smaller cuts.
D. Cut Sweeping Second, it normalizes cuts by attempting to express them
with the same set of variables. The chosen variables are
Cut sweeping [9] is a fast yet incomplete approach
those that have a large fanouts, i.e., that are shared by
for combinational equivalence checking. It iteratively
many other nodes.
computes cuts for each AIG node and compares the
A good truth-table implementation is critical to the
functions associated to the cuts.
performance of cut sweeping. In [9], truth tables are
A cut is defined with respect to an AIG node, called
implemented as bit vectors. An advantage of bit vectors
root. A cut C(v) of root v is a set of nodes, called leaves,
is the constant-time cost of Boolean operations. On the
such that any path from a PI to v intersects C(v). A cut-
other hand, bit interleaving is required to extend the bit
set Φ(v) consists of several cuts of v . For cut-sets Φ(v1 )
vectors to the same length so that the corresponding bits
and Φ(v2 ), the merge operator ⊗ is defined as
represent the same minterm1 .
Φ(v1 ) ⊗ Φ(v2 ) = {C1 ∪ C2 | C1 ∈ Φ(v1 ), C2 ∈ Φ(v2 )} . III. BDD- BASED C UT S WEEPING
(1)
Assume k ≥ 1. A k-feasible cut is a cut that contains Representing functions having a small number of
at most k leaves. A k-feasible cut-set is a cut-set that inputs using bit vectors is very efficient. However, the
contains only k-feasible cuts. The k-merge operator, ⊗k , number of bits required grows exponentially with the
creates only k-feasible cuts. Cut enumeration recursively number of variables, which can easily lead to memory
computes all k-feasible cuts for an AIG. It computes the blow-up. As an alternative, BDDs, which are more suit-
k-feasible cut-set for a node v as follows: able for large functions, can also be used to represent cut
( functions. Furthermore, the strong canonicity of BDDs
{{v}} v ∈ PI makes it trivial to check for equivalence. The use of
Φ(v) =
{{v}} ∪ Φ (Left (v)) ⊗k Φ (Right (v)) v ∈ AND , BDDs also enables a heuristic which we describe below.
(2) The proposed algorithm differs from the original one
where PI and AND refer to the set of PIs and 2-input in two aspects. First, we introduce a new parameter s, the
AND gates respectively. Note that cuts are not built for maximum size of a BDD, to replace k. That is, instead of
POs because they are never merged. k-feasible cuts, we keep cuts whose functions contain at
The function of a cut is the Boolean function of the most s BDD nodes. Node count, as opposed to number
root in terms of the leaves. It can be represented in of inputs, is a more natural characterization of BDD size.
different ways, for instance, using bit vectors or BDDs. The second difference comes from the pruning heuris-
Two cuts are equivalent if their cut functions are equal. tic. We define the height h of a cut C as the average
Hence, two nodes are functionally equivalent if their cut- height of its nodes:
sets contain equivalent cuts. X h(v)
Cut sweeping is parametrized by k and N , the maxi- h(C) = . (4)
v∈C
|C|
mum cut size and the maximum cut-set size, respectively.
For each node v in some topological order of the AIG, 1
A good reference of bit-interleaving can be found at http://
the algorithm builds a k-feasible cut-set Φ(v). Each cut graphics.stanford.edu/∼seander/bithacks.html.
A smaller h indicates that the leaves in the cut are closer cut sweeping only keeps those below the threshold. The
to the PIs. The height heuristic keeps at most N cuts quality heuristic tries to attract cuts on certain nodes,
choosing the ones with smallest values of h. which is similar to the placement of auxiliary variables in
BDD sweeping. Nevertheless, the number of “attractors”
a
p
a
r
in the quality heuristic tends to be much larger than in
b c
g
BDD sweeping.
f
c b
q s
d d IV. C OMBINING S WEEPING M ETHODS
Fig. 1. Two implementations of a 4-input AND gate The idea of combining several simplification algo-
rithms is not new. Many existing model checkers iterate
A motivating example for the new heuristic is in several simplification algorithms before the problem is
Figure 1, which shows two different 4-input AND gates. passed to the model checking engines. However, we are
Nodes a, b, c, and d are PIs. Nodes p, q , r , and s are unaware of any studies that have been carried out to
internal nodes. Nodes f and g can only be merged if identify the best way they could be combined. In this
their cut-sets both contain {a, b, c, d}. However, if the section we attempt to give general guidelines to which
internal nodes have many more fanouts than the PIs, the a combined approach should adhere. We support our
quality heuristic may select cuts containing the internal claims by empirical evidence collected in the experi-
nodes instead, causing the merge to be missed. ments reported in Section V.
As mentioned before, the quality heuristic tries to The problem we address is as follows: given a time
normalize the cuts on certain “attractors.” This reduces budget for combinational simplification, how should it
the possibility that equivalent functions are represented be allotted to the different algorithms? The sweeping
differently. However, this might also lead to the loss algorithms discussed in previous sections vary in their
of the opportunity to find equivalences that cannot be completeness and speed, with cut sweeping being the
expressed by those “attractors,” as in Figure 1. most incomplete method yet the fastest of the three meth-
On the other hand, the height heuristic tries to push the ods, SAT sweeping being a complete, yet the slowest,
cut boundary as far as possible. A supporting argument and BDD sweeping lying in between, both in terms of
is that, if a node is employed in equivalent cuts, then completeness and speed.
replacing it with its predecessors preserve equivalence. Possible solutions include allocating the whole time
Furthermore, new merges that are otherwise undiscover- budget to a single algorithm, or dividing it among two or
able (consider other equivalences that require a and b in more algorithms. The fact that some methods are better
the above example) may be found. The height heuristic in approaching certain problems than others, suggests
does not attract cuts to certain nodes, which may result that more than one method should be applied. If two or
in different cuts for equivalent nodes. As shown in the more methods are to be applied in sequence, the intuition
experiments, the effectiveness of the height heuristic suggests that the lower effort methods should precede
reduces as the height of nodes increases. the higher effort ones. The advantages of doing so are
The two heuristics have their own strengths and weak- two-fold. First, although the higher-effort methods are
nesses. A natural question is whether it is possible to likely to discover the merges that a lower-effort method
combine them to benefit from their individual strengths. discovers, in general, it will take them more time to do
We can choose a few cuts with each heuristic. This may so. Second, preceding higher-effort methods by lower-
lead to more merges but may also worsen the efficiency effort methods is beneficial in having them present a
if it significantly increases the number of cuts. To prevent smaller problem that is easier to handle.
such an increase, a combined heuristic only records Finally, the percentages of total time that should be
height cuts for the lower nodes, while it keeps both types allotted to each of the methods to yield the maximum
of cuts for the others. possible reduction is studied in Section V.
There is some connection between cut sweeping with
each of the two heuristics and BDD sweeping. With V. R ESULTS
the height heuristic, cut sweeping tries to build cuts as
large as possible, as BDD sweeping does. However, BDD The experiments are divided into three parts. The first
sweeping can store cuts that exceed the threshold while part compares different variations of the cut sweeping
algorithm. The second part shows the results of the com- proved significantly with s. This means that with s, each
bined sweeping methods, and the third part is concerned cut has a larger chance of resulting in a merge.
with the effect of sweeping on ic32 . Second, while “k-quality” and “k-combined-1” have
We use the benchmark set from HWMCC’10 [14], a very close sweeping times, the latter achieves 19.8%
set of 818 benchmarks in total. The experiments are run more merges. Furthermore, the decrease in the “Height”
on a machine with a 2.8GHz Quad-Core CPU and 9GB column reveals that the height cuts indeed lead to
of memory. We use CUDD [15] as our BDD package merges. Although “s-quality” is more effective than the
for all the BDD-related algorithms. two above methods, it is less efficient due to the larger
cut sizes.
A. BDD-based Cut Sweeping For the methods with s (excluding “s-quality”), we
Variations of cut sweeping are applied to the observe that “s-height N = 1” is the fastest and produces
HWMCC’10 benchmark set. The differences between a good number of merges. Increasing the number of
variations are two-fold. First, either the number of vari- height cuts to two triples the run time without gaining
ables, k, or the number of BDD nodes, s, is used to many more merges. Comparing it with “s-combined-1”,
drop over-sized cuts. Second, we experiment with several an improvement on the merges is shown by the latter.
heuristics for pruning cut-sets: the quality heuristic, This indicates that maintaining one height cut and one
the height heuristic, and two combined heuristics. The quality cut works better than two height cuts. For “s-
naive combined heuristic (“combined-1”) chooses one combined-2”, the number of merges is between the two
cut based on the height heuristic and the others based above methods, but with lower run time. Furthermore,
on the quality heuristic. The other heuristic (“combined- the numbers of generated cuts and kept cuts are even
2”) sets a threshold on the node height (350 in our comparable to “s-height N = 1”. That is, even though
experiments). For nodes that are below the threshold, we keep three cuts for those nodes with height larger
it only keeps a height cut. For higher nodes, it produces than 350, on average we compute only a few percent
cut-sets consisting of a height cut and two quality cuts. more cuts than we do in the case of one cut per node.
We denote a method by a k or an s followed by the the The “Height” values of the three methods confirm the
heuristic name. All the variations use BDDs to represent assumption made in Section III: most merges produced
the cut functions. by the height heuristic come from cuts close to the
The results are shown in Table I; they are aggregated PIs. When the two heuristics are combined, a significant
over the 818 benchmarks. Based on experiments, both increase on the “Height” value is observed. In Figure
the threshold of BDD sweeping and s in BDD-based 2, we show the number of merges found by “s-height
cut sweeping are set to 250. The total number of AIG N = 1” and “s-combined-2” on nodes within different
nodes before sweeping is 7.22M. “Final” is the size of height ranges. The plot is normalized to “k-quality” and
AIGs after sweeping. “Generated” and “Kept” are the has bin size of 50, i.e., a point at (2, 1) indicates that the
number of cuts generated and kept by the corresponding method finds the same number of merges as “k-quality”
methods. For an individual benchmark, its “height” is the for nodes with height from 100 to 149. Obviously, the
average height of all merged cuts. The “Height” column height heuristic works better on smaller height nodes,
is computed by taking the average of the “height” of while the quality heuristic catches more on larger height
all the benchmarks. A smaller value indicates that more ones. The combined heuristic takes the advantage of
merges are found by cuts that are close to the PIs. Note both and produces an even better profile on nodes with
that since we use BDDs, the results in terms of efficiency larger heights. Note that although the height heuristic
of bit-vectors based methods may not be as good as in works worse on the nodes with larger heights, it can
[9]. Therefore, when dealing with them, we just compare still get more merges. This may be due to the fact that
the effectiveness. in this benchmark set, a large percent of equivalences
Results indicate that the resulting AIGs are con- are located at lower heights.
sistently smaller with s than with k. There are In our setup, cut sweeping is intended for usage as a
a few interesting observations. First, the ratios fast method. Thus we consider “s-height N = 1” and
GeneratedCuts /Merge and KeptCuts/Merge are im- “s-combined-2” to be the best variants. Compared to
BDD sweeping, those two variants are faster because
2
Detailed results for first and second parts can be found at http: they create fewer BDD nodes than BDD sweeping, but
//eces.colorado.edu/∼hassanz/sweeping-study are less effective since BDD sweeping may keep BDDs
TABLE I
R ESULTS OF CUT SWEEPING , BDD SWEEPING AND SAT SWEEPING ON THE HWMCC’10 SET. B Y DEFAULT, k = 8 AND s = 250.

Method Final (×106 ) Merges (×105 ) Time (s) Generated (×107 ) Kept (×107 ) Height
k-quality N = 5 6.82 2.62 123.26 5.83 1.92 10.20
k-combined-1 N = 5 6.75 3.14 129.64 5.84 1.90 8.57
s-quality N = 5 6.71 3.31 536.75 7.63 2.32 12.11
s-height N = 1 6.55 4.07 58.99 1.07 0.54 3.19
s-height N = 2 6.51 4.20 181.52 2.18 0.99 2.94
s-combined-1 N = 2 6.48 4.42 181.21 2.29 1.02 12.86
s-combined-2 6.52 4.28 74.64 1.10 0.54 12.72
BDD Sweeping 6.34 5.61 112.74 – – –
SAT Sweeping 6.10 6.37 2149.4 – – –

2
10 spent in sweeping. Data are aggregated over the 818
Height Heuristic
Combined Heuristic benchmarks. The base case for our comparisons is the
pure SAT sweeping case in which SAT sweeping gets
Ratio over Quality Heuristic

1
the whole budget. The time budget used in our study is
10
10 seconds.
We consider the following policies: (a) allocating
the budget to two methods, (b) allocating it to three
0
10 methods, and (c) allocating the whole budget to SAT
sweeping. For (a) and (b), we consider all the different
permutations of assigning integer time values to each
−1
method, such that they sum up to 10 seconds. Note that if
10
0 20 40 60
Node Height (/50)
80 100 120 a sweeping algorithm times out, what it has achieved thus
far is used in what follows. In all cases, a set of light-
Fig. 2. Number of merges on nodes within different height ranges
weight sequential simplification algorithms are applied
before sweeping. This set of algorithms includes cone-
of-influence reduction, stuck-at-constant latch detection,
that exceed the threshold. and equivalent latch detection. The total number of
AIG nodes for all 818 benchmarks measured after the
B. Combined Sweeping Methods sequential simplification step is 6.1M.
In this section, we show experimental evidence that Results are presented in Table II. The first column lists
supports the guidelines of combining sweeping methods the methods, where the number before each sweeping
presented in Section IV. In particular, we try several pos- method indicates the number of seconds given to it.
sibilities of allotting the budget to the different sweeping The second column shows the number of AIG nodes
algorithms with the purpose of identifying empirically if removed. The third column shows the total time spent in
they should be combined, and if so, in which way. In sweeping. The methods are listed in order of decreasing
what follows we use the “s-height N = 1” variant of reduction. The last row is for pure SAT sweeping. We
cut-sweeping since it is the fastest, and we simply refer only show the best three setups in terms of reduction for
to it as cut sweeping. each of the possible orders of the method sequences.
In our combined approach, SAT sweeping is always Several observations can be made. First, when it
run last since it is the only complete method of the three, comes to running two methods in sequence, BDD
and should thus be given whatever time is left to find sweeping combined with SAT sweeping outperforms cut
equivalences not discovered by the other methods. Also, sweeping combined with SAT sweeping. The method
the time not used by any of the preceding methods is that achieves maximum reduction (8 seconds of BDD
passed to SAT sweeping. For instance, if cut sweeping sweeping followed by 2 seconds of SAT sweeping)
is given a time budget of 4 seconds and only uses 3 of removes 56K more nodes than pure SAT sweeping (7.7%
them, SAT sweeping gets to run for one extra second. more reduction). Second, more reduction is achievable
We compare the reduction measured in terms of the by running three methods in sequence. As suggested in
number of AIG nodes removed, and the total time Section IV, ordering the methods by increasing effort
TABLE II TABLE III
E FFECT OF BUDGET ALLOCATION ON REDUCTION . E FFECT OF SWEEPING ON ic3’ S PERFORMANCE .

Method Reduction Total Sweeping Time (s) Solves Solves Runtime (s) Runtime (s)
4 Cut, 5 BDD, 1 SAT 801,932 518 Seed (No (With (No (With
2 Cut, 5 BDD, 3 SAT 801,137 516 Index Sweeping) Sweeping) Sweeping) Sweeping)
6 Cut, 3 BDD, 1 SAT 801,119 522 1 693 698 96,297 91,762
4 BDD, 1 Cut, 5 SAT 794,052 517 2 689 699 95,629 90,341
8 BDD, 1 Cut, 1 SAT 793,921 515 3 691 699 95,050 92,714
7 BDD, 2 Cut, 1 SAT 793,814 519 4 696 697 93,691 91,141
8 BDD, 2 SAT 793,226 500 5 693 698 95,007 89,656
7 BDD, 3 SAT 793,068 503 6 690 695 96,270 91,559
5 BDD, 5 SAT 792,797 508 7 693 699 94,784 92,056
1 Cut, 9 SAT 772,563 512 8 690 701 94,351 90,837
6 Cut, 4 SAT 771,070 513 9 693 693 95,491 92,847
3 Cut, 7 SAT 769,483 511 10 690 693 95,124 93,048
10 SAT 736,594 619 Average 691.8 697.2 95,169 91,596

(or equivalently by increasing degree of completeness) and after sweeping) in the sweeping case. We compare
achieves more reduction than otherwise. Here, the best the number of solves, and the aggregate runtime among
method (4 seconds, 5 seconds, and 1 second for cut, all benchmarks.
BDD and SAT sweeping, respectively), has an edge It is important to note that the ic3 algorithm has a
of 65K nodes over pure SAT sweeping (about 8.9% random flavor. In particular, the order by which gener-
more reduction). Third, in terms of sweeping time, it alization (dropping literals) is attempted is randomized.
is clear that a large drop occurs (> 100 seconds) when Also, since the algorithm is SAT-based, randomization
two or three methods are combined versus pure SAT occurs in the SAT solver decisions. To have reliable
sweeping, which is due to the often smaller time needed experimental results, each experiment is repeated with
by BDD and cut sweeping to discover equivalences than 10 different seeds, and the results are averaged over the
SAT sweeping. Given an overall model checking budget, different seeds.
smaller sweeping time allows more time for the model Results are shown in Table III. The first column shows
checking engine, which is desirable. the seed index, the second and third columns show the
The question of whether such difference has a consid- number of solves without and with sweeping, and the
erable effect on the performance of the model checking fourth and fifth columns show the aggregate runtime
engine is answered in the next section. without and with sweeping.
The results confirm a positive effect of sweeping on
C. Effect on ic3
the performance of ic3. On average, five more solves
The recently developed model checking algorithm, are achieved with sweeping, and the aggregate runtime
ic3 [1], has been regarded as the best standalone model drops by 3.8%.
checking algorithm developed up till now [16]. As the The enhancement in the performance of ic3 in pres-
interaction of combinational simplification methods with ence of sweeping can be attributed to two factors. First,
different model checking algorithms has been studied in reduction in the number of gates caused by sweeping can
the past, we here aim to study how they interact with result in the reduction in the SAT solver time. Second,
ic3. In particular, we would like to empirically find simplification often results in dropping of latches (e.g.,
out if ic3 benefits from preprocessing the netlist with if it merges the next-state functions of two latches). For
a simplification algorithm or not, and if it does, how example, for the benchmark set used in our experiments,
sensitive it is to the magnitude of reductions achieved sweeping reduces the aggregate number of latches from
through simplification. 279,161 to 269,091 (3.6% decrease). This reduces the
In the first experiment, we compare two runs of ic3, amount of work done by ic3 in generalization of
one that is preceded by SAT sweeping, and one that counterexamples to induction.
is not. The experimental setup is as follows. A total We now repeat the previous experiment, but this time
timeout of 10 minutes is used. The budget for SAT we compare the number of solves and the aggregate
sweeping is 10 seconds. The light-weight sequential runtime between pure SAT sweeping and the empirically
simplification algorithms referred to in Section V-B are optimum combined sweeping scheme of Section V-B.
applied once in the no-sweeping case, and twice (before The purpose of this experiment is to understand how
TABLE IV
O PTIMUM SWEEPING VERSUS PURE SAT SWEEPING . reviewers for their insightful comments regarding cut
sweeping’s pruning heuristics that prompted us to try
Solves Solves Runtime (s) Runtime (s) the “combined-2” heuristic.
(Pure (Optimum (Pure (Optimum
Seed SAT Sweeping SAT Sweeping
Index Sweeping) Scheme) Sweeping) Scheme) R EFERENCES
1 698 696 91,762 91,567
2 699 696 90,341 91,113
[1] A. R. Bradley, “SAT-based model checking without unrolling,”
3 699 697 92,714 92,373 in Proc. Int. Conf. on Verification, model checking, and abstract
4 697 702 91,141 90,478 interpretation, 2011, pp. 70–87.
5 698 697 89,656 90,327 [2] H. Mony, J. Baumgartner, V. Paruthi, R. Kanzelman, and
6 695 699 91,559 90,606 A. Kuehlmann, “Scalable automated verification via expert-
7 699 697 92,056 91,498 system guided transformations,” in Formal Methods in
8 701 699 90,837 92,228
9 693 696 92,847 91,252
Computer-Aided Design, 2004, pp. 159–173.
10 693 697 93,048 92,663 [3] R. Brayton and A. Mishchenko, “ABC: An academic industrial-
Average 697.2 697.6 91,596 91,411 strength verification tool,” in Computer Aided Verification,
2010, pp. 24–40.
[4] G. Cabodi, S. Nocco, and S. Quer, “Benchmarking a model
checker for algorithmic improvements and tuning for perfor-
sensitive ic3 is to the magnitude of reductions. mance,” in Proc. Hardware Verification Workshop, 2010.
[5] P. Bjesse and A. Boralv, “DAG-aware circuit compression for
Results are shown in Table IV, where the second and formal verification,” in Proc. Int. Conf. on Computer-aided
third columns compare the number of solves for pure design, 2004, pp. 42–49.
SAT sweeping and the optimum sweeping scheme, and [6] A. Mishchenko, S. Chatterjee, and R. Brayton, “DAG-aware
AIG rewriting a fresh look at combinational logic synthesis,”
the fourth and fifth columns compare the total runtime.
in Proc. Design Automation Conference, 2006, pp. 532–535.
As the results indicate, ic3 does not benefit much [7] A. Kuehlmann and F. Krohm, “Equivalence checking using cuts
from the better reduction achieved by the combined and heaps,” in Proc. Design Automation Conference, 1997, pp.
sweeping scheme. The lack of performance enhancement 263–268.
[8] A. Kuehlmann, “Dynamic transition relation simplification for
on ic3 can be attributed to the small improvement bounded property checking,” in Proc. Int. Conf. on Computer-
in reduction the combined sweeping approach achieves aided design, 2004, pp. 50–57.
over pure SAT sweeping. In particular, while pure SAT [9] N. Eén, “Cut sweeping,” Cadence Design Systems, Tech. Rep.,
sweeping removes 737K nodes out of the total 6.1M 2007.
[10] A. Mishchenko, S. Chatterjee, and R. Brayton, “FRAIGs: A
nodes in the 818 benchmarks (12.1% reduction), the unifying representation for logic synthesis and verification,”
combined approach removes 802K nodes (13.2% reduc- EECS Dept., UC Berkeley, ERL, Tech. Rep., Mar 2005.
tion); a mere 1.1% improvement. This suggests that ic3 [11] A. Biere, A. Cimatti, E. M. Clarke, and Y. Zhu, “Symbolic
model checking without BDDs,” in TACAS, 1999, pp. 193–207.
has a small sensitivity to the magnitude of reduction.
[12] G. Cabodi, M. Murciano, S. Nocco, and S. Quer, “Stepping
VI. C ONCLUSION forward with interpolants in unbounded model checking,” in
Proc. Int. Conf. on Computer-aided design, 2006, pp. 772–778.
In this paper, we presented an empirical study of [13] A. Mishchenko, S. Chatterjee, R. Brayton, and N. Een, “Im-
the different sweeping methods proposed in the past. provements to combinational equivalence checking,” in Proc.
Int. Conference on Computer-aided design, 2006, pp. 836–843.
We have shown that a combined sweeping approach [14] A. Biere and K. Claessen, “Hardware model checking compe-
outperforms any of the sweeping methods alone. We tition,” in Hardware Verification Workshop, 2010.
have also proposed a BDD-based cut sweeping method [15] F. Somenzi, CUDD: CU Decision Diagram Package, University
that is more effective than the original cut sweeping. of Colorado at Boulder, ftp://vlsi.colorado.edu/pub/.
[16] R. Brayton, N. Een, and A. Mishchenko, “Continued relevance
Finally, we have studied the effect of sweeping on the of bit-level verification research,” in Proc. Usable Verification,
new model checking algorithm, ic3, and investigated Nov. 2010, pp. 15–16.
the causes of the better performance it experiences with
sweeping. The goal of this analysis is to help designers
of model checkers to make decisions regarding the
incorporation of sweeping methods and to provide a
deeper understanding of how sweeping methods interact
with ic3.
ACKNOWLEDGEMENTS
We thank Aaron Bradley who motivated this work
and contributed to many discussions. We also thank the