=Paper=
{{Paper
|id=None
|storemode=property
|title=A Study of Sweeping Algorithms in the Context of Model Checking
|pdfUrl=https://ceur-ws.org/Vol-832/paper_7.pdf
|volume=Vol-832
|dblpUrl=https://dblp.org/rec/conf/fmcad/HassanZS11
}}
==A Study of Sweeping Algorithms in the Context of Model Checking
==
A Study of Sweeping Algorithms in the Context of Model Checking Zyad Hassan, Yan Zhang, and Fabio Somenzi Dept. of Electrical, Computer, and Energy Engineering University of Colorado at Boulder Boulder, CO 80309 Abstract—Combinational simplification techniques have niques benefit interpolation considerably. The recently proved their usefulness in both industrial and academic introduced ic3 [1] incrementally discovers invariants model checkers. Several combinational simplification al- that hold relative to stepwise reachability information. gorithms have been proposed in the past that vary in Designing a model checking flow that involves ic3 efficiency and effectiveness. In this paper, we report our requires understanding how combinational simplification experience with three algorithms that fall in the com- binational equivalence checking (sweeping) category. We algorithms affect it. propose an improvement to one of these algorithms. We This paper makes the following contributions: have conducted an empirical study to identify the strengths • We carry out a comparative study of the different and weaknesses of each of the algorithms and how they sweeping methods. can be synergistically combined, as well as to understand • We propose a BDD-based cut sweeping method that how they interact with ic3 [1]. is more effective than the original cut sweeping. • We propose a combined sweeping approach in I. I NTRODUCTION which more than one sweeping method is applied. Combinational simplification eliminates redundancies We show that the combined approach can achieve and increases sharing of logic in a design. It has been more simplification than any of the methods can successfully employed in logic synthesis, equivalence achieve individually. checking, and model checking. • We perform an empirical study of the effect of In the model checking context, combinational sim- sweeping on ic3. plification often dramatically improves the performance The rest of the paper is organized as follows. Section of the proof engines. This has made it into a primary II contains preliminaries. In Section III, we introduce component in both industrial [2] and academic [3], the BDD-based cut sweeping method. In Section IV, [4] model checkers. Several combinational simplification we explain the rationale behind the combined sweeping algorithms have been proposed in the past, such as approach. In Section V we present the experimental DAG-aware circuit compression [5], [6] and sweeping results and in Section VI we conclude. methods [7]–[9]. Sweeping methods merge functionally equivalent nodes. They include BDD sweeping [7], SAT II. P RELIMINARIES sweeping [8], [10], and cut sweeping [9]. When designing a model checker, the strengths and A. AND-inverter-graph weaknesses of each of the sweeping methods should The input and output of our sweeping algorithms be taken into account. To the best of our knowledge, are AND-inverter-graphs (AIGs). An AIG is a directed no studies have been carried out so far to evaluate acyclic graph that has four types of nodes: source nodes, and compare the different sweeping methods, with the sink nodes, internal nodes and the constant TRUE node. exception of a limited study reported in [9]. A primary input (PI) is a source node in an AIG. A The effect of combinational simplification on several primary output (PO) is a sink node and has exactly one model checking engines has been studied in the past. predecessor. The internal nodes represent 2-input AND In [8], SAT sweeping has been shown to positively gates. A node v is a fanin of v0 if there is an edge (v, v0 ); affect bounded model checking (BMC) [11]. In [12], it it is a fanout of v0 if there is an edge (v0 , v). Left(v) is shown that combinational redundancy removal tech- and Right(v) refer to the left and right predecessors of v . Fanin(v) and Fanout (v) denote the fanins and fanouts point is that the original BDD is kept for equivalence of node v . An edge in an AIG may have the INVERTED checking, but is not used to produce new BDDs. The attribute to model an inverter. The Boolean function of algorithm is complete if the threshold is so large that a PI is a unique Boolean variable. For an edge, it is no auxiliary variable is introduced. In practice, however, the function of its source node if the edge does not have this can be prohibitive. the INVERTED attribute, and the complement otherwise. For an internal node, it is the conjunction of the functions C. SAT Sweeping of its incoming edges. The Boolean function of a PO is The advancements in SAT solver technology over the that of its incoming edge. past two decades has proliferated SAT-based reasoning A path from node a to b is a sequence of nodes methods. SAT sweeping is one such method proposed hv0 , v1 , v2 , . . . , vn i, such that v0 = a, vn = b and by Kuehlmann [8] for combinational redundancy identi- vi ∈ Fanin(vi+1 ), 0 ≤ i < n. The height of a node v , fication. SAT sweeping queries the SAT solver to prove h(v), is the length of the longest path from a PI to v . The or refute the equivalence of two nodes in the circuit. fanin (fanout) cone of node v is the set of nodes that have The basic SAT sweeping algorithm works as follows. a path to (from) v . Two nodes are functionally equivalent First, the circuit is simulated with random inputs, and (complementary) if they represent the same (comple- candidate equivalence classes are formed, where nodes mentary) Boolean function(s). Functionally equivalent having the same simulation values are placed together. (complementary) nodes can be merged transferring the Next, for each two nodes belonging to the same class, fanouts of the redundant nodes to their representative. the SAT solver is queried for their equivalence. If the To simplify our presentation, in what follows we delib- SAT solver reports they are equivalent, one of them is erately ignore detection of complementary functions. merged into the other. Otherwise, the counterexample provided is simulated to break this equivalence class, B. BDD Sweeping and possibly rule out other candidate equivalences as BDD sweeping identifies two nodes as equivalent if well. This process is repeated until a resource limit is they have the same BDD representation. The original reached, or until all classes become singletons, indicating algorithm of Kuehlmann and Krohm [7] works in two the absence of further equivalences. stages: equivalence checking and false negative detec- In our implementation of SAT sweeping, several tion. In the first stage, it builds BDDs for each node heuristics were applied. We mention each of them briefly. and merges nodes having the same BDD. The algorithm 1) Simulating Distance-1 Vectors: This heuristic was introduces an auxiliary BDD variable (cut point) for each proposed in [13]. Instead of just simulating the coun- node that has been merged, which potentially leads to terexample to equivalence derived by the SAT solver, false negatives. In the second stage, it takes the BDDs all distance-1 vectors, that have a single bit flipped, are of the output functions and for each of them, replaces simulated as well. Only the bits corresponding to the the auxiliary variables with their driving functions. The inputs that are in the fanin cone of the two nodes being algorithm is complete in the sense that it can find all checked for equivalence are flipped. We have found this the equivalences in the circuit given a sufficiently large heuristic to be very effective in practice. In [13], this limit on the BDD sizes. However, a large limit often hurts process is repeated for vectors that were successful in efficiency. In this paper, we intend to use BDD sweeping breaking up equivalence classes until convergence. In our in conjunction with SAT sweeping which is complete implementation, we only simulate the distance-1 vectors and avoids inherent BDD inefficiencies [8]. For that, we for the original counterexample: for the benchmark suite employ a version of BDD sweeping that is incomplete we experimented with, recurring on successful vectors is yet faster than the original. too expensive for the number of refinements it achieves. The algorithm we adopt iterates the following steps 2) Clustering: Simulating distance-1 vectors often on each node v of the AIG in some topological order. results in considerable refinement of the equivalence It builds a BDD for v and checks if there exists another classes. This is desirable, since an equivalence class node that has the same BDD. If so, it merges these two is often broken up more cheaply by simulation than nodes and continues. Otherwise, if the BDD size exceeds by the SAT solver. Moreover, we have observed that a given threshold, the algorithm introduces an auxiliary with distance-1 vector simulation, it becomes very likely BDD variables for v to be used in place of the original that nodes remaining in an equivalence class are indeed BDD when dealing with the fanouts of v . An important equivalent. Therefore, rather than checking the equiva- lence of two nodes at a time, we check the equivalence in Φ(v) is associated with a truth table. Next, it searches of all nodes in an equivalence class using a single SAT for a node equivalent to v by looking for a cut equivalent query. If they are all indeed equivalent, we find that using to some cut in Φ(v). If it succeeds, the two nodes are a single SAT query rather than n − 1 queries where n is merged. Otherwise, a heuristic is applied to prune Φ(v) the number of nodes in the class. to at most N cuts. After pruning, the algorithm stores 3) Height-Based Merging: When two nodes are Φ(v) as the cut-set of v and builds a cross-reference proved equivalent, we merge the node with a larger between each of its cuts and v . height into the one with a smaller height, instead of The heuristic for pruning, which we call the quality merging based on a topological order as in [13]. The heuristic, computes a value q for each cut: intuition being that a node having a larger height often X 1 q(C) = . (3) has a larger fanin cone, which suggests that merging it | Fanout (n)| n∈C would lead to a larger reduction. Nodes coming later in a The cuts with the smallest values of q are kept. The topological order do not necessarily have a larger height intuition of the quality heuristic is two-fold. First, it tries than nodes coming earlier. to localize the equivalence and thus favors smaller cuts. D. Cut Sweeping Second, it normalizes cuts by attempting to express them with the same set of variables. The chosen variables are Cut sweeping [9] is a fast yet incomplete approach those that have a large fanouts, i.e., that are shared by for combinational equivalence checking. It iteratively many other nodes. computes cuts for each AIG node and compares the A good truth-table implementation is critical to the functions associated to the cuts. performance of cut sweeping. In [9], truth tables are A cut is defined with respect to an AIG node, called implemented as bit vectors. An advantage of bit vectors root. A cut C(v) of root v is a set of nodes, called leaves, is the constant-time cost of Boolean operations. On the such that any path from a PI to v intersects C(v). A cut- other hand, bit interleaving is required to extend the bit set Φ(v) consists of several cuts of v . For cut-sets Φ(v1 ) vectors to the same length so that the corresponding bits and Φ(v2 ), the merge operator ⊗ is defined as represent the same minterm1 . Φ(v1 ) ⊗ Φ(v2 ) = {C1 ∪ C2 | C1 ∈ Φ(v1 ), C2 ∈ Φ(v2 )} . III. BDD- BASED C UT S WEEPING (1) Assume k ≥ 1. A k-feasible cut is a cut that contains Representing functions having a small number of at most k leaves. A k-feasible cut-set is a cut-set that inputs using bit vectors is very efficient. However, the contains only k-feasible cuts. The k-merge operator, ⊗k , number of bits required grows exponentially with the creates only k-feasible cuts. Cut enumeration recursively number of variables, which can easily lead to memory computes all k-feasible cuts for an AIG. It computes the blow-up. As an alternative, BDDs, which are more suit- k-feasible cut-set for a node v as follows: able for large functions, can also be used to represent cut ( functions. Furthermore, the strong canonicity of BDDs {{v}} v ∈ PI makes it trivial to check for equivalence. The use of Φ(v) = {{v}} ∪ Φ (Left (v)) ⊗k Φ (Right (v)) v ∈ AND , BDDs also enables a heuristic which we describe below. (2) The proposed algorithm differs from the original one where PI and AND refer to the set of PIs and 2-input in two aspects. First, we introduce a new parameter s, the AND gates respectively. Note that cuts are not built for maximum size of a BDD, to replace k. That is, instead of POs because they are never merged. k-feasible cuts, we keep cuts whose functions contain at The function of a cut is the Boolean function of the most s BDD nodes. Node count, as opposed to number root in terms of the leaves. It can be represented in of inputs, is a more natural characterization of BDD size. different ways, for instance, using bit vectors or BDDs. The second difference comes from the pruning heuris- Two cuts are equivalent if their cut functions are equal. tic. We define the height h of a cut C as the average Hence, two nodes are functionally equivalent if their cut- height of its nodes: sets contain equivalent cuts. X h(v) Cut sweeping is parametrized by k and N , the maxi- h(C) = . (4) v∈C |C| mum cut size and the maximum cut-set size, respectively. For each node v in some topological order of the AIG, 1 A good reference of bit-interleaving can be found at http:// the algorithm builds a k-feasible cut-set Φ(v). Each cut graphics.stanford.edu/∼seander/bithacks.html. A smaller h indicates that the leaves in the cut are closer cut sweeping only keeps those below the threshold. The to the PIs. The height heuristic keeps at most N cuts quality heuristic tries to attract cuts on certain nodes, choosing the ones with smallest values of h. which is similar to the placement of auxiliary variables in BDD sweeping. Nevertheless, the number of “attractors” a p a r in the quality heuristic tends to be much larger than in b c g BDD sweeping. f c b q s d d IV. C OMBINING S WEEPING M ETHODS Fig. 1. Two implementations of a 4-input AND gate The idea of combining several simplification algo- rithms is not new. Many existing model checkers iterate A motivating example for the new heuristic is in several simplification algorithms before the problem is Figure 1, which shows two different 4-input AND gates. passed to the model checking engines. However, we are Nodes a, b, c, and d are PIs. Nodes p, q , r , and s are unaware of any studies that have been carried out to internal nodes. Nodes f and g can only be merged if identify the best way they could be combined. In this their cut-sets both contain {a, b, c, d}. However, if the section we attempt to give general guidelines to which internal nodes have many more fanouts than the PIs, the a combined approach should adhere. We support our quality heuristic may select cuts containing the internal claims by empirical evidence collected in the experi- nodes instead, causing the merge to be missed. ments reported in Section V. As mentioned before, the quality heuristic tries to The problem we address is as follows: given a time normalize the cuts on certain “attractors.” This reduces budget for combinational simplification, how should it the possibility that equivalent functions are represented be allotted to the different algorithms? The sweeping differently. However, this might also lead to the loss algorithms discussed in previous sections vary in their of the opportunity to find equivalences that cannot be completeness and speed, with cut sweeping being the expressed by those “attractors,” as in Figure 1. most incomplete method yet the fastest of the three meth- On the other hand, the height heuristic tries to push the ods, SAT sweeping being a complete, yet the slowest, cut boundary as far as possible. A supporting argument and BDD sweeping lying in between, both in terms of is that, if a node is employed in equivalent cuts, then completeness and speed. replacing it with its predecessors preserve equivalence. Possible solutions include allocating the whole time Furthermore, new merges that are otherwise undiscover- budget to a single algorithm, or dividing it among two or able (consider other equivalences that require a and b in more algorithms. The fact that some methods are better the above example) may be found. The height heuristic in approaching certain problems than others, suggests does not attract cuts to certain nodes, which may result that more than one method should be applied. If two or in different cuts for equivalent nodes. As shown in the more methods are to be applied in sequence, the intuition experiments, the effectiveness of the height heuristic suggests that the lower effort methods should precede reduces as the height of nodes increases. the higher effort ones. The advantages of doing so are The two heuristics have their own strengths and weak- two-fold. First, although the higher-effort methods are nesses. A natural question is whether it is possible to likely to discover the merges that a lower-effort method combine them to benefit from their individual strengths. discovers, in general, it will take them more time to do We can choose a few cuts with each heuristic. This may so. Second, preceding higher-effort methods by lower- lead to more merges but may also worsen the efficiency effort methods is beneficial in having them present a if it significantly increases the number of cuts. To prevent smaller problem that is easier to handle. such an increase, a combined heuristic only records Finally, the percentages of total time that should be height cuts for the lower nodes, while it keeps both types allotted to each of the methods to yield the maximum of cuts for the others. possible reduction is studied in Section V. There is some connection between cut sweeping with each of the two heuristics and BDD sweeping. With V. R ESULTS the height heuristic, cut sweeping tries to build cuts as large as possible, as BDD sweeping does. However, BDD The experiments are divided into three parts. The first sweeping can store cuts that exceed the threshold while part compares different variations of the cut sweeping algorithm. The second part shows the results of the com- proved significantly with s. This means that with s, each bined sweeping methods, and the third part is concerned cut has a larger chance of resulting in a merge. with the effect of sweeping on ic32 . Second, while “k-quality” and “k-combined-1” have We use the benchmark set from HWMCC’10 [14], a very close sweeping times, the latter achieves 19.8% set of 818 benchmarks in total. The experiments are run more merges. Furthermore, the decrease in the “Height” on a machine with a 2.8GHz Quad-Core CPU and 9GB column reveals that the height cuts indeed lead to of memory. We use CUDD [15] as our BDD package merges. Although “s-quality” is more effective than the for all the BDD-related algorithms. two above methods, it is less efficient due to the larger cut sizes. A. BDD-based Cut Sweeping For the methods with s (excluding “s-quality”), we Variations of cut sweeping are applied to the observe that “s-height N = 1” is the fastest and produces HWMCC’10 benchmark set. The differences between a good number of merges. Increasing the number of variations are two-fold. First, either the number of vari- height cuts to two triples the run time without gaining ables, k, or the number of BDD nodes, s, is used to many more merges. Comparing it with “s-combined-1”, drop over-sized cuts. Second, we experiment with several an improvement on the merges is shown by the latter. heuristics for pruning cut-sets: the quality heuristic, This indicates that maintaining one height cut and one the height heuristic, and two combined heuristics. The quality cut works better than two height cuts. For “s- naive combined heuristic (“combined-1”) chooses one combined-2”, the number of merges is between the two cut based on the height heuristic and the others based above methods, but with lower run time. Furthermore, on the quality heuristic. The other heuristic (“combined- the numbers of generated cuts and kept cuts are even 2”) sets a threshold on the node height (350 in our comparable to “s-height N = 1”. That is, even though experiments). For nodes that are below the threshold, we keep three cuts for those nodes with height larger it only keeps a height cut. For higher nodes, it produces than 350, on average we compute only a few percent cut-sets consisting of a height cut and two quality cuts. more cuts than we do in the case of one cut per node. We denote a method by a k or an s followed by the the The “Height” values of the three methods confirm the heuristic name. All the variations use BDDs to represent assumption made in Section III: most merges produced the cut functions. by the height heuristic come from cuts close to the The results are shown in Table I; they are aggregated PIs. When the two heuristics are combined, a significant over the 818 benchmarks. Based on experiments, both increase on the “Height” value is observed. In Figure the threshold of BDD sweeping and s in BDD-based 2, we show the number of merges found by “s-height cut sweeping are set to 250. The total number of AIG N = 1” and “s-combined-2” on nodes within different nodes before sweeping is 7.22M. “Final” is the size of height ranges. The plot is normalized to “k-quality” and AIGs after sweeping. “Generated” and “Kept” are the has bin size of 50, i.e., a point at (2, 1) indicates that the number of cuts generated and kept by the corresponding method finds the same number of merges as “k-quality” methods. For an individual benchmark, its “height” is the for nodes with height from 100 to 149. Obviously, the average height of all merged cuts. The “Height” column height heuristic works better on smaller height nodes, is computed by taking the average of the “height” of while the quality heuristic catches more on larger height all the benchmarks. A smaller value indicates that more ones. The combined heuristic takes the advantage of merges are found by cuts that are close to the PIs. Note both and produces an even better profile on nodes with that since we use BDDs, the results in terms of efficiency larger heights. Note that although the height heuristic of bit-vectors based methods may not be as good as in works worse on the nodes with larger heights, it can [9]. Therefore, when dealing with them, we just compare still get more merges. This may be due to the fact that the effectiveness. in this benchmark set, a large percent of equivalences Results indicate that the resulting AIGs are con- are located at lower heights. sistently smaller with s than with k. There are In our setup, cut sweeping is intended for usage as a a few interesting observations. First, the ratios fast method. Thus we consider “s-height N = 1” and GeneratedCuts /Merge and KeptCuts/Merge are im- “s-combined-2” to be the best variants. Compared to BDD sweeping, those two variants are faster because 2 Detailed results for first and second parts can be found at http: they create fewer BDD nodes than BDD sweeping, but //eces.colorado.edu/∼hassanz/sweeping-study are less effective since BDD sweeping may keep BDDs TABLE I R ESULTS OF CUT SWEEPING , BDD SWEEPING AND SAT SWEEPING ON THE HWMCC’10 SET. B Y DEFAULT, k = 8 AND s = 250. Method Final (×106 ) Merges (×105 ) Time (s) Generated (×107 ) Kept (×107 ) Height k-quality N = 5 6.82 2.62 123.26 5.83 1.92 10.20 k-combined-1 N = 5 6.75 3.14 129.64 5.84 1.90 8.57 s-quality N = 5 6.71 3.31 536.75 7.63 2.32 12.11 s-height N = 1 6.55 4.07 58.99 1.07 0.54 3.19 s-height N = 2 6.51 4.20 181.52 2.18 0.99 2.94 s-combined-1 N = 2 6.48 4.42 181.21 2.29 1.02 12.86 s-combined-2 6.52 4.28 74.64 1.10 0.54 12.72 BDD Sweeping 6.34 5.61 112.74 – – – SAT Sweeping 6.10 6.37 2149.4 – – – 2 10 spent in sweeping. Data are aggregated over the 818 Height Heuristic Combined Heuristic benchmarks. The base case for our comparisons is the pure SAT sweeping case in which SAT sweeping gets Ratio over Quality Heuristic 1 the whole budget. The time budget used in our study is 10 10 seconds. We consider the following policies: (a) allocating the budget to two methods, (b) allocating it to three 0 10 methods, and (c) allocating the whole budget to SAT sweeping. For (a) and (b), we consider all the different permutations of assigning integer time values to each −1 method, such that they sum up to 10 seconds. Note that if 10 0 20 40 60 Node Height (/50) 80 100 120 a sweeping algorithm times out, what it has achieved thus far is used in what follows. In all cases, a set of light- Fig. 2. Number of merges on nodes within different height ranges weight sequential simplification algorithms are applied before sweeping. This set of algorithms includes cone- of-influence reduction, stuck-at-constant latch detection, that exceed the threshold. and equivalent latch detection. The total number of AIG nodes for all 818 benchmarks measured after the B. Combined Sweeping Methods sequential simplification step is 6.1M. In this section, we show experimental evidence that Results are presented in Table II. The first column lists supports the guidelines of combining sweeping methods the methods, where the number before each sweeping presented in Section IV. In particular, we try several pos- method indicates the number of seconds given to it. sibilities of allotting the budget to the different sweeping The second column shows the number of AIG nodes algorithms with the purpose of identifying empirically if removed. The third column shows the total time spent in they should be combined, and if so, in which way. In sweeping. The methods are listed in order of decreasing what follows we use the “s-height N = 1” variant of reduction. The last row is for pure SAT sweeping. We cut-sweeping since it is the fastest, and we simply refer only show the best three setups in terms of reduction for to it as cut sweeping. each of the possible orders of the method sequences. In our combined approach, SAT sweeping is always Several observations can be made. First, when it run last since it is the only complete method of the three, comes to running two methods in sequence, BDD and should thus be given whatever time is left to find sweeping combined with SAT sweeping outperforms cut equivalences not discovered by the other methods. Also, sweeping combined with SAT sweeping. The method the time not used by any of the preceding methods is that achieves maximum reduction (8 seconds of BDD passed to SAT sweeping. For instance, if cut sweeping sweeping followed by 2 seconds of SAT sweeping) is given a time budget of 4 seconds and only uses 3 of removes 56K more nodes than pure SAT sweeping (7.7% them, SAT sweeping gets to run for one extra second. more reduction). Second, more reduction is achievable We compare the reduction measured in terms of the by running three methods in sequence. As suggested in number of AIG nodes removed, and the total time Section IV, ordering the methods by increasing effort TABLE II TABLE III E FFECT OF BUDGET ALLOCATION ON REDUCTION . E FFECT OF SWEEPING ON ic3’ S PERFORMANCE . Method Reduction Total Sweeping Time (s) Solves Solves Runtime (s) Runtime (s) 4 Cut, 5 BDD, 1 SAT 801,932 518 Seed (No (With (No (With 2 Cut, 5 BDD, 3 SAT 801,137 516 Index Sweeping) Sweeping) Sweeping) Sweeping) 6 Cut, 3 BDD, 1 SAT 801,119 522 1 693 698 96,297 91,762 4 BDD, 1 Cut, 5 SAT 794,052 517 2 689 699 95,629 90,341 8 BDD, 1 Cut, 1 SAT 793,921 515 3 691 699 95,050 92,714 7 BDD, 2 Cut, 1 SAT 793,814 519 4 696 697 93,691 91,141 8 BDD, 2 SAT 793,226 500 5 693 698 95,007 89,656 7 BDD, 3 SAT 793,068 503 6 690 695 96,270 91,559 5 BDD, 5 SAT 792,797 508 7 693 699 94,784 92,056 1 Cut, 9 SAT 772,563 512 8 690 701 94,351 90,837 6 Cut, 4 SAT 771,070 513 9 693 693 95,491 92,847 3 Cut, 7 SAT 769,483 511 10 690 693 95,124 93,048 10 SAT 736,594 619 Average 691.8 697.2 95,169 91,596 (or equivalently by increasing degree of completeness) and after sweeping) in the sweeping case. We compare achieves more reduction than otherwise. Here, the best the number of solves, and the aggregate runtime among method (4 seconds, 5 seconds, and 1 second for cut, all benchmarks. BDD and SAT sweeping, respectively), has an edge It is important to note that the ic3 algorithm has a of 65K nodes over pure SAT sweeping (about 8.9% random flavor. In particular, the order by which gener- more reduction). Third, in terms of sweeping time, it alization (dropping literals) is attempted is randomized. is clear that a large drop occurs (> 100 seconds) when Also, since the algorithm is SAT-based, randomization two or three methods are combined versus pure SAT occurs in the SAT solver decisions. To have reliable sweeping, which is due to the often smaller time needed experimental results, each experiment is repeated with by BDD and cut sweeping to discover equivalences than 10 different seeds, and the results are averaged over the SAT sweeping. Given an overall model checking budget, different seeds. smaller sweeping time allows more time for the model Results are shown in Table III. The first column shows checking engine, which is desirable. the seed index, the second and third columns show the The question of whether such difference has a consid- number of solves without and with sweeping, and the erable effect on the performance of the model checking fourth and fifth columns show the aggregate runtime engine is answered in the next section. without and with sweeping. The results confirm a positive effect of sweeping on C. Effect on ic3 the performance of ic3. On average, five more solves The recently developed model checking algorithm, are achieved with sweeping, and the aggregate runtime ic3 [1], has been regarded as the best standalone model drops by 3.8%. checking algorithm developed up till now [16]. As the The enhancement in the performance of ic3 in pres- interaction of combinational simplification methods with ence of sweeping can be attributed to two factors. First, different model checking algorithms has been studied in reduction in the number of gates caused by sweeping can the past, we here aim to study how they interact with result in the reduction in the SAT solver time. Second, ic3. In particular, we would like to empirically find simplification often results in dropping of latches (e.g., out if ic3 benefits from preprocessing the netlist with if it merges the next-state functions of two latches). For a simplification algorithm or not, and if it does, how example, for the benchmark set used in our experiments, sensitive it is to the magnitude of reductions achieved sweeping reduces the aggregate number of latches from through simplification. 279,161 to 269,091 (3.6% decrease). This reduces the In the first experiment, we compare two runs of ic3, amount of work done by ic3 in generalization of one that is preceded by SAT sweeping, and one that counterexamples to induction. is not. The experimental setup is as follows. A total We now repeat the previous experiment, but this time timeout of 10 minutes is used. The budget for SAT we compare the number of solves and the aggregate sweeping is 10 seconds. The light-weight sequential runtime between pure SAT sweeping and the empirically simplification algorithms referred to in Section V-B are optimum combined sweeping scheme of Section V-B. applied once in the no-sweeping case, and twice (before The purpose of this experiment is to understand how TABLE IV O PTIMUM SWEEPING VERSUS PURE SAT SWEEPING . reviewers for their insightful comments regarding cut sweeping’s pruning heuristics that prompted us to try Solves Solves Runtime (s) Runtime (s) the “combined-2” heuristic. (Pure (Optimum (Pure (Optimum Seed SAT Sweeping SAT Sweeping Index Sweeping) Scheme) Sweeping) Scheme) R EFERENCES 1 698 696 91,762 91,567 2 699 696 90,341 91,113 [1] A. R. Bradley, “SAT-based model checking without unrolling,” 3 699 697 92,714 92,373 in Proc. Int. Conf. on Verification, model checking, and abstract 4 697 702 91,141 90,478 interpretation, 2011, pp. 70–87. 5 698 697 89,656 90,327 [2] H. Mony, J. Baumgartner, V. Paruthi, R. Kanzelman, and 6 695 699 91,559 90,606 A. Kuehlmann, “Scalable automated verification via expert- 7 699 697 92,056 91,498 system guided transformations,” in Formal Methods in 8 701 699 90,837 92,228 9 693 696 92,847 91,252 Computer-Aided Design, 2004, pp. 159–173. 10 693 697 93,048 92,663 [3] R. Brayton and A. Mishchenko, “ABC: An academic industrial- Average 697.2 697.6 91,596 91,411 strength verification tool,” in Computer Aided Verification, 2010, pp. 24–40. [4] G. Cabodi, S. Nocco, and S. Quer, “Benchmarking a model checker for algorithmic improvements and tuning for perfor- sensitive ic3 is to the magnitude of reductions. mance,” in Proc. Hardware Verification Workshop, 2010. [5] P. Bjesse and A. Boralv, “DAG-aware circuit compression for Results are shown in Table IV, where the second and formal verification,” in Proc. Int. Conf. on Computer-aided third columns compare the number of solves for pure design, 2004, pp. 42–49. SAT sweeping and the optimum sweeping scheme, and [6] A. Mishchenko, S. Chatterjee, and R. Brayton, “DAG-aware AIG rewriting a fresh look at combinational logic synthesis,” the fourth and fifth columns compare the total runtime. in Proc. Design Automation Conference, 2006, pp. 532–535. As the results indicate, ic3 does not benefit much [7] A. Kuehlmann and F. Krohm, “Equivalence checking using cuts from the better reduction achieved by the combined and heaps,” in Proc. Design Automation Conference, 1997, pp. sweeping scheme. The lack of performance enhancement 263–268. [8] A. Kuehlmann, “Dynamic transition relation simplification for on ic3 can be attributed to the small improvement bounded property checking,” in Proc. Int. Conf. on Computer- in reduction the combined sweeping approach achieves aided design, 2004, pp. 50–57. over pure SAT sweeping. In particular, while pure SAT [9] N. Eén, “Cut sweeping,” Cadence Design Systems, Tech. Rep., sweeping removes 737K nodes out of the total 6.1M 2007. [10] A. Mishchenko, S. Chatterjee, and R. Brayton, “FRAIGs: A nodes in the 818 benchmarks (12.1% reduction), the unifying representation for logic synthesis and verification,” combined approach removes 802K nodes (13.2% reduc- EECS Dept., UC Berkeley, ERL, Tech. Rep., Mar 2005. tion); a mere 1.1% improvement. This suggests that ic3 [11] A. Biere, A. Cimatti, E. M. Clarke, and Y. Zhu, “Symbolic model checking without BDDs,” in TACAS, 1999, pp. 193–207. has a small sensitivity to the magnitude of reduction. [12] G. Cabodi, M. Murciano, S. Nocco, and S. Quer, “Stepping VI. C ONCLUSION forward with interpolants in unbounded model checking,” in Proc. Int. Conf. on Computer-aided design, 2006, pp. 772–778. In this paper, we presented an empirical study of [13] A. Mishchenko, S. Chatterjee, R. Brayton, and N. Een, “Im- the different sweeping methods proposed in the past. provements to combinational equivalence checking,” in Proc. Int. Conference on Computer-aided design, 2006, pp. 836–843. We have shown that a combined sweeping approach [14] A. Biere and K. Claessen, “Hardware model checking compe- outperforms any of the sweeping methods alone. We tition,” in Hardware Verification Workshop, 2010. have also proposed a BDD-based cut sweeping method [15] F. Somenzi, CUDD: CU Decision Diagram Package, University that is more effective than the original cut sweeping. of Colorado at Boulder, ftp://vlsi.colorado.edu/pub/. [16] R. Brayton, N. Een, and A. Mishchenko, “Continued relevance Finally, we have studied the effect of sweeping on the of bit-level verification research,” in Proc. Usable Verification, new model checking algorithm, ic3, and investigated Nov. 2010, pp. 15–16. the causes of the better performance it experiences with sweeping. The goal of this analysis is to help designers of model checkers to make decisions regarding the incorporation of sweeping methods and to provide a deeper understanding of how sweeping methods interact with ic3. ACKNOWLEDGEMENTS We thank Aaron Bradley who motivated this work and contributed to many discussions. We also thank the