LEC: Learning-Driven Data-path Equivalence Checking Jiang Long∗ , Robert K. Brayton∗ , Michael Case† ∗ EECS Department, UC-Berkeley {jlong, brayton}@eecs.berkeley.edu † Calypto Design Systems {mcase}@calypto.com Abstract— equivalence problem is NP-complete. However, on the extreme In the LEC system, we employ a learning-driven approach for end, the complexity becomes O(1) of the size of the network solving combinational data-path equivalence checking problems. if the two designs are structurally the same. An NP-complete The data-path logic is specified using Boolean and word-level operators in VHDL/Verilog. The targeted application area are C- problem can be tackled by using SAT-solvers as a general to-RTL equivalence checking problems found in an industrial set- procedure. To counter the capacity limitation of SAT-solving, ting. These are difficult because of the algebraic transformations it is crucial to reduce the complexity by identifying internal done on the data-path logic for highly optimized implementations. match points and by conducting transformations to bring in Without high level knowledge, existing techniques in bit-level more structural similarity between the two sides of the miter equivalence checking and QF BV SMT solving are unable to solve these problems effectively. It is crucial to reverse engineer logic. such transformations to bring more similarity between the two sides of the logic. However, it is difficult to extract algebraic Verilog operators SMT QF BV operators logic embedded in a cloud of Boolean and word-level arithmetic Boolean &&, k, !, ⊕, mux and, or, not, xor, ite operators. To address this, LEC uses a compositional proof bit-wise &, |, ∼, ⊕, mux bvand, bvor, bvnot, bvxor, bvite methodology and analysis beyond the bit and word level by incor- arithmetic +, −, ∗, /, % bvadd, bvsub, bvmul, bvdiv, bvmod porating algebraic reasoning through polynomial reconstruction. extract [] extract LEC’s open architecture allows new solver techniques to be concat {} concat integrated progressively. It builds sub-model trees, recursively comparator <, >, ≤, ≥ bvugt, bvult, bvuge, bvule transformating the sub-problems to simplify and expose the shifter ,  bvshl, bvshr actual bottleneck arithmetic logic. In addition to rewriting TABLE I rules that normalize the arithmetic operators, LEC supports S UPPORTED OPERATORS ( UNSIGNED ) conditional rewriting, where the application of a rule is dependent on the existence of invariants in the design itself. LEC utilizes A. Motivation both functional and structural information of the data-path logic to recognize and reconstruct algebraic transformations. A The differences between the two data-path logics under case-study illustrates the steps used to extract the arithmetic equivalence checking are introduced by various arithmetic embedded in a data-path design as a linear sum of signed transformations for timing, area and power optimizations. integers, and shows the procedures that collaboratively led to a successful compositional proof. These optimizations are domain specific and can be very spe- cialized towards a particular data-path design and underlying technology. They have the following characteristics: I. I NTRODUCTION • The two sides of the miter logic are architecturally With the increasing popularity of high-level design method- different and have no internal match points. ologies there is renewed interest in data-path equivalence • Many expensive operators such as adders and multipliers checking [3][13][18][20]. In such an application, a design are converted to cheaper but more complex implemen- prototype is first implemented and validated in C/C++, and tations and the order of computations are changed. It is then used as the golden specification. A corresponding Ver- not a scalable solution to rely on SAT solving on the ilog/VHDL design is implemented either manually or au- bit-blasted model. tomatically through high-level synthesis tool [2][4][15]. In • The parts of the transformed portion are embedded in both cases, a miter logic for equivalence checking is formed a cloud of bit and word level operators. Algebraic ex- to prove the correctness of the generated RTL model by traction [8][26] of arithmetic logic based on structural comparing it against the original C/C++ implementation. patterns is generally too restrictive to handle real-world The data-path logic targeted in this paper is specified post-optimization data-path logic. using Verilog/VHDL. The bit and word-level operators in • Word-level rewriting uses local transformation. Without Verilog/VHDL have the same semantic expressiveness as SMT high-level information, local rewriting is not able to make QF BV theory[5]. Table I gives a one-to-one correspondence the two sides of the miter logic structurally more similar. between Verilog and QF BV unsigned operators. Signed arith- Lacking high-level knowledge of the data-path logic, the metic operators are also supported. The complexity of such an equivalence problems can be very difficult for gate-level equiv- alence checking and general QF BV SMT solvers. Strategi- Original miter cally, LEC views the bottleneck of such problems as having new miter target been introduced by high-level optimizations and employs a Verific Parser Frontend collaborative approach to isolate, recognize and reconstruct the high-level transformations to simplify the miter model by VeriABC bringing in more structural similarities. SSA Network B. Contributions Word−level Bit −level Learning−based The LEC system incorporates compositional proof strate- Simulator AIG Transformations gies, uses rewriting to normalize arithmetic operators, and ABC Transformed conducts analysis beyond bit and word level. The collaborating solvers RTL procedures help to expose the actual bottleneck in a proof of equivalence. The novel aspects of this system are: SAT UNSAT 1) It uses global algebraic reasoning through polynomial Fig. 1. Overall tool flow reconstruction. In the case-study, it uses the functional information of the design to reverse engineer the arith- metic expression as a linear sum and also uses a struc- LEC tries to solve the miter directly using random simula- tural skeleton of the original data-path to achieve the tion on the word-level simulator or by ABC[1]’s equivalence equivalence proof. checking procedure dcec, which is a re-implementation of 2) It supports conditional rewriting and proves required iprove[24]. If unresolved, LEC applies transformations to the invariants as pre-conditions. SSA and produces sub-models in Verilog miter format from 3) It uses recursive transformations that target making both which LEC can be recursively applied. The overall system sides of the miter logic structurally more similar and integration is described in Section IV. hence more amenable to bit-level SAT sweeping. 4) It has an open architecture, allowing new solver tech- III. L EARNING TECHNIQUES niques to be integrated progressively. In this section, we present the techniques implemented in Through a case study, we demonstrate the steps that were LEC. Even though some are simple and intuitive, they are used to reconstruct the arithmetic embedded in a data-path powerful when integrated together as demonstrated in the design as a linear sum of signed integers, as well as all the pro- experimental results. All techniques are essential bdecause cedures that compositionally led to a successful equivalence LEC may not achieve a final proof if any one is omitted. proof. The experimental results demonstrate the effectiveness Their interactions are illustrated in the case-study in Section of these collaborating procedures. V. miter miter miter C. Overview The overall tool flow is described in Section II. Learning F=G F=G F=G techniques and system integration are presented in Section III F (̄x ) G(̄x ) F (̄x ) G(̄x ) F (̄x ) G(̄x ) and IV. A case study is presented in Section V. Experimental results is presented in Section VI followed by a comparison red blue red blue with related work and conclusion. purple kept II. T OOL FLOW input : ̄x abstract input x̄' LEC takes Verilog/VHDL as the input language for the data- input : ̄x redundant path logic under comparison. Internally, a miter network, as in Figure 2(a), is constructed comparing combinational logic input : ̄x (a)miter network (b )structurally hashed functions F and G. Figure 1 illustrates the overall tool flow. (c )abstraction First, the Verific RTL parser front-end[6] is used to compile Fig. 2. Miter network input RTL into the Verific Netlist Database. VeriABC[23] processes the Verific netlist, flattens the hierarchy and produces an intermediate DAG representation in static single assignment A. Structural information (SSA) form, consisting of Boolean and word-level operators An SSA netlist is a DAG of bit and word-level operators as shown in Table I. Except for the hierarchical information, annotated with bit-width and sign information. In the tool flow, the SSA is a close-to-verbatim representation of the original both Verific and VeriABC perform simple structural hashing RTL description. From SSA, a bit-blasting procedure generates at the SSA level, merging common sub-expressions. After a corresponding bit-level network as an AIG (And-inverter merging, the miter logic is divided into three colored regions graph). Word-level simulation models can be created at the using cone of influence (COI) relations, as in Figure 2 (b). SSA level. ABC[1] equivalence checking solvers are integrated • Red: if the node is in the COI of F only as external solvers. • Blue: if the node is in the COI of G only removes the logic between the cut and the inputs. An abstract • Purple: the node is in the COI of both sides of the miter model is formed by replacing the cut signals with free inputs i.e. common logic x̄0 . If this abstracted miter is UNSAT, then the original miter The purple region is the portion of the miter logic that is UNSAT. In our current implementation, LEC traverses the has been proved equivalent already, while the red and blue SSA network in topological order from the inputs. As each regions are the unresolved ones. LEC makes progress by node is tentatively replaced with new PIs, simulation is used reducing the red/blue regions and increasing the purple region. to validate the replacement. If successful, the node is omitted The common logic constrains the logic for the red and blue and replaced with the new PIs and the next node is processed regions, which may be abstracted (see Section III-E) to reduce similarly. redundancy and possibly expose the real bottleneck in a proof. A successful abstraction step removes irrelevant logic and exposes a smaller unresolved region of the miter logic, al- lowing LEC to continue using other procedures. In addition, B. Simulation model as seen from experimental results, the reduction of common Two word-level simulators are generated from the SSA logic can reduce significantly the amount of complexity for network. One is a native interpreted model. The other uses downstream SAT-solving, e.g. when common multipliers being the open-source Verilator[29] for compiled simulation. From removed from the miter logic. An unsuccessful abstraction the SSA network, LEC automatically generates C++ code for when the abstract miter becomes SAT, indicates the existence pseudo-random input drivers and for monitoring design be- of a rare event not being captured during random simulations. havior. Verilator compiles the Verilog miter logic, links in the Often, this gives hints for selecting case-splitting candidates. generated C++ code and produces a simulator as a standalone executable. Efficient and effective simulation is crucial in our current flow in capturing potential constants and potential F. Rewriting internal equivalent points at the SSA level. Simulation is also Similar to [20], word-level rewriting transforms an SSA used to reduce common logic in the abstraction computation network into a structurally different but functionally equivalent procedure. one. Through rewriting, certain equivalence checking problems can become much simpler. In our experience, a multiplier is C. Bit-level model often a source of difficulty in data-path equivalence checking. If two multipliers from opposite sides of the miter are matched As shown in Figure 1, an AIG is created from the SSA exactly, LEC can simplify the miter through structural hashing network by bit-blasting. LEC calls ABC[1]’s SAT sweeping and treat them as common logic. This is most effective when procedure dcec to perform direct solving at the bit level. Using combined with the abstraction procedure as the common the AIG model, the native SAT solver embedded in LEC can multiplier can now be totally removed. be used to obtain a formal proof for a particular query. Typical In LEC, a few rules are hard-coded through pattern match- queries are for extracting constant nodes, proving suspected ing applied to the SSA network. The goal is to process equivalent pairs of points or conducting particular learning for multiplications so that they can be matched exactly. This rewriting. Book-keeping information between the SSA nodes rewriting is implementation specific; for illustration purposes, and the AIG nodes allows queries to be constructed at the we list a few rewriting rules in Table II using Verilog notation word-level and verified at the bit-level. The result is then used and the semantics of the operators. to simplify the SSA network. The first rule is the normalization of multiplier operands. If a multiplier uses a partial product generator and a compressor D. Constants and Potential Equivalent Points (PEPs) tree, switching the operands of the multiplication becomes a At the word-level, candidates for constants and PEPs are very hard SAT problem because at the bit level the imple- identified through simulation and SAT queries are posed. Each mentation is not symmetrical. It is almost imperative to apply such SAT query is constructed and checked at the bit-level. this rule whenever possible. The second and third rules use SAT-solving is configured at a low-effort level (run for a few the distributive laws of multiplication and multiplexing. Rules seconds) for these types of queries. Proven constants and PEPs 4 and 5 remove the shift operator  when it is used with are used immediately to simplify the SSA network, leading extract and concat because it is hard for multiplication to to a new sub-model of less complexity. LEC then continues be restructured through the  operator. Rule 6 distributes to process the sub-model. In the presence of unproven PEPs, multiplication through the concat of two bit vectors using LEC can choose one as the next miter target, normally the +. It uses the fact that the concatenation {a, b[n − 1 : 0]} is smallest in terms of the number of nodes in its COI. The proof equivalent to a ∗ 2n + b[n − 1 : 0]. progresses as constants and PEPs are identified and used to The following is a more complex rule that distributes + over simplify the miter model. the extract operator. The right hand side is corrected with a third term, which is the carry bit from adding the lower n bits of a and b. E. Abstraction (a + b)[m : n] = As illustrated in Figure 2 (c), in computing an abstraction, (1) LEC computes a cut in the purple region (common logic), and a[m : n] + b[m : n] + (a[n − 1 : 0] + b[n − 1 : 0])[n] Before After implemented to trace back from the sel port of a mux node 1 a∗b b∗a 2 mux(cond, d0, d1) ∗ c mux(cond, d0 ∗ c, d1 ∗ c) through its Boolean fanins and choose the candidates that have 3 mux(cond, d0, d1)[m : n] mux(cond, d0[m : n], d1[m : n]) the highest controllability. 4 a[m : 0]  n { (m-n)’b0, a[m:n] } Another advantage of case-splitting is that the co-factored 5 (a[m : 0]  n)[m − n : 0] a[m : n] sub-models contain new candidates for constants and PEPs, 6 {a, b[n − 1 : 0]} ∗ c a ∗ c  n + b[n − 1 : 0] ∗ c which lead to other down-stream transformations not possible TABLE II before. Case-splitting also reduces the amount of Boolean R EWRITING RULES logic in the SSA network and exposes the data-path logic to high-level learning such as polynomial construction. Repeatedly applying the above rules, LEC transforms the SSA network and keeps only the ∗ and + operators, enhancing H. Polynomial construction the possibility of multipliers to be matched. Note that the above rule (1) and Rule 4-6 in Table II are correct for Reasoning at the word-level, rewriting rules are based on unsigned operators. Currently, for signed operators, due to the arithmetic properties of the corresponding operators such sign extension and the two’s complement representation of the as the commutative law of integer multiplication. However, operands, we have not implemented a good set of rewriting rewriting applies only local transformations and does not have rules. a global view. In situations when the miter logic is constructed 1) Conditional rewriting: The following equation from arithmetic optimization at the polynomial level, local rewriting is not able to bring similarity into the miter for (a  c) ∗ b = (a ∗ b)  c (2) further simplification. In such a situation, LEC tries to recon- struct the polynomial of the whole miter model to establish reduces the bit-width of a multiplier on the left hand side to equivalence through arithmetic or algebraic equivalences and a smaller one on the right. It is correct if a, b, c are integers then use high level transformations to prove the equivalence but incorrect in Verilog semantics, which uses modulo integer of the original miter. arithmetic. However, if the following is true within the miter As a generic procedure, LEC follows four steps to prove a model in modulo integer semantics miter network F (x̄) = G(x̄) where F and G are the top level ( (a  c)  c) == a (3) signals being compared, and x̄ is the vector of input variables (bit-vectors): then equation (2) is valid. In such a situation, LEC identifies the pattern on the left hand side of (2) in the SSA network 1) Conjecture (possibly by design knowledge) about the and executes a SAT query concerning (3) using the AIG model algebraic domain of the polynomial, e.g. signed vs. through bit-level solvers. The transformation to the left hand unsigned integer, modulo integer arithmetic, the order side of (2) is carried out only if the query is proven to be an of the polynomial etc. These conjectures set up the invariant. Such a transformation may produce an exact match framework and semantics for polynomial reconstruction of a∗b afterwards, which can be crucial for achieving the final as illustrated in the case-study of Section V. proof. 2) Determine a polynomial f and create a logic network F 0 such that the following can be proved formally. G. Case-split F 0 implements f (5) Case-splitting on a binary signal, cofactors the original miter F0 = F (6) model into two sub-models. The miter is proven if both sub- models are proven, or falsified if any sub-model is falsified. Al- How f is constructed is domain and test-case dependent. though exponential in nature, if many signals are chosen, case- In the case-study of Section V, we use simulation splitting can simplify the underlying bit-level SAT solving patterns to probe for the coefficients of a linear function. significantly. For example, it is difficult to prove the following 3) Determine a polynomial g and create a logic network miter structure directly through bit-blasting and SAT solving G0 such that the following can be proved formally. at the AIG level G0 implements g (7) (x + y) ∗ (x + y) == x ∗ x + 2 ∗ x ∗ y + y ∗ y (4) miter 0 G =G (8) where x is a 32-bit integer and y a single binary signal. 4) Establish the following equivalence formally at the al- However, it can be proven easily if case-splitting is done on gebraic level. y = 0 and y = 1. After constant propagation, the bit-level solver can prove both sub-models easily. f =g (9) The current case-splitting mechanism supports cofactoring on an input bit or input bit-vector. In verifying the test cases The combination of Items 2, 3, and 4 establishes the equiva- experienced so far, the case splits are conducted on a bit, a lence proof of the original miter model F = G. In constructing bit-vector equal to zero or not, or on the lsb or msb of a F 0 and G0 , we try to make them as structurally similar to F bit-vector equals to zero or not. A heuristic procedure can be and G as possible. Details are given in Section V. Sub-model Return IV. S YSTEM I NTEGRATION simulator SAT The above learning techniques are integrated in LEC as a set AIG SAT/UNSAT of logically independent procedures. Each procedure produces simplified SAT/UNSAT one or more sub-models, illustrated as a tree in Figure 3. abstraction UNSAT The root node is the current targeted Verilog miter model. rewrite SAT/UNSAT case-split SAT/UNSAT It has eight children. The simulator and AIG models are linear construction SAT/UNSAT the ones described in Figure 1. The simplif ied sub-model is PEP SIMPLIFY generated by constant propagation and merging proven PEPs. TABLE III The abstraction and rewrite sub-models are created by the S UB MODEL RETURN VALUE abstraction and rewrite procedures in the previous section. The case-split sub-model consists of a set of sub-models, corresponding to the cofactoring variables selected. In the is disjunctive, which includes the root and all the leaf nodes current implementation, the user needs to input the set of in bold font. The case-split and linear construction nodes signals to case-split on; eventually they will be selected by are conjunctive; a P EP node is disjunctive. The semantics, heuristics. The linear-construction node has two sub-models shown in the following tables, are used to resolve the proof which will be explained in detail in Section V. When PEPs result of the parent model from its immediate sub-models. To are identified through simulation, a P EP node is create with complete the calculus, we introduced two values: CON and the set of unproven-PEPs as sub-models. BOT, where CON stands for an internal conflict indicating a potential LEC software bug and BOT is the bottom of the Verilog Miter Model simulator value lattice and acts like an uninitialized value. AIG simplified k SAT UNS UNK SMP CON BOT abstraction SAT SAT CON SAT SAT CON SAT rewrite UNS CON UNS UNS UNS CON UNS case-split case 0 UNK SAT UNS UNK SMP CON UNK case ... 1 SMP SAT UNS SMP SMP CON SMP case n CON CON CON CON CON CON CON linear-construction case F BOT SAT UNS UNK SMP CON BOT case G PEP pep 0 TABLE IV pep 1 D ISJUNCTIONS OF MODELS ... pep m & SAT UNS UNK SMP CON BOT SAT SAT SAT SAT n/a CON SAT LEC UNS SAT UNS UNK n/a CON UNS Fig. 3. Branching sub-model tree UNK SAT UNK UNK n/a CON UNK SMP n/a n/a n/a n/a n/a n/a Two nodes in the sub-model tree are terminal. One is the CON CON CON CON n/a CON CON simulator model which can falsify the miter through random BOT SAT UNS UNK n/a CON BOT simulation. The other is the AIG model where ABC’s bit-level TABLE V dcec procedure is applied. The rest of the leaf models (in bold C ONJUNCTION OF MODELS font) are generated as Verilog miter models, which have the same format as the root node. LEC procedures can be applied Tables IV and V are the truth tables for the disjunction and recursively to these leaf nodes to extend the sub-model trees to conjunction semantics of the return values, in which UNS, simpler ones. The LEC proof process progresses by expanding UNK, SMP stand for UNSAT, UNKNOWN, and SIMPLIFY. the sub-model tree. A sub-model is resolved as SAT or UNSAT Assuming a bug free situation, at a disjunctive node, if either from its sub-models’ proof results. SAT or UNSAT is returned from a sub-model, this is the final Since there are no logical dependencies between sibling proof result for the parent. In conjunction, the parent must wait sub-models, any branch can be chosen to continue the proof until all sub-models are resolved as UNSAT before deciding process. Sibling sub-models can be forked in parallel from that its result is UNSAT, while any SAT sub-model implies a parent process. A node in the sub-model tree determines the current model is SAT. A P EP node returns SIMPLIFY its proof result from its children. Table III gives the possible to its parent if one of its sub-models, say pepi , is proven return values from the first level sub-models. SIMPLIFY is UNSAT. In this case, the parent model can apply another round returned by a P EP node to its parent model when at least of simplification to obtain a new simplif ied sub-model by one of its sub-models, pepi , is proven UNSAT, notifying the merging the node pair in the just-proved pepi . The proof log in parent node to simplify further with the newly proved pepi . Figure VI is a sample sub-model tree where only the branches Depending on the logical relationships between a parent and that contributed to the final proof are shown. Indentation its immediate sub-models, a node is either disjunctive or con- indicates the parent-child relationship. Recursively, the proof junctive in semantics. In Figure 3, a Verilog miter model node result of the top level target is evaluated as UNSAT. { "case split": { three numbers on the right are the node counts in the red, "case_0": "UNSAT by AIG" blue and purple regions (common logic) of the SSA network "case_1": { as distinguished in Figure 2(a). Only those sub-models that "simplified": { contributed to the final proof are shown in the figure. Others "abstraction": { are ignored. As seen in Figure 5, the case-split procedure is "case split": { "case_00": "UNSAT by AIG", 1.original model : 366 332 776 "case_01": "UNSAT by AIG", 2. case-split "case_10": "UNSAT by AIG", 3. case_0 : 366 331 844 "case_11": "UNSAT by AIG" 4. AIG : UNSAT }, 6. case_1 : 366 332 776 }, 7. simplified : 344 289 675 }, 8. abstraction : 344 289 29 }, 9. case-split }, 10. case_0 : 344 289 31 } 11. AIG : UNSAT ------------------------------------- 12. case_1 : 344 289 31 Miter proof result: [Resolved: UNSAT] 13. simplified : 343 288 27 ------------------------------------- 14. PEP Fig. 4. Illustration of proof log 15. pep_0 : 335 280 27 16. linear construction 17. case_F Using this sub-model tree infrastructure, any new proce- 18. AIG : UNSAT dures discovered in the future can be plugged into the system 19. case_G easily. Also, the system is fully parallelizable in that siblings 20. AIG : UNSAT 21. simplified : 10 10 305 can be executed at the same time. The proof process can be 22. AIG : UNSAT retrieved from the expanded sub-model tree. Fig. 5. Sub-model proof tree V. C ASE S TUDY The design in this case-study is an industrial example taken applied twice, at lines 2 and 9. Both models have a single- from the image processing domain. We verify specification bit input port, which was selected for cofactoring. ABC[1] = implementation where the “specification” is a manually- immediately proved the first cofactored case, case 0 (3 and specified high-level description of the design. “Implemen- 10) , using the AIG model at 4 and 11. The time-out for the tation” is a machine-generated and highly optimized RTL dcec run was set to two seconds. Abstraction was applied at 8, implementation of the same design using[2]. The miter logic significantly reducing the common logic from 675 to 29 SSA is obtained through SLEC[3]. Therefore, the miter problem nodes, and effectively removing all the comparator logic. We is verifying that the high-level synthesis (HLS) tool did not tried abstraction on the original model without the case-split modify the design behavior. procedure and it failed to produce any result. The case-split at This miter is sequential in nature, but here we examine a 2 removed enough Boolean logic and eliminated some corner bounded model checking (BMC) problem which checks the cases such that the abstraction procedure was able to produce correctness of the implementation at cycle N. This renders the an abstract model successfully. problem combinational. This is industrially relevant because Model 15 is the smallest unproved PEP from model 13. the sequential problem is too hard to solve in general, and It is proved using the linear construction procedure at 16, even the BMC problem at cycle N becomes too difficult for which we shall describe in detail in Section V-A. Model industrial tools. 21 is the simplified model of model 13 after merging the The original design (specification) consists of 150 lines just-proved pep0 . After simplification, most of the logic in of C++. It went through the Calypto frontend[3] and was model 21 became common logic through structural hashing, synthesized into a word-level netlist in Verilog. The generated leaving only 10 nodes in each of the blue and red regions. miter model has 1090 lines of structural Verilog code with 36 Model 21 was proved quickly by ABC which concludes the input ports: 29 of which are 7 bits wide, 2 are 9 bits, 4 are proof of the original miter. In this case, the linear-construction 28 bits and one is a single-bit wire. The miter is comparing procedure was crucial in attaining the proof. However, the two 28-bit values. We do not have knowledge about what the case-split, simplification, abstraction, and PEP models also are design does except through structural statistics: no multipliers, very important because they collaborate in removing Boolean, many adders, subtractors, comparators, shifters etc., together mux and comparator logic etc, but keeping only the part of the with Boolean logic. From a schematic produced from the original miter logic which constitutes a linear function. Only Verilog, there seems to be a sorting network implemented at this point, can a proof by the linear construction procedure using comparators, but we can not tell anything further. succeed. Figure 5 illustrates the compositional proof produced by the LEC system by showing the sub-model tree created during A. Linear construction the proof process. Indentations indicate parent and sub-model For model 15 in Figure 5, the SSA network contains many relations and are listed in the order they were created. The +,−,  and  operators along with extract and concat oper- ators, but contains no Boolean operators or muxes. The input input variables to compute the coefficients: ports consist of twenty-five 7-bit or 12-bit wide ports. The b = F (0, 0, ..., 0) miter is comparing two 15-bit wide output ports. At this point, simplification and abstraction can not simplify the model a0 = F (1, 0, ..., 0) − b further. Also, there are no good candidates for case-splitting. a1 = F (0, 1, ..., 0) − b The local rewriting rules can not be applied effectively without ... having some global information to help converge the two sides an−1 = F (0, 0, ..., 1) − b of the miter logic. High-level information must be extracted and applied to prove this miter model. Another round of random simulation on both the logic and After the linear construction procedure through LEC, the the polynomial can be done to increase the likelihood of the miter logic is found to be implementing the following linear conjecture. The same is repeated for G(x̄) to obtain g(x̄). sum in the signed integer domain using two’s complement In integer arithmetic, f is equal to g if and only if the representation: coefficients match exactly for each term: f = g <=> ∀ i ai = a0i and b = b0 (12) −16 ∗ x0 + 2 ∗ x1 + 2 ∗ x2 + 2 ∗ x3 + 2 ∗ x4 + 2 ∗ x5 So checking of f = g is trivial in this case. In other algebraic +2 ∗ x6 + 2 ∗ x7 + 2 ∗ x7 + 2 ∗ x8 + 2 ∗ x9 domains, domain specific reasoning may have to be applied +2 ∗ x10 + x11 + x12 + 2 ∗ x13 + 2 ∗ x14 + 2 ∗ x15 to derive algebraic equivalence e.g. in [28]. +2 ∗ x16 + 2 ∗ x17 + 2 ∗ x18 + 2 ∗ x19 − 2 ∗ x20 + 2 ∗ x21 3) Synthesizing implementations F 0 /G0 for f (f =g), struc- +2 ∗ x22 + 2 ∗ x23 + 2 ∗ x24 + 14 turally similar to F/G: We want to find a Verilog implemen- tation F 0 (x̄) of f such that 1) F 0 implements f 2) F 0 is structurally similar to F One side of the miter implements the above sum as a plain To do this, all nodes in the SSA network with arithmetic linear adder chain (Figure 6(a)), the other side is a highly op- operators +, − are marked, and edges connecting single bits timized implementation using a balanced binary tree structure are removed. A reduced graph is then created from the marked (Figure 6(b)) and optimization tricks, which we don’t fully un- nodes in the remaining graph maintaining the input/output derstand. This is a hard problem for bit-level engines because relations between marked nodes. This graph is a skeleton of the implementation structure of F . For each of its nodes, we annotate it with a conjectured linear sum computed in the same way as in the above steps. The root node F is annotated with f and internal nodes annotated with linear sums fs , ft , etc. For illustration purposes, Figure 7(a) shows such an ... annotated reduced graph for node w. For an arbitrary node w ... (a)linear adder chain (b) balanced adder tree u=... u=... + + Fig. 6. Addition implementation v=... w=c s⋅s+c t⋅t + f st ( ̄x ) w=f w ( ̄x ) v=... + + + + there are no internal match points to utilize. Therefore, LEC s=f s ( ̄x ) s=... t =... resorts to trying a high-level method to establish equivalence + + t=f ( ̄ + + t x) at the polynomial level. The following are the detailed steps ̄x ̄x for this specific case. (a)annotated reduced graph (b) substituted annotation 1) The conjecture: Assume the miter logic is F (x̄) = G(x̄) Fig. 7. Annotated reduced graph as in Figure 2(a). LEC conjectures the following for the arithmetic domain. in the reduced graph with inputs from nodes s and t, from the • Signed integer arithmetic. The numbers are in 2’s com- annotation we have the following: plement representation. s = fs (x̄) • Assume F (x̄) and G(x̄) are implementing the linear sums t = ft (x̄) f and g of the forms w = fw (x̄) X f (x̄) = ai · xi + b (10) We would like to substitute fw with variable s and t, such X that w is a function of s and t in order to follow the structure g(x̄) = a0i · xi + b0 (11) of the skeleton reduced graph. Because all the functions are linear sums, we can compute, using algebraic division, two 2) Determining the coefficients of f , g and proving f = g constants cs and ct such that the following holds: algebraically: Given the data-path logic F (x̄) and the linear sum formula (10), it takes n + 1 simulation patterns on the n w = cs · s + ct · t + fst (x̄) cs is the quotient of fw /fs and ct = (fw − cs · fs )/ft , while c[m : 0] ={a[n : 0]}[m : 0]; fst is the remainder of the previous division. The substitution assert a[n : 0] == {(n − m) ∗ {c[m − 1]}, c[m : 0]} is conducted in one topological traversal from the inputs to the miter output. After substitution, the annotated reduced graph The first two sets of assertions ensure there is no overflow of is essentially a multi-level implementation of the above linear signed integer add and no non-zero remainder of division. sum. Because the linear sum annotated at each node in the The third one ensures that extraction does not change the reduced graph is only a conjecture, it may not be exactly the value in two’s complement representation. The SVA checkers same function as in the original miter logic. However, in re- are formally verified separately using the VeriABC[23] flow, implementing f using this structure, certain similarities are which in turn uses ABCs model checker. still captured through the construction process. From the above procedures, we established the following: This multi-level circuit is implemented by traversing the f (x̄) = g(x̄) (13) substituted reduced graph, and creating a corresponding Ver- 0 ilog file for F 0 . It is generated by allocating a bit-vector at F (x̄) implements f (x̄) (14) 0 each internal node with its bit-width equivalent to output port G (x̄) implements g(x̄) (15) of F . The same can be done for G to obtain G0 . The reason F = F0 (16) why LEC goes through so much trouble to obtain separate 0 G=G (17) RTL implementations for F 0 and G0 (even though they are both implementing f ), is that we need to prove F = F 0 and Altogether they establish the proof for F = G. G = G0 separately next. Without the structural similarities Combining the above procedures into a single run, LEC took created through this procedure, proving equivalence with an about 10 minutes on an i7 processor to complete the full proof. arbitrary F 0 and G0 would be as difficult as proving the Roughly 80% of the time is spent on compiling and running original F = G. Generally, only with these extra similarities random simulation, the rest are used for SAT solving. In table injected, can the miter model be simplified enough to allow VI, we also compare this run-time against Boolector[11], z3 SAT sweeping to succeed for F = F 0 and G = G0 . [16] and ABC’ iprove [24] solver, all run on the same server. 4) Proving F = F 0 and G = G0 : We construct two miter It is clear that LEC expedites the proof significantly by using models F = F 0 and G = G0 as caseF and caseG in Figure the knowledge of the linear sum formulation inside the miter 3, and apply LEC separately to each. By construction, each logic. miter should be simpler than the original F = G because of the increased structural similarity between the two sides of the Miter Boolector Z3 iprove LEC miter. Another round of LEC might reduce this miter logic, if model 1 time-out time-out time-out 10min not prove it directly through bit-level solvers. In the present TABLE VI case, F = F 0 was proven instantly because F is a simple C OMPARISON WITH OTHER SOLVES ( TIME - OUT IN 24 HOURS ) linear adder chain, and so is F 0 . Proving G = G0 takes more time using ABCs dcec because G is a highly optimized imple- In summary, polynomial reconstruction was a key tech- mentation of f and the multi-level implementation from the nique to prove the underlying miter problem. The case- annotated reduced graph only captures part of the similarity. study illustrates the major steps and the proof obligations But, the injected similarity was sufficient enough to reduce the encountered during the process. The actual techniques used complexity to be within dcec’s capacity. for different domains of the underlying arithmetic would 5) Proving that F 0 /G0 implements f /g : To complete the differ. Each algebraic domain would require special purpose proof, we still have the proof obligation that F 0 and G0 heuristics and automated proof procedures to guarantee the implement f and g respectively. By construction from the correctness of the reconstructions and transformations used reduced graph, the generated Verilog is a verbatim translation in the method. The goal of the linear construction procedure from the multi-level form of f . However, we need to bridge was to inject increased structural similarity by using global the gap between Verilog’s bit-vector arithmetic vs. the integer algebraic transformations. arithmetic of the linear sum. To do so, we created SVA assertions to check that every Verilog statement captures the integer value in full without losing precision due to underflows VI. E XPERIMENTAL RESULTS or overflows. Table VI shows the experimental results comparing Boolector[11], Z3[16] and iprove[24] using a 24-hour time- c[n : 0] =a[n − 1 : 0] + b[n − 1 : 0]; out limit on an 2.6Ghz Intel Xeon processor. These models assert (a[n − 1] & b[n − 1] =⇒ c[n]) are generated directly using SLEC[3] for checking C-to-RTL assert (!a[n − 1] & !b[n − 1] =⇒ !c[n]) equivalence or extracted as a sub-target from PEPs. The first column is the miter design name. The second column is the c[n : 0] =a[n : 0]/b[m : 0]; number of lines of Verilog for the miter model specification. assert (a[n : 0] == (c[n : 0] ∗ b[m : 0])[n : 0]) Run-time or time-out results are reported for each solver in columns 3 to 6. Although the miter models are not big in terms of lines of Verilog, they are quite challenging for Boolector, Z3 and iprove. The run-time of LEC is the total CPU time including Verilog compilation. It was expected that iprove it in 9 hours while both iprove and Boolector time out. would not prove any of them because it works on the bit- This shows that LEC’s transformations through collaborating blasted model without any high-level information that the other procedures successfully reduce the surrounding logic in the solvers have. original model, which was preventing Z3 to prove it in 24 hours. Design Lines Boolector z3 iprove LEC mul 64 64 125 20 sec 200 sec timeout 10 sec The above experiments demonstrate the effectiveness of d1 24 time-out time-out time-out 15 sec LEC’s collaborating procedures of simplification, rewriting, d2 507 time-out time-out time-out 2 min case-splitting and abstraction computations. The LEC architec- d3 191 time-out time-out time-out 15 min d4 473 time-out time-out time-out 60 sec ture allows these procedures to be applied recursively through d5 pep 0 674 time-out 9 hour time-out 4 min a sub-model tree: the model obtained by one procedure TABLE VII introduces new opportunities for applying other procedures B ENCHMARK COMPARISON (T IMEOUT 24 HOURS ) in the next iteration. As exemplified in miter d4, the initial case-split gives rise to new opportunities for simplification The miter, mul 64 64, is comparing a 64x64 multiplier as new constants are introduced by cofactoring. Then a new with an implementation using four 32x32 multipliers as the round of abstraction is able to remove enough common logic following: and expose three one-bit inputs as case-split candidates in {aH , aL } ∗ {bH , bL } = (aH ∗ bH ) << 64 + the reduced model, which in turn gives rise to another case- split transformation that leads to the final proof. None of (aH ∗ bL + aL ∗ bH ) << 32 + aL ∗ bL this is possible without the transformations being applied in where aH , aL , bH , bL are the 32-bit slices of the original sequence. 64-bit bit-vectors. Both Boolector and Z3 are able to prove it. LEC proves it by first utilizing rewriting rules to transform VII. C OMPARISON WITH RELATED WORK the 64x64 multiplier into four 32x32 multipliers, matching the other four in the RHS of the miter. As they are matched In bit-level equivalence-checking procedures [24][25], sim- exactly, they become common logic in the miter model. LEC ulation, SAT-sweeping, AIG rewriting and internal equiva- then produces an abstraction and obtains a reduced model lence identification are all relevant to data-path equivalence- with all the multipliers removed: the outputs of the multipliers checking. In LEC, these types of procedures are conducted become free inputs in the abstract model. The abstract model at the word-level. Word-level rewriting is difficult if only a is then proven instantly by ABC’s dcec on the AIG model. bit-level model is available. For example, with no knowledge The miter d1, extracted from a PEP sub-model, is a demon- of the boundary of a multiplier, normalizing its operands is stration of rewrite rule 6 in Table II using 32-bit multiplication. impractical at the bit-level . Although abstraction and case- As both Boolector and Z3 fail to prove equivalence within split techniques in LEC can be applied at the bit-level in the time-limit, they likely do not have this rewriting rule theory, these are not used due to the difficulty of comput- implemented. ing an abstraction boundary or of finding good cofactoring To prove d2, LEC conducts conditional rewriting using rule candidates. (2) by first statically proving an invariant in the form of (3). SMT solving is relevant because a data-path is a subset After the transformation, the multipliers are matched exactly of QF BV theory. Methods such as [7][11][16][14][17][19], on both sides of the miter and removed in the subsequent are state-of-art QF BV solvers. These employ different imple- abstract model. The final miter model is proved instantly by mentations of word-level techniques in rewriting, abstraction, ABC on the bit level AIG. case-splitting, and simplification, and interleave Boolean and The miter model d3 has part of its logic similar to word-level reasoning via a generalized DPLL framework or mul 64 64 embedded inside. LEC proves d3 by first applying through abstraction refinements of various forms. Hector[20] is rewriting rules repeatedly until no more rewriting is possible. closest to LEC in terms of technology and targeted application Then, LEC computes a reduced model through abstraction. In domains, and has a rich set of word-level rewriting rules along the reduced model, LEC conducts a case-split on a one-bit with some theorem prover [7] procedures to validate every input. The case-0 AIG model is proven instantly, while case-1 rewriting applied. Hector also has an orchestration of a set of is proven in about 10 minutes by ABC. bit-level solvers using SAT and BDD engines to employ once The miter d4 is proven by first conducting a case-split of two the bit-level miter model is constructed. Strategically, LEC bit-vector inputs: cofactoring on whether the bit-vector equals relies less on the capacity of SAT solver; instead it builds zero or not. Three of the four cofactored cases are proven a compositional proof infrastructure and employs iterative instantly. The one unresolved goes through a round of simpli- transformations to finally obtain a proof through sub-model fication and abstraction. On the then obtained sub-model, three trees. The goal of these LEC learning procedures is to reverse one-bit inputs are identified and cofactored through case-split engineer the embedded high-level algebraic transformations procedures. LEC prove all eight cases quickly within a few and bring more similarity between both sides of the miter seconds. model. Miter d5 is extracted from model 15 in Figure 5 which The techniques in [26] [31][33] also try to reconstruct an contains the purely linear sum miter described in the case- algebraic model from the underlying logic, but they employ a study section. For this simpler miter, Z3 is able to prove bottom up approach and their primitive element is a half-adder. The method in [8] simplifies the algebraic construction by [11] R. Brummayer and A. Biere. Boolector: An efficient smt solver for bit- solving an integer linear programming problem. The limitation vectors and arrays. In Tools and Algorithms for the Construction and Analysis of Systems, pages 174–177. Springer, 2009. of these approaches is that they rely on the structural pattern [12] M. L. Case, A. Mishchenko, and R. K. Brayton. Automated extraction of the underlying logic to reconstruct the algebraic model. of inductive invariants to aid model checking. In Formal Methods On the other hand, the linear construction case-study in in Computer Aided Design, 2007. FMCAD’07, pages 165–172. IEEE, 2007. Section V-A constructs the polynomial through probing with [13] P. Chauhan, D. Goyal, G. Hasteer, A. Mathur, and N. Sharma. Non- simulation patterns. This is more general as it uses only the cycle-accurate sequential equivalence checking. In Proceedings of the functional information of the data-path logic. For different 46th Annual Design Automation Conference, pages 460–465. ACM, 2009. domains, other techniques may well be more applicable such [14] A. Cimatti, A. Griggio, B. J. Schaafsma, and R. Sebastiani. The mathsat5 as the bottom-up approach. The use of vanishing polynomials smt solver. In Tools and Algorithms for the Construction and Analysis and Grobner bases in [27][28] to prove equivalence between of Systems, pages 93–107. Springer, 2013. [15] J. Cong, B. Liu, S. Neuendorffer, J. Noguera, K. Vissers, and Z. Zhang. polynomials in the modulo integer domain can be utilized once High-level synthesis for fpgas: From prototyping to deployment. a polynomial form is reconstructed in LEC. In many data-path Computer-Aided Design of Integrated Circuits and Systems, IEEE Trans- miter models, such a polynomial in a certain domain or theory actions on, 30(4):473–491, 2011. [16] L. De Moura and N. Bjørner. Z3: An efficient smt solver. In Tools is likely embedded in other control and data-path logic. Direct and Algorithms for the Construction and Analysis of Systems, pages application of algebraic techniques is often not practical. Thus 337–340. Springer, 2008. the collaborating procedures in LEC are designed to bridge [17] B. Dutertre and L. De Moura. The yices smt solver. Tool paper at http://yices. csl. sri. com/tool-paper. pdf, 2:2, 2006. this gap and isolate such polynomials so that these high level [18] M. Fujita. Equivalence checking between behavioral and rtl descriptions theories can then be applied. with virtual controllers and datapaths. ACM Transactions on Design In conducting consistency checking between C and Verilog Automation of Electronic Systems (TODAES), 10(4):610–626, 2005. [19] S. Jha, R. Limaye, and S. A. Seshia. Beaver: Engineering an efficient RTL, the work [21] focuses on how to process a C program smt solver for bit-vector arithmetic. In Computer Aided Verification, to generate formal models. The tool relies on SMT solvers pages 668–674. Springer, 2009. [11][16][14] as the back-end solving engines. [20] A. Koelbl, R. Jacoby, H. Jain, and C. Pixley. Solver technology for system-level to rtl equivalence checking. In Design, Automation & Test In terms of tool architecture, [9] [10] [22], all employ in Europe Conference & Exhibition, 2009. DATE’09., pages 196–201. a sophisticated set of transformations to simplify the target IEEE, 2009. model during verification. These are done at the bit-level. The [21] D. Kroening, E. Clarke, and K. Yorav. Behavioral consistency of C and Verilog programs using bounded model checking. In Proceedings of LEC infrastructure allows future extension to take advantage DAC 2003, pages 368–371. ACM Press, 2003. of multi-core parallelization as demonstrated in [30]. [12] [22] A. Kuehlmann and J. Baumgartner. Transformation-based verification [32], use a dedicated data-structures to represent the proof- using generalized retiming. In Computer Aided Verification, pages 104– 117. Springer, 2001. obligations, while LEC relies on the sub-model tree to track [23] J. Long, S. Ray, B. Sterin, A. Mishchenko, and R. Brayton. Enhanc- the compositional proof strategy used at each node. ing abc for ltl stabilization verification of systemverilog/vhdl models. Ganesh Gopalakrishnan University of Utah USA, page 38, 2011. [24] A. Mishchenko, S. Chatterjee, R. Brayton, and N. Een. Improvements to VIII. C ONCLUSION combinational equivalence checking. In Computer-Aided Design, 2006. In LEC, we build a system of collaborating procedures ICCAD ’06. IEEE/ACM International Conference on, pages 836–843, 2006. for data-path equivalence-checking problems found from an [25] V. Paruthi and A. Kuehlmann. Equivalence checking combining a industrial setting. The strategy is to utilize Boolean level structural sat-solver, bdds, and simulation. In Computer Design, 2000. solvers, conduct the transformations at the word-level and Proceedings. 2000 International Conference on, pages 459–464, 2000. [26] E. Pavlenko, M. Wedler, D. Stoffel, W. Kunz, A. Dreyer, F. Seelisch, to synthesize internal similarities by lifting the reasoning to and G. Greuel. Stable: A new qf-bv smt solver for hard verification the algebraic level . Using a real industrial case-study, we problems combining boolean reasoning with computer algebra. In demonstrated the applicability of the sub-tree infrastructure for Design, Automation & Test in Europe Conference & Exhibition (DATE), 2011, pages 1–6. IEEE, 2011. integrating a compositional proof methodology using LEC. [27] N. Shekhar, P. Kalla, and F. Enescu. Equivalence verification of polynomial datapaths using ideal membership testing. Computer-Aided R EFERENCES Design of Integrated Circuits and Systems, IEEE Transactions on, [1] ABC - a system for sequential synthesis and verification. Berkeley 26(7):1320–1330, 2007. Verification and Synthesis Research Center, http://www.bvsrc.org. [28] N. Shekhar, P. Kalla, F. Enescu, and S. Gopalakrishnan. Equivalence [2] Calypto R Catapult Design Product. http://www.calypto.com. verification of polynomial datapaths with fixed-size bit-vectors using [3] Calypto R SLEC. http://www.calypto.com. finite ring algebra. In Computer-Aided Design, 2005. ICCAD-2005. [4] Forte design systems. http://www.forteds.com. IEEE/ACM International Conference on, pages 291–296, 2005. [5] smtlib. http://www.smt-lib.org. [29] W. Snyder, P. Wasson, and D. Galbi. Verilator: Convert verilog code to [6] Verific Design Automation: http://www.verific.com. c++/systemc, 2012. [7] C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanović, T. King, [30] B. Sterin, N. Een, A. Mishchenko, and R. Brayton. The benefit of A. Reynolds, and C. Tinelli. Cvc4. In Computer Aided Verification, concurrency in model checking. IWLS, 2011. pages 171–177. Springer, 2011. [31] D. Stoffel and W. Kunz. Equivalence checking of arithmetic circuits on [8] M. A. Basith, T. Ahmad, A. Rossi, and M. Ciesielski. Algebraic the arithmetic bit level. Computer-Aided Design of Integrated Circuits approach to arithmetic design verification. In Formal Methods in and Systems, IEEE Transactions on, 23(5):586–597, 2004. Computer-Aided Design (FMCAD), 2011, pages 67–71. IEEE, 2011. [32] D. Wang and J. Levitt. Automatic assume guarantee analysis for [9] J. Baumgartner, H. Mony, V. Paruthi, R. Kanzelman, and G. Janssen. assertion-based formal verification. In Proceedings of the 2005 Asia Scalable sequential equivalence checking across arbitrary design trans- and South Pacific Design Automation Conference, pages 561–566. ACM, formations. In Computer Design, 2006. ICCD 2006. International 2005. Conference on, pages 259–266. IEEE, 2007. [33] M. Wedler, D. Stoffel, R. Brinkmann, and W. Kunz. A normalization [10] R. Brayton and A. Mishchenko. Abc: An academic industrial-strength method for arithmetic data-path verification. Computer-Aided Design of verification tool. In Computer Aided Verification, pages 24–40. Springer, Integrated Circuits and Systems, IEEE Transactions on, 26(11):1909– 2010. 1922, 2007.