Abstraction-Based Livelock/Deadlock Checking for
                       Hardware Verification

                                              In-Ho Moon and Kevin Harer
                                                          Synopsys Inc.
                                            {mooni, kevinh}@synopsys.com


ABSTRACT                                                           has any incoming edges to the SCC in the state transi-
Livelock/deadlock is a well known and important problem            tion graph representing a hardware design. However, even
in both hardware and software systems. In hardware veriﬁ-          though the method in [27] is an improved method from its
cation, a livelock is a situation where the state of a design      previous work [13] in symbolic approaches, it is still infea-
changes within only a smaller subset of the states reachable       sible to apply the method to the industrial designs, simply
from the initial states of the design. Deadlock is a special       due to the capacity problem of symbolic methods. In gen-
case in which there is only one state in a livelock. However,      eral, any BDD-based method can handle only up to several
livelock/deadlock checking has never been actively used in         hundred latches without any abstraction or approximation
hardware veriﬁcation in practice, mainly due to the com-           techniques, whereas there can be millions of latches in in-
plexity of the computation which involves ﬁnding strongly          dustrial designs.
connected components.                                                 In this paper, we ﬁrst present an improved BDD-based
   This paper presents a practical abstraction-based live-         algorithm ﬁnding TSCCs from[27] in the following aspects.
lock/deadlock checking algorithm for hardware veriﬁcation.         First, initial state is taken into account in ﬁnding TSCCs.
The proposed livelock/deadlock checking works on FSMs              Especially, the improved algorithm handles multiple initial
rather than the whole design. For each FSM, we make an             states eﬃciently. Secondly, unreachable TSCCs are distin-
abstract machine of manageable size from the cone of inﬂu-         guished from reachable TSCCs which are more interesting to
ence of the FSM. Once a livelock is found on an abstract           designers. Thirdly, we provide more intuitive state classiﬁ-
machine, the livelock is justiﬁed on the concrete machine          cation as main, transient, and livelock groups as opposed to
with trace concretization. Experimental results shows that         transient and recurrence classes in [27]. In our classiﬁcation,
the proposed abstraction-based livelock checking ﬁnds real         a set of transient states is further classiﬁed into main and
livelock errors in industrial designs.                             transient groups. Recurrence class in [27] is mapped into
                                                                   livelock group in our classiﬁcation. Main group is an SCC
1. INTRODUCTION                                                    that contains the initial state. Transient group is a set of
   Livelock/deadlock is a well known and important problem         states that belong to neither main nor livelock group. There
in both hardware and software systems. In hardware veriﬁ-          is one or zero main group in a design per one initial state.
cation, a livelock is a situation where the state of a design         This paper also presents a practical approach for checking
changes within only a subset of the states reachable from          livelock using abstraction techniques. The proposed live-
the initial states of the design. In a state transition graph, a   lock checking works on FSMs(Finite State Machines)2 rather
livelock is a set of states from which there is no path going      than the whole design. For each FSM, we make an abstract
to any other states that are reachable from the initial state.     machine (by localization reduction [15]) of manageable size
Since deadlock is a special case in which there is only one        by the improved BDD method from the COI(Cone of Inﬂu-
state in a livelock, deadlock checking can be done by livelock     ence) of the FSM. Once a livelock is found on an abstract
checking. Thus, livelock implies both livelock and deadlock        machine, the livelock is justiﬁed on the concrete machine
in this paper. However, livelock checking1 has never been          with trace concretization using SAT (Satisﬁability[9]) and
actively used in hardware veriﬁcation in practice, mainly due      simulation. When there is no livelock on the abstract ma-
to the complexity of the computation which involves ﬁnd-           chine, there is no guarantee for no livelock on the concrete
ing SCCs (Strongly Connected Components). Thus, livelock           machine. However, the bigger the abstract size is, the more
checking has been on hardware designer’s wish list to verify       conﬁdence we have that no livelock exists on the concrete
their designs.                                                     machine. The key beneﬁt of this abstraction-based livelock
   There have been many approaches on ﬁnding SCCs [13,             checking is that it enables ﬁnding real livelock groups that
27, 28, 3, 20, 12]. Among these work, Xie and Beeral pro-          cannot be found by tackling whole design directly.
posed a symbolic method ﬁnding terminal SCCs (in short,               Once an FSM is given, its COI is ﬁrst computed. Then,
TSCCs) using BDDs (Binary Decision Diagrams [4]) in [27].          an abstract machine is computed by ﬁnding N influential
In a state transition graph, a TSCC is an SCC that does            latches from the COI. Inﬂuential latches are the latches that
not have any outgoing edges to any state outside the SCC.          are likely related with the FSM. N is either pre-deﬁned or a
Thus, a TSCC becomes a livelock group when the TSCC                user-deﬁned number of latches in the abstract machine, or
1                                                                  2
  Livelock checking is diﬀerent from liveness checking and           FSMs are either automatically extracted [26] or any sets of
the diﬀerence will be explained in Section 2.3.                    sequential elements that are user-speciﬁed.
gradually increased. In general, N is up to a few hundred          algorithms. Ravi et al. [20] provided a taxonomy of those
latches. Inﬂuential latches are computed mainly by approx-         symbolic algorithms. One is SCC-hull algorithms (without
imate state decomposition [6]. However, in many cases, the         enumerating SCCs) [11, 14, 24], and the other is SCC enu-
size of COIs is too big for even approximate state decom-          meration algorithms [28, 3, 12, 20]. The details are in [20].
position. Thus, a structural abstraction is applied by using
connectivity and sequential depth before approximate state         2.2    Finding TSCCs
decomposition. This structural abstraction reduces the COI            Even though TSCCs are a subset of SCCs in the states of a
to a manageable size by approximate state decomposition.           design, the algorithms for ﬁnding TSCCs can be signiﬁcantly
   There is another important hardware property called tog-        optimized since not all SCCs are of interest.
gle deadlock. A state variable has a toggle deadlock if the           This section recapitulates the work on ﬁnding TSCCs by
state variable initially toggles and the state variable becomes    Xie and Beeral [27]. This algorithm classiﬁes all states into
a constant after a certain number of transitions. However,         either recurrence or transient class. Recurrence class is a set
notice that this is not a constant variable since it initially     of TSCCs and the rest belongs to transient class. Let S be
toggles. Toggle deadlock may or may not happen depending           the set of states. With i, j ∈ S, i → j denotes that there is
on input stimuli in simulation. Therefore, toggle deadlock is      at least one path from i to j. Deﬁnition 1 deﬁnes forward
also an important property to check with formal approaches.        set and backward set of a state.
   Experimental results shows that the proposed abstraction-          Definition 1. The forward set of state i ∈ S, denoted by
based approach ﬁnds real livelock and toggle deadlock errors       F (i), is the set of states that have a path from i. That is,
from industrial designs.                                           F (i) = {j ∈ S | i → j}. Similarly, the backward set of state
   The contributions of this paper are in the three aspects.       i, denoted by B(i), is the set of states that have a path to i.
    • Improved algorithm for livelock checking                     That is, B(i) = {j ∈ S | j → i}.
      The proposed algorithm improved the existing algo-             Lemma 1. Let i, j ∈ S. If j ∈ F (i), then F (j) ⊆ F (i).
      rithm [27] in many aspects, such as providing new            Similarly, if j ∈ B(i), then B(j) ⊆ B(i).
      state classiﬁcation with initial state, handling multi-
      ple initial states, reﬁning the search space eﬃciently          Theorem 1. A state i ∈ S is recurrent if and only if
      with care states, early termination, and trimming out        F (i) ⊆ B(i). In other words, i is transient if and only if
      transient states.                                            F (i) * B(i).
    • Abstraction-based livelock checking                             Theorem 2. If state i ∈ S is transient, then states in
      This paper presents theories and an implementation           B(i) are all transient. If state i is recurrent, on the other
      on abstraction-based livelock checking to handle large       hand, states in F (i) are all recurrent. In the latter case, set
      designs in practice.                                         F (i) is a recurrence class, and set B(i)\F (i) (if not empty)
    • Toggle deadlock checking                                     contains only transient states.
      To the best of our knowledge, this paper presents the          Lemma 1, Theorem 1 and 2 are from [27]. Lemma 1 shows
      ﬁrst method to solve toggle deadlock problem.                a subset relation between two forward sets as well as two
   The remainder of this paper is organized as follows. Sec-       backward sets when j is in either F (i) or B(i). Theorem 1
tion 2 brieﬂy recapitulates ﬁnding SCCs and TSCCs, and             and 2 show how a state is determined whether the state
describes related work. Section 3 describes our improved al-       belongs to either recurrence or transient class. Based on
gorithm for ﬁnding TSCCs. Section 4 explains how livelock          Theorem 1 and 2, all TSCCs can be found by performing
is checked on FSM using abstraction. Section 5 describes           forward and backward reachability iteratively. The detailed
how to check toggle deadlocks. Experimental results are            algorithm can be found in [27] and our improved algorithm is
presented and discussed in Section 6. We conclude with             described in Section 3.2 with the comparisons to the original
Section 7.                                                         algorithm.
2. PRELIMINARIES                                                   2.3    Related work
2.1 Finding SCCs                                                      There are two types of properties in model checking; safety
   Given a graph, G = (V, E) where G is an inﬁnite transition      and liveness properties [16]. A safety property represents
system of the Kripke structure [8], V is a ﬁnite set of states     ’something bad never happens’, whereas a liveness property
and E ⊆ V × V is the set of edges, a strongly connected            represents ’something good eventually happens’. Liveness
component (SCC) is a maximal set of state U ⊆ V such               checking with a liveness property can be performed by ﬁnd-
that for every pair (u, v) ∈ U , u and v are reachable from        ing SCCs [20]. Liveness checking can also be performed by
each other, that is, u is reachable from v and v is reachable      safety checking with proper transformations [1].
from u [27, 28].                                                      Livelock checking is diﬀerent from liveness checking in the
   Finding SCCs has a variety of applications in formal ver-       sense that liveness checking requires a liveness property to
iﬁcation such as Buchi emptiness [11], LTL model check-            work on a design, whereas livelock checking does not require
ing [25], CTL model checking with fairness constraints [10],       any property and works on a design directly.
Liveness checking [16, 1], and so on.                                 There have been many publications on ﬁnding all SCCs [24,
   The traditional approach to ﬁnd SCCs is to use Tarjan’s         28, 3, 12]. Even though all TSCCs can be found by any of
method [23]. Since this method manipulates the states of the       these approaches on ﬁnding all SCCs, it is not necessary to
graph explicitly, even though it runs in linear time in the size   ﬁnd all SCCs for ﬁnding all TSCCs since we are interested
of the graph, the size of the graph grows exponentially as         in ﬁnding only all TSCCs for livelock checking.
the number of state variables grows.                                  Hachtel et al. [13] proposed a symbolic approach to ﬁnd
   To overcome this state explosion problem in explicit al-        all recurrence classes concurrently identifying all TSCCs by
gorithms, there have been many publications on symbolic            computing transitive closure [17] on the transition graph
with the Markov chain. Due to the complexity of transi-            states, the reachable states are a through i inside the dot-
tive closure, this approach takes signiﬁcantly more time and       ted rectangle. The unreachable states are j through o out-
memory than a reachability-based approach does.                    side the dotted rectangle. There are ﬁve SCCs that are
   Qadeer et al. [19] proposed an algorithm to ﬁnd single          {a, b, c}, {e, f, g}, {h, i}, {j, k, l}, and{m, n, o}. Since a is the
TSCC in the context of safe replacement in sequential equiv-       initial state, {a, b, c} becomes the main group. {h, i} and
alence checking [21, 22]. In this approach, multiple TSCCs         {m, n, o} are TSCCs and only {h, i} is a livelock group (reach-
are not considered.                                                able STSCC) since it is reachable from a. {m, n, o} is called
   Xie and Beeral proposed a reachability-based algorithm          an unreachable TSCC. The rest states, {d, e, f, g, j, k, l}, be-
to ﬁnd all TSCCs iteratively [27]. This is also a symbolic         long to the transient group in which the states are contained
approach that outperforms the method in [13]. However,             in neither the main group nor the TSCCs.
this approach does not consider initial states.
   None of the above previous work on ﬁnding TSCCs was
                                                                   3.2      Finding Livelock
used in real designs in practice, due to the design sizes. Our        We ﬁrst deﬁne transition relation in Deﬁnition 3 to explain
abstraction-based approach is the ﬁrst in publication to han-      our algorithms to check livelock.
dle large designs in practice.                                        Definition 3. Let x = {x1 , . . . , xn }, y = {y1 , . . . , yn },
   Case et al. [5] proposed a method ﬁnding transient sig-         and w = {w1 , . . . , wp } be sets of variables ranging over B =
nals using ternary simulation. A transient signal is a toggle      {0, 1}. A (finite state) machine is a pair of boolean functions
deadlock on over-approximate reachable states. The toggle          ⟨Q(x, w, y), I(x)⟩, where Q : B 2n+p → B is 1 if and only if
deadlock checking in this paper ﬁnds transients signals in         there is a transition from the state encoded by x to the state
exact reachable states.                                            encoded by y under the input encoded by w. I : B n → B is
                                                                   1 if the state encoded by x is an initial state. Q(x, w, y) is
3. IMPROVED LIVELOCK CHECKING                                      called transition relation. The sets x, y, and w are called the
3.1 State Classification                                           present state, next state, and input variables, respectively.
   The state classiﬁcation in [27] consists of one transient          The procedure ComputeF orwardSet in Figure 2 is a mod-
class and one or more recurrence classes. However, in hard-        iﬁed version of the procedure f orward set in [27] in order
ware veriﬁcation, initial states are given to verify the hard-     to compute forward set of a given state s only within the
ware behavior only in reachable state space. One problem of        given care states careSet in the procedure and to perform
the state classiﬁcation in [27] is that there is no distinction    early termination when stop is not ZERO (empty BDD). ⇓
between reachable TSCCs and unreachable TSCCs from the             represents a restrict operator [7] that is used to minimize
initial states. Also, the reachable TSCCs may vary depend-         the transition relation with respect to careSet in Line 2.
ing on initial states.                                             The minimized transition relation is denoted by Q̃. In Line
   We propose a new state classiﬁcation that is shown in Fig-      7, y ← x represents that y variables are replaced by x vari-
ure 1, assuming that there is one single initial state. Han-       ables by BDD substitution. Early termination is another
dling multiple initial states is explained in Section 3.3.         big diﬀerence from f orward set in [27] and is used in Fig-
   Definition 2. STSCC is a sink TSCC that has incoming            ure 3. This is to bail out computing forward set as soon as
edges from any states outside the TSCC.                            any newly reached state intersects with the states in stop
   We ﬁrst deﬁne sink TSCC (in short, STSCC) in Deﬁni-             as in Line 11. BddIteConstant is a BDD ITE(if-then-else)
tion 2. The new state classiﬁcation consists of main group,        operation without creating a new BDD node. O is an array
transient group, livelock groups (reachable STSCCs) and            of states to store newly reached states at each iteration and
unreachable TSCCs for a given initial state. The transient         O is called onion rings. These onion rings are used later in
class in [27] is further classiﬁed into main group or tran-        Section 3.3. ComputeF orwardSet returns the forward set
sient group. Main group is an SCC containing the initial           F (s) and the onion rings O.
state and there exists either one or no main group. The                 ComputeForwardSet(Q, careSet, s, stop) {
recurrence classes in [27] are further classiﬁed into livelock     1      F (s) = ZERO;
groups (reachable STSCCs) and unreachable TSCCs. When              2      Q̃(x, w, y) = Q(x, w, y) ⇓ careSet;
there is no livelock, there exists only one SCC which is the       3      f rontier(x) = s;
                                                                   4      Put s in O;
main group.                                                        5      while (f rontier(x) ̸= ZERO) {
                                                                   6          image(y) = ∃x, w. Q̃(x, w, y) ∧ f rontier(x);
                                        Unreachable TSCC           7          image(x) = image(y)|y←x ;
                                                                   8          F (s) = F (s) ∨ image(x);
                                                                   9          f rontier = image(x) ∧ ¬F (s);
                           k        l        m       n             10         Put f rontier in O;
                                                                   11         if (BddIteConstant(f rontier, stop, ZERO) != ZERO)
                                j                o                 12             break;
        Reachable States                                           13     }
                                                                   14     return (F (s), O);
                 c                           g           i              }
                                                                               Figure 2: Computing forward set.
             a       b         d         e       f       h
                                                 Livelock Group      ComputeBackwardSet is a dual procedure to Compute
           Main Group          Transient Group (Reachable STSCC)   F orwardSet, except not using stop and not computing the
                                                                   onion rings O.
                                                                     Figure 3 is a procedure for ﬁnding TSCCs from the given
              Figure 1: State classification.
                                                                   set of states S. The procedure F indT SCCs is a modiﬁed
   In Figure 1, there are states a through o and a is the          version of the procedure State classif ication in [27]. The
initial state that is marked with thick circle. Among all          modiﬁed procedure utilizes care states careSet, assuming S
is not necessarily all state space. T is a set of transient states   Line 11. TR represents the set of transient states that are
in S, and R is an array of TSCCs in S. P ickOneState in              reachable from s. In Line 12, R and TR are computed by
Line 5 picks a random state from careSet as a seed state to          calling F indT SCCs with careSet. If s ∈    / M (means that
ﬁnd a TSCC. In Line 7, early termination is used in comput-          the main group is empty), s is added to TR in Line 13-14.
ing the forward set F (s), by setting stop in Figure 2 as the        TU represents the set of transient states that are unreachable
negation of B(s). This is because while we compute F (s)             from s and TU is computed in Line 15. T is computed by
within B(s) for the state s, once any state outside B(s)             union of TR and TU in Line 16.
is reachable from s, all states in B(s) are transient. An-
other big diﬀerence is trimming transient states in Line 12          3.3    Multiple Initial States
and 16. T rimT ransient(Q, careSet, T, dir) trims out the               It is possible for a design to have multiple initial states
transient states from the current care states by the given           when some of the state variables do not have concrete initial
direction (dir) that is either PREFIX, SUFFIX, or BOTH.              values. In the presence of multiple initial states, ﬁnding
PREFIX(SUFFIX) means to trim out the lasso preﬁx(suﬃx)               livelock groups has to be devised correctly to avoid false
states. This is the same technique used in ﬁnding SCCs [20].         positives and redundant computations.
Finally, F indT SCCs returns R (a set of TSCCs) and T (a                Figure 5 shows an example with multiple initial states.
set of transient states).                                            In this example, there are six states, S = {a, b, c, d, e, f }.
     FindTSCCs(Q, S) {                                               There are two SCCs, {a, b, c} and {d, e, f }. We can see
1       R = { };                                                     that {d, e, f } is a TSCC. a and d are initial states, I =
2       T = ZERO;                                                    {a, d}, as shown with thick circles. Suppose that we com-
3       careSet = S;
4       while (careSet ̸= ZERO) {                                    pute livelock by calling F indLivelock(Q, S, I). Then, we
5          s = PickOneState(careSet);                                get F (I) = B(I) = M = {a, b, c, d, e, f } and R = {} which
6          B(s) = ComputeBackwardSet(Q, careSet, s);                 is not correct since there is a reachable TSCC. Now, let
7          F (s) = ComputeForwardSet(Q, careSet, s, ¬B(s));
8          if (F (s) ⊆ B(s)) {                                       us try to call F indLivelock for each single initial state.
9              R = R ∪ F (s);                                        First for the initial state a, we get F (a) = {a, b, c, d, e, f }
10             T = T ∨ (B(s) ∧ ¬F (s));                              and B(a) = {a, b, c}. This gives us Ma = {a, b, c} and
11             careSet = careSet ∧ ¬B(s);
12             TrimTransient(Q, careSet, T , PREFIX);                Ra = {d, e, f }. There is a livelock group Ra for the initial
13         } else {                                                  state a. Now, for the initial state d, F (d) = {d, e, f } and
14             T = T ∨ (s ∨ B(s));                                   B(d) = {a, b, c, d, e, f }. This gives us Md = {d, e, f } and
15             careSet = careSet ∧ ¬(s ∨ B(s));
16             TrimTransient(Q, careSet, T , BOTH);
                                                                     Rd = {} and TU = {a, b, c}. There is no livelock group for
17         }                                                         the initial state d. Therefore, we can see that livelock check-
18      }                                                            ing has to be applied for each single initial state separately
19      return (R, T );
     }
                                                                     in the presence of multiple initial states.
                 Figure 3: Finding TSCCs.
   FindLivelock(Q, S, s) {
                                                                                             c                 f
1     (F (s), O) = ComputeForwardSet(Q, S, s, ZERO);
2     B(s) = ComputeBackwardSet(Q, S, s);
3     reached = F (s) ∨ s;                                                              a        b        d        e
4     if (F (s) ⊆ B(s)) {
5         M = F (s);
6         R = { };                                                               Figure 5: Multiple initial states.
7         T = ZERO;
8     } else {
9         M = F (s) ∧ B(s);
                                                                         Theorem 3. When there are two initial states (i0 and
10        careSet = F (s) ∧ ¬(M ∨ s);                                i1 ), if i1 is included in the reached states from i0 , the livelock
11        TrimTransient(Q, careSet, T , PREFIX);                     groups from i1 are a subset of the livelock groups from i0 .
12        (R, TR ) = FindTSCCs(Q, careSet);
13        if (s ∈
                / M)                                                     Proof. Since i1 is included in the reached states from
14            TR = TR ∨ s;                                           i0 , i1 is in either main, transient, or livelock groups from
15        TU = B(s) ∧ ¬M ;
16        T = TR ∨ TU ;
                                                                     i0 . When i1 is in the main group, the same livelock groups
17    }                                                              from i1 are obtained. When i1 is in the transient group, all
18    return (M, R, T, reached, O);                                  or a subset of the livelock groups i1 is obtained. When i1
   }
                                                                     is in one of the livelock groups, the livelock group including
                 Figure 4: Finding livelock.                         i1 becomes the main group from i1 , and no livelock group
   Figure 4 shows the procedure to perform our new state             exists from i1 since the other livelock groups from i0 become
classiﬁcation. As explained in Section 3.1, we ﬁnd main              unreachable TSCCs from i1 . From the above three cases, no
group (M ), transient group (T ), and livelock groups (R)            new livelock group is obtained from i1 compared to the ones
from the given initial state (s) within the given care states        from i0 . Therefore, the livelock groups from i1 are a subset
(S). F indLivelock starts computing forward set F (s) and            of the livelock groups from i0 .
backward set B(s) in Line 1 and 2. In Line 3, reached is the             Theorem 3 says that when there is large number of initial
reached states from s in S. If F (s) ⊆ B(s) in Line 4, there         states, we can skip livelock checking for any initial states
is no livelock in S. In this case, F (s) becomes the main            that are already in the forward sets of other initial states.
group and both R and T are set to empty in Line 5-7. If              In Figure 5, livelock checking for the initial state d can be
F (s) * B(s) in Line 8, there must exist at least one livelock       skipped because of d ∈ F (a), assuming that a is used ﬁrst.
group. In this case, M is computed by intersecting F (s) and         However, there is an order dependency on which initial state
B(s) in Line 9. careSet is set to a subset of F (s) in Line          is used ﬁrst. If d is used ﬁrst, we still need to run live-
10. The lasso preﬁx states in careSet are trimmed out in             lock checking with a. In practice, the number of calls to
F indLivelock is greatly reduced because of Theorem 3 in            propose a framework for abstraction-based livelock checking
the presence of multiple initial states.                            on an abstracted COI of the FSM. Once we ﬁnd a livelock on
   Figure 6 is the top-level procedure that checks livelock         the abstract machine, we justify whether the livelock exists
with multiple initial states. CheckLivelock takes transition        on the concrete machine. Notice that a livelock on the ab-
relation(Q), a set of states(S), a set of initial states(I), and    stract machine can be mapped into more than one livelock
a concrete machine(C) as procedure inputs. The use of C             on the concrete machine.
is explained in Section 4. CheckLivelock ﬁrst ﬁnds live-               Figure 7 shows how an abstract machine is obtained from
lock groups in the reachable states in Line 1-17 and then it        the COI of an FSM. Suppose an FSM that has two state
ﬁnds TSCCs in the unreachable states in Line 18-23. The             variables f and g. Then, we compute the COI of the FSM.
while loop (Line 6-17) performs livelock checking for a cur-        Suppose that there are state variables {a, b, c, d, e} in the
rent initial state s until all initial states are covered with      COI of the FSM. The size of the abstract machine is pre-
iteration index k. For this, remaining is initially set to          deﬁned and let us suppose that the size is N . Then, a set
I in Line 3 and updated by eliminating the newly reached            of inﬂuential latches from the COI is computed from the
states reachedk from remaining in Line 13. reached is the           FSM variables. The minimum abstract machine is the FSM
reached states from all initial states. reached is initially set    itself and the maximum abstract machine is the concrete
to ZERO in Line 1 and updated by adding reachedk that is            machine. In this example, N =4 and we get the abstract
the reached states from s in Line 12. Then, the next initial        machine {f, g, d, e}.
state is chosen from remaining in Line 15. TU is the union
of unreachable transient states from each initial state. TU is                                          Abstract Machine
initially set to ZERO in Line 2 and updated by adding the                             COI of FSM
                                                                                                                   FSM
unreachable states of Tk in Line 14. For the current initial                              a
                                                                                                           d        f
state s, F indLivelock is called in Line 7. |Rk | represents                              b
                                                                                                           e        g
the number of livelock groups in Rk in Line 8. For each                                   c

Rkj , a trace tracejk is generated in Line 9 and the livelock
                                                                                              Concrete Machine
is reported with the trace in Line 10. Generating trace is
explained in Section 4.2 and reporting livelock is explained                      Figure 7: Abstract machine.
in Section 4.3.                                                        Theorem 4. If any state in a livelock group on an ab-
   CheckLivelock(Q, S, I, C) {                                      stract machine is reachable from the initial state on the con-
1    reached = ZERO;
2    TU = ZERO;                                                     crete machine, the livelock exists on the concrete machine.
3    remaining = I;                                                    Proof. Since the abstraction is an over-approximation,
4    k = 0;                                                         the set of all transitions on the abstract machine is a superset
5    s = PickOneState(I);
6    while (s ̸= ZERO) {
                                                                    of the set of all transitions on the concrete machine. Since
7        (Mk , Rk , Tk , reachedk , Ok ) = FindLivelock(Q, S, s);   there is no path from any state in the livelock group to any
8        for (j = 0; j < |Rk |; j++) {                              state in the main group on the abstract machine, there is
9            tracejk = GenerateTrace(C, Rk      j
                                                  , s, Ok );        still no path from any projected states of the livelock group
10           ReportLivelock(s, Mk , Rk , Tk , tracejk );
                                         j

11       }
                                                                    on the concrete machine to any projected states of the main
12       reached = reached ∨ reachedk ;                             group on the concrete machine. Now, suppose that the live-
13       remaining = remaining ∧ ¬reachedk ;                        lock does not exist on the concrete machine. In order for the
14       TU = TU ∨ (Tk ∧ ¬reachedk );
15       s = PickOneState(remaining);
                                                                    livelock not to exist on the concrete machine, the only con-
16       k++;                                                       dition is that there is no path from the projected main group
17   }                                                              to the projected livelock group on the concrete machine. In
18   careSet = ¬(reached ∨ TU );                                    other words, the projected livelock group has to be unreach-
19   if (careSet ̸= ZERO) {
20       Rk = FindTSCCs(Q, careSet);                                able from the initial state. However, this contradicts the
21       for (j = 0; j < |Rk |; j++)                                assumption that any state of the livelock group is reachable
                                           j
22           ReportUnreachLivelock(Rk        );                     from the initial state on the concrete machine. Therefore,
23   }
   }
                                                                    the livelock group still exists on the concrete machine.
                Figure 6: Checking livelock.                           Thanks to Theorem 4, this abstraction-based livelock ﬁnds
   Once all reachable livelock groups are found, we next ﬁnd        a livelock on small abstract machine using BDD-based sym-
unreachable TSCCs. We set the care states careSet to the            bolic method, then justiﬁes the existence of the livelock on
negation of all visited states so far in Line 18, then call         the concrete machine by trace concretization in Section 4.2,
F indT SCCs with careSet in Line 20. If there is any un-            by using SAT techniques that can handle large designs. The
reachable TSCC, the TSCC is reported in Line 22.                    abstraction-based livelock checking is an incomplete method
                                                                    in the sense that it does not provide the proof of no livelock
4. LIVELOCK CHECKING ON FSM                                         unless the checking is performed on a concrete machine. No
   To check whether a livelock exists in a design or not, the       livelock on an abstract machine does not guarantee no live-
checking should be done on the whole design. However, this          lock on the concrete machine. However, the abstraction-
is infeasible due to the size of the design in practice. Thus,      based livelock checking enables ﬁnding real livelock errors
we propose a practical method for checking livelock on FSMs         on industrial large designs.
on the design.
   Even when we check livelock on an FSM, the entire COI            4.1    Causality Checking
logic of the FSM must be considered in order to get an ex-            Let V be the set of state variables in an abstract machine
act result on livelock. However, this is still computationally      for livelock checking. Suppose that R(V ) is the reached
very expensive or not feasible, in most real designs. Thus, we      states in the abstract machine and L(V ) is a livelock group
containing a TSCC. Also, suppose that v is a state variable         using the onion rings Ok in Figure 6. The second step is to
in V . We are interested in whether v contributes to the            generate an abstract trace. Starting from the target state,
livelock as in Deﬁnition 4. This is called variable causality.      an abstract trace can be computed by applying BDD-based
   Definition 4. When a livelock exists in the abstract ma-         pre-image computation iteratively until the initial state is
chine, a variable v in V contributes to the livelock if the live-   reached. The third step is to generate a concrete trace by
lock disappears by eliminating v from the abstract machine.         making a BMC (Bounded Model Checking [2]) problem from
In other words, there is no livelock in another abstract ma-        the abstract trace, in order to see whether the livelock group
chine that is composed of the variables, V \v.                      is reachable on the concrete machine. An eﬃcient approach
                                                                    for concretization was proposed in [18]
  Equation 1 shows a condition for existence of livelock.
                         L(V ) ⊂ R(V )                        (1)   4.3    Reporting Livelock
  Now, let R̃ be the quantiﬁed reached states and L̃ be the            Once a concrete trace is generated for a livelock group,
quantiﬁed livelock states with respect to a state variable v,       the livelock is real on the concrete machine. We report the
as shown in Equation 2 and 3.                                       livelock group with the state classiﬁcation mentioned in Sec-
                                                                    tion 3.1. A livelock group is reported with its initial state,
                   R̃(V \v) = ∃v . R(V )                      (2)
                                                                    the main group, transient group, and the unreachable states
                   L̃(V \v) = ∃v . L(V )                      (3)   in terms of the number of states and the percentage in each
  Then, it is determined by Equation 4 to check whether             group on the abstract machine.
the variable v contributes to the livelock. Theorem 5 says             By looking at the transient and livelock groups, we can
that if Equation 4 holds, v contributes to the livelock.            see what fraction of the state space is in problematic zone.
                                                                    A good design is expected to have only one main group per
                       L̃(V \v) ⊂ R̃(V \v)                    (4)   one initial state without any transient and livelock groups,
   Theorem 5. When a livelock group is found on an ab-              unless the design has an intended reset sequence to a normal
stract machine (L(V ) ⊂ R(V )), if L̃(V \v) ⊂ R̃(V \v) holds        mode.
for a variable v, the variable v contributes to the livelock.
   Proof. Let M1 be the machine consisting of V and sup-            5.    TOGGLE DEADLOCK CHECKING
pose that a livelock group exists in M1 . Let M2 be the ma-           There is another important design property, called toggle
chine consisting of (V \v) by eliminating v from M1 . Also,         deadlock that is related to livelock. A livelock may occur for
let T1 (T2 ) be the set of transitions in M1 (M2 ), respectively.   multiple state variables of a design, whereas a toggle dead-
Since M2 is an over-approximated machine from M1 , M2 has           lock may occur on a single state variable. A state variable
more transitions than M1 (T1 ⊂ T2 ). Let Td be the diﬀer-           has a toggle deadlock if the variable initially toggles, but
ence between T1 and T2 . If there is any transition (in Td )        the variable gets stuck at a constant value after a certain
that makes a path from any state in the livelock to any state       number of cycles.
in the main group in M2 , the livelock group merges into the          Figure 8 shows an example of toggle deadlock. There are
main group and both groups become a single SCC, yielding            two state variables {a, b} and four states {s0 , s1 , s2 , s3 } as in
L̃(V \v) = R̃(V \v). Thus, M2 becomes a machine without             the example. Provided that s0 is the initial state, the main
the livelock. This means that v is a necessary variable to          group is {s0 , s1 } and the livelock group is {s2 , s3 }. Once the
have the livelock in M1 . Therefore, if L̃(V \v) = R̃(V \v), v      state transition reaches to s2 that is a state in the livelock
contributes to the livelock.                                        group, the value of b gets stuck at 1, whereas a still toggles.
   This causality checking can also be applied to a set of          Thus, we say that b has a toggle deadlock.
variables, especially with FSM variables, in order to report                                 a=0, b=0    a=0, b=1
whether the livelocks are related with the FSM. Let F be the                                   s0          s3
set of variables in an FSM and C be the set of variables in the
COI of the FSM. Suppose that R(F, C) is the reached states
                                                                                               s1          s2
in the abstract machine and L(F, C) is a livelock group con-
                                                                                             a=1, b=0    a=1, b=1
taining a TSCC. The quantiﬁed reached states and the quan-
tiﬁed livelock states are computed in Equation 5 and 6 with                         Figure 8: Toggle deadlock.
respect to the FSM variables, respectively.
                   R̃(C) = ∃F . R(F, C)                       (5)     Theorem 6. If there is no STSCC in a design, there is
                                                                    no toggle deadlock on any variable.
                   L̃(C) = ∃F . L(F, C)                       (6)
                                                                      Proof. To be a toggle deadlock, a variable is supposed to
   Then, Equation 7 shows the causality checking with the           toggle at a cycle and to hold the value forever from the cycle.
FSM variables to check whether the FSM variables con-               No STSCC implies that there is only main group in the
tribute to the livelock.                                            design. If a variable appears as constant in the main group,
                         L̃(C) = R̃(C)                        (7)   the variable is a constant. However, the main group does not
                                                                    have any preﬁx behavior. This means it is not possible for
4.2 Trace concretization                                            the variable to get toggled before the main group. Therefore,
  Once a livelock group is found on an abstract machine, we         no STSCC implies no toggle deadlock.
need to justify whether the livelock group is reachable on the
concrete machine. This can be done by the following three             Theorem 6 shows that toggle deadlock occurs in the pres-
steps. The ﬁrst step is to pick a target state in the livelock      ence of a livelock. It is also possible that there is no toggle
group. The target state is chosen randomly from the livelock        deadlock on a design that has a livelock. Thus, toggle dead-
group, but is one of the closest states to the initial states by    lock on a state variable can be computed by two steps. First,
                               Statistics                                                      Results
     Design       L        I         F    T   COI      N         Llk   Dlk             New1                       New2              TraceGen
                                                                               Time    Mem        Ops      Time    Mem      Ops    Time   Len
     D1          1163    1330        7    -     632    30        0       -     16:20   107.1       44      15:31  107.9       41       -     -
                                                       60        0       -     41:11   136.7       66      41:31  137.7       66       -     -
                                                       90        0       -   2:30:33   224.1       81    2:24:50  224.3       81       -     -
                                                      120        -       -       time-out (> 24h)           time-out (> 24h)           -     -
     D2           385     352      25     -     68     30        0       -      0:01    21.0       12       0:01   21.1       12       -     -
                                                       60        0       -      0:40    38.7       28       0:40   38.7       28       -     -
                                                       68    12032       -   4:47:52    95.7  386439     0:53:58   96.5   70329        -     -
     D3-F1      32541     912        2   -    28941    30        1       -      0:49   140.4       54       0:49  140.5       55    6:37   66
     D3-F2      32541     912        4   -    28930     4        1       -      0:09   129.1        9       0:09  129.1       12    1:54   14
                                     -   4    28930    30        -       1      1:31   132.8      168       1:31  132.8      172    2:11   14

                                                        Table 1: Experimental results.


we ﬁnd STSCCs on an abstract machine from the state vari-                      image/pre-image computations(Ops). N ew1 is the proposed
able. The abstract machine is made in the same way as in                       method without the trimming technique, whereas N ew2 is
livelock checking on FSM. Secondly, we evaluate the value                      the proposed method with the trimming technique. The
of the state variable in the livelock if the livelock exists.                  times are in the form of hh:mm:ss and the memory con-
   Figure 9 shows the procedure that checks toggle deadlock                    sumptions are in M-byte. The ﬁnal two columns(T raceGen)
on a given state variable t. CheckT oggleDeadlock takes                        show the results on trace generation on concrete machine for
transition relation (Q), a set of states (S), a set of initial                 the livelock or toggle deadlock found by N ew2, and T ime
states (I), a concrete machine (C), and the state variable                     shows the time spent for trace generation and Len shows
(t) as procedure inputs. CheckT oggleDeadlock is similar                       the trace length.
to CheckLivelock in Figure 6. For each reachable livelock                         We have chosen 3 industrial designs (D1, D2, and D3).
group Rkj in Line 6, T estT oggleDeadlock checks whether the                   For each design, we have run livelock or toggle deadlock
value of t toggles or not in the livelock group and returns dlk                checking on several sizes of abstract machines with the mul-
and c in Line 7. dlk represents whether the state variable is                  tiples of 30 latches. We have set the maximum run time to
in toggle deadlock or not, and c is the constant value (0 or                   24 CPU hours.
1) in the case of toggle deadlock.                                                In D1, there is one FSM automatically extracted. The
     CheckToggleDeadlock(Q, S, I, C, t) {                                      FSM consists of 7 latches and contains 632 latches in its COI.
1      remaining = I;                                                          We can see that the run time is exponentially increased,
2      k = 0;
3      s = PickOneState(I);                                                    depending on the size of the abstract machine. On this
4      while (s ̸= ZERO) {                                                     design, the livelock checking became infeasible when N =120.
5         (Rk , reachedk , Ok ) = FindLivelock(Q, S, s);                          In D2, there is also one FSM that was user-speciﬁed. The
6         for (j = 0; j < |Rk |; j++) {
7             (dlk, c) = TestToggleDeadlock(Rk   j
                                                   , t);
                                                                               FSM consists of 25 latches and contains only 68 latches in
8             if (dlk) {                                                       its COI. This design has a livelock group. However, the
                        j                        j
9                 tracek = GenerateTrace(C, Rk , s, Ok );                      livelock was not detected when N=30 and N=60. The live-
                                             j
10                ReportToggleDeadlock(s, Rk   , tracejk , c);                 lock was detected only when all the latches in the COI were
11            }                                                                included in the abstract machine. In other words, the ab-
12        }
13        remaining = remaining ∧ ¬reachedk ;                                  stract machine is the concrete machine at the FSM point of
14        s = PickOneState(remaining);                                         view. Since the livelock was found on the concrete machine,
15        k++;                                                                 trace concretization is not required since the abstract trace
16     }
     }                                                                         in Section 4.2 is already a concrete trace.
              Figure 9: Checking toggle deadlock.                                 D2 is the only design showing a signiﬁcant performance
                                                                               diﬀerence between N ew1 and N ew2 in the table. This is be-
6. EXPERIMENTAL RESULTS                                                        cause this design has many transient states as well as many
   We have implemented the proposed livelock checking and                      livelock groups. In this case, the trimming technique signif-
toggle deadlock checking algorithms. Table 1 shows our ex-                     icantly reduced the number of image/pre-image operations
perimental results on livelock and toggle deadlock checking,                   from 386K to 70K (5.5X reduction) that gave big speed-up
generated on a 1.4 GHz Intel processor machine with 4 GB                       from 5 hours to 1 hour (5X speed-up). This shows that the
memory running Red Hat Linux.                                                  trimming technique helps the performance when there are
   The ﬁrst column lists the design names. The next ﬁve                        many transient states. When there is no transient states,
columns present the statistics on the designs, in terms of the                 the trimming technique becomes a pure overhead as shown
number of latches (L), the number of inputs (I), the number                    in D3. However, the overhead is almost negligible from the
of latches in FSM (F ), the number of toggle signals to check                  experiment.
(T ), and the number of latches in the COI of either FSM                          In D3, there are two FSMs (F 1 and F 2). F 1 is composed
and a toggle signal (COI). The next three columns show the                     of 2 latches and a livelock was found with N =30 within 49
results on livelock and toggle deadlock checking. The col-                     seconds. The livelock was justiﬁed by trace concretization
umn with N shows how many latches were in the abstract                         that took 397 seconds, and the trace length was 66. F 2 is
machine. The column with Llk shows how many livelock                           composed of 4 latches and a livelock was found with N =4
groups are found and the column with Dlk shows how many                        (the FSM itself) in 9 seconds. The livelock was also justi-
toggle deadlock are found. The next six columns compare                        ﬁed by trace concretization that took 114 seconds, and the
the performance between two methods (N ew1 and N ew2),                         trace length was 14. We have also tried the toggle dead-
in terms of time(T ime), memory(M em), and the number of
lock checking on F 2 separately from the livelock checking.           [5] M. Case, H. Mony, J. Baumgartner, and R. Kanzelman.
A toggle deadlock was found in 90 seconds and the concrete                Enhanced veriﬁcation by temporal decomposition. In Formal
                                                                          Methods in Computer Aided Design, pages 37–54, 2009.
trace was generated in 131 seconds. D3 shows the value of             [6] H. Cho, G. D. Hachtel, E. Macii, M. Poncino, and F. Somenzi.
abstraction-based livelock and toggle deadlock checking.                  Automatic state space decomposition for approximate fsm
   Table 2 shows a comparison on ﬁnding all SCCs with four                traversal based on circuit analysis. IEEE Transactions on
algorithms (XB [28], Lockstep [3], Skeleton [12], IXB [20])               Computer-Aided Design, 15(12):1451–1464, Dec. 1996.
                                                                      [7] O. Coudert and J. C. Madre. A uniﬁed framework for the
on the design D2 from Table 1. In this design, the number of              formal veriﬁcation of sequential circuits. In Proceedings of the
recurrent states is 2.07e8 and the number of transient states             International Conference on Computer-Aided Design, pages
is 1.2e6 that is only 0.6% of all states. However, it turned out          126–129, Nov. 1990.
                                                                      [8] O. G. E. M. Clarke and D. Peled. Model Checking. The MIT
that how to handle these transient states eﬃciently is the key            Press, 1999.
factor in the performance. One main diﬀerence between XB              [9] N. Een and N. Sorensson. MiniSat.
and IXB is that IXB trims out those transient states as much              http://www.cs.chalmers.se/Cs/Research/FormalMethods/
as possible. This trimming technique makes the IXB method                 MiniSat.
                                                                     [10] E. A. Emerson and C. Lei. Modalities for model checking:
outperform on this design: faster in time (more than 15X)                 Branching time logic strikes back. Science of Computer
and fewer number of image operations (more than 10X) than                 Programming, 8:275–306, 1987.
the other methods. This explains why N ew2 outperformed              [11] E. A. Emerson and C.-L. Lei. Eﬃcient model checking in
on D2 in Table 1. Table 2 also shows why livelock checking is             fragments of the propositional mu-calculus. In Proceedings of
                                                                          the First Annual Symposium of Logic in Computer Science,
done by ﬁnding TSCCs instead of SCCs. Finding all livelock                pages 267–278, June 1986.
groups took 54 minutes, whereas ﬁnding all SCCs took 100             [12] R. Gentilini, C. Piazza, and A. Policriti. Computing strongly
minutes (2X) even with IXB.                                               connected components in a linear number of symbolic steps. In
                                                                          SODA ’03: Proceedings of the fourteenth annual ACM-SIAM
   Method         Time    Memory         Ops    SCCs     States           symposium on Discrete algorithms, pages 573–582, 2003.
   XB          84:07:56      98.2    1013333                         [13] G. D. Hachtel, E. Macii, A. Pardo, and F. Somenzi. Markovian
   Lockstep    45:55:53     237.3    2590724    19458    2.08e8           analysis of large ﬁnite state machines. IEEE Transactions on
   Skeleton    26:01:54     266.5    2609008                              Computer-Aided Design, 15(12):1479–1493, Dec. 1996.
   IXB          1:39:47      92.5     102990                         [14] R. Hojati, H. Touati, R. P. Kurshan, and R. K. Brayton.
                                                                          Eﬃcient ω-regular language containment. In Computer Aided
           Table 2: Finding all SCCs in D2.                               Verification, pages 371–382, Montréal, Canada, June 1992.
                                                                     [15] R. P. Kurshan. Computer-Aided Verification of Coordinating
7. CONCLUSIONS                                                            Processes. Princeton University Press, Princeton, NJ, 1994.
   We have presented a framework for abstraction-based live-         [16] L. Lamport. Proving the correctness of multiprocess programs.
lock and toggle deadlock checking, in order to handle large               IEEE Transactions on Software Engineering,
                                                                          SE-3(2):125–143, Mar. 1977.
designs in practice. Since exact livelock and toggle deadlock        [17] Y. Matsunaga, P. C. McGeer, and R. K. Brayton. On
checking is infeasible on real designs directly, our approach             computing the transitive closure of a state transition relation.
is to check livelock and toggle deadlock on abstract machine              In Proceedings of the Design Automation Conference, pages
of either an FSM or a toggle signal. Once we ﬁnd a livelock               260–265, June 1993.
                                                                     [18] K. Nanshi and F. Somenzi. Constraints in one-to-many
or toggle deadlock, we justify the livelock or toggle deadlock            concretization for abstraction reﬁnement. In Proceedings of the
on the concrete machine by concretizing the abstract trace                Design Automation Conference, pages 569–574, 2009.
on the concrete machine.                                             [19] S. Qadeer, R. K. Brayton, V. Singhal, and C. Pixley. Latch
   Even though the proposed approach does not prove the                   redundancy removal without global reset. In Proceedings of the
                                                                          International Conference on Computer Design, pages
non-existence of livelock or toggle deadlock on a design un-              432–439, 1996.
less the design is small enough to handle, this approach ﬁnds        [20] K. Ravi, R. Bloem, and F. Somenzi. A comparative study of
livelocks or toggle deadlocks on the design if there exists.              symbolic algorithms for the computation of fair cycles. In
                                                                          W. A. Hunt, Jr. and S. D. Johnson, editors, Formal Methods
   To the best of our knowledge, it is the ﬁrst approach to               in Computer Aided Design, pages 143–160. Springer-Verlag,
use the abstraction-based livelock checking and also the ﬁrst             Nov. 2000. LNCS 1954.
approach for checking toggle deadlock. The experimental re-          [21] V. Singhal. Design replacements for sequential circuits. Ph.D.
sults showed that the abstraction-based approach ﬁnds live-               dissertation, University of California at Berkeley, 1996.
                                                                     [22] V. Singhal, C. Pixley, A. Aziz, and R. K. Brayton. Theory of
lock errors on the real designs.                                          safe replacements for sequential circuits. IEEE Transactions
   As future work, we are interested in improving the con-                on Computer-Aided Design, 20(2):249–265, Feb. 2001.
cretization, ﬁnding more accurate inﬂuential latches, and            [23] R. Tarjan. Depth ﬁrst search and linear graph algorithms.
optimizing the computations with multiple FSMs or toggle                  SIAM Journal of Computing, 1:146–160, 1972.
                                                                     [24] H. J. Touati, R. K. Brayton, and R. P. Kurshan. Testing
signals by considering the overlaps in their COIs.                        language containment for ω-automata using BDD’s.
                                                                          Information and Computation, 118(1):101–109, Apr. 1995.
8. REFERENCES                                                        [25] M. Y. Vardi and P. Wolper. An automata-theoretic approach to
 [1] A. Biere, C. Artho, and V. Schuppan. Liveness checking as            automatic program veriﬁcation. In Proceedings of the First
     safety checking. In International Workshop in Formal                 Symposium on Logic in Computer Science, pages 322–331,
     Methods for Industrial Critical Systems, pages 160–177, 2002.        Cambridge, UK, June 1986.
 [2] A. Biere, A. Cimatti, E. Clarke, and Y. Zhu. Symbolic model     [26] T.-H. Wang and T. Edsall. Practical FSM analysis for verilog.
     checking without BDDs. In Fifth International Conference on          In IVC-VIUF ’98: Proceedings of the International Verilog
     Tools and Algorithms for Construction and Analysis of                HDL Conference and VHDL International Users Forum,
     Systems (TACAS’99), pages 193–207, Amsterdam, The                    pages 52–58, 1998.
     Netherlands, Mar. 1999. LNCS 1579.                              [27] A. Xie and P. A. Beeral. Eﬃcient state classiﬁcation of
 [3] R. Bloem, H. Gabow, and F. Somenzi. An algorithm for                 ﬁnite-state markov chains. IEEE Transactions on
     strongly connected component analysis in n log n symbolic            Computer-Aided Design, 17(12):1334–1339, Dec. 1998.
     steps. In Formal Methods in Computer Aided Design, pages        [28] A. Xie and P. A. Beeral. Implicit enumeration of strongly
     37–54, 2000.                                                         connected components and an application to formal
 [4] R. Bryant. Graph-based algorithms for boolean function               veriﬁcation. IEEE Transactions on Computer-Aided Design,
     manipulation. IEEE Transactions on Computers,                        19(10):1225–1230, Oct. 2000.
     C-35(8):677–691, Aug. 1986.