=Paper= {{Paper |id=Vol-1566/paper3 |storemode=property |title=NBTI Lifetime Evaluation and Extension in Instruction Caches |pdfUrl=https://ceur-ws.org/Vol-1566/Paper3.pdf |volume=Vol-1566 |authors=Shengyu Duan,Basel Halak,Rick Wong,Mark Zwolinski |dblpUrl=https://dblp.org/rec/conf/date/DuanHWZ16 }} ==NBTI Lifetime Evaluation and Extension in Instruction Caches== https://ceur-ws.org/Vol-1566/Paper3.pdf
                                                                                                                                   9

          NBTI Lifetime Evaluation and Extension in
                     Instruction Caches
                               Shengyu Duan⇤ , Basel Halak⇤ , Rick Wong† , Mark Zwolinski⇤
                         ⇤ School of Electronics and Computer Science, University of Southampton, UK

                                            Email: {sd5g13, bh9, mz}@ecs.soton.ac.uk
                                                       † Cisco Systems, Inc



   Abstract—CMOS devices suffer from wearout mechanisms               tor and an SRAM cell. In Section III, we demonstrate that the
resulting in reliability issues. Negative bias temperature insta-     pattern of NBTI stress locality does not vary much between
bility (NBTI) is one of the dominant ageing effects that can          different programs and from this cell lifetimes are calculated.
cause threshold voltage shift on PMOS devices and subsequently
impact circuit performance. The static noise margin (SNM) of an       The lifetime evaluation algorithm and the simulation results
SRAM cell may be sharply reduced with unbalanced NBTI stress.         for instruction caches in ARM and MIPS architectures are
This will impact SRAM read stability. From our observations           presented in Section IV, while Section V describes the lifetime
of instruction caches, NBTI stress duty cycles for each cache         extension by cell flipping. Finally, the paper is concluded in
line generally have similar but unbalanced patterns even when         Section VI.
running very different programs. Based on the patterns, we
propose an algorithm to evaluate the lifetime of instruction             II. NBTI E FFECT AND SRAM C ELL D EGRADATION
caches by running SPICE simulation. The results predict 6
and 7 years NBTI lifetimes of instruction caches for ARM and          A. Impact of NBTI on Single PMOS Transistor
MIPS architectures respectively. One of the practical solutions          NBTI can result in an increased Vth over time. A PMOS
is periodically flipping each cell to balance the degradation rate.   transistor can be switched between the NBTI stress phase
However the performance benefits in terms of lifetime are not
actually proven before. Using the stress patterns and lifetime        and the recovery phase. Si-H bonds are disassociated under
evaluation algorithm, our work for the first time prove this          negative bias condition (Vgs = VDD ) and hydrogen spaces
technique can extend the lifetime of the cache by two orders          and traps are produced at the oxide interface. These hydro-
of magnitude.                                                         gen spaces then diffuse away. Once the stress is removed
                                                                      (Vgs = 0), some bonds recover because of recombination with
                       I. I NTRODUCTION
                                                                      hydrogen. Some traps still remain and therefore the recovery
   As transistor dimensions continue to shrink, reliability is        is partial.
one of the most significant remaining concerns for CMOS                  Thus, the Vth shift is proportional to the density of traps
technology [1]. Negative bias temperature instability (NBTI)          at the oxide interface [4], [11]. The traps are produced during
is one of the dominant ageing mechanisms, in which the                the stress phase and some will be neutralized in the recovery
threshold voltage (Vth ) of a PMOS transistor [2]–[4] increases       phase. Therefore, Vth degradation is highly dependent on the
over time.                                                            stress duty cycle, which is the probability of a logic zero at
   The NBTI effect on CMOS memory devices such as SRAM                the gate of a PMOS transistor in a digital circuit.
cache has received much attention [5]–[7]. NBTI leads to                 In [12], the authors propose a long-term NBTI model to
degradation of the SRAM static noise margin (SNM) due                 quantify Vth degradation after a given operation time t:
to time-dependent mismatches [8], [9]. One of the practical                                              p            !2n
solutions is periodically flipping each cell to balance the                                                 Kv2 ↵Tclk
                                                                                            Vth (t) =                             (1)
degradation rate [7], [10]. However, since the storage value                                              1
                                                                                                                1/2n
                                                                                                                t
is considered unpredictable in these works, the performance                   where
                                                                                                          p                 !
benefits of this technique are not actually proven.                                              2⇠1 te + ⇠2 C(1 ↵)Tclk
   Our work presents a method to evaluate NBTI lifetime in                            t =   1                     p
                                                                                                         2tox + Ct
instruction caches. The contributions are as follows: 1) a novel
analysis of the instruction cache that shows the NBTI stress          where ↵ is the key parameter – the stress duty cycle. Kv
duty cycles for each cache line generally have similar patterns       is a function of supply voltage, temperature and technology
even when running very different programs; 2) an algorithm            while Tclk is the equivalent stress-recovery period. n is either
of running SPICE simulation to predict the NBTI lifetime for          1/4 or 1/6 depending on the diffusion spaces (H or H2 ).
the instruction cache based on this observation; 3) lifetime          C is the diffusion speed in the gate material. ⇠1 and ⇠2
extension of cell flipping in instruction caches is proven by         represent the annealing probabilities in the oxide and the
using the stress patterns and lifetime evaluation algorithm.          gate respectively. Finally, te is the effective oxide thickness
   This paper is organized as follows. Section II presents the        indicating the diffusion distance in the oxide and is less than
theory and simulation results of NBTI on both a single transis-       or equal to the oxide thickness, tox .

Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE
2016 - Dresden, Germany
Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.
                                                                                                                                           10
                                                                           The static noise margin (SNM) is the biggest noise voltage
                                                                        the SRAM cell can tolerate. Vth mismatch on an SRAM cell
                                                                        can result in an asymmetric transfer characteristic and thereby
                                                                        reduce the SNM, Figure 3.
                                                                           Using the long-term NBTI model in Equation (1) to modify
                                                                        the SPICE model, the degradation in the SRAM cell can be
                                                                        simulated, Figure 4. A cell with 50% stress duty cycle ages
                                                                        most slowly because MP1 and MP2 are matched. Uneven
                                                                        stress accelerates ageing.

Fig. 1. NBTI simulations on PMOS transistor for different duty cycles
(T=300K, Vdd=1.2V, Vtp=-0.276V)



   Based on the long-term NBTI model, above, and data from
[12], Vth shifts are simulated using MATLAB as in Figure
1. The technology parameters are from the Synopsys 90-nm
SPICE model. 90-nm technology is, truly, outdated. However,
according to the NBTI models proposed already [2], [3], we
believe the trends also apply to smaller technologies.                  Fig. 4. SNM degradation simulations for different duty cycles (T=300K,
                                                                        Vdd=1.2V, Vtp=-0.276V)
B. Impact of NBTI on 6-T SRAM Cell

                                                                              III. S TRESS L OCALITY OF I NSTRUCTION C ACHE
             WL
                                            V DD                           The observation is noticed that the data stored in a cache
                                                                        shows very similar patterns when executing different bench-
                            MP1                     MP2                 mark programs. We ran a test on the instruction caches of
                                                                        ARM and MIPS architectures, using GEM5. 16 benchmark
                             Q                      Q                   programs, all with more than ten thousand instructions, were
                      MN3                                  MN4
                                                                        chosen. The signal probability of each bit of each cache word
                            MN1                     MN2
                                                                        is shown in Figure 5. It can be seen that some bits preserve the
               BL                                                 BL    same values in most locations, consequently leading to NBTI
                                                                        stress locality.
                                                                           This phenomenon can be explained as following. For any
                  Fig. 2. Six transistors SRAM cell circuit
                                                                        program, we can expect some types of instruction to be used
                                                                        more frequently than the others. Take the ARM processor re-
   Figure 2 shows a basic 6-T SRAM cell, in which only the
                                                                        sults in Figure 5a as an example. The most significant four bits
pull-up transistors, MP1 and MP2, would suffer from NBTI
                                                                        are the condition field and ”1110” is used for unconditional
[8]. Since MP1 and MP2 are part of cross coupled inverters,
                                                                        instructions. The number of unconditional instructions is much
only one would be under NBTI stress at any time. This might
                                                                        bigger than that of conditional ones in any program. As a
result in unbalanced Vth degradations of these two transistors
                                                                        result, the most significant four bits have a high probability of
and thereby lead to a mismatch.
                                                                        being ”1110” as seen in Figure 5a.
                                                                           If the SNM degrades to a value smaller than expected noise,
                                                                        the storage data might be flipped, which causes a failure
               V DD                          SNM for fresh SRAM         when the data is read out. This gives the NBTI lifetime of
                                             SNM after ageing           the instruction cache. 50% signal probability will give the
                                                                        longest lifetime because both inverters in the SRAM cell
                                                                        age at the same rate and so are not mismatched. The bit
              VQ                                                        with the probability furthest from 50% would fail first, which
                                                                        determines the lifetime of the whole SRAM array.
                              Ageing
                                                                            IV. NBTI L IFETIME E VALUATION OF I NSTRUCTION
                    0                  VQ                  V DD                                 C ACHE
                                                                           According to the stress locality in last section, NBTI
                                                                        lifetime of a instruction cache is predictable. We propose
                    Fig. 3. SRAM cell SNM degradation                   Algorithm1 to evaluate the lifetime. This algorithm uses Monte

Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE
2016 - Dresden, Germany
Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.
                                                                                                                                                 11
                                                                            Figure 5. For each bit, we run 1600 simulations to guarantee
                                                                            a 95% confidence level within 1% probability error. The
                                                                            simulation results predict 6 and 7 years NBTI lifetimes for
                                                                            ARM and MIPS architectures respectively, at which point, the
                                                                            stored values in some SRAM cells start to be corrupted, Figure
                                                                            6.




                         (a) CPU model: 32-bit ARM                                                      Data flipping error
                                                                                                         starts to occur


        Mean value µ26        Standard deviation σ26




                                                                                                  (a) CPU model: 32-bit ARM



                                                                                                               Data flipping error
                                                                                                                starts to occur
                         (b) CPU model: 32-bit MIPS
Fig. 5. Probability mean values and standard deviations of one cache word
in instruction cache when running 16 benchmark programs



Carlo simulations to detect the moment when stored data is
corrupted and also calculates the flipped bit rate over the whole                                 (b) CPU model: 32-bit MIPS
instruction cache.
                                                                            Fig. 6. SRAM cache NBTI lifetimes and failure rates simulations based on
                                                                            Algorithm 1 (T=300K, Vdd=1.2V, Vnoise=+/-0.32V)
Algorithm 1 SRAM cache lifetime evaluation
 1: procedure L IFE E VA
 2:    i     0                . i indicates current bit location              V. L IFETIME E XTENSION BY P ERIODIC C ELL F LIPPING
 3:    j     1             . j indicates current iteration times
                                                                               The motivation for previous cell flipping work, [7], [10],
 4:    t     0                        . t indicates current year
                                                                            is to avoid a cell holding the same value for a long time.
 5: Monte Carlo:
                                                                            However, since the storage value is considered unpredictable in
 6:    ↵1 ⇠ N (µi , i2 )                  . Normal distribution
                                                                            that work, the performance benefits of periodical cell flipping
 7:    ↵2      1 ↵1
                                                                            are not actually proven. On the other hand, in our work, we
 8:       Vth1    FN BT I (↵1 , t) . Implement NBTI model
                                                                            note that the NBTI stress in the instruction cache stays constant
 9:       Vth2    FN BT I (↵2 , t)
                                                                            over time.
10:    run SP ICE simulation
                                                                               Figure 7 shows the new probability mean values and stan-
11:    if Data f lipping error occurs then
                                                                            dard deviations if cell flipping is applied. By definition, the
12:        lif etime     t
                                                                            mean values of the probabilities are at 50%. From this, the
13:    else if j < total iteration times then
                                                                            new predicted lifetimes can be calculated, as shown in Figure
14:        j     j+1
                                                                            8. As can be seen, for the same operating conditions, data
15:    else if i < instruction length 1 then
                                                                            failures start to occur after more than 300 years in both ARM
16:        i     i+1
                                                                            and MIPS processors. While the exact figure is, of course,
17:    else
                                                                            dependent on the modelling, it is unarguable that a significant
18:        t     t+1
                                                                            extension to the lifetime of an SRAM instruction cache is
19:    update f lipped bit rate
                                                                            achievable by simply flipping cell values periodically.
20:    goto Monte Carlo
                                                                                                     VI. C ONCLUSION
   To model typical operation, the storage value in each cell                 Rapid shrinkage of CMOS transistors has led to con-
is set to both 1 and 0 but with different probabilities: we                 cerns about reliability risks such as ageing. The effect of
assume NBTI stress duty cycles are distributed with normal                  NBTI on the Static Noise Margin of SRAM-based instruction
distributions with the means and standard deviations shown in               cache is discussed in this paper. NBTI directly affects the

Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE
2016 - Dresden, Germany
Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.
                                                                                                                                                        12
                                                                             instruction caches in ARM and MIPS processors by using the
                                                                             proposed lifetime evaluation method. Additionally, the benefit
                                                                             of lifetime extension by periodically flipping each SRAM cell
                                                                             is presented using our proposed stress patterns and lifetime
                                                                             evaluation algorithm. It has been shown the instruction cache
                                                                             lifetimes can be extended by two orders of magnitude by this
                                                                             technique.
                                                                                                           R EFERENCES
                                                                              [1] G. Ribes, M. Rafik, and D. Roy, “Reliability issues for nano-scale CMOS
                       (a) CPU model: 32-bit ARM                                  dielectrics,” Microelectronic engineering, vol. 84, no. 9, pp. 1910–1916,
                                                                                  2007.
                                                                              [2] W. Wang, S. Yang, S. Bhardwaj, S. Vrudhula, F. Liu, and Y. Cao, “The
                                                                                  impact of NBTI effect on combinational circuit: modeling, simulation,
                                                                                  and analysis,” Very Large Scale Integration (VLSI) Systems, IEEE
                                                                                  Transactions on, vol. 18, no. 2, pp. 173–183, 2010.
                                                                              [3] K. K. Saluja, S. Vijayakumar, W. Sootkaneung, and X. Yang, “NBTI
                                                                                  degradation: A problem or a scare?” in VLSI Design, 2008. VLSID 2008.
                                                                                  21st International Conference on. IEEE, 2008, pp. 137–142.
                                                                              [4] S. V. Kumar, C. H. Kim, and S. S. Sapatnekar, “An analytical model
                                                                                  for negative bias temperature instability,” in Proceedings of the 2006
                                                                                  IEEE/ACM international conference on Computer-aided design. ACM,
                                                                                  2006, pp. 493–496.
                                                                              [5] V. Huard, C. Parthasarathy, C. Guerin, T. Valentin, E. Pion, M. Mam-
                      (b) CPU model: 32-bit MIPS                                  masse, N. Planes, and L. Camus, “NBTI degradation: From transistor
                                                                                  to SRAM arrays,” in Reliability Physics Symposium, 2008. IRPS 2008.
Fig. 7. Probability mean values and standard deviations of cell flipping          IEEE International. IEEE, 2008, pp. 289–300.
instruction cache                                                             [6] A. Calimera, M. Loghi, E. Macii, and M. Poncino, “Dynamic indexing:
                                                                                  concurrent leakage and aging optimization for caches,” in Proceedings of
                                                                                  the 16th ACM/IEEE international symposium on Low power electronics
                                                                                  and design. ACM, 2010, pp. 343–348.
                                                                              [7] S. V. Kumar, C. H. Kim, and S. S. Sapatnekar, “Impact of NBTI on
                                                                                  SRAM read stability and design for reliability,” in Quality Electronic
                                           Data flipping error                    Design, 2006. ISQED’06. 7th International Symposium on. IEEE, 2006,
                                            starts to occur                       pp. 6–pp.
                                                                              [8] J. Qin, X. Li, and J. B. Bernstein, “Sram stability analysis considering
                                                                                  gate oxide sbd, nbti and hci,” in Integrated Reliability Workshop Final
                                                                                  Report, 2007. IRW 2007. IEEE International. IEEE, 2007, pp. 33–37.
                                                                              [9] X. Li, J. Qin, B. Huang, X. Zhang, and J. B. Bernstein, “SRAM circuit-
                                                                                  failure modeling and reliability simulation with SPICE,” Device and
                                                                                  Materials Reliability, IEEE Transactions on, vol. 6, no. 2, pp. 235–246,
                                                                                  2006.
                       (a) CPU model: 32-bit ARM                             [10] A. Gebregiorgis, M. Ebrahimi, S. Kiamehr, F. Oboril, S. Hamdioui, and
                                                                                  M. B. Tahoori, “Aging mitigation in memory arrays using self-controlled
                                                                                  bit-flipping technique,” in Design Automation Conference (ASP-DAC),
                                                                                  2015 20th Asia and South Pacific. IEEE, 2015, pp. 231–236.
                                            Data flipping error              [11] W. Wang, V. Reddy, A. T. Krishnan, R. Vattikonda, S. Krishnan, and
                                             starts to occur                      Y. Cao, “Compact modeling and simulation of circuit reliability for
                                                                                  65-nm CMOS technology,” Device and Materials Reliability, IEEE
                                                                                  Transactions on, vol. 7, no. 4, pp. 509–517, 2007.
                                                                             [12] S. Bhardwaj, W. Wang, R. Vattikonda, Y. Cao, and S. Vrudhula,
                                                                                  “Predictive modeling of the NBTI effect for reliable design,” in Custom
                                                                                  Integrated Circuits Conference, 2006. CICC’06. IEEE. IEEE, 2006,
                                                                                  pp. 189–192.


                      (b) CPU model: 32-bit MIPS
Fig. 8. NBTI lifetimes and failure rates simulations for both non-flipping
cache and cell flipping one (T=300K, Vdd=1.2V, Vnoise=+/-0.32V)



threshold voltage of PMOS devices and thereby impacts on
the performance. In an SRAM cell, an unbalanced NBTI
stress duty cycle can reduce the SNM and affect the read
stability. From our observations, the NBTI stress duty cycles
for an instruction cache generally has similar patterns even
running very different programs. Therefore the NBTI lifetime
is predictable, and our results suggest 6 or 7 year lifetimes for

Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE
2016 - Dresden, Germany
Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.