9 NBTI Lifetime Evaluation and Extension in Instruction Caches Shengyu Duan⇤ , Basel Halak⇤ , Rick Wong† , Mark Zwolinski⇤ ⇤ School of Electronics and Computer Science, University of Southampton, UK Email: {sd5g13, bh9, mz}@ecs.soton.ac.uk † Cisco Systems, Inc Abstract—CMOS devices suffer from wearout mechanisms tor and an SRAM cell. In Section III, we demonstrate that the resulting in reliability issues. Negative bias temperature insta- pattern of NBTI stress locality does not vary much between bility (NBTI) is one of the dominant ageing effects that can different programs and from this cell lifetimes are calculated. cause threshold voltage shift on PMOS devices and subsequently impact circuit performance. The static noise margin (SNM) of an The lifetime evaluation algorithm and the simulation results SRAM cell may be sharply reduced with unbalanced NBTI stress. for instruction caches in ARM and MIPS architectures are This will impact SRAM read stability. From our observations presented in Section IV, while Section V describes the lifetime of instruction caches, NBTI stress duty cycles for each cache extension by cell flipping. Finally, the paper is concluded in line generally have similar but unbalanced patterns even when Section VI. running very different programs. Based on the patterns, we propose an algorithm to evaluate the lifetime of instruction II. NBTI E FFECT AND SRAM C ELL D EGRADATION caches by running SPICE simulation. The results predict 6 and 7 years NBTI lifetimes of instruction caches for ARM and A. Impact of NBTI on Single PMOS Transistor MIPS architectures respectively. One of the practical solutions NBTI can result in an increased Vth over time. A PMOS is periodically flipping each cell to balance the degradation rate. transistor can be switched between the NBTI stress phase However the performance benefits in terms of lifetime are not actually proven before. Using the stress patterns and lifetime and the recovery phase. Si-H bonds are disassociated under evaluation algorithm, our work for the first time prove this negative bias condition (Vgs = VDD ) and hydrogen spaces technique can extend the lifetime of the cache by two orders and traps are produced at the oxide interface. These hydro- of magnitude. gen spaces then diffuse away. Once the stress is removed (Vgs = 0), some bonds recover because of recombination with I. I NTRODUCTION hydrogen. Some traps still remain and therefore the recovery As transistor dimensions continue to shrink, reliability is is partial. one of the most significant remaining concerns for CMOS Thus, the Vth shift is proportional to the density of traps technology [1]. Negative bias temperature instability (NBTI) at the oxide interface [4], [11]. The traps are produced during is one of the dominant ageing mechanisms, in which the the stress phase and some will be neutralized in the recovery threshold voltage (Vth ) of a PMOS transistor [2]–[4] increases phase. Therefore, Vth degradation is highly dependent on the over time. stress duty cycle, which is the probability of a logic zero at The NBTI effect on CMOS memory devices such as SRAM the gate of a PMOS transistor in a digital circuit. cache has received much attention [5]–[7]. NBTI leads to In [12], the authors propose a long-term NBTI model to degradation of the SRAM static noise margin (SNM) due quantify Vth degradation after a given operation time t: to time-dependent mismatches [8], [9]. One of the practical p !2n solutions is periodically flipping each cell to balance the Kv2 ↵Tclk Vth (t) = (1) degradation rate [7], [10]. However, since the storage value 1 1/2n t is considered unpredictable in these works, the performance where p ! benefits of this technique are not actually proven. 2⇠1 te + ⇠2 C(1 ↵)Tclk Our work presents a method to evaluate NBTI lifetime in t = 1 p 2tox + Ct instruction caches. The contributions are as follows: 1) a novel analysis of the instruction cache that shows the NBTI stress where ↵ is the key parameter – the stress duty cycle. Kv duty cycles for each cache line generally have similar patterns is a function of supply voltage, temperature and technology even when running very different programs; 2) an algorithm while Tclk is the equivalent stress-recovery period. n is either of running SPICE simulation to predict the NBTI lifetime for 1/4 or 1/6 depending on the diffusion spaces (H or H2 ). the instruction cache based on this observation; 3) lifetime C is the diffusion speed in the gate material. ⇠1 and ⇠2 extension of cell flipping in instruction caches is proven by represent the annealing probabilities in the oxide and the using the stress patterns and lifetime evaluation algorithm. gate respectively. Finally, te is the effective oxide thickness This paper is organized as follows. Section II presents the indicating the diffusion distance in the oxide and is less than theory and simulation results of NBTI on both a single transis- or equal to the oxide thickness, tox . Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE 2016 - Dresden, Germany Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors. 10 The static noise margin (SNM) is the biggest noise voltage the SRAM cell can tolerate. Vth mismatch on an SRAM cell can result in an asymmetric transfer characteristic and thereby reduce the SNM, Figure 3. Using the long-term NBTI model in Equation (1) to modify the SPICE model, the degradation in the SRAM cell can be simulated, Figure 4. A cell with 50% stress duty cycle ages most slowly because MP1 and MP2 are matched. Uneven stress accelerates ageing. Fig. 1. NBTI simulations on PMOS transistor for different duty cycles (T=300K, Vdd=1.2V, Vtp=-0.276V) Based on the long-term NBTI model, above, and data from [12], Vth shifts are simulated using MATLAB as in Figure 1. The technology parameters are from the Synopsys 90-nm SPICE model. 90-nm technology is, truly, outdated. However, according to the NBTI models proposed already [2], [3], we believe the trends also apply to smaller technologies. Fig. 4. SNM degradation simulations for different duty cycles (T=300K, Vdd=1.2V, Vtp=-0.276V) B. Impact of NBTI on 6-T SRAM Cell III. S TRESS L OCALITY OF I NSTRUCTION C ACHE WL V DD The observation is noticed that the data stored in a cache shows very similar patterns when executing different bench- MP1 MP2 mark programs. We ran a test on the instruction caches of ARM and MIPS architectures, using GEM5. 16 benchmark Q Q programs, all with more than ten thousand instructions, were MN3 MN4 chosen. The signal probability of each bit of each cache word MN1 MN2 is shown in Figure 5. It can be seen that some bits preserve the BL BL same values in most locations, consequently leading to NBTI stress locality. This phenomenon can be explained as following. For any Fig. 2. Six transistors SRAM cell circuit program, we can expect some types of instruction to be used more frequently than the others. Take the ARM processor re- Figure 2 shows a basic 6-T SRAM cell, in which only the sults in Figure 5a as an example. The most significant four bits pull-up transistors, MP1 and MP2, would suffer from NBTI are the condition field and ”1110” is used for unconditional [8]. Since MP1 and MP2 are part of cross coupled inverters, instructions. The number of unconditional instructions is much only one would be under NBTI stress at any time. This might bigger than that of conditional ones in any program. As a result in unbalanced Vth degradations of these two transistors result, the most significant four bits have a high probability of and thereby lead to a mismatch. being ”1110” as seen in Figure 5a. If the SNM degrades to a value smaller than expected noise, the storage data might be flipped, which causes a failure V DD SNM for fresh SRAM when the data is read out. This gives the NBTI lifetime of SNM after ageing the instruction cache. 50% signal probability will give the longest lifetime because both inverters in the SRAM cell age at the same rate and so are not mismatched. The bit VQ with the probability furthest from 50% would fail first, which determines the lifetime of the whole SRAM array. Ageing IV. NBTI L IFETIME E VALUATION OF I NSTRUCTION 0 VQ V DD C ACHE According to the stress locality in last section, NBTI lifetime of a instruction cache is predictable. We propose Fig. 3. SRAM cell SNM degradation Algorithm1 to evaluate the lifetime. This algorithm uses Monte Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE 2016 - Dresden, Germany Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors. 11 Figure 5. For each bit, we run 1600 simulations to guarantee a 95% confidence level within 1% probability error. The simulation results predict 6 and 7 years NBTI lifetimes for ARM and MIPS architectures respectively, at which point, the stored values in some SRAM cells start to be corrupted, Figure 6. (a) CPU model: 32-bit ARM Data flipping error starts to occur Mean value µ26 Standard deviation σ26 (a) CPU model: 32-bit ARM Data flipping error starts to occur (b) CPU model: 32-bit MIPS Fig. 5. Probability mean values and standard deviations of one cache word in instruction cache when running 16 benchmark programs Carlo simulations to detect the moment when stored data is corrupted and also calculates the flipped bit rate over the whole (b) CPU model: 32-bit MIPS instruction cache. Fig. 6. SRAM cache NBTI lifetimes and failure rates simulations based on Algorithm 1 (T=300K, Vdd=1.2V, Vnoise=+/-0.32V) Algorithm 1 SRAM cache lifetime evaluation 1: procedure L IFE E VA 2: i 0 . i indicates current bit location V. L IFETIME E XTENSION BY P ERIODIC C ELL F LIPPING 3: j 1 . j indicates current iteration times The motivation for previous cell flipping work, [7], [10], 4: t 0 . t indicates current year is to avoid a cell holding the same value for a long time. 5: Monte Carlo: However, since the storage value is considered unpredictable in 6: ↵1 ⇠ N (µi , i2 ) . Normal distribution that work, the performance benefits of periodical cell flipping 7: ↵2 1 ↵1 are not actually proven. On the other hand, in our work, we 8: Vth1 FN BT I (↵1 , t) . Implement NBTI model note that the NBTI stress in the instruction cache stays constant 9: Vth2 FN BT I (↵2 , t) over time. 10: run SP ICE simulation Figure 7 shows the new probability mean values and stan- 11: if Data f lipping error occurs then dard deviations if cell flipping is applied. By definition, the 12: lif etime t mean values of the probabilities are at 50%. From this, the 13: else if j < total iteration times then new predicted lifetimes can be calculated, as shown in Figure 14: j j+1 8. As can be seen, for the same operating conditions, data 15: else if i < instruction length 1 then failures start to occur after more than 300 years in both ARM 16: i i+1 and MIPS processors. While the exact figure is, of course, 17: else dependent on the modelling, it is unarguable that a significant 18: t t+1 extension to the lifetime of an SRAM instruction cache is 19: update f lipped bit rate achievable by simply flipping cell values periodically. 20: goto Monte Carlo VI. C ONCLUSION To model typical operation, the storage value in each cell Rapid shrinkage of CMOS transistors has led to con- is set to both 1 and 0 but with different probabilities: we cerns about reliability risks such as ageing. The effect of assume NBTI stress duty cycles are distributed with normal NBTI on the Static Noise Margin of SRAM-based instruction distributions with the means and standard deviations shown in cache is discussed in this paper. NBTI directly affects the Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE 2016 - Dresden, Germany Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors. 12 instruction caches in ARM and MIPS processors by using the proposed lifetime evaluation method. Additionally, the benefit of lifetime extension by periodically flipping each SRAM cell is presented using our proposed stress patterns and lifetime evaluation algorithm. It has been shown the instruction cache lifetimes can be extended by two orders of magnitude by this technique. R EFERENCES [1] G. Ribes, M. Rafik, and D. Roy, “Reliability issues for nano-scale CMOS (a) CPU model: 32-bit ARM dielectrics,” Microelectronic engineering, vol. 84, no. 9, pp. 1910–1916, 2007. [2] W. Wang, S. Yang, S. Bhardwaj, S. Vrudhula, F. Liu, and Y. Cao, “The impact of NBTI effect on combinational circuit: modeling, simulation, and analysis,” Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 18, no. 2, pp. 173–183, 2010. [3] K. K. Saluja, S. Vijayakumar, W. Sootkaneung, and X. Yang, “NBTI degradation: A problem or a scare?” in VLSI Design, 2008. VLSID 2008. 21st International Conference on. IEEE, 2008, pp. 137–142. [4] S. V. Kumar, C. H. Kim, and S. S. Sapatnekar, “An analytical model for negative bias temperature instability,” in Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design. ACM, 2006, pp. 493–496. [5] V. Huard, C. Parthasarathy, C. Guerin, T. Valentin, E. Pion, M. Mam- (b) CPU model: 32-bit MIPS masse, N. Planes, and L. Camus, “NBTI degradation: From transistor to SRAM arrays,” in Reliability Physics Symposium, 2008. IRPS 2008. Fig. 7. Probability mean values and standard deviations of cell flipping IEEE International. IEEE, 2008, pp. 289–300. instruction cache [6] A. Calimera, M. Loghi, E. Macii, and M. Poncino, “Dynamic indexing: concurrent leakage and aging optimization for caches,” in Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design. ACM, 2010, pp. 343–348. [7] S. V. Kumar, C. H. Kim, and S. S. Sapatnekar, “Impact of NBTI on SRAM read stability and design for reliability,” in Quality Electronic Data flipping error Design, 2006. ISQED’06. 7th International Symposium on. IEEE, 2006, starts to occur pp. 6–pp. [8] J. Qin, X. Li, and J. B. Bernstein, “Sram stability analysis considering gate oxide sbd, nbti and hci,” in Integrated Reliability Workshop Final Report, 2007. IRW 2007. IEEE International. IEEE, 2007, pp. 33–37. [9] X. Li, J. Qin, B. Huang, X. Zhang, and J. B. Bernstein, “SRAM circuit- failure modeling and reliability simulation with SPICE,” Device and Materials Reliability, IEEE Transactions on, vol. 6, no. 2, pp. 235–246, 2006. (a) CPU model: 32-bit ARM [10] A. Gebregiorgis, M. Ebrahimi, S. Kiamehr, F. Oboril, S. Hamdioui, and M. B. Tahoori, “Aging mitigation in memory arrays using self-controlled bit-flipping technique,” in Design Automation Conference (ASP-DAC), 2015 20th Asia and South Pacific. IEEE, 2015, pp. 231–236. Data flipping error [11] W. Wang, V. Reddy, A. T. Krishnan, R. Vattikonda, S. Krishnan, and starts to occur Y. Cao, “Compact modeling and simulation of circuit reliability for 65-nm CMOS technology,” Device and Materials Reliability, IEEE Transactions on, vol. 7, no. 4, pp. 509–517, 2007. [12] S. Bhardwaj, W. Wang, R. Vattikonda, Y. Cao, and S. Vrudhula, “Predictive modeling of the NBTI effect for reliable design,” in Custom Integrated Circuits Conference, 2006. CICC’06. IEEE. IEEE, 2006, pp. 189–192. (b) CPU model: 32-bit MIPS Fig. 8. NBTI lifetimes and failure rates simulations for both non-flipping cache and cell flipping one (T=300K, Vdd=1.2V, Vnoise=+/-0.32V) threshold voltage of PMOS devices and thereby impacts on the performance. In an SRAM cell, an unbalanced NBTI stress duty cycle can reduce the SNM and affect the read stability. From our observations, the NBTI stress duty cycles for an instruction cache generally has similar patterns even running very different programs. Therefore the NBTI lifetime is predictable, and our results suggest 6 or 7 year lifetimes for Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE 2016 - Dresden, Germany Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors.