=Paper= {{Paper |id=Vol-1566/paper1 |storemode=property |title=Design–Reliability Flow and Advanced Models Address IC-Reliability Issues |pdfUrl=https://ceur-ws.org/Vol-1566/Paper1.pdf |volume=Vol-1566 |authors=Mohamed Selim,Eric Jeandeau,Cyril Descleves |dblpUrl=https://dblp.org/rec/conf/date/SelimJD16 }} ==Design–Reliability Flow and Advanced Models Address IC-Reliability Issues== https://ceur-ws.org/Vol-1566/Paper1.pdf
                                                                                                                                                   1


  Design–Reliability Flow and Advanced Models Address IC-
                       Reliability Issues
                            Mohamed Selim, Eric Jeandeau, Cyril Descleves
                                         Mentor Graphics
Abstract—                                                                  design itself, the architecture, the chosen device geometries
                                                                           and, most importantly, the actual stimuli applied to the
Reliability effects are real threats with advanced process nodes. This     circuit during operation will strongly determine the
paper describes the industrial challenges associated with accurate         magnitude and speed of degradation. Fortunately, tools are
modeling of aging problems. Joint design–reliability flows can mitigate
their effects, and allow designers to take those effects into account as
                                                                           now available that help designers to understand exactly
early as possible.                                                         when and how much devices are stressed and damaged, and
                                                                           how this aging impacts the device or circuit performance
                                                                           after a certain time, allowing designers to take corrective
  I.    INTRODUCTION                                                       actions if necessary. The necessity of linking classical
                                                                           design flows with the capability to predict reliability has
    Advanced short-geometry CMOS processes are subject                     been recognized by the industry [3, 4, 5].
to aging that causes major reliability issues, degrading the
performance of integrated circuits over time. Degradation
effects causing aging are hot carrier injection (HCI) and
negative bias temperature instability (NBTI), in addition to
positive bias temperature instability (PBTI) and time
dependent dielectric breakdown (TDDB). Below 90 nm,
consideration of these effects is becoming mandatory for
design flows targeting quality and reliability. This paper
describes the state-of-the-art simulation flow that can help
designers address these issues and to create more reliable
designs.
    These particular reliability effects modify the
fundamental behavior of the transistors, such as threshold                 Figure 1 Reliability bathtub and the effect of new node introduction.
voltage (Vth) and the mobility factor [1]. No applications
that make full usage of the process performance are really
safe. These changes will affect timing delays, drive currents,
                                                                            II.     RELIABILITY SIMULATION
leakage, linearity, and every possible specification that may
appear in IC design, be it for automotive, biomedical,            Each device experiences a different stress that depends on
military-aerospace, wireless communications, or video.         its exact individual bias conditions, in addition to global
Basically, all industry sectors are potentially affected.      conditions such as temperature. This stress is computed
                                                               individually and integrated over a particular time window
Device failure phases are described as:                        [6]. This time window is chosen to be compatible with the
                                                               CPU time constraints for circuit simulation. By
a) Infant Mortality (early rate)                               extrapolation, an estimate of the stress seen by each device
b) Normal Operating Life (constant random failures, intrinsic during a much longer periodic operation (maybe weeks,
     rate)                                                     months, or years) is computed.
                                                                   This stress quantity is then used to compute degraded
c) Wear Out
                                                               values of the model parameters (such as the threshold,
    It would be obvious that for new technology nodes the      mobility, etc.). Using these degraded model parameters, a
wear out failure curve shifts to the left, which means earlier new simulation is run which represents what would happen
device degradation as shown in Figure 1 [2]. The actual        after N years of operation. In this aged simulation, each
stress effect depends on the design itself. The manufacturing  device uses its own individual degraded model with updated
process can be refined so that these effects are globally      Vth, mobility etc., because the stress is device-specific and
minimized. However degradation is tightly linked to the




Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE
2016 - Dresden, Germany
Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.
                                                                                                                                2



so is the update. Fresh and aged simulation results can be         two identical devices drawn side-by-side, they will have
overlapped and compared.                                           slightly different (fresh) threshold voltages, drive current,
   Models can either take into account DC stress or dynamic        and leakage currents. The distribution of these parameters is
stress, including the recovery effect [4]. Including the           monitored by the foundries, allowing statistical simulations
recovery effect leads to less pessimistic estimations of the       by designers. But, unfortunately, the same is true for their
degradation compared to a static DC stress approach hence          aging characteristics. Two identical devices with identical
avoiding over-design to compensate the degradation.                bias conditions do not age exactly in the same way. The Vth
Although rather sophisticated, this entire process is now          shift, for example, is a statistical variable with its own mean,
completely integrated in regular SPICE flows (Figure 2).           variance, etc. This spread of the aged transistor parameters
From the designer’s perspective, performing an aging               also will be reflected onto the measured performances such
analysis of a cell or circuit is just a new command in the         as a propagation time (see Figure 3) and distortion level.
simulator, and just as simple to use.                              However, the required measurements needed to capture the
                                                                   statistical nature of the aging process are much more
                                                                   complicated, lengthy, and costly than regular process
                                                                   monitoring.
                                                                       Solid information about the statistical properties of the
                                                                   aging process is rarely available from the foundries however
                                                                   Eldo easily support statistical analysis on top of the aging
                                                                   flow. For example, it is possible to assign a certain variance
                                                                   to any of the parameters that define the aging models and
                                                                   run a Monte Carlo simulation to get a statistical view of the
                                                                   aging trajectories.
                                                                       Reliability models are implemented in the Eldo UDRM
                                                                   (User Defined Reliability Model) interface [10]. The
                                                                   modeled damage could follow one or more of several
                                                                   damage mechanisms, which show gradual degradation in
                                                                   device performance. Other mechanisms, which cause
                                                                   sudden and complete damage of the device, (like the abrupt
    Figure 2: A joint design and reliability simulation flow       oxide breakdown for example) are not targeted in this scope.
                                                                       Reliability changes can be analyzed for any degradation
 III.   AGING AND RELIABILITY SIMULATION IN ELDO                   effect: HCI, NBTI, PBTI and TDDB.
                                                                        In a Long-Term Reliability Simulation Scheme, there
   The computation of the stress and the update of the             are two possible methods: Two simulations scheme &
electrical model parameters is done with user-defined              Repetitive scheme
equations. These two sets of equations are called a stress         In the Two simulations scheme the steps are as follows:
model and an update model. They can have virtually any             1. A transient simulation with the “fresh” device is done.
complexity, from the simplest two-lines-of-code model to a         2. At the end of the fresh simulation, the amount of
pages-and-pages model. Examples of realistic models can                 damage each device has been subjected to, caused by
be shown in [7, 8, 9]. The models to be used are described              the stress applied on the device, is calculated.
using a C API to communicate with the simulator kernel,            3. The transistor models are updated accordingly using the
allowing maximum computing efficiency. As explained, a                  equations specified in the user defined reliability
typical aging analysis requires two simulations runs. The               functions.
first run uses the nominal (fresh) models to produce the           4. A new transient simulation is run with the aged device.
waveforms, but it is also in charge of computing the stress
quantities. The second run uses the individually degraded               In the Repetitive scheme, the long period Tage is
models, and the CPU cost is exactly the same as a nominal          divided into smaller time intervals Ti (where Tage=ΣTi).
run. Thus, the total aging analysis is at least 2x more costly     The same steps as the two simulations scheme are followed,
than a nominal simulation. Depending on the complexity of          except that steps 2 to 4 are repeated NBRUN times, where
the stress model, the total combined CPU time is usually           NBRUN is the number of time intervals. The calculation of
comprised between 2x and 2.5x the time of a nominal                the stress is updated at the end of every time interval. This
simulation. Therefore, the cost in terms of CPU time is very       process is repeated until t=Tage. This approach can account
modest, given the importance of the provided insight—              for the gradual changing bias conditions as a result of device
especially compared to Monte Carlo analysis or simple              degradation. It is obvious that if the number of time
corner case analysis where the ratio is easily in the range of     divisions was chosen to be equal to one, the Repetitive
100X.                                                              scheme will be identical to the two simulations scheme. The
     To further complicate an already difficult subject, aging     more general case, the Repetitive scheme, is implemented in
appears to be a statistical process unto itself. If you consider   Eldo to ensure a high level of accuracy, and account for the

Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE
2016 - Dresden, Germany
Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.
                                                                                                                                          3



gradual changing bias conditions as a result of device                     distortion level, or a settling time) is interesting. However, it
degradation which could significantly affect the results. This             is a compound result that depends on many variables. The
flow is described in Figure 4.                                             next thing a designer wants to know is which device is
     Up to recently, aging mechanism has mainly focused on                 primarily responsible for the degradation of the observed
active devices. However, new challenges have appeared in                   metric. Joint design-reliability flows allow identification of
industrial and automotive power applications where                         devices that are subject to the largest degradation. The
resistors and capacitor degradations are becoming critical                 information is typically presented in tabular format with
issues. For example the need for accounting accurate                       sorting criteria. For example, a table may show the relative
resistors instability modeling has become mandatory in                     degradation of the drive current, the linear current, and also
latest high voltage processes. Eldo has been addressing                    the trans-conductance (gm), Vth, or generally any quantity
those new challenges while targeting any kind of design                    of interest. The designer can choose the sorting criterion. In
regions and not only blocks which contain transistors.                     the example shown in Figure 6, the results are sorted by
Furthermore, as Eldo can run electro-thermal simulation                    decreasing “delta-Vth,” where the devices that have their
using an electrical netlist connected to a thermal netlist                 threshold voltage degraded the most severely are presented
(built using RC network), one can mix those 2 capability to                first. This allows the designer to immediately identify the
apply aging on the thermal RC components which are part                    areas in the circuit that require extra attention.
of the simulated thermal network. Without this capability,
an electro-thermal simulation considers that the RC thermal
network does NOT degrade over time. Eldo can help to
remove this limitation in applying degradation models for
both the electrical and thermal netlists..
 IV.    JOINT DESIGN–RELIABILITY FLOW
   Using a joint design-reliability flow allows the designer
to predict the behavior of the circuit versus “wall-clock”
time. Important metrics can be traced versus time and
verified against the specifications. For example, Figure 5
shows how the operating frequency of a CMOS oscillator
degrades over time. The absolute period and the relative
degradation (in %) of the frequency are shown in the upper
and lower plots, respectively. The x-axis is the time in years
in logarithmic scale. The frequency is degraded by nearly
5% after only one year of operation. The rest of the circuit
may be able to accommodate such degradation, or not.




                                                                           Figure 4 Reliability simulation repetitive scheme flowchart.

                                                                           IP protection is one of the main benefits of this flow. For
                                                                           many companies or foundries, the details of the equations
                                                                           and models used to predict degradation are not considered
                                                                           public information. Rather, they consider it to be sensitive
                                                                           proprietary information that they are not keen to disclose in
                                                                           any way. For this purpose, Eldo has developed encryption
                                                                           mechanisms that allow full protection of the information.
                                                                           They can run the simulation using only binary non-human–
   Figure 3: The spread of aged transistor parameters are reflected onto
measured performances such as a propagation time.
                                                                           readable model files. As well, security encryption keys can
                                                                           be used to restrict access and control the execution of the
                                                                           models by the simulators. Once encrypted and protected,
     Predicting the degradation of a key metric (here it is an             only licensed partners, customers or sub-contractors, such as
oscillation frequency, but it could be power consumption, a                design houses, can make use of the protected libraries.

Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE
2016 - Dresden, Germany
Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.
                                                                                                                                                        4



                                                                                VI.     CONCLUSIONS
                                                                                  Reliability effects are real threats with advanced process
                                                                               nodes, joint design–reliability flows, however, can mitigate
                                                                               their effects and allow designers to take those effects into
                                                                               account as early as possible.
                                                                                  Eldo provides a fully customizable and robust aging
                                                                               simulation interface that allows IP protection where details
                                                                               of the equations and models used to predict degradation can
                                                                               be encrypted to secure the IP.
                                                                                  This solution can be used with any type of analysis: AC,
                                                                               DC, Transient, RF, statistics, sensitivity and mixed signal
                                                                               simulations.
Figure 5: The operating frequency of a CMOS oscillator degrades over
time.

 V.      REAL DESIGN AGING USING ELDO UDRM
   A complete use model of the flow starting from stress and
update model development, implementation in Eldo UDRM
interface and performing aging analysis on a design and
showing how the stress effects was crucial for the design is
described in [7]. They worked on a zero crossing detector
comparator (ZCDC) which is an essential part of Smart
Power ICs driving inductive load. Its function is to avoid the
inversion of output current. The design included 2 MOS
devices who operated on different stress, the switching of
both transistor is monitored for the operation of the ZCDC                     Figure 7. 6 Simulated current waveforms before (black) and after (blue and
hence accurately modeling Vth of both transistors and the                      red) reliability trial (1000h, 175°C). Due to mismatch in Vth degradation of
                                                                               M1 and M2 an offset in ZCDC is generated resulting in a too large reverse
shift occurring in both M1 and M2 behavior due to stress is                    current (red curve). The failure condition can’t be predicted without
mandatory to have accurate simulation results.                                 recovery modeling (blue curve). [7]


                                                                               REFERENCES

                                                                               [1]  Xiaojun Li, Jin Qin, and Joseph B. Bernstein, “Compact Modeling of
                                                                                    MOSFET Wearout Mechanisms for Circuit-Reliability Simulation,”
                                                                                    IEEE Transactions on Device and Materials Reliability, Vol. 8, No. 1,
                                                                                    March 2008..
                                                                               [2] Alain Braviax, IIRW tutorial 2010
                                                                               [3] Wenping Wang, Vijay Reddy, Anand T. Krishnan, Rakesh
                                                                                    Vattikonda, Srikanth Krishnan, and Yu Cao, ―An Integrated
                                                                                    Modeling Paradigm of Circuit Reliability for 65nm CMOS
Figure 6: The results are sorted by decreasing “delta-Vth.” The devices that        Technology,ǁ IEEE 2007 Custom Integrated Circuits Conference
have their threshold voltage degraded the most severely are presented first.        (CICC).
                                                                               [4] S. Bhardwaj, W. Wang, R. Vattikonda, Y. Cao, and S. Vrudhula,
    M1 drain voltage switches from VDD to 0 following a 1                           ―Predictive Modeling of the NBTI Effect for Reliable Design,ǁ
                                                                                    Proceedings of the IEEE Custom Integrated Circuits Conference, pp.
MHz square waveform while M2 is stressed in DC                                      189–192, September 2006.
conditions. M1 is stressed for half a period while in the                      [5] Mridul Agarwal, Varsha Balakrishnan, Anshuman Bhuyan, Kyunglok
other half it is off and can partially recover the degradation.                     Kim, Bipul C. Paul, Wenping Wang, Bo Yang, Yu Cao, and
Therefore after reliability simulation, M2 Vth will become                          Subhasish Mitra, ―Optimized Circuit Failure Prediction for Aging:
                                                                                    Practicality and Promise,ǁ IEEE 2008 International Test Conference.
more negative than M1 one and comparator will switch at a                      [6] Mikido Sode, et al. ―Reliability Simulation Environment Tackles
positive drain voltage given by Vth shifts difference in the                        LSI Design,ǁ Chip Design, June 2007.
input stage MOS. This effect, which can’t be predicted                         [7] Alagi, Filippo, et al. "Compact model for parametric instability under
without the modelling of NBTI recovery, can generate                                arbitrary stress waveform." 2014 44th European Solid State Device
                                                                                    Research Conference (ESSDERC). 2014.
unacceptable reverse currents causing the failure of circuit                   [8] Huard, Vincent, et al. "CMOS device design-in reliability approach in
operation as observed in a real case. This has been                                 advanced nodes." IEEE intl’ reliability physics symposium. 2009.
illustrated in [7] using aging simulations and giving results                  [9] Alagi, Filippo, Roberto Stella, and Emanuele Viganó. "Aging model
in Figure 7 which should be used by the designer to                                 for a 40 V Nch MOS, based on an innovative approach." IETE
                                                                                    Journal of Research 58.3 (2012): 191-196.
mitigate the aging risk on his design’s behavior.                              [10] Eldo UDRM manual.


Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE
2016 - Dresden, Germany
Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.