=Paper=
{{Paper
|id=Vol-1566/paper1
|storemode=property
|title=Design–Reliability Flow and Advanced Models Address IC-Reliability Issues
|pdfUrl=https://ceur-ws.org/Vol-1566/Paper1.pdf
|volume=Vol-1566
|authors=Mohamed Selim,Eric Jeandeau,Cyril Descleves
|dblpUrl=https://dblp.org/rec/conf/date/SelimJD16
}}
==Design–Reliability Flow and Advanced Models Address IC-Reliability Issues==
1
Design–Reliability Flow and Advanced Models Address IC-
Reliability Issues
Mohamed Selim, Eric Jeandeau, Cyril Descleves
Mentor Graphics
Abstract— design itself, the architecture, the chosen device geometries
and, most importantly, the actual stimuli applied to the
Reliability effects are real threats with advanced process nodes. This circuit during operation will strongly determine the
paper describes the industrial challenges associated with accurate magnitude and speed of degradation. Fortunately, tools are
modeling of aging problems. Joint design–reliability flows can mitigate
their effects, and allow designers to take those effects into account as
now available that help designers to understand exactly
early as possible. when and how much devices are stressed and damaged, and
how this aging impacts the device or circuit performance
after a certain time, allowing designers to take corrective
I. INTRODUCTION actions if necessary. The necessity of linking classical
design flows with the capability to predict reliability has
Advanced short-geometry CMOS processes are subject been recognized by the industry [3, 4, 5].
to aging that causes major reliability issues, degrading the
performance of integrated circuits over time. Degradation
effects causing aging are hot carrier injection (HCI) and
negative bias temperature instability (NBTI), in addition to
positive bias temperature instability (PBTI) and time
dependent dielectric breakdown (TDDB). Below 90 nm,
consideration of these effects is becoming mandatory for
design flows targeting quality and reliability. This paper
describes the state-of-the-art simulation flow that can help
designers address these issues and to create more reliable
designs.
These particular reliability effects modify the
fundamental behavior of the transistors, such as threshold Figure 1 Reliability bathtub and the effect of new node introduction.
voltage (Vth) and the mobility factor [1]. No applications
that make full usage of the process performance are really
safe. These changes will affect timing delays, drive currents,
II. RELIABILITY SIMULATION
leakage, linearity, and every possible specification that may
appear in IC design, be it for automotive, biomedical, Each device experiences a different stress that depends on
military-aerospace, wireless communications, or video. its exact individual bias conditions, in addition to global
Basically, all industry sectors are potentially affected. conditions such as temperature. This stress is computed
individually and integrated over a particular time window
Device failure phases are described as: [6]. This time window is chosen to be compatible with the
CPU time constraints for circuit simulation. By
a) Infant Mortality (early rate) extrapolation, an estimate of the stress seen by each device
b) Normal Operating Life (constant random failures, intrinsic during a much longer periodic operation (maybe weeks,
rate) months, or years) is computed.
This stress quantity is then used to compute degraded
c) Wear Out
values of the model parameters (such as the threshold,
It would be obvious that for new technology nodes the mobility, etc.). Using these degraded model parameters, a
wear out failure curve shifts to the left, which means earlier new simulation is run which represents what would happen
device degradation as shown in Figure 1 [2]. The actual after N years of operation. In this aged simulation, each
stress effect depends on the design itself. The manufacturing device uses its own individual degraded model with updated
process can be refined so that these effects are globally Vth, mobility etc., because the stress is device-specific and
minimized. However degradation is tightly linked to the
Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE
2016 - Dresden, Germany
Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.
2
so is the update. Fresh and aged simulation results can be two identical devices drawn side-by-side, they will have
overlapped and compared. slightly different (fresh) threshold voltages, drive current,
Models can either take into account DC stress or dynamic and leakage currents. The distribution of these parameters is
stress, including the recovery effect [4]. Including the monitored by the foundries, allowing statistical simulations
recovery effect leads to less pessimistic estimations of the by designers. But, unfortunately, the same is true for their
degradation compared to a static DC stress approach hence aging characteristics. Two identical devices with identical
avoiding over-design to compensate the degradation. bias conditions do not age exactly in the same way. The Vth
Although rather sophisticated, this entire process is now shift, for example, is a statistical variable with its own mean,
completely integrated in regular SPICE flows (Figure 2). variance, etc. This spread of the aged transistor parameters
From the designer’s perspective, performing an aging also will be reflected onto the measured performances such
analysis of a cell or circuit is just a new command in the as a propagation time (see Figure 3) and distortion level.
simulator, and just as simple to use. However, the required measurements needed to capture the
statistical nature of the aging process are much more
complicated, lengthy, and costly than regular process
monitoring.
Solid information about the statistical properties of the
aging process is rarely available from the foundries however
Eldo easily support statistical analysis on top of the aging
flow. For example, it is possible to assign a certain variance
to any of the parameters that define the aging models and
run a Monte Carlo simulation to get a statistical view of the
aging trajectories.
Reliability models are implemented in the Eldo UDRM
(User Defined Reliability Model) interface [10]. The
modeled damage could follow one or more of several
damage mechanisms, which show gradual degradation in
device performance. Other mechanisms, which cause
sudden and complete damage of the device, (like the abrupt
Figure 2: A joint design and reliability simulation flow oxide breakdown for example) are not targeted in this scope.
Reliability changes can be analyzed for any degradation
III. AGING AND RELIABILITY SIMULATION IN ELDO effect: HCI, NBTI, PBTI and TDDB.
In a Long-Term Reliability Simulation Scheme, there
The computation of the stress and the update of the are two possible methods: Two simulations scheme &
electrical model parameters is done with user-defined Repetitive scheme
equations. These two sets of equations are called a stress In the Two simulations scheme the steps are as follows:
model and an update model. They can have virtually any 1. A transient simulation with the “fresh” device is done.
complexity, from the simplest two-lines-of-code model to a 2. At the end of the fresh simulation, the amount of
pages-and-pages model. Examples of realistic models can damage each device has been subjected to, caused by
be shown in [7, 8, 9]. The models to be used are described the stress applied on the device, is calculated.
using a C API to communicate with the simulator kernel, 3. The transistor models are updated accordingly using the
allowing maximum computing efficiency. As explained, a equations specified in the user defined reliability
typical aging analysis requires two simulations runs. The functions.
first run uses the nominal (fresh) models to produce the 4. A new transient simulation is run with the aged device.
waveforms, but it is also in charge of computing the stress
quantities. The second run uses the individually degraded In the Repetitive scheme, the long period Tage is
models, and the CPU cost is exactly the same as a nominal divided into smaller time intervals Ti (where Tage=ΣTi).
run. Thus, the total aging analysis is at least 2x more costly The same steps as the two simulations scheme are followed,
than a nominal simulation. Depending on the complexity of except that steps 2 to 4 are repeated NBRUN times, where
the stress model, the total combined CPU time is usually NBRUN is the number of time intervals. The calculation of
comprised between 2x and 2.5x the time of a nominal the stress is updated at the end of every time interval. This
simulation. Therefore, the cost in terms of CPU time is very process is repeated until t=Tage. This approach can account
modest, given the importance of the provided insight— for the gradual changing bias conditions as a result of device
especially compared to Monte Carlo analysis or simple degradation. It is obvious that if the number of time
corner case analysis where the ratio is easily in the range of divisions was chosen to be equal to one, the Repetitive
100X. scheme will be identical to the two simulations scheme. The
To further complicate an already difficult subject, aging more general case, the Repetitive scheme, is implemented in
appears to be a statistical process unto itself. If you consider Eldo to ensure a high level of accuracy, and account for the
Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE
2016 - Dresden, Germany
Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.
3
gradual changing bias conditions as a result of device distortion level, or a settling time) is interesting. However, it
degradation which could significantly affect the results. This is a compound result that depends on many variables. The
flow is described in Figure 4. next thing a designer wants to know is which device is
Up to recently, aging mechanism has mainly focused on primarily responsible for the degradation of the observed
active devices. However, new challenges have appeared in metric. Joint design-reliability flows allow identification of
industrial and automotive power applications where devices that are subject to the largest degradation. The
resistors and capacitor degradations are becoming critical information is typically presented in tabular format with
issues. For example the need for accounting accurate sorting criteria. For example, a table may show the relative
resistors instability modeling has become mandatory in degradation of the drive current, the linear current, and also
latest high voltage processes. Eldo has been addressing the trans-conductance (gm), Vth, or generally any quantity
those new challenges while targeting any kind of design of interest. The designer can choose the sorting criterion. In
regions and not only blocks which contain transistors. the example shown in Figure 6, the results are sorted by
Furthermore, as Eldo can run electro-thermal simulation decreasing “delta-Vth,” where the devices that have their
using an electrical netlist connected to a thermal netlist threshold voltage degraded the most severely are presented
(built using RC network), one can mix those 2 capability to first. This allows the designer to immediately identify the
apply aging on the thermal RC components which are part areas in the circuit that require extra attention.
of the simulated thermal network. Without this capability,
an electro-thermal simulation considers that the RC thermal
network does NOT degrade over time. Eldo can help to
remove this limitation in applying degradation models for
both the electrical and thermal netlists..
IV. JOINT DESIGN–RELIABILITY FLOW
Using a joint design-reliability flow allows the designer
to predict the behavior of the circuit versus “wall-clock”
time. Important metrics can be traced versus time and
verified against the specifications. For example, Figure 5
shows how the operating frequency of a CMOS oscillator
degrades over time. The absolute period and the relative
degradation (in %) of the frequency are shown in the upper
and lower plots, respectively. The x-axis is the time in years
in logarithmic scale. The frequency is degraded by nearly
5% after only one year of operation. The rest of the circuit
may be able to accommodate such degradation, or not.
Figure 4 Reliability simulation repetitive scheme flowchart.
IP protection is one of the main benefits of this flow. For
many companies or foundries, the details of the equations
and models used to predict degradation are not considered
public information. Rather, they consider it to be sensitive
proprietary information that they are not keen to disclose in
any way. For this purpose, Eldo has developed encryption
mechanisms that allow full protection of the information.
They can run the simulation using only binary non-human–
Figure 3: The spread of aged transistor parameters are reflected onto
measured performances such as a propagation time.
readable model files. As well, security encryption keys can
be used to restrict access and control the execution of the
models by the simulators. Once encrypted and protected,
Predicting the degradation of a key metric (here it is an only licensed partners, customers or sub-contractors, such as
oscillation frequency, but it could be power consumption, a design houses, can make use of the protected libraries.
Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE
2016 - Dresden, Germany
Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.
4
VI. CONCLUSIONS
Reliability effects are real threats with advanced process
nodes, joint design–reliability flows, however, can mitigate
their effects and allow designers to take those effects into
account as early as possible.
Eldo provides a fully customizable and robust aging
simulation interface that allows IP protection where details
of the equations and models used to predict degradation can
be encrypted to secure the IP.
This solution can be used with any type of analysis: AC,
DC, Transient, RF, statistics, sensitivity and mixed signal
simulations.
Figure 5: The operating frequency of a CMOS oscillator degrades over
time.
V. REAL DESIGN AGING USING ELDO UDRM
A complete use model of the flow starting from stress and
update model development, implementation in Eldo UDRM
interface and performing aging analysis on a design and
showing how the stress effects was crucial for the design is
described in [7]. They worked on a zero crossing detector
comparator (ZCDC) which is an essential part of Smart
Power ICs driving inductive load. Its function is to avoid the
inversion of output current. The design included 2 MOS
devices who operated on different stress, the switching of
both transistor is monitored for the operation of the ZCDC Figure 7. 6 Simulated current waveforms before (black) and after (blue and
hence accurately modeling Vth of both transistors and the red) reliability trial (1000h, 175°C). Due to mismatch in Vth degradation of
M1 and M2 an offset in ZCDC is generated resulting in a too large reverse
shift occurring in both M1 and M2 behavior due to stress is current (red curve). The failure condition can’t be predicted without
mandatory to have accurate simulation results. recovery modeling (blue curve). [7]
REFERENCES
[1] Xiaojun Li, Jin Qin, and Joseph B. Bernstein, “Compact Modeling of
MOSFET Wearout Mechanisms for Circuit-Reliability Simulation,”
IEEE Transactions on Device and Materials Reliability, Vol. 8, No. 1,
March 2008..
[2] Alain Braviax, IIRW tutorial 2010
[3] Wenping Wang, Vijay Reddy, Anand T. Krishnan, Rakesh
Vattikonda, Srikanth Krishnan, and Yu Cao, ―An Integrated
Modeling Paradigm of Circuit Reliability for 65nm CMOS
Figure 6: The results are sorted by decreasing “delta-Vth.” The devices that Technology,ǁ IEEE 2007 Custom Integrated Circuits Conference
have their threshold voltage degraded the most severely are presented first. (CICC).
[4] S. Bhardwaj, W. Wang, R. Vattikonda, Y. Cao, and S. Vrudhula,
M1 drain voltage switches from VDD to 0 following a 1 ―Predictive Modeling of the NBTI Effect for Reliable Design,ǁ
Proceedings of the IEEE Custom Integrated Circuits Conference, pp.
MHz square waveform while M2 is stressed in DC 189–192, September 2006.
conditions. M1 is stressed for half a period while in the [5] Mridul Agarwal, Varsha Balakrishnan, Anshuman Bhuyan, Kyunglok
other half it is off and can partially recover the degradation. Kim, Bipul C. Paul, Wenping Wang, Bo Yang, Yu Cao, and
Therefore after reliability simulation, M2 Vth will become Subhasish Mitra, ―Optimized Circuit Failure Prediction for Aging:
Practicality and Promise,ǁ IEEE 2008 International Test Conference.
more negative than M1 one and comparator will switch at a [6] Mikido Sode, et al. ―Reliability Simulation Environment Tackles
positive drain voltage given by Vth shifts difference in the LSI Design,ǁ Chip Design, June 2007.
input stage MOS. This effect, which can’t be predicted [7] Alagi, Filippo, et al. "Compact model for parametric instability under
without the modelling of NBTI recovery, can generate arbitrary stress waveform." 2014 44th European Solid State Device
Research Conference (ESSDERC). 2014.
unacceptable reverse currents causing the failure of circuit [8] Huard, Vincent, et al. "CMOS device design-in reliability approach in
operation as observed in a real case. This has been advanced nodes." IEEE intl’ reliability physics symposium. 2009.
illustrated in [7] using aging simulations and giving results [9] Alagi, Filippo, Roberto Stella, and Emanuele Viganó. "Aging model
in Figure 7 which should be used by the designer to for a 40 V Nch MOS, based on an innovative approach." IETE
Journal of Research 58.3 (2012): 191-196.
mitigate the aging risk on his design’s behavior. [10] Eldo UDRM manual.
Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Co-Located with DATE
2016 - Dresden, Germany
Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.