=Paper= {{Paper |id=Vol-1566/paper12 |storemode=property |title=Workload Impact on BTI HCI Induced Aging of Digital Circuits: A System level Analysis |pdfUrl=https://ceur-ws.org/Vol-1566/Paper10.pdf |volume=Vol-1566 |authors=Ajith Sivadasan,Florian Cacho,Sidi Ahmed Benhassain,Vincent Huard,Lorena Anghel |dblpUrl=https://dblp.org/rec/conf/date/SivadasanCBHA16a }} ==Workload Impact on BTI HCI Induced Aging of Digital Circuits: A System level Analysis== https://ceur-ws.org/Vol-1566/Paper10.pdf
                                                                                                                                   38

          Workload Impact on BTI HCI Induced Aging
          of Digital Circuits: A System level Analysis
    Ajith Sivadasan1,2, Florian Cacho1,Sidi Ahmed                                           Lorena Anghel2
            Benhassain1,2, Vincent Huard1                                2
                                                                          TIMA, 46, avenue Félix Viallet, 38031 Grenoble, France
            1
                STMicroelectronics – 850 rue Jean Monnet,
                         38926 Crolles, France
                   Contact: ajith.sivadasan@st.com

    Abstract— Workload characterization of digital circuits using      just been used for micro-architecture analysis, for our current
industry standard benchmarks gives an insight into the                 aging analysis studies, the same workloads are considered to
performance and energy characteristics of processor                    influence the aging of the microcontroller circuit under study
designs. Aging studies of digital circuits due to BTI, HCI is          in different respective fashions. As observed from our
gaining importance since a higher impact on the performance of
                                                                       simulations and references [11], [8] considerable variation can
circuits can be observed as we scale down gate dimensions. For
embedded system applications, the workload may very well               be observed between the simulation times of different
dictate the lifetime of a system. This article aims to study the       applications. Number of cycles of certain application to be
influence of different workloads on the degradation of critical        employed by the end user will then definitely influence the life
path which determines the reliability of a system. A top-down          time of the circuit. The automotive sector is one of the markets
circuit activity and probability analysis is carried out leading to    of interest for St Microelectronics. Matrix multiplication, FIR
an accurate estimation of aging due to HCI and BTI of critical         filters benchmarks [11] are used to characterize Embedded
path elements at the design stage. A dedicated simulation flow         System Applications for the Automotive sector.
has been set up, from RTL simulation down to gate level cell
timing analysis mapped onto 28nm FDSOI technology from
                                                                                 Silicon on Insulator (SOI) technology involves the
STMicroelectronics. The objective is to correlate path delay
timing with aging of critical path cells. Simulation results           fabrication of a sandwich structure, where a 25nm Buried
indicate that the higher complexity of an execution program may        oxide layer is sandwiched between a thin undoped silicon
not necessarily lead to a higher rate of degradation of the critical   layer and the substrate. This undoped layer is an important
path considering that aging is primarily driven by the workload        device matching characteristic [13]. The final SOI thickness of
dependent activity and the probability of critical path                7nm provides an excellent Short Channel effect (SCE) control
combinational logic elements.                                          without any change in the leakage current for gate size up to
                                                                       24 nm. FDSOI technology is aimed at high speed, low voltage
   Keywords— Workload, Aging, Critical Path, Reliability               circuit applications providing a 32% and 84% speed increase
                                                                       for 1V and 0.6V respectively with very little modification to
                       I. INTRODUCTION                                 the prevalent fabrication flow at ST Microelectronics [12].
          Computers were designed to perform tasks faster.             Incidentally, manufacturing process gets simplified because of
Speed of operation of microprocessors, power consumption,              elimination of well and field implantation steps. Memory
processor micro-architecture, memory hierarchy, system                 access times, can be significantly reduced due to high Iread
architecture [6] and task scheduling of application specific           values for manageable leakage. HCI and BTI effects observed
software that drive the microprocessors without a                      are comparable to the libraries based on Bulk Technology.
comprehensive knowledge of the end user specific
requirements has been till the now the major concern for                        This paper makes a direct link between workload at
microprocessor and microcontroller designers [8]. It is to             system level and the device level degradations due to HCI and
simulate these wide range of possible applications of an end           BTI. Considerable amount of research into how HCI and BTI
user that benchmarks and their associated performance metrics          affect the aging of transistors has been done [14] [15]. Both
that benchmarks have been developed [10]. Benchmarks are               HCI and BTI result in an increase in the threshold voltage [4]
designed to represent emerging workloads. EEMBC                        thus resulting in slower transistors. HCI degradation is
(Embedded Microprocessor Benchmark Consortium) deal                    observed predominantly while switching the transistors at high
with benchmarks for embedded systems [11]. Benchmarks                  frequency while a transistor at a constant potential degrade
scores gives a relative measure of performance of different            gradually due to BTI. Efforts have already been made to
processors. Using these different benchmarks for system or             consider HCI and BTI effects in the design process [16].
hardware performance analysis gives information to the                 Research into how workload can accelerate the aging has been
designers as to whether design changes need to be made                 explored by [2], [4]. This paper investigates an industrial
depending on the workload that it will employ for a certain            design flow to identify critical paths and critical path elements
application [9]. Though workload characterization till now has

Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Dresden, Germany

Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.
                                                                                                                                                                                                                                                                                  39
of a design that are most susceptible to HCI and BTI                                                                                                                                     III. HIERARCHICAL APPROACH
corresponding to a certain workload.
                                                                                                                                              A. Path Level Analysis
                   II. DESIGN AND FLOW METHODOLOGY                                                                                                The activity value varies as we move through the elements
                                                                                                                                              along a certain critical path. The activity value of a net depends
A. openMSP430 Architecture
                                                                                                                                              on the cell elements connected to it and the workload. Activity
    The design under use is an open-source synthesizable 16                                                                                   of the endpoint nets of all chosen critical paths is obtained.
bit microcontroller core from TI MSP430 family based on                                                                                       These activity values are plotted against their respective path
Von Neumann architecture and written in Verilog HDL. The                                                                                      delay values. Aging of critical paths which form 10% of
modules Frontend Unit, Execution unit, Memory backbone in                                                                                     maximum delay has also been referred to as PCP (Potential
Fig 1 were observed to show the maximum activity based on                                                                                     Critical Paths) [1]. Activity of endpoints here is considered as
RTL simulations and so they are the modules that are of                                                                                       the activity of the critical paths which provides a means to
interest in relation to this paper. The openMSP430 was further                                                                                compare path aging due to HCI and BTI.
synthesized onto 28nmFDSOI technology.                                                                                                            The critical paths with highest activity have been
B. Workload dependent Aging analysis Flow                                                                                                     highlighted in the graph Fig. 3 (bottom). The cell from which
                                                                                                                                              the net originates is expected to see maximum HCI related
    Activity is defined as the average number of transitions of
a net per clock period for the entire simulation cycle. Activity                                                                                                 1
                                                                                                                                                                                                                      Worst CP with ‘1’ probability

for this paper refers to transition density as mentioned in [20].




                                                                                                                                                probability
                                                                                                                                                               0.8
Probability in this research paper refers to the probability of
observing a logic 1 at a particular net per clock cycle. Thus                                                                                                  0.6

probability for this paper stands for signal probability of being                                                                                              0.4           FIR
                                                                                                                                                                                                                    Sub-CP with potential hazard
either a 1 or 0. Signal probability of 1 means the signal is                                                                                                                 Multiplication
                                                                                                                                                                             Floating Math                                     Worst CP with ‘0’ probability
always at 1 and vice versa.                                                                                                                                    0.2
                                                                                                                                                                             32bit Math

     Activity and probability information for all nets in the                                                                                                    0
                                                                                                                                                                             8bit Math
                                                                                                                                                                     0          0.2        0.4     0.6        0.8          1
design corresponding to 10 different Benchmark programs for                                                                                                                             Normalized Delay Path
a full simulation run is obtained.                                                                                                                              1                                                              Worst CP with high activity


                                                                                                                                                               0.8
                                                                                        Program Memory Interface                                                                                                                               400
                                                                                                                             RAM                                                                                                               350
  Frontend Unit                                                                                                                                                0.6                                                                             300
                                                                                                                                              activity




                                  Serial Debug                                                                                                                                                                                                 250




                                                                                                                                                                                                                                      # path
                                    Interface                                                                                                                  0.4
                                                                                                                                                                         FIR                             Sub-CP with                           200
                                                                                        Data Memory Interface
                                                                                                                             ROM                                         multiplication                  potential hazard                      150
                                   HW Break                                                                                                                              floating math                                                         100
                                                      Memory Backbone




                                     Unit                                                                                                                      0.2                                                                              50
                                                                                                                                                                         32bit math                      Sub-CP
  Execution Unit                                                                                                                                                                                                                                 0
                                                                                                                                                                         8bit math                       without hazard
                                                                              Watchdog             SFR                                                                                                                                                         1                       2
                                                                                                                                                                0
     Register File                  UART or                                                                                                                          0                         0.5                      1
                                                                                                                   Peripheral bus                                                                                                                        slack path
                                      I2C                                                                                                                                             Normalized delay path


                                                                             Basic Clock         16x16
         ALU                                                                  Module            Multiplier                                    Fig. 3. Path level activity and probability plots
                                                                                                                                Peripherals
                           DMA Controller,
                           Bootloader,
                           Memory-BIST
                                                                                                                                              degradation. The nets without activity will have a logic level of
                                                                                                                                              1 or 0 associated with it and this is an indicator of possible
                                                                                                                                              worst case degradation due to BTI as in Fig. 3 (top)
Fig. 1. openMSP430 Design Structure                                                                                                                              1
                                                                                                                                                               0.8
                                                                                                                                                 probability




                                                                                                                                                               0.6
                                                                                                                                                               0.4
    C kernels                                                                                                   Activity & power
                                     compiler                                      Simulation                                                                  0.2
    Programs                                                                                                         analysis
                                                       Hex files                                  VCD files
                                                                                                                                                                 0
 FIR filter
 matrix multiplication                                                                                                                                                   1        4           7    10     13          16         19             22     25          28    31       34
 Floating point math
 …                                                                                                                 Aging analysis                                1
                                                                                                                                                               0.8                                                                                   FIR
                                                                                                                                                                                                                                                     Matrix_Multiplication
                                                                                                                                                               0.6
                                                                                                                                                   activity




                                                                                                                                                                                                                                                     Floating point Math
     RTL                                                                                        Timing                      Aged Timing                        0.4
                                                                                                                                                                                                                                                     32 bit Math
                                      synthesis                                                                                                                                                                                                      8 bit Math
 openMSP430                                                             Gate netlist            analysis                      analysis
                                                                                                                                                               0.2
                                                                                                                                                                 0
                                           Timing contraint change
                                                                                                                                                                         1        4           7    10     13          16         19             22     25          28        31   34
        Hardware code hardening
                                           Timing monitors insertions
                                           ….                                                                                                                                                                           Path depth



Fig. 2. Workload dependent Aging analysis Flow



                                                                                                                                              Fig. 4. Critical Path Cell level activity and probability plots


Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Dresden, Germany

Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.
                                                                                                                                                                                   40
B. Cell Level Analysis                                                                                                                  REFERENCES
    The cell level analysis of the critical path provides                                               [1]  Wang et al., (2007). An efficient method to identify critical gates under
complete data on how the individual elements in the critical                                                 circuit aging. IEEE/ACM International Conference on Computer-Aided
                                                                                                             Design, Digest of Technical Papers, ICCAD, 735–740
path ages. In Fig. 4 (bottom), the cells most affected by HCI is
identified. In Fig.4 (top), for a different workloads, probability                                      [2] Mintarno et al., (2013). Workload dependent NBTI and PBTI analysis
                                                                                                             for a sub-45nm commercial microprocessor. IEEE International
of the nets along the Critical Path being either 1 or 0 provides                                             Reliability Physics Symposium Proceedings
information on BTI degradation.                                                                         [3] Cacho, F. et al., (2009). Hci/Bti Coupled Model: the Path for Accurate
                                                                                                             and Predictive Reliability Simulations
    A complex workload can be referred to as the workloads
that lead to high activity nodes. More the number of such high                                          [4] Ebrahimi et al. (2013). Aging-aware logic synthesis. IEEE/ACM
                                                                                                             International Conference on Computer-Aided Design, Digest of
activity nodes, higher is the complexity of the workload.                                                    Technical Papers, ICCAD, 61–68
Higher complexity of the workload does not necessarily lead to                                          [5] O. Girard. (2010). [Online]
a higher activity of all the critical paths but it has an impact on                                     [6] John L. K. et al. (1998). Workload Characterization: Methodology and
certain potential critical paths [2].                                                                        Case Studies, 1999, 3 - 14,10.1109/WWC.1998.809354
                                                                                                        [7] Maxiaguine A. et al (2004). Design, Automation and Test in Europe
C. Timing Arc degradation                                                                                    Conference and Exhibition(DATE),
    The impact of activity and probability are now reviewed at                                          [8] Li Y. et al (1997). Computer-Aided Design of Integrated Circuits and
                                                                                                             Systems, IEEE Transactions on, Vol: 16, Issue: 12
cell level. Design in Reliability models [17] [18], are models of
HCI and NBTI degradation of cells. Using Design in                                                      [9] Hoste       et al. (2007).      Microarchitecture-Independent Workload
                                                                                                             Characterization, Micro IEEE
Reliability models, it is possible to evaluate the degradation of
                                                                                                        [10] Weicker P. et al (1990). An Overview of Common Benchmarks,
cell for a given stimuli and mission profile. Assuming 2 years                                               Computer
of operating conditions at Vmax, the degradation of a                                                   [11] Poovey, J.A et al, (2009). A Benchmark Characterization of the EEMBC
NAND3A is depicted in Fig 5. At a given input slope and load,                                                Benchmark Suite, Micro IEEE
delay of rising and falling arcs are simulated. It is noticeable                                        [12] Planes N et al (2012). 28nm FDSOI technology platform for high-speed
that degradation is a function of both the activity and                                                      low-voltage digital applications, Symposium on VLSI Technology
probability driven respectively by HCI and BTI mechanisms.                                                   (VLSIT)
However, there is coupling between both these mechanisms, as                                            [13] Federspiel, X. et al(2011), Experimental characterization of the
explained in [3]. These effects are not additive but they do                                                 interactions between HCI, off-state and BTI degradation modes, IEEE
                                                                                                             International Integrated Reliability Workshop Final Report (IRW)
interact with each other. For this particular arc, an always ‘0’ at
input leads to drastic cell degradation because of PMOS                                                 [14] Huard, V. et al, (2008), NBTI degradation: From transistor to SRAM
                                                                                                             arrays, IEEE International Reliability Physics Symposium, (IRPS).
degradation. This gets exacerbated at high activity.
                                                                                                        [15] Huard V. et al, (2013) Advances in industrial practices for optimal
                                       2                                                                     performance/reliability/power trade-off in commercial high-performance
                                                                                                             microprocessors for wireless applications, IEEE International Reliability
                                                                                                             Physics Symposium, (IRPS)
         delay cell degradation (%)




                                      1.5
                                                                                                        [16] Cacho, F et al (2011), Hot Carrier Injection degradation induced
                                                                                                             dispersion: Model and circuit-level measurement, IEEE International
                                       1                                                                     Integrated Reliability Workshop Final Report (IRW)
                                                                               falling, high activity   [17] Huard, V et al (2007), Design-in-Reliability Approach for NBTI and
                                      0.5                                      falling, low activity         hot- Carrier degradations in Advanced Nodes, IEEE Transactions on
                                                                               rising, low activity          Device and Materials Reliability
                                                                               rising, high activity    [18] Huard, V et al (2009), CMOS device design-in reliability approach in
                                       0
                                            0            50              100
                                                                                                             advanced nodes, IEEE International Reliability Physics Symposium,
                                                probability of '1' (%)
                                                                                                             2009
                                                                                                        [19] Najm, F.N. (1993), Transition density: a new measure of activity in
Fig. 5. Timing arcs degradation for NAND3A cell for different probability                                    digital circuits, IEEE Transactions on Computer-Aided Design of
      and activity                                                                                           Integrated Circuits and Systems
                                                                                                        [20] Kleeberger, V.B. (2014), Workload and instruction-aware timing
                                                                                                             analysis - The missing link between technology and system-level
                                                          CONCLUSION                                         resilience, 51st ACM/EDAC/IEEE Design Automation Conference
                                                                                                             (DAC)
    This paper discusses a flow that provides workload                                                  [21] Lorenz, D et al (2014), Monitoring of aging in integrated circuits by
dependent HCI and BTI related degradation information to the                                                 identifying possible critical paths, Microelectronics Reliability, vol 54, p
designer at the very beginning of the design stage. Workload                                                 1075, June-July 2014
dependent activity and probability information of the critical
path nets of a digital circuit is gathered. Higher the activity and
probability on a certain net, higher will be degradation due to
HCI and BTI respectively of cells from which the nets
originate. A designer can thus improve the reliability of a
hardware, by taking into account the HCI and BTI aging for a
certain application during the design stage.



Workshop on Early Reliability Modeling for Aging and Variability in Silicon Systems – March 18th 2016 – Dresden, Germany

Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.