<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Early failure prediction by using in-situ monitors: Implementation and application results</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">A</forename><surname>Benhassain</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">STMicroelectronics</orgName>
								<orgName type="institution" key="instit2">Technology R&amp;D</orgName>
								<address>
									<settlement>Crolles</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">TIMA</orgName>
								<address>
									<addrLine>46, avenue Félix Viallet</addrLine>
									<postCode>38031</postCode>
									<settlement>Grenoble</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">F</forename><surname>Cacho</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">STMicroelectronics</orgName>
								<orgName type="institution" key="instit2">Technology R&amp;D</orgName>
								<address>
									<settlement>Crolles</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">V</forename><surname>Huard</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">STMicroelectronics</orgName>
								<orgName type="institution" key="instit2">Technology R&amp;D</orgName>
								<address>
									<settlement>Crolles</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">L</forename><surname>Anghel</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">TIMA</orgName>
								<address>
									<addrLine>46, avenue Félix Viallet</addrLine>
									<postCode>38031</postCode>
									<settlement>Grenoble</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Early failure prediction by using in-situ monitors: Implementation and application results</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">4329C820B552F6A0A31A68080AA20705</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T00:53+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In-situ monitor is a promising strategy to measure timing slacks and to provide pre-error warning prior to any timing violation. In this work, we demonstrate that the usage of in-situ monitors with a feedback loop of voltage regulation is suitable for process and temperature compensation. Index Terms -in-situ timing monitors, CMOS reliability, timing margin..</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>With CMOS technology scaling, it becomes more and more difficult to guarantee circuit functionality for all process, voltage, and temperature (PVT) corners. Moreover, circuit wearout degradation lead to additional temporal variation. It results an increase of design margin for reliable systems <ref type="bibr" target="#b0">[1]</ref>.Adding pessimistic timing margin to guarantee all operating conditions under worse case conditions is no more acceptable due to the huge impact on design costs.</p><p>One can report two categories of ageing monitoring techniques. Firstly, we can define standalone sensors utilizing various configurations of ring oscillators <ref type="bibr" target="#b1">[2]</ref> and delay chain. Replica paths <ref type="bibr" target="#b2">[3]</ref> are a solution to mimic the timing behavior of the original path in combinatory logic. Second, in-situ delay monitors can directly measure the delay degradation of a specific path within the target circuit, this approach is very promising to provide reliable timing information <ref type="bibr" target="#b3">[4]</ref>. Delay monitors such as "Razor I" <ref type="bibr" target="#b4">[5]</ref> and "Razor II" <ref type="bibr" target="#b5">[6]</ref> detect timing errors in actual paths. A local microrollback execution procedure ensures error correction. However, these methods need huge hardware architecture for error recovery. The Adaptive Voltage Scaling (AVS) approach in <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b7">8]</ref> proposes error correction by using in-situ monitors able to detect timing error and global system action following the error detection.</p><p>Another approach consists in detecting timing pre-error instead of timing error by detecting critical transitions <ref type="bibr" target="#b7">[8]</ref>. In this case, the in-situ delay monitors can be used as reliability technique to provide alert prior setup violation. This technique is also further combined with global system actions such as AVS or DVFS. In this paper, an innovative insertion flow of monitor is presented. Two solutions of ISM are discussed and compared. The first one is built with standard cells available in the technology design platform library, named here built-in flow ISM. The second one uses a dedicated custom design, named cell-based ISM. In section III, some benchmarks of strategy of insertion are reported and discussed. Finally, several applications of ISM usage for compensation are presented.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. ISM INSERTION FLOW</head><p>The advantage of ISM located inside digital block is the capability to accurately capture all sources of local physical, environmental and temporal variations. ISMs under investigation are presented in the Fig. <ref type="figure" target="#fig_0">1</ref>. The basic idea is to delay the data of a critical path arriving at D in the shadow FF, and to compare it with the regular FF. When Flag signal rises, it means that a violation of the setup time has occurred in the shadow FF and the remaining slack of the data path is close to the timing of the delay element, as defined in the schematic. In this work 3 time windows (TW1=60ps, TW2=100ps, TW3=130ps) have been evaluated. The schematic can be carried out in two different ways: semi-custom (flow-based ISM) or full custom designs (cell-based ISM). In the first one, all schematic elements are issued from the standard cell design platform. Placement and connectivity is performed with scripting during the flow execution. The second one is a new cell dedicated this usage. For that approach, all CAD views of the new cell (functional, physical, timing, etc) need to be developed to be compliant with standard digital flow. In addition to the choice of the monitor, the insertion flow is of crucial importance. In the objective of developing quantitative results, an industrial framework compliant with STMicroelectronics digital flow design is used. Obviously, the methodology is portable to any other standard or in-house digital flow. The generic approach is illustrated in the Fig. <ref type="figure" target="#fig_1">2</ref>. The classical Front-end steps are executed with synthesis and floorplaning. At the end, a gate netlist is provided as input to placement and route tool. After placement and pre clock tree synthesis (CTS), a timing analysis (TA) is performed. For setup functional corner, a decision is made to insert monitor (FF cell sweep for cell-based ISM) and to regenerate connectivity on a sub-set of critical path. It results in a new gate netlist, new timing and power figures, and the flow is normally re-executed: post CTS (hold and setup optimization), route and optimization until the design is timing, power and reliability closed. A certain number of back and forth steps is required to fully satisfy the initial design specification, as shown in figure <ref type="figure" target="#fig_1">2</ref>.</p><p>For illustration, some timing analyses are presented in the Fig. <ref type="figure" target="#fig_2">3</ref>. Based on an initial 5% worst slack selections, ISM are inserted in a sub-set of path. At step 3 (Fig. <ref type="figure" target="#fig_1">2</ref>), histogram of paths are reported for an implementation with and without ISM. In the following analysis, delayed paths are not reported. A particular attention is paid to be sure that for flow-based ISM the inserted cells are physically the closest possible to the monitored FF. To achieve this objective, timing constraints are adapted to minimize the skew between shadow and regular FF. Moreover the delayed data arriving at shadow FF is not considered as a real path when the place and route tool optimize to fulfill the timing constraint. It means that there is theoretically no timing penalty after ISM insertion expected the one induced by the slightly additional routing resource.</p><p>It is important to notice that the delay monitored is in the order of magnitude of the degradation measured on test chips during ageing experiments. It is mandatory to have the highest timing accuracy level during monitor insertion. Whereas the insertion could be possible at synthesis during Front-end, the physical synthesis (Back-end) is able to account for parasitic effects. Thus, timing analysis at this level (post-route) is suitable and relevant to discuss the efficiency of the insertion flow. Benchmarking this methodology on different digital blocks is now reviewed to determine how the insertion flow can cover digital path ageing and establish the performance penalty </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. BENCHMARK RESULTS</head><p>The ISM insertion flow is now applied on different circuits and performance versus ISM covering efficiency is reviewed. The design is synthesized and place-and-routed in 28FDSOI technology with Low-Vt devices. Several circuits are issued from ITC99 benchmark whose characteristics are typical of synthetized circuits. The b19, b15 and b14 have respectively 17, 6 and 10 Kgate after physical implementation. They are respectively composed of 2, 0.8 and 0.4 kFF. In addition industrial customer-related digital block is investigated as well. This block is a Bose, Ray-Choudhary and Hocquenghem (BCH) error correcting code IP consisting of encoder and decoder modules. The IP contains an output signal, Autotest, indicating if error correction is preformed correctly. More details about this circuit can be found in <ref type="bibr" target="#b0">[1]</ref>.</p><p>Different trials of implementation are performed for the same target performance with a large availability of area. Optimization of place and route tool has for first priority, performance, and then area/power. Worst negative slack at post-route step are discussed for all the circuits. As depicted in the Fig. <ref type="figure" target="#fig_3">4</ref>, performance impact after ISM insertion might depend on circuit. Compared to reference circuit (fresh library without monitors), the implementation with an aged library (consumer, networking or automotive are mission profile dependent) leads to minor penalty. However this guard band enables the circuit to fulfill the timing requirement at the end of life. Concerning the ISM insertion, we chose a sub-set of 10% worst slacks (at step 1 of Fig. <ref type="figure" target="#fig_1">2</ref>) to equip with monitors. BCH result shows a 90ps slack degradation for cell-based ISM and less than 20ps for flowbased ISM. The explanation for the penalty of cell-based penalty is the area constraint of the large custom cell. For the sake of clarity, the delayed data arriving at shadow FF for cellbased ISM is not reported in the TA of Fig. <ref type="figure" target="#fig_3">4</ref> for ITC99 benchmark. The number of ISM inserted and it performance impacts is discussed in the Fig. <ref type="figure" target="#fig_4">5</ref>. For a selection of the most 5% of the critical data path, the performance penalty is only 15ps, and 40% of the initial selection is covered by monitors. A classic approach would consist to select the CP according to an absolute delay window and not in a number of CP criteria. Thus, the distribution of the CP and sub-CP histogram is important to analyze when using this approach. The violation hazards of a path due to the induced ageing failures are a function of its remaining slack. However, for the ISM flow, we use the number of inserted ISM as the metric under investigation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. EXAMPLE OF APPLICATIONS</head><p>After discussing the strategy and the benchmark of insertion ISM flow, some experimental results are now reviewed. Dedicated digital block are developed where on 10% of critical paths custom cell-based ISM has been inserted. We have investigated designs in 28nm, Low Power (LP) and Fully Depleted SOI (FDSOI) developed at STMicroelectronics. Digital block studied is the BCH block mentioned in previous section. In the first application ISM are inserted in the architecture to manage the variability under optimum power budget. Major challenge in multicore architecture is to cope with inter-core dispersion. Indeed, local process dispersion leads to variation of speed and thus power consumption of all cores. To tackle this dispersion, an additional margin in the voltage stack needs to be used. It is not a trivial task to establish this margin because it is deeply influenced by the process centering and dispersion of manufacturability. Alternative approach is to insert ISM and to use their Flag as a warning to be considered as inputs of margin capabilities. As depicted in Fig. <ref type="figure" target="#fig_5">6</ref>, 18 BCH cores are implemented in LP technology with ISM without any feedback loop. Under constant 1GHz clock frequency, when supply voltage is decreased, a first Flag monitor occurs at 0.99V, corresponding to a 1% of voltage decrease. At that point, the operating functionality is still correct. While supply voltage continue to decrease, more and more Flags occur on different cores and a first failure is reported (setup violation) at 0.85V. Interestingly, the V MIN (minimal voltage sustaining to maintain functionality at a given PLL clock) distribution for all 18 cores, depends on the application execution of all cores and their ageing experience. To optimize the choice of voltage stack in multicore architecture, the strategy would be to monitor the first Flag of each core instead of using a conservative extra margin covering intra-core dispersion.</p><p>The second application focuses on monitoring the Flag number. The Flag number is the indicator for circuit speed and used for local variations along with aging aware voltage adaptation. An important measure campaign is performed on  <ref type="figure">A, B, C, D, E</ref>). This workloads are patterns containing 0, 1,2,3,4 errors respectively. Fig. <ref type="figure" target="#fig_7">7</ref> shows the Flag count when decreasing the voltage until the Autotest signal fail using TW1 for different workloads. When modifying the pattern, the activity is modified and it results a strong modification on CP ranking. A direct consequence is that V MIN_AUTOTEST (the supply voltage before Autotest signal fail) and V MIN_Flag (the supply voltage when the count is starting) vary strongly with the workload (V MIN_AUTOTEST ~ 0.8V and V MIN_Flag ~ 0.84 for Workload A, V MIN_AUTOTEST ~ 0.825V and V MIN_Flag ~ 0.825V for Workload D) In order to demonstrate the robustness of ISM, it is important to test them under various conditions. For that propose, various temperature change have been exercised on BCH IP. Figure <ref type="figure" target="#fig_8">8</ref> shows the result of V MIN variation under 30°C and 125°C for both Autotest and Flag signals. As depicted, degradation by 250 mV of V MIN_AUTOTEST is observed when 125°C is applied, confirming the ability of ISM to capture local variations induced by temperature change. A flow of in-situ monitor is developed and applied to different circuits. Two types of monitors are compared and discussed: cell-based and flow-based approach. Performance penalty and area overhead of ISM is slightly small. This additional margin provided by Flag signal is more accurate than the additional voltage stack margin to account for ageing degradation. The coverage path statistics, number of critical path monitored on desired critical path to be monitored, is around 40%. This approach is suitable for dynamic management of ageing because at long-term, the probability to activate one path from critical path selection is high. Some applications of adaptive regulation are illustrated, this scheme is promising for process compensation and temperature change.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Schematic and layout of in-situ monitor under investigations. Data arriving at Q is delayed in shadow FF and compared to the regular one. Flowbased ISM is composed of standard cells available in the design platform. Cell-based ISM is a fully customized design.</figDesc><graphic coords="1,306.60,522.60,246.48,170.16" type="vector_box" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Flow insertion of in-situ built-in monitors. During Front-End flow, a preliminary Timing Analysis is performed after pre-CTS step. In-situ monitors are inserted in sub-set of critical paths and a new gate netlist is generated. Then the Back-end flow is normally executed with new gate netlist.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 3 .</head><label>3</label><figDesc>Fig. 3. Timing analysis of BCH results at different step of the flow. A preliminary TA at post-CTS is calculated (step 1). Based on this ranking, 5% worst slack are selected, and ISM are inserted. In the final TA (step 3), slack of monitored and none monitored paths are presented.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 4 .</head><label>4</label><figDesc>Fig. 4. Relative worst slack penalty for reference implementation, aged library implementation (for different mission profile), 10% flow-based ISM and 10% cell-based ISM.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Fig. 5 .</head><label>5</label><figDesc>Fig. 5. Slack penalty for different implementations of BCH circuit. Increase of number of ISM leads to a degradation. The level of coverage (number of critical path monitored on initial critical path targeted to be monitored) remains close to 40%.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Fig. 6 .</head><label>6</label><figDesc>Fig. 6. Management of multicore architecture using ISM. At fixed 1GHz clock, when decreasing supply voltage, a warning Flag appears earlier before the IP failure. The 18 core safety margins are in a 100mV supply voltage range.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head></head><label></label><figDesc>5% ISM 10% ISM 30% ISM % coverage of targeted set of CP slack penalty (ns) performance coverage % 300 dies using 3 time windows (TW1, TW2, TW3) and 5 workloads (Workload</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Fig. 7 :</head><label>7</label><figDesc>Fig. 7: Evolution of Flag number with VDD using time window 1 (TW1) for different workloads: A, B, C, D ( 0, 1,2,3,4 errors injected).</figDesc><graphic coords="4,43.80,233.88,252.60,162.96" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head>Fig. 8 :</head><label>8</label><figDesc>Fig. 8: Evolution of Flag number with VDD using TW1 and workload B under 30°C (magenta) and 125°C (blue) temperatures</figDesc><graphic coords="4,43.80,538.44,255.96,142.56" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0">Copyright © 2016 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors.</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Adaptative wear out management with in-situ management</title>
		<author>
			<persName><forename type="first">V</forename><surname>Huard</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IRPS</title>
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Path-RO: a novel on-chip critical path delay measurement under process variation</title>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE ACM</title>
		<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Representative Critical Reliability Paths for low-cost and accurate on-chip aging evaluation</title>
		<author>
			<persName><forename type="first">S</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE/ICCAD</title>
		<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Digial circuits reliability with in-situ monitors in 28nm fully depleted SOI</title>
		<author>
			<persName><surname>Saliva</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE/DATE</title>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A Self-Tuning DVS Processor Using Delay-Error Detection and Correction</title>
		<author>
			<persName><forename type="first">S</forename><surname>Das</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE J. Solid-State Circuits</title>
		<imprint>
			<date type="published" when="2006-04">Apr.. 2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">RazorII: In Situ Error Detection and Correction for PVT and SER Tolerance</title>
		<author>
			<persName><forename type="first">D</forename><surname>Blaauw</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE J. Solid-State Circuits</title>
		<imprint>
			<date type="published" when="2009-01">Jan. 2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Energy-efficient and Metastability-Immune Resilient Circuits for Dynamic Variation Tolerance</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">A</forename><surname>Bowman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Journal of Solid-State Circuits</title>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">A Variation-Aware Adaptive Voltage Scaling Technique Based on In-situ Delay monitoring</title>
		<author>
			<persName><forename type="first">M</forename><surname>Wirnshofer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE/DDECS</title>
		<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
