Modeling and Analyzing NBTI in the Presence of Process Variation Taniya Siddiqua, Sudhanva Gurumurthi, Mircea R. Stan † Dept. of Computer Science, † Dept. of Electrical and Computer Engg., University of Virginia {taniya,gurumurthi,mircea}@virginia.edu Abstract With continuous scaling of transistors in each technology generation, NBTI and Process Variation (PV) have become very important silicon reliability problems for the micro- processor industry. In this paper, we develop an analytical model to capture the impact of NBTI in the presence of PV for use in architecture simulations. We capture the follow- ing aspects in the model: i) variation in NBTI related to stress and recovery due to workloads, ii) temporal varia- tion in NBTI due to Random Charge Fluctuation (RCF) and iii) Random Dopant Fluctuation (RDF) due to process vari- ation. We use this model to analyze the combined impact of NBTI and PV on a memory structure (register file) and a logic structure (Kogge-Stone adder). We show that the im- pact of the threshold voltage variations due to NBTI and PV over the nominal degradation can hurt the yield of the struc- tures. Due to the combined effect of NBTI and PV across different benchmarks, 26 to 117 bits fail in a 8Kb size reg- ister file and the execution delay increases by 18% to 28% in a Kogge-Stone adder. We then discuss the implications of these results for architecture-level reliability techniques. 1 Introduction We are in the era of multicore processors and it is ex- pected that the number of the processing cores on a chip will steadily increase over the next decade, driven by Moore’s law. While technology scaling paves the way for high per- formance multicore processors, the scaling has a dark side too: silicon reliability. Processors are becoming increas- ingly susceptible to a variety of silicon reliability problems, from soft errors and process variation to several hard er- ror phenomena, which can cause permanent damage to the processor. One important hard error phenomenon is Nega- tive Bias Temperature Instability (NBTI), which affects the lifetime of PMOS transistors. NBTI occurs when a neg- ative bias is applied at the gate of a PMOS transistor and causes an increase in the threshold voltage of the device. In terms of its impact on microprocessor circuits, this increase in the threshold voltage degrades the speed of the transis- tors and therefore degrades the speed of the circuit in which they are used, eventually causing the circuit to violate tim- ing constraints [16, 11]. Such a timing violation will cause the circuit to behave incorrectly and cause the processor it- self to fail. Moreover, the impact of NBTI is exacerbated by Process Variation (PV). PV is the variation in the tran- sistor attributes (length, width, oxide thickness) caused dur- ing the fabrication of the integrated circuits and manifests itself as threshold voltage variations which results in vari- ability in circuit performance and power. Processors have to be designed to provide adequate protection against both these problems. Both NBTI and PV have received attention in the archi- tecture community in recent years and several mitigation techniques have been proposed for each [1, 21, 18, 19, 20, 7]. Since both NBTI and PV affect the threshold voltage of devices, these two problems should not be addressed in isolation. To come up with the appropriate mitigation tech- niques, it is important to accurately gauge the impact of both NBTI and PV and factor-in the impact of the workloads that run on the processor as well. For this purpose, an analytical model is required which captures the impact of both NBTI and PV in a coherent way and which is suitable for use in architecture level analyses. There have been several efforts in developing analyti- cal models for NBTI and PV at the circuit-level. How- ever, these models are suitable only for analyzing NBTI and PV effects over a very short time span and are not readily usable for architecture simulations. Architects, on the other hand, study microprocessor reliability by execut- ing different program benchmarks and extrapolate the col- lected statistics over a much longer timescale (typically, 7- 10 years). Throughout the benchmark execution, utiliza- tions of the microarchitectural structures vary. Also, the in- teractions among the structures, the inputs to each structure, and bits stored within them change over the course of exe- cution of a benchmark. The analytical model for NBTI and PV should be able to factor-in all these “variations” to be usable in architecture simulations to gain correct and holis- tic insight into these inter-related silicon reliability phenom- ena. In this paper, we leverage the prior research on NBTI and PV modeling from the circuits community to develop a model that captures the interactions between these two reli- ability phenomena and which is usable at the architecture- level. 1