An accurate model for soft error rate estimation considering dynamic voltage and frequency scaling effects Farshad Firouzi a,⇑ , Mostafa E. Salehi a , Fan Wang b , Sied Mehdi Fakhraie a a Nano Electronics Center of Excellence, School of Electrical and Computer Engineering, University of Tehran, North Kargar Ave., Tehran 14395-515, Iran b Juniper Networks, Inc., Sunnyvale, CA, USA article info Article history: Received 30 May 2010 Received in revised form 20 August 2010 Accepted 24 August 2010 Available online 24 September 2010 abstract Due to shrinking feature size and higher transistor count in a single chip in modern fabrication technol- ogies, power consumption and soft error reliability have become two critical challenges which chip designers are facing in new silicon integrated circuits. Recent studies have shown that these issues have compromising effects on each other. Besides, power consumption and reliability significantly vary across workloads and among pieces of a single application which can be exploited to design adaptive runtime fault tolerant and low power systems. These attractions have been exploited in prior studies to design online reconfigurable fault tolerant systems with power management schemes. However, those attempts are driven by complicated simulations and hardly deliver a sense of direction to the designers. To achieve maximum efficiency in terms of power, performance, and reliability in dynamic scaling of voltage and frequency, it is critical to have a simple and accurate reliability model which estimates the value of fault rate considering supply voltage and operating frequency of a circuit. In this paper, we propose an accurate formula for analytic modeling of the soft error rate of a system which can be used to precisely track the reliability of the system under dynamic voltage and frequency adjustments. The experimental results of this paper prove that our proposed model offers precise estimates of reliability in accordance with the results of accurate soft error rate (SER) estimation algorithm for ISCAS85’s benchmark circuits. Ó 2010 Elsevier Ltd. All rights reserved. 1. Introduction Because of the increasing complexity, high transistor count, smaller feature size, reduced noise margin, and higher parasitic capacitance of modern integrated circuits, many intrinsic effects that were unimportant in previous technologies are becoming noticeable. Each of these effects can individually cause unreliabil- ity and make future generations of digital systems prone to tran- sient faults. Besides, feature size scaling trend which implies higher power consumption per area is becoming another vital concern. Transient faults can arise from multiple sources. However, high energy elements such as alpha particles produced by impurities in packaging and high-energy neutrons induced by cosmic radiation are the most important sources of soft error. It is predicted that the failure rate due to radiation-induced soft errors dominates all other reliability issues [1]. Thus, increasing rate of soft errors en- forces the designers to make special arrangements to consider more reliability and fault-tolerant features in future deep-submi- cron ICs. Whenever a particle strikes a junction area, it creates elec- tron–hole pairs along its path. Accumulation of these carriers by the junction with drift and diffusion mechanisms might result in a transition to a new logical value. This phenomenon is called tran- sient fault. If this faulty value is captured by a sequential element, the new false value remains in effect until it is rewritten by a new one. Nevertheless, in a static logic the value returns to its initial after a few picoseconds. However, it creates a glitch that can prop- agate through several gates and might make a change in the state of the system which is held by some sequential elements. Another important issue which new integrated circuits are fac- ing is power consumption limits. Digital designers face increasing challenges in reducing power consumption and avoiding thermal hotspots. Researchers have proposed many static and dynamic techniques to address power and performance issues at different levels [2,3]. According to the quadratic relation of power consump- tion with the voltage, dynamic voltage and frequency scaling (DVFS) is an effective technique for controlling both energy and performance. DVFS techniques reduce the supply voltage to the minimum required value such that the timing constraints are mar- ginally satisfied. However, reliability is affected as a direct conse- quence of employing DVFS. The rate of transient faults or soft errors that are caused by cosmic ray radiations also will be highly impacted by system operating frequency and supply voltage. This further complicates finding a trade-off between system reliability 0026-2714/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.microrel.2010.08.016 ⇑ Corresponding author. E-mail addresses: farshadfirouzi@gmail.com (F. Firouzi), mersali@ut.ac.ir (M.E. Salehi), fanw@juniper.net (F. Wang), fakhraie@ut.ac.ir (S.M. Fakhraie). Microelectronics Reliability 51 (2011) 460–467 Contents lists available at ScienceDirect Microelectronics Reliability journal homepage: www.elsevier.com/locate/microrel