1 Full Two-Dimensional Markov Chain Analysis of Thermal Soft Errors in Subthreshold Nanoscale CMOS Devices Pooya Jannaty, Florian C. Sabou, R. Iris Bahar, Joseph Mundy, William Patterson, Alexander Zaslavsky Abstract—Thermally induced fluctuations in the logic state of a simple flip-flop occur on a time scale that renders them impossible to simulate through Monte Carlo methods. In previous work, an analytical framework based on Markov chains and queue theory was introduced along with a symbolic solution for a truncated one-dimensional queue, diagonally connecting the two stable logic states in a two-dimensional (2D) queue. In this paper, a complete solution for a full 2D queue is presented, which maps all the possible thermal noise fluctuations of electron populations in flip- flop inverters. The results for the mean time to thermally-induced error confirm the estimates given by truncated approximations. This formalism is also capable of computing arbitrary probability moments as well as steady-state distributions and transient behavior of the system. The full 2D queue can also capture the statistics of other noise sources, like radiation-induced charge generation where the flip-flop can transiently reside in a queue state far from the diagonal connecting the two stable logic states of a flip-flop. Index Terms—CMOSFET logic devices, Markov processes, Monte Carlo methods, Numerical analysis, Reliability. I. I NTRODUCTION R ELIABILITY analysis for subthreshold and low-voltage regimes of operation is motivated by the growing field of ultra-low-power digital circuits, that furnish one possible avenue for reducing the overall energy consumption. The reduction in the operating voltage V dd reduces the error margin with respect to different noise sources and raises the need for probabilistic frameworks capable of analyzing the effect of various noise sources on low-power devices. The error rate estimates arising from such models can serve as a guideline for designing logic circuits operated at ultra-low V dd [1]-[6]. For devices or systems with sufficiently high error rates, Monte Carlo techniques can be effectively employed to es- timate the time to error. However, Monte Carlo approaches become computationally prohibitive when the error rates are low: for the simple flip-flop circuit we will use to demonstrate our approach, the Monte Carlo computation time increases exponentially with the number of electrons stored on the node capacitances, as detailed in [7]. For this reason, in our pre- vious work [7][8], a probabilistic framework for the analysis of thermal-noise-induced variations in the logic stability of memory circuits (flip-flops) has been developed. Assuming The authors are with the School of Engineering and the Department of Physics, Brown University, Providence, Rhode Island 02912, USA. Copyright c 2010 IEEE. Personal use of this material is permitted. How- ever, permission to use this material for any other purposes must be obtained by sending a request to pubs-permissions@ieee.org. Poisson processes for the arrival and departure of carriers at source and drain of a transistor operated in subthreshold regime [9], the charging and discharging rates for the nodal capacitors C 1 and C 2 shown in Fig. 1 can be expressed in terms of the corresponding Poisson rates μ and λ as C i charging rate = λ ni + λ pi , C i discharging rate = μ ni + μ pi . (1) The charge on capacitors C 1 and C 2 can be mapped onto a two-dimensional (2D) state queue, where each state cor- responds to a unique combination of charges k 1 and k 2 on capacitors C 1 and C 2 . The two valid logic states, ‘0’ and ‘1’, correspond to states in opposite corners of the 2D queue. If the noise source is purely thermal, it was shown [7][8] that thermally- induced transitions between the valid states are exponentially rare and occur predominantly on a diagonal path of the 2D queue. However, other mechanisms of upset, such as radiation events, could create a wide range of charge amounts at either electronic node. This sudden change in the carrier population on the two inverters also changes the state of the system as represented in the 2D queue, often moving the system far from the stable corner states. Thermal broadening at higher temperatures also necessitates taking into account the states farther from the diagonal. To study the stability of the system in these situations and also in order to confirm the diagonal approximation [7] in estimating the thermally- induced soft error rates, a complete solution of the full 2D queue is necessary. A numerical solution to the full 2D queue V dd C 1 , k 1 V dd µ p1 λ p1 C 2 , k 2 λ n1 µ n1 µ p2 λ p2 λ n2 µ n2 Fig. 1. The modeled flip-flop circuit. Capacitors C 1 and C 2 represent the node capacitances associated with each inverter. For each transistor, the charging and discharging rates λ and μ determine the electron populations k 1 and k 2 on the node capacitances.