IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 24, NO. 12, DECEMBER 2005 1859
Modeling of Failure Probability and Statistical
Design of SRAM Array for Yield Enhancement
in Nanoscaled CMOS
Saibal Mukhopadhyay, Student Member, IEEE, Hamid Mahmoodi, Student Member, IEEE, and
Kaushik Roy, Fellow, IEEE
Abstract—In this paper, we have analyzed and modeled failure
probabilities (access-time failure, read/write failure, and hold fail-
ure) of synchronous random-access memory (SRAM) cells due to
process-parameter variations. A method to predict the yield of a
memory chip based on the cell-failure probability is proposed. A
methodology to statistically design the SRAM cell and the memory
organization is proposed using the failure-probability and the
yield-prediction models. The developed design strategy statisti-
cally sizes different transistors of the SRAM cell and optimizes
the number of redundant columns to be used in the SRAM array,
to minimize the failure probability of a memory chip under area
and leakage constraints. The developed method can be used in an
early stage of a design cycle to enhance memory yield in nanometer
regime.
Index Terms—Leakage, performance, random dopant fluctu-
ation (RDF), robustness, synchronous random-access memory
(SRAM), yield.
I. I NTRODUCTION
T
HE random variations in process parameters have emerg-
ed as a major design challenge in circuit design in the
nanometer regime [1]–[3]. The sources of the inter-die and the
intra-die variations in process parameters includes variations in
channel length, channel width, oxide thickness, threshold volt-
age, line-edge roughness, and random dopant fluctuations [the
random variations in the number and location of dopant atoms
in the channel region of the device resulting in the random
variations in transistor threshold voltage (RDF)] [1]–[5]. These
different sources of variations result in significant variation in
the delay and the leakage of digital circuits [1]–[5]. The inter-
die variation in a parameter [say threshold voltage (V
t
)] mod-
ifies the value of that parameter of all transistors in a die in
the same direction (i.e., threshold voltage of all the transistors
either increase or reduce). This principally results in a spread
in the delay and the leakage, but does not cause a mismatch
between different transistors in a die. On the other hand, the
intra-die variations shift the process parameters of different
Manuscript received September 14, 2003; revised December 2, 2004. This
work was supported in part by the Semiconductor Research Corporation, the
Defence Advance Research Project Agency Power Aware Computing and
Communication (DARPA PACC) Program, Intel, and IBM Corporation. This
paper was recommended by Associate Editor S. Sapatnekar.
The authors are with the Department of Electrical and Computer Engi-
neering, Purdue University, West Lafayette, IN 47907 USA (e-mail: sm@ecn.
purdue.edu; mahmoodi@ecn.purdue.edu; kaushik@ecn.purdue.edu).
Digital Object Identifier 10.1109/TCAD.2005.852295
transistors in a die in different directions (e.g., V
t
of some
transistors increase whereas that of some others reduce). The
intra-die (or on-die) variations can be systematic (i.e., shift
in a parameter of one transistor depends on the shift of that
parameter of a neighboring transistor) or random (i.e., shifts
in a parameter of two neighboring transistors are completely
independent). An example of the systematic intra-die variation
can be the change in the channel length of different transistors
of a die that are spatially correlated. The RDF induced V
t
variation is a classic example of the random intra-die variation.
The systematic variation does not result in large differences
between the two transistors that are in close spatial proximity.
The random component of the intra-die variation can result in
a significant mismatch between the neighboring transistors in a
die [1]–[5].
In a static random-access memory (SRAM) cell, a mis-
match in the strength between the neighboring transistors,
caused by intra-die variations, can result in the failure of the
cell [7]–[9]. For example, a cell failure can occur due to: 1) an
increase in the cell access time (access time failure); 2) unstable
read (flipping of the cell data while reading) and/or write
(inability to successfully write to a cell) operations (read/write
failure); or 3) failure in the data holding capability of the
cell (flipping of the cell data with the application of a supply
voltage lower than the nominal one) at the standby mode (hold
failure in the standby mode). Since these failures are caused
by the variations in the device parameters, these are known
as the parametric failures [8], [9]. There can also be hard
failures (caused by open or short) or soft failures due to soft
error. In this paper, we will concentrate only on the parametric
failures, and hereafter, by the word “failure,” we will refer
to the parametric failures. A failure in any of the cells in a
column of the memory will make that column faulty. In a
memory, the redundant columns are used to improve the fault
tolerance of the memory and when a column is detected as a
faulty one, it gets replaced by an available redundant column.
Thus, if the number of faulty columns in a memory chip is
larger than the number of available redundant columns, then
the chip is considered to be faulty (a similar argument holds
for the memory designed with the row redundancy). Hence, the
probability of failure of a cell is directly related to the yield
of a memory chip. Thus, the intra-die-variation-induced device
mismatch can significantly reduce the yield of a memory. As the
effect of the intra-die variations increases with the technology
0278-0070/$20.00 © 2005 IEEE
Authorized licensed use limited to: San Francisco State Univ. Downloaded on December 10, 2008 at 17:52 from IEEE Xplore. Restrictions apply.