2742 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 60, NO. 4, AUGUST 2013
Radiation and Fault Injection Testing of a
Fine-Grained Error Detection Technique for FPGAs
Gabriel L. Nazar, Paolo Rech, Christopher Frost, and Luigi Carro
Abstract—We present the experimental evaluation of a
fine-grained hardening approach that exploits underused and
abundant resources found in state-of-the-art SRAM-based FPGAs
to detect radiation-induced errors on configuration memories. The
technique’s main goal is to provide the benefits of fine-grained re-
dundancy, namely improved diagnosis and reduced error latency,
with a reduced area overhead. Neutron experiments, validated
with fault injection campaigns, demonstrate the proposed tech-
nique’s efficiency when compared to the traditional dual modular
redundancy.
Index Terms—Fault tolerance, field-programmable gate arrays
(FPGAs), neutron radiation effects.
I. INTRODUCTION
F
IELD -Programmable Gate Arrays (FPGAs) have seen
great success over the past years due to their high perfor-
mance, flexibility and fast time-to-market. Moreover, the pos-
sibility of reprogramming the device after deployment allows
the addition of new functionalities or the correction of design
bugs, extending the system’s lifetime. Despite these advantages,
FPGA utilization in critical systems has been limited due to re-
liability issues. With the aggressive scaling of transistor feature
sizes, radiation-induced Single Event Effects (SEEs) became a
major threat to the reliability of electronic devices. While this
concern was more prominent in radiation harsh environments,
such as the space, recent technologies may suffer from SEEs
even in terrestrial applications [1]. It is then crucial to exper-
imentally characterize the susceptibility of a device to these
effects.
As SRAM-based FPGAs have their functionality stored in
large memory arrays, which represent the vast majority of the
storage cells in the device, Single Event Upsets (SEUs) or
Multiple Bit Upsets (MBUs) affecting configuration cells are
a major concern for the overall system reliability. Evaluating
the effects of such faults in FPGAs is, hence, crucial to enable
their use in critical systems. The two main means to do so are
through fault injection and accelerated radiation experiments,
which are often complementary approaches. The first is able
Manuscript received September 28, 2012; revised February 04, 2013; ac-
cepted April 30, 2013. Date of publication May 31, 2013; date of current ver-
sion August 14, 2013. This work was supported by the CAPES foundation of the
Ministry of Education, the CNPq research council of the Ministry of Science and
Technology, and the FAPERGS research agency of the State of Rio Grande do
Sul, Brazil. Experiments were performed in ISIS, Rutherford Appleton Labora-
tories, Didcot, U.K., and founded by Science and Technology Facilities Council.
G. L. Nazar, P. Rech, and L. Carro are with the Instituto de Informática, Uni-
versidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, RS 91509-900,
Brazil (e-mail: glnazar@inf.ufrgs.br; prech@inf.ufrgs.br; carro@inf.ufrgs.br).
C. Frost is with the ISIS, Rutherford Appleton Laboratories, Didcot OX11
0QX, U.K. (e-mail: christopher.frost@stfc.ac.uk).
Digital Object Identifier 10.1109/TNS.2013.2261319
to inject a much larger number of faults in a short period of
time, while the second allows a more accurate evaluation of the
effects of radiation on the device.
Fault injection can be performed with various abstraction
levels, from high-level models down to an actual silicon device.
Specifically for FPGAs, due to the complex and frequently
unpredictable effects of configurations that were not foreseen
by manufacturers, the use of high level models or simulation
software can become too complex or inaccurate. Furthermore,
detailed low level schematics of the device are usually not
available to users, increasing the complexity of assessing
the purpose of each configuration bit. Thus, fault injection is
usually performed directly in an actual FPGA, with different
approaches [2]–[7]. Injecting faults directly into the FPGA has
the additional benefit of greatly reducing the total experiment
time, as the circuit under test runs at full speed.
However, although fault injection experiments are suitable
to quickly determine relevant metrics such as fault coverage,
they are unable to directly measure properties as cross-section
or failure rate, as no physical disturbance is suffered by the de-
vice. Thus, radiation experiments are relevant to measure the
actual susceptibility of a device to such effects. In this work,
we report the results of neutron experiments conducted to esti-
mate the cross-section and failure rates attainable with a fine-
grained error detection technique, relative to those of a tradi-
tional coarse-grained approach. These experiments are valuable
to validate the fault injection campaigns conducted on circuits
with the technique [8], [9].
Fine-grained redundancy techniques as a means to mitigate
transient faults in FPGAs have been proposed in several works
[8]–[14]. Among the main advantages of such techniques are the
ability to quickly detect faults and the improved diagnosis infor-
mation provided by fine-grained comparators, as the output of
each Lookup Table (LUT) can be compared to a replica. These
features have a great potential to reduce the repair time of faults
affecting configuration bits. As such faults are usually removed
by means of scrubbing [15], the time required to traverse the
configuration memory limits the attainable repair time, which
is usually in the order of several milliseconds and may be too
long for critical real time applications. Furthermore, once the
error has propagated to sensitive parts of the user circuit, such
as feedback structures, even its removal from the configuration
may not restore the circuit functionality [11].
The main disadvantage of fine-grained techniques is the area
cost of the additional voters or comparators. In this work we use
a technique that exploits typically a very abundant and under-
used resource of FPGAs, namely the carry propagation chain, to
implement fine-grained comparators [8], [9]. This circuit is al-
ready included in the basic configurable blocks of most state-of-
0018-9499 © 2013 IEEE