Hindawi Publishing Corporation
VLSI Design
Volume 2013, Article ID 785281, 12 pages
http://dx.doi.org/10.1155/2013/785281
Research Article
A General Design Methodology for
Synchronous Early-Completion-Prediction Adders in
Nano-CMOS DSP Architectures
Mauro Olivieri and Antonio Mastrandrea
Department of Information Engineering, Electronics and Telecommunications, Sapienza University of Rome,
Via Eudossiana 18, 00184 Rome, Italy
Correspondence should be addressed to Mauro Olivieri; olivieri@diet.uniroma1.it
Received 12 September 2012; Revised 2 December 2012; Accepted 5 December 2012
Academic Editor: Meng-Hsueh Chiang
Copyright © 2013 M. Olivieri and A. Mastrandrea. is is an open access article distributed under the Creative Commons
Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
Synchronous early-completion-prediction adders (ECPAs) are used for high clock rate and high-precision DSP datapaths, as they
allow a dominant amount of single-cycle operations even if the worst-case carry propagation delay is longer than the clock period.
Previous works have also demonstrated ECPA advantages for average leakage reduction and NBTI effects reduction in nanoscale
CMOS technologies. is paper illustrates a general systematic methodology to design ECPA units, targeting nanoscale CMOS
technologies, which is not available in the current literature yet. e method is fully compatible with standard VLSI macrocell
design tools and standard adder structures and includes automatic de�nition of critical test patterns for postlayout veri�cation. A
design example is included, reporting speed and power data superior to previous works.
1. Introduction
Fast integer adders are an essential component of most DSP
datapaths. Synchronous early-completion-prediction adders
(ECPAs) [1], also known as variable-latency adders [2],
have been introduced for high clock rate and high-precision
datapaths, as they allow single-cycle operations even if the
clock period is shorter than the worst-case carry propagation
delay. anks to the data dependency of actual carry chain
propagation, the occurrence of multicycle operations can be
maintained statistically rare, thus allowing an overall speed
improvement. e industrial effectiveness of the idea was �rst
proven by the design of a full-custom ECPA unit for a DSP
datapath at Toshiba Labs [1]. e logic foundation of that
adder is shown in [3]. An extension to multiply unit design
has been shown in [4]. e works in [2] and [5] have recently
pointed out the potentials of variable-latency adder units
in nano-CMOS addition units, for reducing average leakage
power consumption and improving robustness to NTBI faults
occurring in nano-scale technologies.
An ECPA consists of a conventional adder plus a
completion-prediction logic unit (Figure 1). e prediction
unit estimates the actual critical path length in the adder
depending on the operand values and hence the cycle count
of the operation for the target cycle time. is approach
differs from asynchronous completion detection units [6–
8], as it is based on a totally synchronous scheme. From
the design point of view, the logic speci�cation of the
prediction function depends on the target cycle time and on
the estimation of the variable completion time of the adder, in
order to de�ne the cycle count output. Moreover, the speed of
the prediction unit is critical, since the prediction must always
be completed in a single cycle in order to be effective.
No general design methodology for ECPA VLSI cores
has been proposed yet. In [3], Lee and Asada analyzed the
design problem on the basis of 2-input-gate unit delay within
a ripple carry adder structure. In [1], Kondo et al. address
the full-custom design case of a fast carry-select structure.
In [9], Nowick et al. deal with the design of speculative-
completion adders, similar in principle to ECPA but again