172 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 56, NO. 2, FEBRUARY 2009 Designing High-Speed Adders in Power-Constrained Environments Fabio Frustaci, Marco Lanuzza, Paolo Zicari, Stefania Perri, Member, IEEE, and Pasquale Corsonello, Member, IEEE Abstract—Data-driven dynamic logic (D3L) is very efficient when low-power constraints are mandatory. Unfortunately, this advantage is typically obtained at the expense of speed perfor- mances. This paper presents a novel technique to realize D3L parallel prefix tree adders without significantly compromising speed performance. When applied to a 64-bit Kogge–Stone adder realized with 90-nm complementary metal–oxide–semiconductor (CMOS) technology, the proposed technique leads to an energy- delay product that is 29% and 21% lower than its standard domino logic and conventional D3L counterparts, respectively. It also shows a worst case delay that is 10% lower than that of the D3L approach and only 5% higher than that of the conventional domino logic. Index Terms—Clock-precharged dynamic logic, data-driven dy- namic logic (D3L), data-precharged dynamic logic, parallel prefix adder. I. I NTRODUCTION A DDITION is a fundamental operation in any digital sys- tem and can significantly influence the overall achievable performances [1]. For this reason, novel high-speed adders are highly desirable. The speed performances of addition circuits can be improved by optimizing both the top-level structure and the circuit imple- mentation [2]–[4]. However, it is worth noting that the adder topology, together with the used logic and transistor-sizing criterion, also significantly affects energy dissipation [5]–[7]. For example, a very high speed is reached when parallel prefix adders are implemented with dynamic domino logic. In this case, the advantages offered by the logarithmic depth tree structure are emphasized through fast dynamic logic, which, as a drawback, requires a clock distribution system to correctly run. As demonstrated in [8], the power dissipation, owing to the clock distribution network in a dynamic system, can range from 20% up to 45% of the overall consumed power. This provides crucial information for the design of efficient digital circuits, in which achieving low-power dissipation is also an important issue. To limit the power consumption of the clock distribution system, the data-driven dynamic logic (D3L) [9] and the clock- and data-precharged dynamic logic (CDPDL) [10] have recently been proposed. These techniques completely or partially remove the clock distribution system Manuscript received June 23, 2008; revised September 30, 2008. First published February 10, 2009; current version published February 25, 2009. This paper was recommended by Associate Editor A. Brambilla. The authors are with the Department of Electronics, Computer Science and Systems (DEIS), University of Calabria, 87036 Arcavacata di Rende, Italy (e-mail: p.corsonello@unical.it). Digital Object Identifier 10.1109/TCSII.2008.2010187 Fig. 1. Generic n-type domino gate. required within conventional dynamic circuits, thus leading to significantly lower energy consumption. Unfortunately, this advantage is obtained at the expense of a nonnegligible penalty in speed performances. This paper presents a new technique for exploiting the energy-saving advantages offered by D3L without paying sig- nificant performance penalty with respect to the conventional domino logic. As a sample application, a new data-precharged dynamic structure is presented for a 64-bit Kogge–Stone par- allel prefix adder. When implemented with STMicroelectron- ics 90-nm 1-V complementary metal–oxide–semiconductor (CMOS) technology, the novel adder exhibits an energy-delay product that is 29% and 21% lower than those of the standard clock-precharged domino logic and the conventional D3L im- plementations, respectively. This paper is organized as follows: In Section II, a brief background is given. In Section III, a comparison between the traditional clock-precharged domino logic and the D3L described in [9] is performed. Finally, Section IV describes the new data-precharged domino adder and furnishes some interesting comparison results. II. BACKGROUND Conventional dynamic CMOS circuits operate using a se- quence of precharge and evaluation phases based on a clock input. During the precharge phase, the output signal is forced to a predefined value, independently of the input data. On the contrary, during the evaluation phase, the output signal depends on the received inputs. Among the dynamic logics known in the literature, domino logic is the most widely used in high-performance microproces- sors due to its speed and area characteristics [11]. An n-type (p-type) domino circuit executes the precharge phase when the clock signal is low (high) and the evaluation phase when the clock is high (low). Fig. 1 shows the generic n-type domino gate. 1549-7747/$25.00 © 2009 IEEE Authorized licensed use limited to: UNIVERSITA DELLA CALABRIA. Downloaded on June 16,2010 at 18:12:11 UTC from IEEE Xplore. Restrictions apply.