IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 54, NO. 6, JUNE 2007 1279 Novel Power-Delay-Area-Efficient Approach to Generic Modular Addition Riyaz A. Patel, Mohammed Benaissa, Senior Member, IEEE, Neil Powell, and Said Boussakta, Senior Member, IEEE Abstract—Modular adders are fundamental arithmetic compo- nents typically employed in residue number system (RNS)-based digital signal processing (DSP) systems. They are widely used in modular multipliers and residue-to-binary converters and in im- plementing other residue arithmetic operations such as scaling. In this paper, a methodology for designing power-delay-area-ef- ficient modular adders based on carry propagate addition is pre- sented. The binary representational characteristics of the modulus are exploited to allow the sharing of hardware in a fast modular adder topology. VLSI implementation results using 0.13- m stan- dard-cell technology, together with a theoretical analysis, show that this approach produces adders that offer efficient tradeoffs when compared with the fastest through to the smallest generic modular adders in the literature. Index Terms—Carry propagate adder, computer arithmetic, ELM adder, modular adder, parallel-prefix adder, residue number system (RNS), very large-scale integration (VLSI( design. I. INTRODUCTION M ODULAR addition for arbitrary moduli plays an impor- tant role in the implementation of residue number sys- tems (RNSs) that provide balanced arithmetic operations as well as large dynamic ranges [1], [2]. Various approaches to the im- plementation of modular addition have been proposed in the lit- erature, and, in general, these are confined to memory lookup, combinational logic, or an amalgamation of both [3], [4]. The RNS is a nonweighted integer number system that is defined by an -tuple base of pairwise relatively prime positive integers , which are col- lectively known as the moduli of the system [5]. For a given base, an integer is represented by an -tuple word, , where , i.e., it is the nonnegative remainder when dividing by the modulus . Addition, subtraction, and multiplication operations are all closed in RNS. Let denote the binary operation of addition, Manuscript received July 29, 2004; revised May 5, 2005 and March 17, 2006. This work was supported by a White Rose Studentship, offered in collaboration with the Universities of Sheffield and Leeds, U.K. This work was presented in part at the IEEE International Workshop on Signal Processing Systems (SiPS), Austin, TX, October 2004. This paper was recommended by Associate Editor S.-G. Chen. R. A. Patel was with the Department of Electronic and Electrical Engineering, University of Sheffield, S1 3 JD, Sheffield, U.K. He is now with Detica Ltd., Surrey Research Park, Guildford, Surrey GU2 7YP, U.K. M. Benaissa and N. Powell are with the Department of Electronic and Elec- trical Engineering, University of Sheffield, S1 3 JD, Sheffield, U.K. S. Boussakta was with the Department of Electronic and Electrical Engi- neering, University of Leeds, Leeds, U.K. He is now with the School of Elec- trical, Electronic, and Computer Engineering, University of Newcastle upon Tyne, Newcastle upon Tyne NE 17 RU, U.K. Digital Object Identifier 10.1109/TCSI.2007.895369 subtraction, or multiplication. It then follows that is isomorphic to , where , . Note that is solely dependent upon and , and, hence, the arithmetic operation is performed in parallel with no interaction between the RNS channels. As a direct consequence of this property, RNS systems are capable of performing high-speed addition and multiplication, usually at a fraction of the time taken in traditional two’s complement systems. In addition to high-speed implementations, RNS circuits have also demonstrated power efficiency when implementing bespoke digital signal processing (DSP) functionality [6]–[9]. The minimization of power dissipation has become an im- portant performance objective as the trend increases towards portability, denser circuits, and high performance. One of the main sources of power dissipation in CMOS circuits is dynamic power dissipation, which is given by [10] (1) where is the activity factor, is the load capacitance, is the supply voltage, and is the clock frequency. A variety of techniques exist to reduce the parameters of (1) at all levels of design abstraction [10], including the use of alternate number representations such as RNS [11]. As an example, Freking and Parhi [8] showed that, as well as offering a reduction in switching activity, the speed advantage offered by RNS can be traded with a reduction in supply voltage to obtain a quadratic reduction in power—this provides a power delay advantage when compared with two’s complement implementations. Amongst the plethora of DSP applications, RNS has shown power efficiency in implementing FIR filters [8], [12], [13], frequency synthesizers [14], programmable DSPs [15], and secure image coding schemes [6]. In general, the design approach for modular adders falls into two distinct categories. On the one hand, they can be designed for flexibility in which case the methodology allows the de- sign of adders for any moduli. On the other hand, if one desires architectural simplicity and the best performance, then adders designed for a specific set of efficient moduli (e.g., ) have also been proposed [16], [17]. Our contribution is focused on modular adders belonging to the former class, and we will henceforth limit all discussion on this class of modular addition. Note that, although the adders for moduli of the form are more efficient than generic modular adders, RNSs that restrict the moduli to this form suffer from either large wordlength RNS channels, imbalanced arithmetic, or both when large dynamic ranges are required [1], [2]. 1549-8328/$25.00 © 2007 IEEE