706 IEICE TRANS. ELECTRON., VOL.E95–C, NO.4 APRIL 2012 BRIEF PAPER Special Section on Solid-State Circuit Design — Architecture, Circuit, Device and Design Methodology Ultra High Speed Modified Booth Encoding Architecture for High Speed Parallel Accumulations Amir FATHI a) , Sarkis AZIZIAN , Student Members, Khayrollah HADIDI , and Abdollah KHOEI , Members SUMMARY This paper presents design of a novel high speed booth encoder-decoder in a 0.35 μm CMOS technology. Focusing on transistor level implementation of the new architecture and employing newly de- signed truth table, the gate level delay of the whole system is reduced to one logic gate plus one transistor delay which is the main advantage of the proposed circuit. Simulation results indicate high speed performance of the designed circuit and depict low power dissipation feature of implemented architecture which makes this work suitable for extensive use in high speed arithmetic blocks. key words: Modified Booth Algorithm, high speed 1. Introduction Because of their high performance, parallel multipliers are widely used in many high speed systems such as DSPs, CPUs and multimedia applications. In most of these sys- tems, the multiplier lies in the critical delay path and has a direct eect on the speed performance of the whole struc- ture [1]. So many high performance algorithms and archi- tectures have been proposed to accelerate the multiplica- tion process such as Booth multiplication algorithm, Wal- lace tree and Dadda tree which have their own benefits and drawbacks. But a comparison between dierent algorithms has revealed that for the range of 16 bit or higher multiplica- tion, the Modified Booth Encoding (MBE) architecture pro- vides higher performance at lower power dissipation than the others [1]. Therefore, most of the reported works in the multiplier design area pertain to MBE architecture espe- cially radix-4 modified booth algorithm [2]–[5]. In a parallel multiplier, the booth encoder-decoder is responsible for gen- erating partial products for the next accumulating stages [5]. Although radix-4 booth algorithm can reduce the number of partial products by half, it also increases the time of com- pression [4] so improving the performance of this algorithm and its circuitry, can directly aect the speed of multiplica- tion. 2. General Architecture In order to represent the procedure of radix-4 modified booth algorithm in mathematical expressions in the multi- plication of two numbers A and B, where B is multiplier and A is multiplicand, the multiplier must be defined as a Manuscript received August 19, 2011. Manuscript revised November 4, 2011. The authors are with Microelectronic Research Laboratory of Urmia University, Urmia, 57159, Iran. a) E-mail: st a.fathi@urmia.ac.ir DOI: 10.1587/transele.E95.C.706 signed n-bit number according to Eq. (1): B = -b n-1 2 n-1 + n-2 i=0 b i 2 i (1) In which b n-1 and n are the sign bit and number of bits, respectively. By some changes in Eq. (1), we will obtain: B = n 2 -1 i=0 (b 2i-1 + b 2i - 2b 2i+1 )2 2i (2) Where we define: d i = b 2i-1 + b 2i - 2b 2i+1 , b -1 = 0 (3) Equation (2) shows that using this encoded version for multiplier, the number of partial products will be halved since the number of summations is reduced to half, too. Us- ing this concept, the multiplier (B), must be encoded to its scale factors (-2, -1, 0, 1, 2) which are represented as (-2X, -X, 0, X, 2X) and are determined by d i in Eq. (3). To gen- erate these factors, the multiplier must be classified in 3 bit groups and with the help of truth table, the encoding is done. Then the partial products are being generated by means of booth decoder. To summarize the partial product generation procedure, 4 types of operations according to scale factors must be applied to multiplicand to obtain the outputs, as fol- lows: 1. The scale factor 0 indicates whether the multiplicand is zeroed before being used as a partial product. 2. The scale factor 1 indicates that the multiplicand can directly be sent to output. 3. The scale factor 2 illustrates whether the partial product bits are shifted left one position. 4. The sign extension scale factor indicates whether or not to invert all of the bits to create a negative prod- uct (which must be corrected by adding “1” at some later stage). Most of the CMOS approaches in the field of radix-4 MBE architecture use the direct way of generating X and 2X scale factors to obtain partial products [2]–[4]. In other words, the outputs are obtained by means of logical func- tions containing X and 2X factors. Thus, these factors are generated directly in the body of booth encoder. In most cases the generation of X factor is simpler than 2X factor according to designated truth table for each work. Hence, more hardware is needed to generate 2X and the gate level Copyright c 2012 The Institute of Electronics, Information and Communication Engineers