Optimal Final Carry Propagate Adder Design for Parallel Multipliers Ramkumar B and Harish M Kittur Abstract- Based on the ASIC layout level simulation of 7 types of adder structures each of four different sizes, i.e. a total of 28 adders, we propose expressions for the width of each of the three regions of the final Carry Propagate Adder (CPA) to be used in parallel multipliers. We also propose the types of adders to be used in each region that would lead to the optimal performance of the hybrid final adders in parallel multipliers. This work evaluates the complete performance of the analyzed designs in terms of delay, area, power through custom design and layout in 0.18 um CMOS process technology. Index terms – ASIC (Application Specific Integrated Circuit), optimal hybrid CPA, Parallel multiplier, low power, area efficient. I. INTRODUCTION The critical signal path in a parallel multiplier is divided into three domains: AND gate array, PPST (Partial Product Summation Tree) and the final CPA. The delay introduced by the AND gate is relatively small compared to the other two components, especially for the large size multiplier. This delay component is also relatively independent of the size of the multiplier. The delay introduced by the PPST and the final CPA constitutes a dominant component of the delay in the multiplier [1]. Hybrid CPA have been proposed earlier with detailed investigations on the final addition of parallel multipliers [1]-[3]. It is well known that the signals applied to the inputs of the CPA arrive first at the ends of the CPA and the last ones are those in the middle of the CPA. So the determination of the exact arrival time to final adder is of prime importance in the design of the optimal final adder. We have therefore analyzed the arrival time from the PPST through layout implementation and based on those arrival times, the inputs has been applied to the 7 type of adders for 4 different bit sizes ( total of 28 adders). The analysis is done by using industry standard tool and based on the post layout simulation results we have designed the optimal final structure. The investigation includes 8 by 8, 16 by 16, 32 by 32 and 64 by 64 Dadda multiplier with the final adders being 16, 32, 64 and 128-bit Ripple Carry Adder (RCA), Carry Save Adder (CSA), Carry Select Adder (CSLA), Carry Look Ahead Adder (CLA) and BEC (Binary to Excess1) based adders called here as BEC Carry Select Adder (BCSLA), BEC Carry Save Adder (BCSA) and BEC Carry Look Ahead adder (BCLA)[4]-[10]. This paper is structured as follows; Section II deals with the design of the PPST based on Dadda algorithm and analysis of the signal arrival profile from the PPST. The analysis of the performance of various adders in terms of area, delay and power is in Section III. The equations for efficient partitioning of the multiplier region are developed in Section IV. The final adder design and ASIC implementation details are provided in Section V and VI respectively. Finally the work is concluded in Section VII. II. ANALYSIS OF PPST SIGNAL ARRIVAL PROFILE The basic top-level implementation for N by N unsigned parallel multiplier without CPA is shown in Fig. 1. To analyze the exact arrival time from the PPST, the multiplier is implemented without CPA. Signal Buffering In order to determine typical signal arrival profile and drive strengths, D flip-flops are used on the primary inputs & outputs. D flip-flops drive multiple buffers to distribute input signals to N 2 AND gates. Delay simulations were performed for each cell library to resolve, a) The maximum number of buffers that a single D flip- flop can drive. b) The maximum number of AND gate inputs that a single buffer can drive. N N N P0 Ps……P1 CLK Buffers AND Gate Array Partial Product Reduction D flip-flops + load caps D flip-flops D flip-flops Multiplicand (A) Multiplier (B) N This work was supported in part by the Integrated Circuit Design Laboratories, VIT University, Vellore, India. B.Ramkumar 1 and Harish Kittur 2 are with the VLSI division, School of Electronics Engineering, VIT University, Vellore, India. (email: ramkumar.b@vit.ac.in 1 ; kittur@vit.ac.in 2 ) Fig. 1. Top-level implementation of N by N multiplier without CPA