Power Optimization in a Parallel Multiplier using Voltage Islands Seok Won Heo Computer Science Department University of California at Los Angeles CA, USA 90095 Email: comace@cs.ucla.edu Suk Joong Huh Samsung Electronics Suwon, Korea Email: sukjoong.huh@samsung.com Milo ˇ s D. Ercegovac Computer Science Department University of California at Los Angeles CA, USA 90095 Email: milos@cs.ucla.edu Abstract—Minimizing the power dissipation of parallel multi- pliers is important for mobile digital signal processing. In this paper, we present an approach to reducing power dissipation in the design of parallel multipliers by utilizing voltage islands to exploit non-uniform arrival of inputs to the carry propagate adder. Our approach reduces up to approximately 20% of dynamic power dissipation with little delay penalty in a parallel multiplier of a tree type, and uses a fast simple adder instead of a hybrid adder. I. I NTRODUCTION The multiplier is an expensive core component of the Digital Signal Processors (DSPs) and Graphics Processing Units (GPUs): studies on power dissipation in DSPs and GPUs indicate that the multiplier is one of the most power hungry components on these chips. Therefore, the research on low power multipliers remains critical. With the increasing complexity of VLSI systems and a growing number of mobile applications, minimizing the power consumption has been growing in importance. Dynamic power dissipation is the dominant factor in the total power consump- tion of a CMOS circuit and typically contributes over 60% of the total system power dissipation. Although the effect of static power dissipation increases signiﬁcantly, the dynamic power dissipation will dominate as VLSI manufacturing technology shrink [1]. It can be described by P dynamic =0.5 × C L × V 2 DD × f p × N where C L is the load capacitance, V DD is the power supply voltage, f p is the clock frequency, and N is the switching activity. The equation indicates the power supply voltage has the largest impact on the dynamic power dissipation due to its squared term factor. Unfortunately, the lowering power supply voltage causes speed penalties. A great deal of effort has been expended in recent years on the development of the techniques to utilize the low power supply voltage while minimizing the performance degradation. Using voltage islands is one way to mitigate such performance degradation by architectural changes of the circuit [2]. This paper proposes a scheme to achieve power savings in a parallel multiplier of a tree type by utilizing voltage islands. This paper is organized as follows. Section II presents an in-depth view of recent research in the design of low power multipliers. Section III addresses the problem of parallel multipliers. In Section IV, the paper focuses on power savings utilizing voltage islands. Section V analyzes how to reduce power, and section VI discusses current problems. Finally, a summary is given in Section VII. II. RELATED WORK To exploit parallelism with a scaled power supply voltage, the clustering and partitioning technique was proposed in [3]. The cluster width is deﬁned as the distance between the ﬁrst and the last nonzero bits. Ignoring the positions outside the cluster and performing multiplication with a collection of smaller multipliers in parallel with scaled supply voltages while maintaining given throughput can achieve signiﬁcant power savings. Another approach for power savings uses pipelining [4]. Compared to non-pipelined schemes, the pipelined technique can achieve a higher operating frequency at a given supply voltage or, alternatively, a lower supply voltage for a desired throughput. These power-efﬁcient schemes for parallel multi- pliers, however, have larger areas. To disable the operations in some rows (or columns), by- passing techniques were discovered in [5]–[7]. If the bits of a multiplier (or multiplicand) are zero, the corresponding partial products are also zero. As a result, the multipliers need not perform summation of zero partial products. These multipliers bypass inputs to outputs when corresponding partial products are zero, and therefore disable unnecessary transitions. These architectures can save signiﬁcantly dynamic power dissipation with little area penalty. Using typically large fraction of zero and small valued input, signal gating approach can achieve power savings by deactivating slices [8]–[10]. The multiplier which is divided into several slices, detects parts of operands with zero values. These approaches mentioned above mainly reduce 1) the power supply voltage with large area overhead or 2) the switching activity with small area overhead. The proposed approach reduces the power supply voltages with little delay and area penalties. Therefore, the proposed approach is more efﬁcient and economical than previous approaches. 978-1-4673-5762-3/13/$31.00 ©2013 IEEE 345