REDUCING MULTIPLIER ENERGY BY DATA-DRIVEN VOLTAGE VARIATION Tomoyuki Yamanaka and Vasily G. Moshnyaga Department of Electonics and Computer Science, Fukuoka University 8-19-1 Nanakuma, Jonan-ku, Fukuoka 814-0180, JAPAN Abstract Design of portable battery operated multimedia devices requires energy-efficient multiplication circuits. This paper proposes a new technique to reduce power consumption of digital multipliers. In contrast to related methods which concentrate on transition activity reduction, we focus on dynamic reduction of supply voltage. Two implementation schemes capable of dynamically adjusting a double voltage supply to input data variation are presented. Simulations show that using these schemes we can reduce energy consumption of 16x16bit multiplier in DCT computation by 33.4% and 25.2% on average without any speed degradation and as low as 4.7% area overhead. 1. Introduction 1.1. Motivation Digital array multipliers are essential arithmetic blocks for many DSP applications: convolution, filtering, discrete cosine transform (DCT), vector quantization, etc. Due to high capacitive load and large bit-width, these structures become the most energy-consuming units in modern DSP circuits. In the NEC’s 16-bit SPX processor, for example, two multiplying units dissipate almost half of the total power [1]. As result optimizing the multipliers for energy is important. In digital CMOS circuits, charging and discharging of capacitors dominates the total energy dissipation. Given the average load capacitance (C), the supply voltage (V dd ), and the number (a) of energy consuming signal transitions per operation, the average energy dissipation of a CMOS multiplier can be expressed by E = a * C * V 2 . Although lowering this energy amounts to all factors, voltage reduction offers the most drastic means of minimizing energy consumption. Unfortunately, the price needs to be paid is higher delay: D= V/(V-V T ) 2 , where V T is the threshold voltage. If however, both supply voltage and delay are dynamically varied in response to computational load demands, then the energy consumed per task can be reduced for the low execution periods, while retaining peak throughput when required. 1.2. Related Research There has been an extensive research on energy reduction in digital multipliers with most efforts put on transition activity reduction in adding array. Methods proposed cover sign- magnitude representation, algebraic operation re-ordering, and self-timing [2-3], replacing the carry-save array by a tree-based structure[4], inserting extra hardware the array to stop spurious transitions [5-8] delay balancing [9-10], using adding compressors, modified sign-extension, and coding [11], truncating the operands [12], applying a mixed number representation with canonical sign digit numbers[13], optimizing adding cells [14], interchanging the multiplicands[15], activating the adding cells as the evaluation wave moves within the array[16], multiplicand reordering and optimization [17], etc. Despite differences all these approaches have one feature in common: they focus on transition activity reduction assuming that the voltage supply is fixed and independent of the workload. Up to our knowledge, the architectural-driven voltage scaling [17-18], which proved its efficiency in a variety of designs, has not been applied to multipliers yet. There have been attempts to change the supply voltage adiabatically, i.e. by the system clock [19]. Due to fixed clock frequency, this voltage alteration is regular and does not depend on the input workload. Moreover, it impacts severely on speed making adiabatic multipliers almost impractical. 1.3. Contribution This paper presents a new approach to reduce the energy consumption in digital multipliers. Unlike existing research, the approach targets previously unexplored degree of freedom inherent in the multiplier energy optimization, namely, the voltage supply per operation. Because the full multiplication bit- width is rarely required in real applications, the time budget of multiplication becomes frequently unused. We propose to trade this unused time with the voltage supply in order to save energy. Experiments show that such formulation can save up to 45% of energy with an average of 33.4% in comparison to the traditional multiplier design with a very small overhead. 2. The Proposed Approach 2.1. Main Idea The main idea of our approach is based on three key features of digital multipliers employed in media processors and DSP: (1) fixed multiplication latency and bit-width, (2) unevenness of delays corresponding to each bit, and (3) unevenness of bit- utilization during operation. A typical multiplier traditionally computes the product based the whole bit-width (16-bit or more). The multiplier is driven by a single supply voltage, whose level is high enough to charge (discharge) the circuit in the time interval, T, to satisfy the performance requirement. The circuit capacitance, which has to be charged (discharged) to produce a corresponding bit of the product depends on the bit position; the most significant bits (MSB) have larger capacitance than the least significant bits (LSB). Consequently, the actual delay II - 285 0-7803-8251-X/04/$17.00 ©2004 IEEE ISCAS 2004