IJSRD - International Journal for Scientific Research & Development| Vol. 1, Issue 5, 2013 | ISSN (online): 2321-0613 All rights reserved by www.ijsrd.com 1188 Abstract— Truncated multiplication reduces part of the power required by multipliers by only computing the most- significant bits of the product. The most common approach to truncation includes physical reduction of the partial product matrix and a compensation for the reduced bits via different hardware compensation sub circuits. However, this result in fixed systems optimized for a given application at design time. A novel approach to truncation is proposed, where a full precision multiplier is implemented, but the active section of the partial product matrix is selected dynamically at run-time. This allows a power reduction trade off against signal degradation which can be modified at run time. Such architecture brings together the power reduction benefits from truncated multipliers and the flexibility of reconfigurable and general purpose devices. Efficient implementation of such a multiplier is presented in a custom digital signal processor where the concept of software compensation is introduced and analysed for different applications. Experimental results and power measurements are studied, including power measurements from both post-synthesis simulations and a fabricated IC implementation. This is the first system-level DSP core using a fine-grain truncated multiplier. Results demonstrate the effectiveness of the programmable truncated MAC (PTMAC) in achieving power reduction, with minimum impact on functionality for a number of applications. Software compensation also shown to be effective when deploying truncated multipliers in a system. INTRODUCTION I. The high increase of portable communication and computing devices and the advance in mobile multimedia systems has made power consumption critical to optimize in the design of digital signal processing architectures. Advances in very large scale integrated-circuit technology (VLSI) has been one of the major driving forces for the growth of digital signal processors (DSP), enabling the implementation of complex algorithms in programmable DSP architectures and fixed application specific hardware. Within DSP systems, multipliers are among the most fundamental building blocks and parallel implementations, required for high speed computation, represent the main component in terms of power consumption of any hardware dedicated to complex mathematic functions, such as filtering, compression, or classification. The relationship between the physical characteristics of the multiplier and its resolution determine the efficiency and accuracy of the systems. The optimization of multipliers in terms of area, power and timing has been extensively studied in the past. Full or direct multiplier implementation of an N×N–bit multiplication yields a 2N-bit product. In order to keep the full accuracy of the system, DSP architecture would need an ever-growing bit width that would be impossible or impractical to implement. To avoid this, results are usually truncated or rounded down to keep results within the limits of the architecture bit width. In order to reduce the parallel multiplier requirements in systems where an exact result is not required, several techniques that reduce power consumption at the expense of lower precision have been presented in the literature. Such techniques skip the implementation and/or disable parts of the partial product matrix to trade energy spent by the computational process for degradation on the output signal. These techniques fall into two categories, namely Word length Reduction and Truncated Multipliers. Word length reduction techniques minimize switching activity at the expense of data precision by input shifting or truncation, but since the reduction is done from the input operands, power reductions are obtained at the cost of high levels of output noise. Truncated multiplication structures based on Baugh-Wooley [2] or Booth [3] algorithms, do not (by design) compute the lowest sections of the partial product matrix. By doing so, a certain error is introduced in the output while savings in power, area, complexity, and timing are achieved. The majority of the previous publications, focused on fixed-width multipliers [4] that produce a fixed -bit output, and their gain-vs.-error ratio is set by hardware. Configurable techniques result in flexible architectures, which while losing their benefits resulting from area optimization and static power reduction, provide the multiplier with the advantage of adaptability. This is desirable in systems with a certain degree of programmability. The architecture presented in this paper, a programmable truncated multiplier (PTM), describes a full-precision multiplier, where the elements of the partial product matrix can be disabled through an external control word in a column-wise mode. Disabled columns cease to contribute to the dynamic power consumption, achieving power reductions in the multiplier. Benefits of PTMs include: 1. Real-time control of the power-SNR exchange in applications where power modes are switchable at run-time. 2. Flexibility on accuracy selection for programmable devices such as DSP structures that would benefit from different truncation levels for different applications, and at different operating points of an individual algorithm. The paper is organized as follows. Section II presents a brief background about multipliers and two’s compliment multiplication. The proposed works are presented in Section III. Section IV gives the simulation Results of the implemented multiplier. Design and Implementation of a Programmable Truncated Multiplier Sethu Merin George 1 1 M.Tech 1 VLSI & ES, MLMCE, Kottayam