K. Yamuna, C. Chandrasekhar / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 4, Jul-Aug 2013, pp.1772-1777 1772 | P a g e Design And Implementation Of Efficient Lifting Based Dwt Architecture Using Wallace Tree Multiplier For Compression K. Yamuna*, C. Chandrasekhar** *(M. Tech (VLSI), SVCET, CHITTOOR, A.P., INDIA ** (HOD, Dept. of ECE, SVCET, CHITTOOR, A.P, INDIA ABSTRACT Demand for high speed and low power architecture for DWT computation have led to design of novel algorithms and architecture. In this paper we design, model and implement a hardware efficient, high speed and power efficient DWT architecture based on modified lifting scheme algorithm. The design is interfaced with SIPO and PISO to reduce the number of I/O lines on the FPGA. The design is implemented on Spartan III device and is compared with lifting scheme logic. The proposed design operates at frequency of 520.738 MHz and consumes power less than 0.103W. The pre-synthesis and post- synthesis results are verified and suitable test vectors are used in verifying the functionality of the design. The design is suitable for real time data processing. Keywords - Lifting scheme, low power, high speed, FPGA implementation I. INTRODUCTION Discrete wavelet transforms (DWT) decomposes image into multiple sub bands of low and high frequency components. Encoding of sub band components leads to compression of image. DWT along with encoding technique represents image information with less number of bits achieving image compression. Image compression finds application in every discipline such as entertainment, medical, defense, commercial and industrial domains. The core of image compression unit is DWT. Other image processing techniques such as image enhancement, image restoration and image filtering also requires DWT and Inverse DWT for transformations. DWT-IDWT is one of the prominent transformation techniques that are widely used in signal processing and communication applications. DWT-IDWT computes or transforms signal into multiple resolution sub bands. DWT is computationally very intensive and consumes power due to large number of mathematical operations. Latency and throughput are other major limitations of DWT as there are multiple levels of hierarchy. DWT has traditionally been implemented by convolution. Digit serial or parallel representation of input data further decides the architecture complexity. Such an implementation demands a large number of computations and a large storage that are not desirable for either high-speed or low-power applications. Recently, a lifting-based scheme that often requires far fewer computations has been proposed for the DWT. The main feature of the lifting based DWT scheme is to break up the high pass and low pass filters into a sequence of upper and lower triangular matrices and convert the filter implementation into banded matrix multiplications. Since DWT requires intensive computations, several architectural solutions using special purpose parallel processor have been proposed, in order to meet the real time requirement in many applications. The solutions include parallel filter architecture, SIMD linear array architecture, SIMD multigrain architecture, 2-D block based architecture, and the AWARE’s wavelet transform processor (WTP). Several versions of lifting scheme architecture have been compared and reported in literature. In terms of hardware complexity, the folded architecture is the simplest and the DSP-based architecture is the most complex. All other architectures have comparable hardware complexity and primarily differ in the number of registers and multiplexor circuitry. The control complexity of the architecture is very simple. In contrast, the number of switches, multiplexors and control signals used in the architectures of is quite large. The control complexity of the remaining architectures is moderate. In terms of timing performance, the architectures are all pipelined, with the architectures having the highest throughput (1/Tm). The architecture has fewer cycles since it is RPA based, but its clock period is higher. The architecture in has the lowest computation delay. DWT is recommended by JPEG2000 standards as it supports features like progressive transmission, higher compression and region of interest encoding schemes. Convolution based DWT or FIR filter bank based DWT architectures occupy large area as they require more number of multipliers and adders, thus making the computations complex and time consuming. Mobile phones and other similar hand held devices that support image//video applications demand high speed and low power architectures with reduced memory size for DWT processing. There are several architectures discussed in literature to perform lifting based approach for 2-D DWT is to apply the 1-D DWT row-wise which produces L and H sub bands and then process these sub-bands column-wise to get LL, LH, HL and HH coefficients.