International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 ISSN 2250-3153 www.ijsrp.org Adaptive FIR Filter Using Distributed Airthmetic for Area Efficient Design Manish Kumar * , Dr. R.Ramesh ** * Dept.of Electronics and Communication, Saveetha Engineering College, Chennai, India. ** Dept.of Electronics and Communication, Saveetha Engineering College, Chennai, India. Abstract- In this paper we propose an efficient pipelined architecture for low power,high throughput and low area adaptive FIR filter based on distributed airthemetic. The throughput rate is significantly increased by parallel look-up table(LUT) update. Reduction in power consumption is achieved by using a fast bit clock for carry save accumulation. We have shown that sampling period could be sequentially reduced by using carry save accumulation for DA based inner product. It involves half the number of register compared to the existing DA based design to store of input samples. The system is implemented in FPGA that enables rapid prototyping of digital cirtcuits. Index Terms- Finite Impulse Response (FIR), Look Up Table (LUT), Distributed airthemetic (DA), Field Programmer Gate Array (FPGA). I. INTRODUCTION n the recent years, there has been a growing trend to implement digital signal processing functions in Field Programmable Gate Array (FPGA). In this sense, we need to put great effort in designing efficient architectures for digital signal processing functions such as FIR filters, which are widely used in video and audio signal processing, telecommunications and etc. Many digital signal processing (DSP) applications require linear filters that can adapt to changes in the signals they process. Adaptive filters find extensive use in several DSP applications including acoustic echo cancellation, signal de-noising, sonar signal processing, clutter rejection in radars, And channel equalization for communications and networking systems [1], [2]. In many cases, the sampling frequencies for digital processing of these signals are close to the system clock frequencies. Thus, it is important for the adaptive filters implementedto have a high throughput Traditionally, direct implementation of a K-tap FIR filter requires K multiply-and-accumulate (MAC) blocks, which are expensive to implement in FPGA due to logic complexity and resource usage. To resolve this issue, we first present DA, which is a multiplier-less architecture. Implementing multipliers using the logic fabric of the FPGA is costly due to logic complexity and area usage, especially when the filter size is large. Modern FPGAs have dedicated DSP blocks that alleviate this problem, however for very large filter sizes the challenge of reducing area and complexity still remains. Very efficient methods have been developed for the parallel implementation of static digital filters in field programmable logic arrays (FPGAs) or custom ICs [4]. Distributed arithmetic (DA) [5] is one method often preferred since it eliminates the need for hardware multipliers and is capable of implementing large filters with very high throughput . An alternative to computing the multiplication is to decompose the MAC operations into a series of lookup table (LUT) accesses and summations. This approach is termed distributed arithmetic (DA), a bit serial method of computing the inner product of two vectors with a fixed number of cycles. The original DA architecture stores all the possible binary combinations of the coefficients w[k] of equation (1) in a memory or lookup table. It is evident that for large values of L, the size of the memory containing the pre computed terms grows exponentially too large to be practical. The memory size can be reduced by dividing the single large memory (2Lwords) into m multiple smaller sized memories each of size 2k where L = m × k. The memory size can be further reduced to 2L−1 and 2L−2 by applying offset binary coding and exploiting resultant symmetries found in the contents of the memories. This technique is based on using 2's complement binary representation of data, and the data can be pre-computed and stored in LUT. As DA is a very efficient solution especially suited for LUT-based FPGA architectures, many researchers put great effort in using DA to implement FIR filters in FPGA. Patrick Longa introduced the structure of the FIR filter using DA algorithm and the functions of each part. Sangyun Hwang analyzed the power consumption of the filter using DA algorithm. Heejong Yoo proposed a modified DA architecture that gradually replaces LUT requirements with multiplexer/adder pairs. But the main problem of DA is that the requirement of LUT capacity increases exponentially with the order of the filter, given that DA implementations need 2Kwords (K is the number of taps of the filter). And if K is a prime, the hardware resource consumption will cost even higher. To overcome these problems, this paper presents a hardware-efficient DA architecture. In this paper, we propose an efficient LUT-less architecture for high-speed DA-based adaptive filter with very low adaptation-delay. We have shown that the minimum sampling period could be substantially reduced by using carry-save accumulation for DA-based inner-product computation. The proposed design requires less than half the number of registers compared to the existing design for storing the necessary sums of input samples and involves a simple structure for weight updating. This method not only reduces the LUT size, but also modifies the structure of the filter to achieve high speed performance. The proposed filter has been designed and synthesized with ISE 14.1, and implemented with a FPGA I