This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS–I: REGULAR PAPERS 1 Efficient Shift-Add Implementation of FIR Filters Using Variable Partition Hybrid Form Structures Dwaipayan Ray , Nithin V. George , Member, IEEE, and Pramod Kumar Meher, Senior Member, IEEE Abstract— Single constant multiplication (SCM) and multiple constant multiplications (MCM) are among the most popular schemes used for low-complexity shift-add implementation of finite impulse response (FIR) filters. While SCM is used in the direct form realization of FIR filters, MCM is used in the trans- posed direct form structures. Very often, the hybrid form FIR filters where the sub-sections are implemented by fixed-size MCM blocks provide better area, time, and power efficiency than those of traditional MCM and SCM based implementations. To have an efficient hybrid form filter, in this paper, we have performed a detailed complexity analysis in terms of the hardware and time consumed by the hybrid form structures. We find that the existing hybrid form structures lead to an undesirable increase of complexity in the structural-adder block. Therefore, to have a more efficient implementation, a variable size partitioning approach is proposed in this paper. It is shown that the proposed approach consumes less area and provides nearly 11% reduction of critical path delay, 40% reduction of power consumption, 15% reduction of area-delay product, 52% reduction of energy- delay product, and 42% reduction of power-area product, on an average, over the state-of-the-art methods. Index Terms—Finite impulse response (FIR) filter, hybrid form FIR filters, constant multiplication schemes, coefficient partitioning approach and low power designs. I. I NTRODUCTION T HE guaranteed stability and linear phase response of finite impulse response (FIR) filters have made it a popular candidate for several digital signal processing (DSP) applications [1]. The area, time and power consumption of an FIR filter are largely dominated by the complexity of multiplications. Several attempts have therefore been made to reduce this complexity by multiplier-less FIR filter imple- mentation, where the multiplication operations are realized by optimized shift-and-add based networks [2]. The two most commonly used forms of FIR filter implementation are the direct form (DF) (Fig. 1(a)) and the transposed direct Manuscript received November 29, 2017; revised February 21, 2018, March 21, 2018, and May 6, 2018; accepted May 14, 2018. This work was supported by the Department of Science and Technology, Government of India through the INSPIRE Faculty Award Scheme under Grant IFA-13 ENG-45. This paper was recommended by Associate Editor Y. Yu. (Corresponding author: Dwaipayan Ray.) D. Ray and N. V. George are with the Department of Elec- trical Engineering, IIT Gandhinagar, Gandhinagar 382355, India (e-mail: dwaipayan.ray@iitgn.ac.in; nithin@iitgn.ac.in). P. K. Meher is with the C. V. Raman College of Engineering, Bhubaneswar 752054, India (e-mail: pkmeher@gmail.com). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCSI.2018.2838666 Fig. 1. Realization of a N-tap FIR filter in (a) Direct form (DF) (b) Transposed Direct form (TDF), and (c) Generalized hybrid form structure of [5]. PSA: pre-structural adder. AT: adder tree. L : partitioning parameter. form (TDF) structures (Fig. 1(b)) [3]. In a shift-and-add based DF filter, each multiplication is realized by the single constant multiplication (SCM) scheme and the partial products are added together by an adder-tree to obtain the final output. On the other hand, in a TDF filter, the current input sample is multiplied by all the coefficients and the products are then passed through a unit delay and added in the structural-adder block (SAB) to generate the filter output. The multiplications in this case are realized by the multiple constant multiplica- tion (MCM) approach [4]. To reduce the hardware complexity of an FIR filter, several optimization techniques have been proposed in the recent past for the MCM block [6]–[15] as well as the structural- adder block [16]–[19], where the methods of [6]–[11] perform the block-level (word-level adder) optimization of the MCM module while the bit-level (full adder) optimization of 1549-8328 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.