IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 21, NO. 1, JANUARY 2013 187 [4] J. A. de Lima and C. Dualibe, “A linearly tunable low-voltage CMOS transconductor with improved common-mode stability and its applica- tion to gm-C ﬁlters,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 48, no. 7, pp. 649–660, Jul. 2001. [5] U. Yodprasit and C. C. Enz, “A 1.5-V 75-dB dynamic range third-order Gm/C ﬁlter integrated in a 0.18 m standard digital CMOS process,” IEEE J. Solid-State Circuits, vol. 38, no. 7, pp. 1189–1197, Jul. 2003. [6] L. Acosta, M. Jiménez, R. G. Carvajal, A. J. López-Martín, and J. Ramírez-Angulo, “Highly linear tunable CMOS Gm-C low-pass ﬁlter,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 56, no. 10, pp. 2145–2158, Oct. 2009. [7] A. Zeki, “Low-voltage CMOS triode transconductor with wide-range and linear tunability,” Electron. Lett., vol. 35, no. 20, pp. 1685–1686, Sep. 1999. [8] R. G. Carvajal, J. Ramírez-Angulo, A. J. López-Martín, A. Torralba, J. A. Galan, A. Carlosena, and F. M. Chavero, “The ﬂipped voltage follower: A useful cell for low-voltage low-power circuit design,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 52, no. 7, pp. 1276–1291, Jul. 2005. [9] Z. Y. Chang, D. Haspeslagh, and J. Verfaillie, “A highly linear CMOS Gm-C bandpass ﬁlter with on-chip frequency tuning,” IEEE J. Solid- State Circuits, vol. 32, no. 3, pp. 388–397, Mar. 1997. [10] J. Silva-Martínez, M. S. Steyaert, and W. C. Sansen, “A 10.7 MHz 68-dB SNR CMOS continuous-time ﬁlter with on-chip automatic tuning,” IEEE J. Solid-State Circuits, vol. 27, no. 12, pp. 1843–1853, Dec. 1992. A High-Speed Low-Complexity Modiﬁed FFT Processor for High Rate WPAN Applications Taesang Cho and Hanho Lee Abstract—This paper presents a high-speed low-complexity modiﬁed 512-point fast Fourier transform (FFT) processor using an eight data-path pipelined approach for high rate wireless personal area network applications. A novel modiﬁed FFT algorithm that reduces the hardware complexity is proposed. This method can reduce the number of complex multiplications and the size of the twiddle factor memory. It also uses a complex constant multiplier instead of a complex Booth multiplier. The proposed FFT processor achieves a signal-to-quan- tization noise ratio of 35 dB at 12 bit internal word length. The proposed processor has been designed and implemented using 90-nm CMOS tech- nology with a supply voltage of 1.2 V. The results demonstrate that the total gate count of the proposed FFT processor is 290 K. Furthermore, the highest throughput rate is up to 2.5 GS/s at 310 MHz while requiring much less hardware complexity. Index Terms—Fast Fourier transform (FFT), modiﬁed , or- thogonal frequency-division multiplexing (OFDM), wireless personal area network (WPAN). I. INTRODUCTION With the ever increasing demand for multimedia applications using wireless transmissions over short distances, the millimeter wave (mmWave) 60 GHz wireless personal area network (WPAN) has been intensively researched for many years. Currently, the IEEE 802.11 Task Group ad (IEEE 802.11ad) is developing a standard for the Manuscript received July 18, 2011; revised November 11, 2011; accepted De- cember 07, 2011. Date of publication February 03, 2012; date of current version December 19, 2012. This work was supported by Inha University. The authors are with the Department of Information and Communication En- gineering, Inha University, Incheon 402-751, Korea (e-mail: hhlee@inha.ac.kr). Color versions of one or more of the ﬁgures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identiﬁer 10.1109/TVLSI.2011.2182068 mmWave wireless local area network (WLAN) and WPAN systems. 1 High rate WPAN systems will be provided for various high speed multimedia applications such as home network systems and real time video streaming services in short range indoor environments. One key advantage of IEEE 802.11ad over the other standardization activities in the 60 GHz arena is that it builds on the existing strong market presence of Wi-Fi in the 2.4/5 GHz bands. In the PHY layer design of high rate WPANs, the orthogonal fre- quency division multiplexing (OFDM) modulation has been adopted, and the fast Fourier transform (FFT) processor is a key component. The FFT/IFFT processor has a high hardware complexity in the OFDM modulation of high rate WPAN systems. One OFDM symbol in the IEEE 802.11ad standards consists of a length of 512 subcar- riers. Therefore, FFT processor conducts the FFT computation with 512-point arithmetic and should provide a high throughput rate of at least 2.115 GS/s. In recent years, there has been some research in the design of multi-path pipelined FFT processors that provide a high throughput [1]–[7]. Many FFT processor architectures are introduced in order to utilize the OFDM transmission, such as a single path delay commu- tator (SDC), multi-path delay commutator (MDC), single path delay feedback (SDF), and multi-path delay feedback (MDF). Among the various FFT architectures, the MDF architecture is frequently used as a solution to provide a throughput rate of more than 1 GS/s [3]–[5]. However, for applications that provide a throughput rate of over 2 GS/s, the number of data-paths can be increased to 8 or 16, which increases the hardware cost. The area becomes even larger because the memory modules are duplicated for the 16 data path approach. In order to reduce the area and power consumption, several FFT algorithms and dynamic scaling schemes have been proposed [2]–[6]. The radix of the algorithm greatly inﬂuences the architecture of the FFT processor and the complexity of the implementation. A small radix is desirable because it results in a simple butterﬂy. Nevertheless, a high radix reduces the number of twiddle factor multiplications. The radix algorithms simultaneously achieve a simple butterﬂy and a reduced number of twiddle factor multiplications [8]. The radix-2 algorithm is a well known simple algorithm for FFT processors, but it requires many complex multipliers. The radix-4 algorithm is primarily used for high data throughput FFT architectures, but requires a 4-point butterﬂy unit with high complexity. Recently, the FFT algorithm and architecture have been studied in order to reduce the number of complex multipliers [2], [4]. In this brief, a novel modiﬁed FFT algorithm and a 512-point FFT/IFFT processor architecture, which can provide a high throughput of 2.5 GS/s and SQNR of 35 dB for16-QAM applications, are proposed. The key concepts for achieving a high data throughput, reduced hardware complexity and higher SQNR performance are de- scribed. The organization of this brief is as follows. Section II describes the proposed modiﬁed FFT algorithm, and Section III describes the proposed 512-point FFT architecture. In Section IV, the implementation and comparison are presented. Finally, conclusions are provided in Section V. II. MODIFIED FFT ALGORITHM A discrete Fourier transform (DFT) of length is deﬁned as fol- lows: (1) 1 [Online]. Available: http://www.ieee802.org/11/Reports/tgad_update.htm 1063-8210/$31.00 © 2012 IEEE