Journal of VLSI Signal Processing, 4, 165-176 (1992) 9 1992 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. A Radix-8 Wafer Scale FFT Processor EARL E. SWARTZLANDER, JR. Department of Electrical and Computer Engineering, Unviersity of Texas, Austin, TX 78712 VIJAY K. JAIN AND HIROOMI HIKAWA Department of Electrical Engineering, Unviersityof South Florida, Tampa, FL 33620 Received January 16, 1991; Revised November 6, 1991. Abstract. Wafer Scale Integration promises radical improvements in the performance of digital signal processing systems. This paper describes the design of a radix-8 systolic (pipeline) fast Fourier transform processor for imple- mentation with wafer scale integration. By the use of the radix-8 FFT butterfly wafer that is currently under develop- ment, continuous data rates of 160 MSPS are anticipated for FFTs of up to 4096 points with 16-bit fixed point data. 1. Introduction Current signal processor performance is severely con- strained by the available technology. Silicon CMOS VLSI circuits placed on printed circuit boards are limited by their interfaces to clock rates of under 40 MHz. Silicon ECL and the various forms of GaAs do not achieve the high levels of integration (i.e., over 100,000 gates/chip) that are required for many signal processing applications. A solution is available in the form of CMOS Wafer Scale Integration which offers both high speed (due to much lower interconnection parasitics than printed circuit board mounted CMOS VLSI) and extremely high levels of integration. Wafer scale technology avoids the interchip buffer- ing and interconnect delays of VLSI, permitting significantly higher data rates and lower power con- sumption. The major challenges are to (1) minimize the number of cell types, (2) employ a regular (and short) interconnection architecture, and (3) use redundancy to circumvent the defects implicit in wafer scale implementation. Signal processing systems require many diverse functions: time-frequency transformation, time and fre- quency domain processing, and general purpose com- putation. Since the FFT is the cornerstone of modern digital signal processing, this paper examines the pro- jected implementation of an FFT processor that is based on a design for a radix-8 FFT wafer that is currently in development at the University of South Florida. The next section describes the systolic FFT architec- ture. Section 3 which follows, examines the wafer scale implementation of a radix-8 systolic FFT processor that is realized with two types of wafers. 2. Systolic (Pipeline) FFT Architecture The systolic (pipeline) FFT algorithm was developed initially and implemented at Raytheon [1], later an ar- bitrary radix systolic (pipeline) FFT was developed at MIT Lincoln Laboratories [2],[3], where a radix-4 im- plementation was realized [4]. Subsequently, in a VLSI floating point arithmetic implementation of a radix-4 pipeline FFT processor produced at TRW, data rates of 40 MSPS have been achieved for transforms of lengths up to 4096 points [5]. With the radix-n pipeline FFT algorithm shown in figure 1, n complex data flow parallel through the pipeline network comprised of computational elements and delay commutators. The parallel data transfers facilitate achieving high throughput with modest clock rates. The computational element realized an n-point discrete Fourier transform, while the delay commutator performs the data reordering that is required for the FFT algorithm. The same architecture is used with minor changes to implement forward and inverse transforms of lengths that are integer powers of the radix. The changes involve varying the number of stages connected in series, changing the multiplicative weights