LOW-DIMENSIONAL AUDIO-RATE CONTROL OF FFT-BASED PROCESSING Cort Lippe & Zack Settel Cort Lippe Zack Settel University at Buffalo McGill University Department of Music Music Faculty Hiller Computer Music Studios 555 rue Sherbrooke Ouest 222 Baird Hall Montreal, Quebec H3A 1E3 Buffalo, NY, USA 14260 CANADA lippe@buffalo.edu zack@music.mcgill.ca ABSTRACT While the use of the Fast Fourier Transform (FFT) for signal processing in music applications has been widespread, ap- plications in real-time systems for dynamic spectral trans- formation have been quite limited. The limitations have been largely due to the amount of computation required for the op- erations. With faster machines, and with suitable implemen- tation for frequency-domain processing, real-time dynamic control of high-quality spectral processing can be accom- plished with great efficiency and a simple approach. This paper will describe some recent work in dynamic real-time control of frequency-domain-based signal processing. Since the implementation of the FFT/IFFT is central to the approach and methods discussed below, the authors will provide a brief description of this implementation, as well as of the development environment used in our work. As seen in Figure 1, the index values provide a synchroniza- tion phasor, making it possible to identify bins within a frame, and recognize frame boundaries. The index values can be used to access bin-specific data for various operations, such as attenuation or spatialization, and to read lookup tables for windowing. 1. INTRODUCTION We employ the standard procedures commonly used when processing audio signals via the FFT, including: (1) windowing of the time-domain input signal [1], (2) transformation of the input signal into a frequency domain signal (spectrum) using the FFT, (3) various frequency- domain operations such as complex multiplication for convolution, (4) transformation of the frequency-domain signals back into the time domain using the IFFT, (5) and windowing of the time-domain output signal. Figure 1. sample-by-sample output of the FFT object 1.2 Audio-Rate Control of FFT-Based Processing The Max/Msp environment has two run-time schedulers: the Max “control” scheduler, which is timed on the basis of mil- liseconds, and the Msp “signal” scheduler, which is timed at the audio sampling rate [5]. In FFT-based processing applica- tions, where changes to the resulting spectrum are infrequent, Msp’s control objects may be used to provide control param- eters for the processing. This is both precise and economical, but has bandwidth limitations. Significant and continuous modification of a spectrum, as in the case of a sweeping band- pass filter, is not possible using Msp’s control objects, since they can not keep up with the task of providing 1024 parameter changes at the FFT frame rate of 43 times a second (using FFT buffers of size 1024 at the audio sampling rate of 44,100 samples per second). A more dynamic approach to fil- tering is to update lookup tables containing filter functions at the signal rate (the audio sampling rate.) The term “Spectral Processing Function” (SPF) will be used frequently in this text and refers to a lookup table-based function (actually a signal), whose length is that of the FFT. For each 1.1 Development and Implementation The development environment used by the authors, Max and Max Signal Processing (Msp) [2], has evolved from the Max software developed by Miller Puckette for the Ircam Signal Processing Workstation (ISPW) [3]. This environment facilitates the development of real-time general purpose audio applications. The FFT object provided in Msp is based on Miller Puckette’s ISPW implementation [4] and stores time- domain signals as buffers of samples upon which the FFT analysis is done. The FFT object outputs each frame, bin-by- bin, using three sample streams running at the sampling rate. Thus, each bin is represented by three samples consisting of “real” and “imaginary” values, and the bin number (index). The IFFT is the complement of the FFT and expects as input, real and imaginary values in the same format as the FFT output.