Published in Visual Representations of Speech Signals, Martin Cooke, Steve Beet, and Malcolm Crawford (eds.), pp. 95-116. 1993 by John Wiley & Sons Ltd 5 1 INTRODUCTION The human auditory system has an amazing ability to separate and understand sounds. We believe that temporal information plays a key role in this ability, more important than the spectral information that is traditionally emphasized in hearing science. In many hearing tasks, such as describing or classifying single sound sources, the underlying mathematical equivalence makes the temporal versus spectral argument moot. We show how the nonlinear- ity of the auditory system breaks this equivalence, and is especially important in analyzing complex sounds from multiple sources of different characteristics. The auditory system is inherently nonlinear. In a linear system, the component frequen- cies of a signal are unchanged, and it is easy to characterize the amplitude and phase changes caused by the system. The cochlea and the neural processing that follow are more interesting. The bandwidth of a cochlear “filter” changes at different sound levels, and neurons change their sensitivity as they adapt to sounds. Inner Hair Cells (IHC) produce nonlinear rectified versions of the sound, generating new frequencies such as envelope components. All of these changes make it difficult to describe auditory perception in terms of the spectrum or Fourier transform of a sound. One characteristic of an auditory signal that is undisturbed by most nonlinear transforma- tions is the periodicity information in the signal. Even if the bandwidth, amplitude, and phase characteristics of a signal are changing, the repetitive characteristics do not. In addition, it is very unlikely that a periodic signal could come from more than one source. Thus the auditory system can safely assume that sound fragments with a consistent periodicity can be combined and assigned to a single source. Consider, for example, a sound formed by opening and clos- ing the glottis four times and filtering the resulting puffs of air with the vocal resonances. Af- ter nonlinear processing the lower auditory nervous system will still detect four similar events which will be heard and integrated as coming from a voice. The duplex theory of pitch perception, proposed by Licklider in 1951 [11] as a unifying model of pitch perception, is even more useful as a model for the extraction and representa- tion of temporal structure for both periodic and non-periodic signals. This theory produces a movie-like image of sound which is called a correlogram. We believe that the correlogram, like other representations that summarize the temporal information in a signal, is an important tool for understanding the auditory system. The correlogram represents sound as a three dimensional function of time, frequency, and periodicity. A cochlear model serves to transform a one dimensional acoustic pressure into a two dimensional map of neural firing rate as a function of time and place along the cochlea. A third dimension is added to the representation by measuring the periodicities in the output from the cochlear model. These three dimensions are shown in Fig. 1. While most of our own work has concentrated on the correlogram, the important message in this chapter is that time and periodicity cues should be an important part of an auditory representation. This chapter describes two cochlear models and explores a structure which we believe can be used to represent and interpret the temporal information in an acoustic signal. Section 2 of this chapter describes two nonlinear models of the cochlea we use in our work. These two models differ in their computational approach and are used to illustrate the robustness of the ON THE IMPORTANCE OF TIME—A TEMPORAL REPRESENTATON OF SOUND Malcolm Slaney and Richard F. Lyon Advanced Technology Group Apple Computer, Inc. Cupertino, CA 95014 USA ON THE IMPORTANCE OF TIME— A TEMPORAL REPRESENTATION OF SOUND Malcolm Slaney and Richard F. Lyon Advanced Technology Group Apple Computer, Inc. Cupertino, CA 95014 USA