Rockefeller University Technical Report; May 2000 ftp://venezia.rockefeller.edu/pubs/PenevPS-NIPS2000-AuditorySymbols.ps Timescales for Sparseness of Natural Sound: Implications for Auditory-Symbols Processing Liubomire G. Iordanov Department of Computer Science University at Albany State University of New York 1400 Washington Avenue, Albany, NY 12222 lou@cs.albany.edu Ofer Tchernichovski Field Research Center The Rockefeller University Tyrrel Road, Millbrook, NY 12545 tcherno@rockvax.rockefeller.edu Penio S. Penev Laboratory of Computational Neuroscience The Rockefeller University 1230 York Avenue, New York, NY 10021 penev@rockefeller.edu http://venezia.rockefeller.edu/ Abstract The statistical structure of the natural visual environment influences the strategies for sensory processing in the retina and the thalamus of pri- mates, where second-order redundancy is reduced, and in the primary visual areas of cortex, where filters with high output-kurtosis, which ex- pose the “sparseness” of the visual environment, are believed to facilitate scene parsing and object segmentation. Here we examine the hypothe- sis of a similar link between the sparseness of natural auditory ensem- bles and their respective symbolic structures. We find that second-order redundancy reducing filters, which presumably operate at the cochlear level, have output kurtoses, much larger than those for images. Moreover, the kurtosis depends strongly on the length of the filter time-window. We find that the difference between the characteristic timescales at which such sparseness is maximal across species-specific ensembles—human speech and bird song—correlate with the known differences in the dura- tion of their respective symbols and inter-symbol transitions. We discuss the implications for the design of sensory systems that would code effi- ciently such natural auditory stimuli. 1 Introduction Any well engineered sensory system must take advantage of the statistical structure of its inputs in order to both suppress noise and build efficient representations of seemingly complex data. The second-order structure of the statistics of natural scenes, both static (Field, 1987; Ruderman and Bialek, 1994) and dynamic (Dong and Atick, 1995a), has been corresponding author