Hindawi Publishing Corporation EURASIP Journal on Audio, Speech, and Music Processing Volume 2010, Article ID 137878, 11 pages doi:10.1155/2010/137878 Research Article Physically Motivated Environmental Sound Synthesis for Virtual Worlds Dylan Menzies Department of Media Technology, De Montfort University Leicester LE1 98H, UK Correspondence should be addressed to Dylan Menzies, rdmg@dmu.ac.uk Received 3 May 2010; Accepted 10 December 2010 Academic Editor: Andrea Valle Copyright © 2010 Dylan Menzies. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. A system is described for simulating environmental sound in interactive virtual worlds, using the physical state of objects as control parameters. It contains a uniﬁed framework for integration with physics simulation engines and synthesis algorithms that are tailored to work within the framework. A range of behaviours can be simulated, including diﬀuse and nonlinear resonators, and loose surfaces. The overall aim has been to produce a ﬂexible and practical system with intuitive controls that will appeal to sound design professionals. This could be valuable for computer game design and in other areas where realistic environmental audio is required. A review of previous work and a discussion of the issues which inﬂuence the overall design of the system are included. 1. Introduction In everyday life, we experience a range of complex sounds, many of which are generated by our direct interaction with the environment or are strongly correlated with visual events. For example, we push a pen across the table, it slides then falls oﬀ the table, hits a teacup, and rattles inside. To generate even this simple example convincingly in an interactive virtual world is challenging. The approach commonly used is simply to match each physical event to a sound taken from a collection of prerecorded or generated sample sounds. Even with plentiful use of memory, this approach produces poor results in many cases, particularly in sections where there is continuous evolution of the sound, because the possible range of sounds is so great, and our ability to correlate subtle visual cues with sound is acute. Foley producers have known this for many years. When the audio-visual correlation is good the sense of realness and immersion can be much better than either audio or visuals alone. Conversely, when the audio-visual correlation is poor, this can worsen the experience. In the interactive case where we have the ability to control the sound objects make, this correlation becomes more critical, as our attention is more acute. The phrase physically motivated audio is used here as short-hand for the use of the macro physical state of the virtual world to provide the controlling information for the underlying audio processes. The audio processes model microphysical behaviour that consist of the audio vibrations and physical behaviour too ﬁne to be captured by the macro system. The macrophysical interactions that can occur in virtual worlds can be managed by integration under constraints, for which there exists a large literature and a range of dedicated physics engine software libraries, both commercial and open source. These implement a wide range of techniques, but appear broadly similar to the application developer, with some diﬀerences of interface and data organization. In the context of virtual environments, procedural sound or generative sound refer to algorithmic sound synthesis in general. This includes synthesis that is not visually or haptically correlated, but can be parameterized and coded compactly. Weather sounds for example require constant variation and controls for selecting the current prevailing conditions. The advantages must be weighed against the quality of the sound compared with sample-based sound. If there is no audio-visual correlation, procedural sound may not be preferable to sampled sound. In the following,