Hindawi Publishing Corporation
EURASIP Journal on Audio, Speech, and Music Processing
Volume 2010, Article ID 137878, 11 pages
doi:10.1155/2010/137878
Research Article
Physically Motivated Environmental Sound Synthesis for
Virtual Worlds
Dylan Menzies
Department of Media Technology, De Montfort University Leicester LE1 98H, UK
Correspondence should be addressed to Dylan Menzies, rdmg@dmu.ac.uk
Received 3 May 2010; Accepted 10 December 2010
Academic Editor: Andrea Valle
Copyright © 2010 Dylan Menzies. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
A system is described for simulating environmental sound in interactive virtual worlds, using the physical state of objects as
control parameters. It contains a unified framework for integration with physics simulation engines and synthesis algorithms that
are tailored to work within the framework. A range of behaviours can be simulated, including diffuse and nonlinear resonators,
and loose surfaces. The overall aim has been to produce a flexible and practical system with intuitive controls that will appeal to
sound design professionals. This could be valuable for computer game design and in other areas where realistic environmental
audio is required. A review of previous work and a discussion of the issues which influence the overall design of the system are
included.
1. Introduction
In everyday life, we experience a range of complex sounds,
many of which are generated by our direct interaction
with the environment or are strongly correlated with visual
events. For example, we push a pen across the table, it
slides then falls off the table, hits a teacup, and rattles
inside. To generate even this simple example convincingly
in an interactive virtual world is challenging. The approach
commonly used is simply to match each physical event to a
sound taken from a collection of prerecorded or generated
sample sounds. Even with plentiful use of memory, this
approach produces poor results in many cases, particularly
in sections where there is continuous evolution of the sound,
because the possible range of sounds is so great, and our
ability to correlate subtle visual cues with sound is acute.
Foley producers have known this for many years. When the
audio-visual correlation is good the sense of realness and
immersion can be much better than either audio or visuals
alone. Conversely, when the audio-visual correlation is poor,
this can worsen the experience. In the interactive case where
we have the ability to control the sound objects make, this
correlation becomes more critical, as our attention is more
acute.
The phrase physically motivated audio is used here as
short-hand for the use of the macro physical state of
the virtual world to provide the controlling information
for the underlying audio processes. The audio processes
model microphysical behaviour that consist of the audio
vibrations and physical behaviour too fine to be captured
by the macro system. The macrophysical interactions that
can occur in virtual worlds can be managed by integration
under constraints, for which there exists a large literature
and a range of dedicated physics engine software libraries,
both commercial and open source. These implement a wide
range of techniques, but appear broadly similar to the
application developer, with some differences of interface and
data organization.
In the context of virtual environments, procedural sound
or generative sound refer to algorithmic sound synthesis
in general. This includes synthesis that is not visually or
haptically correlated, but can be parameterized and coded
compactly. Weather sounds for example require constant
variation and controls for selecting the current prevailing
conditions. The advantages must be weighed against the
quality of the sound compared with sample-based sound.
If there is no audio-visual correlation, procedural sound
may not be preferable to sampled sound. In the following,