Leveraging Behavioral Models of Sounding Objects for Gesture-Controlled Sound Design Kristian Gohlke, David Black Hochschule Bremen (University of Applied Sciences) kgohlke@acm.org, dblack@mevis.fraunhofer.de Jörn Loviscach Fachhochschule Bielefeld (University of Applied Sciences) jL@j3L7h.de ABSTRACT Sound designers and Foley artists have long struggled to create expressive soundscapes using standard editing software, devoting much time for the calibration of multiple sound samples and parameter adjustments. We present an intuitive approach that exploits the capabilities of off-the- shelf motion-sensing input devices to enable quick and fluid interaction with sound to trigger and modulate digital sound generators based on adaptable behavioral models of familiar physical sounding objects. Rather than requiring profound technical knowledge of sound design, the system leverages the user’s motor memory and motion skills to mimic generic and familiar interactions with everyday sounding objects. This allows the user to fully focus on the expressive act of sound creation while enjoying a fluent workflow and a satisfying user experience. Author Keywords Foley Art, Motion Gestures, Sound Effects, Audio Editing, HCI. ACM Classification Keywords H.5.2.: Information interfaces and presentation: User Interfaces – Input devices and strategies INTRODUCTION Professional Foley Artistry Over the history of the profession, Foley artists – sound effect artists for movie, television, radio, and theater – have developed workflows that exploit the physical properties of various everyday sounding objects for creating a wide spectrum of sounds [18]. For instance, the sound of footsteps in snow may be reproduced by pressing a fist into a tray of corn starch, galloping horses may be mimicked by beating coconut shells on a hard surface, and crumpling aluminum foil may substitute for frying eggs. Foley artists use microphones to record these physical sounding objects, which are manipulated in many ways to produce the desired sound effect. To create the appropriate soundscape for a production, the sound is often recorded while the Foley artist watches movie or theater scenes to synchronize sound and motion. This workflow allows the Foley artist to instantly react to the images with fine expressiveness, as the artist directly interacts with a selection of tangible sound- producing artifacts. However, the approach is not without its drawbacks. The flexibility is highly dependent on the available “library” of these physical objects, many of which are bound to a narrow range of applications. A versatile collection of such objects can take up a considerable amount of space and thus often limits the artist to purposefully constructed studios. Once a specific sound has been recorded, expressive parameters are only digitally editable to a certain degree, and often, content must be re- recorded to achieve the desired results. Because of these factors, the classic Foley workflow has been restricted to the domain of specialists. The widespread availability of digital production tools has enticed a growing number of casual users to shoot, edit, and publish their own video content. Professional Foley gadgets and the appropriate recording environment are often not available in such contexts. Digital Foley Work Much Foley-style work has been transferred to the digital realm. Instead of recording physical sounding objects themselves, many sound designers now rely on large databases of sound recordings for their work. Finding and arranging these snippets on a timeline is a time-consuming process that often involves much manual setup: to achieve desired results, the user must locate a desired audio file, edit low-level audio parameters such as reverb decay times, filter frequencies, sample loop starting times, and start and end points. Values are entered into the digital audio workstation (DAW) with keyboard and mouse or manipulated on-screen by tweaking parameter envelope graphs. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. TEI’11, January 22–26, 2011, Funchal, Portugal. Copyright 2011 ACM 978-1-4503-0478-8/11/01...$10.00..