Ringomatic: A Real-Time Interactive Drummer Using Constraint-Satisfaction and Drum Sound Descriptors Jean-Julien Aucouturier SONY CSL Paris 6, rue Amyot 75005 Paris, France jj@csl.sony.fr Franc ¸ois Pachet SONY CSL Paris 6, rue Amyot 75005 Paris, France pachet@csl.sony.fr ABSTRACT We describe a real-time musical agent that generates an audio drum-track by concatenating audio segments auto- matically extracted from pre-existing musical files. The drum-track can be controlled in real-time by specifying high-level properties (or constraints) holding on meta- data automatically extracted from the audio segments. A constraint-satisfaction mechanism, based on local search, selects audio segments that best match those constraints at any time. We report on several drum track audio descrip- tors designed for the system. We also describe a basic mecanism for controlling the tradeoff between the agent’s autonomy and reactivity, which we illustrate with exper- iments made in the context of a virtual duet between the system and a human pianist. Keywords: interaction, drumtrack, metadata, constraint satisfaction, concatenative synthesis 1 INTRODUCTION State-of-the-art sample-based drum machines (or virtual drumkits) such as Fxpansion’s BFD (FXpansion, 2003) or Toontrack’s Drumkit From Hell (Toontrack, 2003) offer drum programmers almost total control over the sampled sounds that are played, the microphones used, the drumkit manufacturer, and even the individual drums and cymbals being used. Like other sampled instruments, they bene- fit from the improvement of digital storage, often offering tens of thousands of sounds from tens of different drumk- its, recorded by tens of different drummers, each using several velocities for each stroke. They also ship with large libraries of Midi-like drum patterns (or presets, or “grooves”), which can be associated with one of the very many sets of sounds to give an instant, realistic drumtrack. While the expressive power of such machines for drum programmers is unprecedented, they offer very little pos- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee pro- vided that copies are not made or distributed for profit or com- mercial advantage and that copies bear this notice and the full citation on the first page. c 2005 Queen Mary, University of London Database t high energy tom-toms low energy no tom-toms low energy some cymbals high energy tom-toms and cymbals Figure 1: The drumtrack is produced by concatenating drumbars selected in a database according to their meta- data. sibilities for interactive music systems, as proposed e.g. in Rowe (1993). On the one hand, the sounds and pat- terns are mostly undescribed, only using editorial, arbi- trary metadata (e.g. what are the perceptual qualities of “retrobreaks fill A” ? Is it energetic ? Syncopated ? How does “kick-Leedy” sounds compared to “kick-PearlB” ?). This makes high-level mappings between a real-time mu- sical input and the virtual drummer difficult. On the other hand, it is difficult to add new sounds into the system, in order to adapt to specific musical contexts. This typically involves buying expensive, pre-built extension packs. In this work, we propose a sampled-based drum ma- chine whose output can be controlled in real-time by high- level properties such as energy, density, saliency of drums or cymbals, etc. We use audio analysis techniques inspired by MIR to both gather the sampled material, which is au- tomatically extracted from pre-existing musical files (e.g. drum solo parts in a jazz mp3) and index each sample with acoustic metadata, automatically extracted from the sig- nal. The typical sample used in the system is a few beats’ audio extract from a drum part, which correspond to a mu- sical bar, and can therefore be looped while preserving a feeling of steady beat and metric. As seen in Figure 1, the drumtrack produced by the drummer is a continuous con- catenation of such bars of drumming, which we call here 412