Automatic Long-Term Loudness and Dynamics Matching Earl Vickers Creative Advanced Technology Center Scotts Valley, CA, USA earlv@atc.creative.com ABSTRACT Traditional audio level control devices, such as automatic gain controls (AGCs) and compressors, generally have little or no advance knowledge of the dynamic characteristics of the remainder of the current audio program. If such advance knowledge is available (i.e., if audio files can be pre-analyzed), it becomes possible to match desired values of overall loudness and dynamics. We introduce two new measures, “long-term loudness matching level” and “dynamic spread,” and present new methods for long-term loudness and dynamics matching. 0 INTRODUCTION Loudness is a subjective measure relating to the physical sound pressure level (SPL) as perceived by the human ear. A number of devices have been created for controlling audio levels to modify either a signal’s loudness or its dynamic change in loudness. Automatic Gain Controls (AGCs) are typically used to minimize loudness differences between audio programs (for example, between one song and the next). Compressors are similar to AGCs but operate on a faster time scale; they are primarily intended to minimize the loudness changes within a single song or audio program [1, 2]. Compressors have a number of uses, including increasing the loudness of the softer parts of an audio program so they can be heard above the noise floor (e.g., for automotive listening), decreasing the loudness of the loudest segments (for example, to avoid disturbing neighbors during late-night listening), and keeping signal levels within technical limits required for radio broadcast. Compressors and AGCs typically operate in real-time with little or no advance knowledge of the contents of the remainder of the current audio program. It seems likely that if we had additional information about the dynamic characteristics of the audio program as a whole, we could do a better job of matching a desired loudness or dynamic behavior. Since music data is often stored in sound files on computer hard drives, we are in a position to generate and use loudness metadata in order to improve performance and reduce artifacts. In this paper, we present a method for matching the loudness of an entire song or sound file to a desired level using a novel measure, “long-term loudness matching level.” In addition, we present a compressor that analyzes the dynamic characteristics of a sound file and matches the output to a desired statistical behavior, using a new measure called “dynamic spread.” This prevents over-compressing audio that already has limited dynamics.