Digital Signal Processing 20 (2010) 123–132 Contents lists available at ScienceDirect Digital Signal Processing www.elsevier.com/locate/dsp Wavelet-based approach for transient modeling with application to parametric audio coding N. Ruiz Reyes a,∗ , P. Vera Candeas a , F. López Ferreras b a Telecommunication Engineering Department, University of Jaén, Polytechnic School, Linares, Jaén, Spain b Signal Theory and Communication Department, University of Alcalá, Polytechnic School, Alcalá de Henares, Madrid, Spain article info abstract Article history: Available online 5 May 2009 Keywords: Transient modeling Matching pursuit Wavelets Complex exponentials Overcomplete dictionary Parametric audio coding In this paper we intend to optimize a wavelet-based dictionary for transient modeling with application to parametric audio coding. Transient modeling is performed by matching pursuit with an overcomplete dictionary composed of orthonormal wavelet functions that implement a wavelet-packet filter bank. We try to find the prototype filter length, the decomposition depth and the orthogonal wavelet family that lead to the best balance between mean squared error and computational cost. We are also interested in the structure of the wavelet decomposition tree. In such sense, comparison between the wavelet transform and the full wavelet-packet transform is performed. Finally, comparative analysis between wavelets and exponentially damped sinusoids is shown in experimental results. The proposed transient modeling method is suitable to be integrated into a parametric audio coder based on the three-part model of sines, transients and noise (STN model). 2009 Elsevier Inc. All rights reserved. 1. Introduction Parametric audio coding is a promising technique for characterization, compression and modification of the audio signal. Parametric audio coders utilize a signal model in combination with a perception model, and are able to obtain a high audio quality at bit-rates lower than 40 kbits/s. The parametric representation allows independent pitch and time-scale modification by the decoder in a straight-forward manner. In contrast to waveform coders, parametric audio coders do not necessarily strive to code the waveform of the audio signal, and very little, if any, reduction in bandwidth is applied. HILN (harmonic individual line and noise) was the first parametric audio coder accepted within the MPEG-4 standard. It operates at a rate ranging from 6 to 16 kbits/s mono [1]. More recently, the parametric audio coder developed by Philips Research in Eindhoven [2] has been submitted in reaction to the MPEG call-for-proposals made in 2001 [3]. This coder, referred to as PPC (Philips Parametric Coder), operates at 24 kbits/s mono, and produces higher audio quality than AAC at the same binary rate [2]. On December 2003, an improved version of PPC, operating in the range of 16 to 24 kbits/s mono, is included into MPEG-4 extension 2 [4]. It achieves low bit rate high quality audio coding, and allows real-time pitch and time-scale modification [5,6]. This result illustrates the potential of parametric audio coding. The international standard for parametric coding of high quality audio is finally published in July 2004 [7]. A notable drawback of most parametric audio coders is that an increase in the bit rate does not lead to a proportional increase in audio quality. Fully parametric audio coders usually decompose the audio signal on three components: sines, transients and noise (STN model-based audio coders) [2,8–11]. The sinusoidal component models the tonal, quasi-stationary elements, also called * Corresponding author. E-mail addresses: nicolas@ujaen.es (N. Ruiz Reyes), pvera@ujaen.es (P. Vera Candeas), francisco.lopez@uah.es (F. López Ferreras). 1051-2004/$ – see front matter 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.dsp.2009.04.011