Wavelets and audio data compression Tibor Asztalos 1 , Isar Alexandru 2 1,2 “Politehnica” University of Timisoara, Faculty of Electronics and Telecommunications, Bd. V. Parvan no.2, 1900, Timisoara, Romania, tel./fax. 056-190608, e-mail: 1 asztalos@ee.utt.ro, 2 isar@ee.utt.ro Abstract - In this work we present an audio data compression software package which uses a Discrete Wavelet Transform based compression procedure. The used wavelet function can be chosen from the well known Daubechies class of compactly supported wavelets. Our implementation uses an adaptive manner for the wavelet domain threshold value computation. The package has two major components: a compression software which reads a standard audio data file and the reconstruction software which reads a compressed file and generates a standard audio file. Experimental results are also presented. I. DATA COMPRESSION WITH WAVELETS Data compression is a very largely used procedure for data storage purposes. There exist a large variety of compression algorithms, many of them standardised and each of them having its advantages and its backdraws. They offers speed, high compression ratio, portability etc., but there is not a single algorithm which best fit to all kind of applications. They were classified in two major categories, the first includes the ones for compression without loss and the second one includes those for compression with loss. All the algorithms have the same purpose, to reduce the information redundancy from the considered data set, but those from the first class allow a perfect reconstruction while those from the second class does not. Those procedures which allow small information losses can achieve higher compression ratio, but they can not be utilised in applications where the data set integrity is primordial (for file compression for example). This second class includes various voice and image compression algorithms. In this work we present a complete audio compression/decompression software package, which implements an algorithm from this second class, which exploits a very popular orthogonal transform named the Discrete Wavelet Transform (DWT), [1], [2]. The topics of the wavelet transforms and their properties had been studied very extensively by the authors in the context of various data compression and signal to noise (SNR) enhancement scenarios, [3], [4], [5], [6], [7]. Our software realises the compression of the Nyquist sampled music or speech signal by using this orthogonal transform, which has a strong decorrelation effect on these signal samples. In this way we will have, in the transform domain, many zeroes or very small values, which can be neglected, achieving the desired compression. The DWT was proposed due to its speed (faster than the FFT) and to its whitening properties superior to the decorrelation properties of other well known transforms, [1], [2], (for example the Discrete Cosine Transform - DCT, which was included in a series of compression standards). II. THE DATA COMPRESSION SYSTEM The principle of our implementation is presented in figure 1. Figure 1. The wavelet compression system The input sequence consist of a series of PCM coded speech signal samples. These samples are encoded in an either 8 or 16 bit/sample format. In fact our software can accept as input a standard audio data file, in the well known wave format (*.wav files), which contains either mono or stereo sound in the above mentioned PCM format. The first block in figure 1 produces a speech dynamic compression. The input-output relation of this block is: [ ] [ ] ( 29 15 log 2 ˆ 2 15 - ⋅ = n x n x (1) where was exploited the facts that the samples x[n] are in PCM format, having the first bit as the sign bit (set to 1 if the sample value is positive and to 0 if is not) and that these values are distributed around the value zero, which is represented in this format by the hexadecimal number λ x[n] y[n] ] [ ˆ n x ] [ ˆ n y u[n] ( 29 ⋅ 2 log DWT