MPEG-4 Scalable to Lossless Audio Coding Rongshan Yu 1 , Ralf Geiger 2 , Susanto Rahardja 1 , Juergen Herre 3 , Xiao Lin 1 , and Haibin Huang 1 1 Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore {rsyu,rsusanto,linxiao,hhuang}@i2r.a-star.edu.sg 2 Fraunhofer IDMT, Ilmenau, Germany ggr@idmt.fraunhofer.de 3 Fraunhofer IIS, Erlangen, Germany hrr@iis.fraunhofer.de ABSTRACT As the latest extension of MPEG-4 Audio coding, MPEG-4 Lossless Audio Coding includes a scalable audio coding solution (SLS) that integrates the functionalities of lossless audio coding, perceptual audio coding, and fine granular scalable audio coding into a single coder framework while providing backward compatibility to MPEG Advanced Audio Coding (AAC) at the bit-stream level. Despite its abundant functionalities, SLS still achieves a compression performance that is comparable to state-of-the-art non-scalable lossless audio coding algorithms. As a result, SLS provides a universal digital audio format for a variety of application domains including professional audio, Internet music, consumer electronics, broadcasting and others. This paper presents the structure of SLS and its latest developments during the MPEG standardization process. 1. INTRODUCTION The digital audio format has now essentially superceded its analog counterpart for audio applications due to its unprecedented quality and flexibility. However, the large bit-rates of digital audio signals, e.g. 705.6 kbps/channel for a CD quality digital audio signal sampled at 44.1 kHz with 16 bit/sample word length, could be a heavy burden for many applications with constrained bandwidth or storage resources. For this reason, considerable effort has been devoted to the development of audio compression algorithms. Most audio compression algorithms, such as MPEG-1 Layer III (mp3) [1] and MPEG-2/4 AAC [2], belong to the category of lossy compression where the original audio signal is modified after compression. However, perceptual coding techniques [3] are usually employed in these lossy audio compression algorithms to minimize the perceptual effects of the introduced distortion and, possibly, to achieve “transparent” audio quality such that distortion introduced during compression is inaudible to the human auditory system. Nowadays many perceptual audio compression algorithms can achieve excellent compression ratio performance, e.g., 7 ~ 14:1 times compression, while