Image Compression - the Mechanics of the JPEG 2000 Jin Li Microsoft Research, Signal Processing, One Microsoft Way, Bld. 113/3374, Redmond, WA 98052 Email: jinl@microsoft.com ABSTRACT We briefly review the mechanics in the coding engine of the JPEG 2000, a start-of-the-art image compression system. The transform, entropy coding and bitstream assembler modules are examined in details. Our goal is to enable the readers to have a good understanding of the modern scalable media compression technologies without being swarmed by the details. Keywords: Image compression, JPEG 2000, transform, wavelet, entropy coder, sub-bitplane entropy coder, bitstream assembler. 1. INTRODUCTION Compression is a process that creates a compact data representation for storage and transmission purposes. Media compression usually involves the use of special compression tools because media is different from the generic data. Generic data file, such as a computer executable program, a Word document, must be compressed losslessly. Even a single bit error may render the data useless. On the other hand, distortion is tolerable in the media compression process; because it is the content of the media that is of paramount importance, rather than the exact bit of the media. Since the size of the original media, whether it is an image, a sound clip, or a movie clip, is usually very large, it is essential to compress the media at a very high compression ratio. Such high ratio media compression is usually achieved through two mechanisms: a) to ignore the media components that are less perceptible, and b) to use entropy coding to explore information redundancies exist in the source data. Different applications may have different requirement of the compression ratio and tolerance of the compression distortion. A publish application may require a compression scheme with very little distortion, while a web application may tolerate relatively large distortion in exchange of a smaller compressed media. Recently, a category of media compression algorithms termed scalable compression emerges to offer the ability to trade between the compression ratio and distortion even after the compressed bitstream has been generated. In scalable compression, a media is first compressed into a master bitstream, where a subset of the master bitstream may be extracted to form an application bitstream with a higher compression ratio. With scalable compression, a compressed media can be quickly tailored for applications with vastly different compression ratio and quality requirement, which is especially useful in media storage and transmission. In the following part of the paper, we use image compression, and in particular the JPEG 2000 image compression standard, to illustrate important mechanics in a modern scalable media compression algorithm. The paper is organized as follows. The basic concepts of the scalable image compression and its applications are discussed in Section 2. The JPEG 2000 standard and its development history are briefly reviewed in Section 3. The transform, quantization, entropy coding, and bitstream assembler modules are examined in details from Section 4-7. Our goal is to describe the key mechanics of the JPEG 2000 coding engine so that the readers may get a good understanding of the standard without being swarmed by the details. For the readers who are further interested, they may refer to [1][2][3]. 2. IMAGE COMPRESSION Digital images are used every day now. A digital image is essentially a 2D data array x(i,j), where i and j index the row and column of the data array, and each of the data point x(i,j) is referred as a pixel. For the gray image, each pixel is of an intensity value G. For color image, each pixel consists of a color vector (R, G, B), which represent the intensity of the red, green and blue components, respectively. Because it is the content of the digital image that matters the most, the underlying 2D data array may undergo big changes, and still convey the content to the user. An example is shown in Figure 1, where the original image Lena is shown at left as a 2D array of 512x512. Operations may be performed on the original image so that it is suited for a particular application. For example, when the display space is tight, we may subsample the original image to a smaller image of size 128x128, as shown in the upper-right of Figure 1. Another possible operation is to extract a rectangular region of the image starting from coordinate (256,256) to (384,384), as shown at the middle-right. The entire image may also be compressed into a compact bitstream representation, e.g. by JPEG, as shown in the bottom-right. In each case, the