106 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 2, FEBRUARY 2002 Region-of-Interest Coding Based on Set Partitioning in Hierarchical Trees Keun-hyeong Park and HyunWook Park, Senior Member, IEEE Abstract—In many image-coding applications such as web browsing, image databases, and telemedicine, it is useful to reconstruct only a region of interest (ROI) before the rest of the image is reconstructed. In this paper, an ROI coding functionality is incorporated with the set partitioning in hierarchical trees (SPIHT) algorithm for wavelet-based image coding. By placing a higher emphasis on the transform coefficients pertaining to the ROI, the ROI is coded with higher fidelity than the rest of the image in earlier stages of progressive coding. The general thrust of this research is to identify necessary coefficients in wavelet-transform domain for the decoder to reconstruct the desired region. This new method provides better performance than the previously presented methods. Index Terms—Parent of ROI (PROI), progressive transmission, region of interest (ROI) coding, set partitioning in hierarchical trees (SPIHT), wavelet transform. I. INTRODUCTION T HE set partitioning in hierarchical trees (SPIHT) algorithm [1] achieves an excellent rate-distortion performance and retains an attractive embedded code property useful for progres- sive transmission. It is essential for the coder to provide a good rate-distortion performance. In addition, a lot of other require- ments become important in still image compression. Examples of such requirements are ability to provide lossy and lossless compression within a single encoding system, ability to provide a scalability of fidelity and resolution, and ability to give higher priority to a region of interest (ROI) [2]. It would be desirable to incorporate the above features into an image coding system without incurring heavy costs such as increased computational complexity or reduced rate-distortion performance. Like the embedded zero-tree wavelet (EZW) [3], SPIHT gen- erally operates on an entire image at once. The SPIHT cap- tures large sets of insignificant coefficients within data struc- tures called spatial orientation trees [1] to make the rate-distor- tion performance efficient. The coding of these sets achieves greater rate-distortion efficiency than the coding of individual coefficients. However, the spatial orientation trees make it diffi- cult to incorporate a functionality of ROI coding into the SPIHT algorithm. Reference [4] modified the SPIHT to quantize and code the wavelet coefficients in the arbitrary ROI only. This method is optimized for wavelet coding of an arbitrary-shaped region, not for entire image coding with higher fidelity for the Manuscript received February 27, 2001; revised November 26, 2001. This paper was recommended by Associate Editor Z. Xiong. The authors are with the Department of Electrical Engineering, Korea Ad- vanced Institute of Science and Technology (KAIST), Daejeon 305-701, Korea (e-mail: hwpark@athena.kaist.ac.kr). Publisher Item Identifier S 1051-8215(02)02016-5. Fig. 1. Generation of ROI mask from an ROI. ROI. Reference [5] proposed a method considering the entire image coding with an emphasis on the ROI, while its rate-dis- tortion performance is not competitive with the original SPIHT algorithm. In this work, we incorporate an ROI coding functionality into the SPIHT algorithm without compromising other desirable fea- tures such as rate-distortion performance and computation time. Necessary data for the decoder to reconstruct the desired region can be identified by the proposed ROI coding method without any overhead of bit stream. This paper is organized as follows. Section II presents a new ROI coding method. Section III shows experimental results, comparing the proposed method with the previous work and the original SPIHT. Finally, conclusions are given in Section IV. II. ROI CODING A. Generation of ROI Mask When an image is coded with an emphasis of ROI, it is nec- essary to identify the wavelet coefficients needed for the recon- struction of the ROI. Thus, the ROI mask is introduced to indi- cate which wavelet coefficients have to be transmitted exactly in order for the receiver to reconstruct the ROI [6]. Once an arbitrarily shaped ROI is defined by user, genera- tion of the ROI mask is performed for rows and columns at each decomposition level. The process is then repeated for the remaining levels until the entire wavelet tree is processed, as shown in Fig. 1. The wavelet coefficients that are required to reconstruct a pixel are selected with dependency on the wavelet length. For example, let the original samples be denoted and the samples belonging to the low- and high-frequency sub- bands be denoted and , respectively. Then, for the 1051–8215/02$17.00 © 2002 IEEE