106 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 2, FEBRUARY 2002
Region-of-Interest Coding Based on Set Partitioning
in Hierarchical Trees
Keun-hyeong Park and HyunWook Park, Senior Member, IEEE
Abstract—In many image-coding applications such as web
browsing, image databases, and telemedicine, it is useful to
reconstruct only a region of interest (ROI) before the rest of the
image is reconstructed. In this paper, an ROI coding functionality
is incorporated with the set partitioning in hierarchical trees
(SPIHT) algorithm for wavelet-based image coding. By placing
a higher emphasis on the transform coefficients pertaining to
the ROI, the ROI is coded with higher fidelity than the rest of
the image in earlier stages of progressive coding. The general
thrust of this research is to identify necessary coefficients in
wavelet-transform domain for the decoder to reconstruct the
desired region. This new method provides better performance
than the previously presented methods.
Index Terms—Parent of ROI (PROI), progressive transmission,
region of interest (ROI) coding, set partitioning in hierarchical
trees (SPIHT), wavelet transform.
I. INTRODUCTION
T
HE set partitioning in hierarchical trees (SPIHT) algorithm
[1] achieves an excellent rate-distortion performance and
retains an attractive embedded code property useful for progres-
sive transmission. It is essential for the coder to provide a good
rate-distortion performance. In addition, a lot of other require-
ments become important in still image compression. Examples
of such requirements are ability to provide lossy and lossless
compression within a single encoding system, ability to provide
a scalability of fidelity and resolution, and ability to give higher
priority to a region of interest (ROI) [2]. It would be desirable
to incorporate the above features into an image coding system
without incurring heavy costs such as increased computational
complexity or reduced rate-distortion performance.
Like the embedded zero-tree wavelet (EZW) [3], SPIHT gen-
erally operates on an entire image at once. The SPIHT cap-
tures large sets of insignificant coefficients within data struc-
tures called spatial orientation trees [1] to make the rate-distor-
tion performance efficient. The coding of these sets achieves
greater rate-distortion efficiency than the coding of individual
coefficients. However, the spatial orientation trees make it diffi-
cult to incorporate a functionality of ROI coding into the SPIHT
algorithm. Reference [4] modified the SPIHT to quantize and
code the wavelet coefficients in the arbitrary ROI only. This
method is optimized for wavelet coding of an arbitrary-shaped
region, not for entire image coding with higher fidelity for the
Manuscript received February 27, 2001; revised November 26, 2001. This
paper was recommended by Associate Editor Z. Xiong.
The authors are with the Department of Electrical Engineering, Korea Ad-
vanced Institute of Science and Technology (KAIST), Daejeon 305-701, Korea
(e-mail: hwpark@athena.kaist.ac.kr).
Publisher Item Identifier S 1051-8215(02)02016-5.
Fig. 1. Generation of ROI mask from an ROI.
ROI. Reference [5] proposed a method considering the entire
image coding with an emphasis on the ROI, while its rate-dis-
tortion performance is not competitive with the original SPIHT
algorithm.
In this work, we incorporate an ROI coding functionality into
the SPIHT algorithm without compromising other desirable fea-
tures such as rate-distortion performance and computation time.
Necessary data for the decoder to reconstruct the desired region
can be identified by the proposed ROI coding method without
any overhead of bit stream.
This paper is organized as follows. Section II presents a new
ROI coding method. Section III shows experimental results,
comparing the proposed method with the previous work and the
original SPIHT. Finally, conclusions are given in Section IV.
II. ROI CODING
A. Generation of ROI Mask
When an image is coded with an emphasis of ROI, it is nec-
essary to identify the wavelet coefficients needed for the recon-
struction of the ROI. Thus, the ROI mask is introduced to indi-
cate which wavelet coefficients have to be transmitted exactly
in order for the receiver to reconstruct the ROI [6].
Once an arbitrarily shaped ROI is defined by user, genera-
tion of the ROI mask is performed for rows and columns at
each decomposition level. The process is then repeated for the
remaining levels until the entire wavelet tree is processed, as
shown in Fig. 1. The wavelet coefficients that are required to
reconstruct a pixel are selected with dependency on the wavelet
length. For example, let the original samples be denoted
and the samples belonging to the low- and high-frequency sub-
bands be denoted and , respectively. Then, for the
1051–8215/02$17.00 © 2002 IEEE