IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 16, NO. 5, MAY 2007 1395
Pointwise Shape-Adaptive DCT for High-Quality
Denoising and Deblocking of Grayscale
and Color Images
Alessandro Foi, Vladimir Katkovnik, and Karen Egiazarian, Senior Member, IEEE
Abstract—The shape-adaptive discrete cosine transform
(SA-DCT) transform can be computed on a support of arbitrary
shape, but retains a computational complexity comparable to
that of the usual separable block-DCT (B-DCT). Despite the
near-optimal decorrelation and energy compaction properties,
application of the SA-DCT has been rather limited, targeted
nearly exclusively to video compression. In this paper, we present
a novel approach to image filtering based on the SA-DCT. We
use the SA-DCT in conjunction with the Anisotropic Local Poly-
nomial Approximation—Intersection of Confidence Intervals
technique, which defines the shape of the transform’s support
in a pointwise adaptive manner. The thresholded or attenuated
SA-DCT coefficients are used to reconstruct a local estimate of
the signal within the adaptive-shape support. Since supports
corresponding to different points are in general overlapping, the
local estimates are averaged together using adaptive weights that
depend on the region’s statistics. This approach can be used for
various image-processing tasks. In this paper, we consider, in
particular, image denoising and image deblocking and deringing
from block-DCT compression. A special structural constraint
in luminance-chrominance space is also proposed to enable an
accurate filtering of color images. Simulation experiments show
a state-of-the-art quality of the final estimate, both in terms of
objective criteria and visual appearance. Thanks to the adaptive
support, reconstructed edges are clean, and no unpleasant ringing
artifacts are introduced by the fitted transform.
Index Terms—Anisotropic, deblocking, denoising, deringing,
discrete cosine transform (DCT), shape adaptive.
I. INTRODUCTION
T
HE 2-D separable block discrete cosine transform
(B-DCT), computed on a square or rectangular support,
is a well-established and very efficient transform in order to
achieve a sparse representation of image blocks. For natural
images, its decorrelating performance is close to that of the
optimum Karhunen–Loève transform. Thus, the B-DCT has
been successfully used as the key element in many compression
and denoising applications. However, in the presence of singu-
larities or edges, such near-optimality fails. Because of the lack
Manuscript received April 24, 2006; revised November 27, 2006. This work
was supported in part by the Academy of Finland, Project 213462 (Finnish
Centre of Excellence program 2006–2011). This paper is based on and extends
the authors’ preliminary conference publications [12]–[15]. The associate ed-
itor coordinating the review of this manuscript and approving it for publication
was Onur G. Guleryuz.
The authors are with the Institute of Signal Processing, Tampere University
of Technology, 33101 Tampere, Finland (e-mail: alessandro.foi@tut.fi).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TIP.2007.891788
of sparsity, edges cannot be coded or restored effectively, and
ringing artifacts arising from the Gibbs phenomenon become
visible.
In the last decade, significant research has been made towards
the development of region-oriented, or shape-adaptive, trans-
forms. The main intention is to construct a system (frame, basis,
etc.) that can efficiently be used for the analysis and synthesis of
arbitrarily shaped image segments, where the data exhibit some
uniform behavior.
Initially, Gilge [19], [20] considered the orthonormalization
of a (fixed) set of generators restricted to the arbitrarily shaped
region of interest. These generators could be a basis of polyno-
mials or, for example, a B-DCT basis, thus yielding a “shape-
adapted” DCT transform. Orthonormalization can be performed
by the standard Gram–Schmidt procedure and the obtained or-
thonormal basis is supported on the region. Because the re-
gion-adapted basis needs to be recalculated for each differently
shaped region and because the basis elements are typically non-
separable, the overall method presents a rather high computa-
tional cost. While even today it is considered as one of the best
solutions to the region-oriented transforms problem, Gilge’s ap-
proach is clearly unsuitable for real-time applications, and faster
transforms were sought.
A more computationally attractive approach, namely the
shape-adaptive DCT (SA-DCT), has been proposed by Sikora
et al. [47], [49]. The SA-DCT is computed by cascaded ap-
plication of 1-D varying-length DCT transforms first on the
columns and then on the rows that constitute the considered
region, as shown in Fig. 1. Thus, the SA-DCT does not require
costly matrix inversions or iterative orthogonalizations and can
be interpreted as a direct generalization of the classical 2-D
B-DCT transform. In particular, the SA-DCT and the B-DCT
(which is separable) have the same computational complexity
and in the special case of a square the two transforms exactly
coincide. Therefore, the SA-DCT has received considerable
interest from the MPEG community, eventually becoming part
of the MPEG-4 standard [32], [36]. The recent availability of
low-power SA-DCT hardware platforms (e.g., [5], [30], [31])
makes this transform an appealing choice for many image- and
video-processing tasks.
The SA-DCT has been shown [4], [27], [47], [48] to pro-
vide a compression efficiency comparable to those of more
computationally complex transforms, such as [20]. The good
decorrelation and energy compaction properties on which this
efficiency depends are also the primary characteristics sought
for any transform-domain denoising algorithm. In this sense,
the SA-DCT features a remarkable potential not only for
1057-7149/$25.00 © 2007 IEEE