A Hardware Implementation of Real-Time Video
Deblocking Using Shifted Thresholding
Martin Hansen
Department of Electrical and
Computer Engineering
University of Waterloo
Waterloo, Ontario, Canada
Email: mdbhanse@uwaterloo.ca
Alexander Wong
Department of Electrical and
Computer Engineering
University of Waterloo
Waterloo, Ontario, Canada
Email: a28wong@uwaterloo.ca
William Bishop
Department of Electrical and
Computer Engineering
University of Waterloo
Waterloo, Ontario, Canada
Email: wdbishop@uwaterloo.ca
Abstract— Video compression has become very important as
demand has increased for the storage and transmission of
digital video content. Popular video compression schemes like
MPEG encoding make use of block-transform coding techniques
which are susceptible to blocking artifacts. Recently, an efficient
deblocking algorithm based on the concept of shifted thresholding
has been proposed. This algorithm uses only integer arithmetic
and replaces division operations with bit shifting. This paper
proposes a new hardware architecture for the implementation of
video deblocking using shifted thresholding. A prototype system
for high performance video deblocking using a FPGA (field
programmable gate array) board is described. The prototype
system leverages the reduced hardware complexity of the shifted
thresholding algorithm to cost-effectively implement video de-
blocking on a FPGA board.
I. I NTRODUCTION
With the ever increasing need for efficient storage and
transmission of digital video content, video compression has
become an active area of research. Video compression is
essential for applications ranging from high definition video
broadcasting to the wireless transmission of video content to
portable entertainment systems. Popular video compression
schemes such as MPEG [1] and recent video compression
schemes such as H.264/AVC [2] make use of block-transform
coding, where blocks of pixels are processed independently to
reduce computational and storage requirements.
A significant drawback of block-transform video coding
is that blocking artifacts are introduced at block boundaries.
These artifacts noticeably degrade video quality, particularly if
the video content is compressed at a high compression rate. To
improve video quality, a process known as video deblocking
is used to reduce the impact of blocking artifacts.
A large number of video and image deblocking methods
have been introduced. These methods have been categorized
[3] as follows:
1) Projections onto convex sets (POCS) methods,
2) Spatial block boundary filtering methods,
3) Wavelet filtering methods,
4) Statistical modeling methods,
5) Constrained optimization methods, and
6) Shifted transform methods.
Traditionally, methods based on spatial block boundary
filtering have been used in real-time video decoding due to
their low computational complexity. However, interest has
grown recently into the use of shifted transform methods.
Such methods typically deliver improved deblocking quality.
However, the computational complexity of shifted transform
methods have traditionally been very high due to the need for
a large number of floating-point calculations.
Recently, an efficient deblocking algorithm based on the
concept of shifted thresholding was proposed [3]. This algo-
rithm uses only a fraction of the computations required by
traditional shifted transform methods. Furthermore, it requires
only integer computations and uses bit shifting to replace
division operations. Despite these simplifications the algorithm
still achieves image quality that is competitive with other
methods in its class. The algorithm is also ideal for imple-
mentation using inexpensive hardware.
This paper presents an efficient hardware architecture for
video deblocking using shifted thresholding. The proposed
architecture is described and explained in detail in Section
2. The hardware complexity of the proposed architecture is
analyzed in Section 3. A prototype design is presented in
Section 4 along with experimental results. Conclusions are
drawn in Section 5.
II. PROPOSED ARCHITECTURE
The proposed hardware architecture implements a shifted
thresholding algorithm for video deblocking [3] on an Altera
DE2 board [4]. The shifted thresholding algorithm transforms
an initial decompressed image into a deblocked, decompressed
image. The algorithm performs six distinct operations as illus-
trated in Fig. 1. The proposed hardware architecture assumes
greyscale bitmap images of dimensions 640×480, with each
greyscale pixel represented by 8 bits. However, the architecture
could be easily modified to support larger images and larger
colour representations.
A 6 stage pipeline architecture was chosen for the hardware
design. It should be noted that in the interest of hardware
optimization, the pipeline stages deviate slightly from the
6 distinct operations described previously. The first pipeline
stage loads image data from memory in blocks of 8×8 pixels,
0840-7789/07/$25.00 ©2007 IEEE
28