1240 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 13, NO. 6, DECEMBER 2011
On Complexity Modeling of H.264/AVC
Video Decoding and Its Application
for Energy Efficient Decoding
Zhan Ma, Student Member, IEEE, Hao Hu, Student Member, IEEE, and Yao Wang, Fellow, IEEE
Abstract—This paper proposes a new complexity model for
H.264/AVC video decoding. The model is derived by decomposing
the entire decoder into several decoding modules (DM), and
identifying the fundamental operation unit (termed complexity
unit or CU) in each DM. The complexity of each DM is modeled by
the product of the average complexity of one CU and the number
of CUs required. The model is shown to be highly accurate for
software video decoding both on Intel Pentium mobile 1.6-GHz
and ARM Cortex A8 600-MHz processors, over a variety of
video contents at different spatial and temporal resolutions and
bit rates. We further show how to use this model to predict the
required clock frequency and hence perform dynamic voltage
and frequency scaling (DVFS) for energy efficient video decoding.
We evaluate achievable power savings on both the Intel and
ARM platforms, by using analytical power models for these two
platforms as well as real experiments with the ARM-based TI
OMAP35x EVM board. Our study shows that for the Intel plat-
form where the dynamic power dominates, a power saving factor
of 3.7 is possible. For the ARM processor where the static leakage
power is not negligible, a saving factor of 2.22 is still achievable.
Index Terms—Complexity modeling and prediction, dynamic
voltage and frequency scaling (DVFS), H.264/AVC video decoding.
I. INTRODUCTION
T
HE SmartPhone market has expanded exponentially
within recent years. People desire to have a multi-purpose
handheld device that not only supports voice communication
and text messaging, but also provides video streaming, multi-
media entertainment, etc. A crucial problem with a handheld
device that enables video playback is how to provide a suf-
ficiently long battery life given the large amount of energy
required in video decoding and rendering. Thus, it is very
useful to have an in-depth understanding of power consump-
tion required by video decoding, which can be utilized to
make decision in advance according to the remaining battery
Manuscript received April 11, 2011; revised June 21, 2011; accepted Au-
gust 04, 2011. Date of publication August 15, 2011; date of current version
November 18, 2011. The associate editor coordinating the review of this man-
uscript and approving it for publication was Dr. Yen-Kuang Chen.
Z. Ma was with the Polytechnic Institute of New York University, Brooklyn,
NY 11201 USA, and is now with the Dallas Technology Lab, Samsung Telecom-
munications America, Richardson, TX 75082 USA (e-mail: zhan.ma@ieee.org;
zhan.ma@gmail.com).
H. Hu and Y. Wang are with the Department of Electrical and Com-
puter Engineering, Polytechnic Institute of New York University, Brooklyn,
NY 11201 USA (e-mail: hhu01@students.poly.edu; hoohawk@gmail.com;
yao@poly.edu).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TMM.2011.2165056
capacity, e.g., discarding unnecessary video packets without
decoding, or decoding at appropriate spatial, temporal, and
amplitude resolutions to yield the best perceptual quality. In
devices using dynamic voltage and frequency scaling (DVFS),
being able to accurately predict the complexity of successive
decoding intervals is critical to reduce the power consumption
[1].
Generally, there are two sources of energy dissipation during
video decoding [2]. One is the memory access. The other is
CPU cycles. Both are power consuming. In this paper, we
will focus on the computational complexity modeling of the
H.264/AVC video decoding and defer the off-chip memory ac-
cess complexity investigation for our future study.
1
Specifically,
we extend our prior work [3] beyond the entropy decoding
complexity and consider all modules involved in H.264/AVC
video decoding, including entropy decoding, side information
preparation, dequantization and inverse transform, intra pre-
diction, motion compensation, and deblocking. First of all, we
define each module as a decoding module (DM), and denote
its complexity (in terms of clock cycles) over a chosen time
interval by . The proposed model is applicable to any time
interval, but the following discussion will assume the interval is
one video frame. Furthermore, we abstract the basic, common
operations needed by each DM as its complexity unit (CU), so
that is the product of the average complexity of one CU
over one frame (i.e., , and the number of CUs required by
this DM over this frame (i.e., . For example, the CU for
the entropy decoding DM is the operation of decoding one bit,
and the complexity of this DM, , is the average complexity
of decoding one bit, , times the number of bits in a frame,
. That is . Among several possible
ways to define the CU for a DM, we choose the definition that
makes the defined CU either fairly constant for a given decoder
implementation, or accurately predictable by a simple linear
predictor. Note that the CU complexity may vary from frame
to frame because the corresponding CU operations change due
to the adaptive coding tools employed in the H.264/AVC. For
example, in H.264/AVC, adaptive in-loop deblocking filter is
used to remove block artifacts, which applies different filters
according to the information of adjacent blocks; thus, the
average cycles required by the deblocking for one block, ,
would vary largely from frame to frame. Therefore, we also
explore how to predict the average complexity of a CU for a
new frame, from the measured CU complexity in the previous
frames. Meanwhile, we assume that the number of CUs, ,
1
Since the on-chip memory, such as cache, is inside the CPU part, our power
measurement and saving does include the on-chip memory energy consumption.
1520-9210/$26.00 © 2011 IEEE