1240 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 13, NO. 6, DECEMBER 2011 On Complexity Modeling of H.264/AVC Video Decoding and Its Application for Energy Efﬁcient Decoding Zhan Ma, Student Member, IEEE, Hao Hu, Student Member, IEEE, and Yao Wang, Fellow, IEEE Abstract—This paper proposes a new complexity model for H.264/AVC video decoding. The model is derived by decomposing the entire decoder into several decoding modules (DM), and identifying the fundamental operation unit (termed complexity unit or CU) in each DM. The complexity of each DM is modeled by the product of the average complexity of one CU and the number of CUs required. The model is shown to be highly accurate for software video decoding both on Intel Pentium mobile 1.6-GHz and ARM Cortex A8 600-MHz processors, over a variety of video contents at different spatial and temporal resolutions and bit rates. We further show how to use this model to predict the required clock frequency and hence perform dynamic voltage and frequency scaling (DVFS) for energy efﬁcient video decoding. We evaluate achievable power savings on both the Intel and ARM platforms, by using analytical power models for these two platforms as well as real experiments with the ARM-based TI OMAP35x EVM board. Our study shows that for the Intel plat- form where the dynamic power dominates, a power saving factor of 3.7 is possible. For the ARM processor where the static leakage power is not negligible, a saving factor of 2.22 is still achievable. Index Terms—Complexity modeling and prediction, dynamic voltage and frequency scaling (DVFS), H.264/AVC video decoding. I. INTRODUCTION T HE SmartPhone market has expanded exponentially within recent years. People desire to have a multi-purpose handheld device that not only supports voice communication and text messaging, but also provides video streaming, multi- media entertainment, etc. A crucial problem with a handheld device that enables video playback is how to provide a suf- ﬁciently long battery life given the large amount of energy required in video decoding and rendering. Thus, it is very useful to have an in-depth understanding of power consump- tion required by video decoding, which can be utilized to make decision in advance according to the remaining battery Manuscript received April 11, 2011; revised June 21, 2011; accepted Au- gust 04, 2011. Date of publication August 15, 2011; date of current version November 18, 2011. The associate editor coordinating the review of this man- uscript and approving it for publication was Dr. Yen-Kuang Chen. Z. Ma was with the Polytechnic Institute of New York University, Brooklyn, NY 11201 USA, and is now with the Dallas Technology Lab, Samsung Telecom- munications America, Richardson, TX 75082 USA (e-mail: zhan.ma@ieee.org; zhan.ma@gmail.com). H. Hu and Y. Wang are with the Department of Electrical and Com- puter Engineering, Polytechnic Institute of New York University, Brooklyn, NY 11201 USA (e-mail: hhu01@students.poly.edu; hoohawk@gmail.com; yao@poly.edu). Color versions of one or more of the ﬁgures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identiﬁer 10.1109/TMM.2011.2165056 capacity, e.g., discarding unnecessary video packets without decoding, or decoding at appropriate spatial, temporal, and amplitude resolutions to yield the best perceptual quality. In devices using dynamic voltage and frequency scaling (DVFS), being able to accurately predict the complexity of successive decoding intervals is critical to reduce the power consumption [1]. Generally, there are two sources of energy dissipation during video decoding [2]. One is the memory access. The other is CPU cycles. Both are power consuming. In this paper, we will focus on the computational complexity modeling of the H.264/AVC video decoding and defer the off-chip memory ac- cess complexity investigation for our future study. 1 Speciﬁcally, we extend our prior work [3] beyond the entropy decoding complexity and consider all modules involved in H.264/AVC video decoding, including entropy decoding, side information preparation, dequantization and inverse transform, intra pre- diction, motion compensation, and deblocking. First of all, we deﬁne each module as a decoding module (DM), and denote its complexity (in terms of clock cycles) over a chosen time interval by . The proposed model is applicable to any time interval, but the following discussion will assume the interval is one video frame. Furthermore, we abstract the basic, common operations needed by each DM as its complexity unit (CU), so that is the product of the average complexity of one CU over one frame (i.e., , and the number of CUs required by this DM over this frame (i.e., . For example, the CU for the entropy decoding DM is the operation of decoding one bit, and the complexity of this DM, , is the average complexity of decoding one bit, , times the number of bits in a frame, . That is . Among several possible ways to deﬁne the CU for a DM, we choose the deﬁnition that makes the deﬁned CU either fairly constant for a given decoder implementation, or accurately predictable by a simple linear predictor. Note that the CU complexity may vary from frame to frame because the corresponding CU operations change due to the adaptive coding tools employed in the H.264/AVC. For example, in H.264/AVC, adaptive in-loop deblocking ﬁlter is used to remove block artifacts, which applies different ﬁlters according to the information of adjacent blocks; thus, the average cycles required by the deblocking for one block, , would vary largely from frame to frame. Therefore, we also explore how to predict the average complexity of a CU for a new frame, from the measured CU complexity in the previous frames. Meanwhile, we assume that the number of CUs, , 1 Since the on-chip memory, such as cache, is inside the CPU part, our power measurement and saving does include the on-chip memory energy consumption. 1520-9210/$26.00 © 2011 IEEE