COMPLEXITY MODELING OF SCALABLE VIDEO DECODING Zhan Ma, Yao Wang Dept. of ECE, Polytechnic University, Brooklyn, NY 11201 Emails: zma03@vision.poly.edu, yao@poly.edu ABSTRACT This paper addresses the computational complexity of scal- able video decoding using emerging scalable extension of H.- 264/AVC (SVC) standard compliant decoder. Scalable func- tionalities provided by SVC standard encompass temporal, spatial, quality enhancements and their combinations. The complexity model for decoding a bit stream with only tempo- ral, spatial, or quality scalability are developed first. We then extend to a more general model for decoding a bit stream with arbitrarily combined scalability. Comparison with the number of clock cycles used in SVC decoding on a PC shows that the proposed model is very accurate. Index Terms— Computational complexity, complexity modeling, scalable video decoding 1. INTRODUCTION Due to the advances of network bandwidth and wireless ac- cess techniques, and pervasive multimedia content service ov- er heterogenous network infrastructures to diverse users (ter- minals), multimedia stream with scalable features is demand- ed to satisfy the different requirements, by adapting its scal- able functionalities [1]. Because of its friendly network inter- face and high coding efficiency, H.264/AVC [2] promises the dominant status for video service industry in coming decades, thus JVT (Joint Video Team) experts decide to extend the H.264/AVC to provide scalable functionalities to face the di- verse requirements via a single bitstream which is generated following the syntax, semantics and operations defined in scal- able video coding extension of H.264/AVC [3]. Along with massive researches on mobile computing and wide deployments of wireless video service, computational complexity problem is raised up for video processing on power limited device since large amount of data transformations in- volve. How to predict or model the computational complexity consumption of video processing attracts more and more at- tentions from industry and academia. In [4], authors analyze the computational complexity of software based H.264/AVC [2] baseline profile decoder by its decoding subfunctions, and estimate the time complexity on DSP and general purpose This material is based upon work supported by the National Science Foundation under Grant No. 0430145. computer via the frequency of use of decoding subfunctions. He et al. [5] propose a power-rate-distortion (P-R-D) frame- work for typical video encoder with fully scalable coding sch- eme. Because the SVC standard has a coding efficiency compa- rable to the non-scalable H.264/AVC standard, it is expected that it may be widely adopted by mobile multimedia appli- cations, where battery energy consumption is of paramount concern. In this paper, we model the decoding computational complexity of the SVC decoder. Given a particular imple- mentation platform of the decoder, the complexity derived from the model can be translated into power consumption. One important motivation for developing such model is to en- able a receiver of a SVC bit stream to determine which spa- tial, temporal and quality layers to decode to achieve a de- sired tradeoff between the decoded video quality and decod- ing power consumption. The paper is organized as follows, the complexity model of SVC will be described in Section 2, and then the experi- mental verifications of analytical model is conducted in Sec- tion 3. Section 4 concludes the paper and give the future di- rection of this work. 2. COMPLEXITY MODEL OF SVC SVC bitstream can be produced either by the individual scal- able tools, or via the combination of supported scalable func- tionalities. In order to give an insightful understanding of decoding time complexity, we firstly analyze the individual scalable tool, i.e., temporal, spatial, and quality, respectively, and then combine them together to obtain a general decoding complexity function in terms of the numbers of decoded spa- tial, temporal and quality layers. Please refer to [3] for more information about the details of SVC. The common symbols used in deriving our model are presented in Table 1. 2.1. Temporal Scalability Temporal scalability could be efficiently provided via hier- archical B pictures. We take the popular dyadic prediction structure with one picture reference (depicted in Fig. 1) to consider time complexity for decoding the temporal scalable