Reduced Computation Mode decision using Error Domain heuristics for the 1264 standard M. Y. Yang Department of Electronic/Electrical Engineering Loughborough University Loughborough, UK M.Y.YangA),lboro.ac.uk Abstract- A new algorithm for fast mode decision in the H264 video coding standard is presented in this paper. The algorithm exploits mode grouping through moving averages of error cost functions for providing significant computational savings with similar Rate Distortion performance as compared to accepted standard contributions [3]. I. INTRODUCTION The H.264 [1] is the newest international video coding standard, which offers superior objective and subjective image quality compared to other standards (MPEG-2, MPEG-4, H.263 and derivatives [7-9] etc.) for the same bit rates. This is due to a variety of exploited features in its design, such as variable block size/quarter sample accuracy and multiple reference picture selection for motion estimation/compensation, improved "skipped" and "direct" mode inference, directional edge extrapolation in intra coded areas, 4x4 integer transforms for machine independent decoding, advanced entropy coding techniques etc. However, the improvements in statistical/visual quality for the same bit rates as previous standards come with significant complexity increases. In Section 2, two recent fast mode decision schemes [2, 3] are discussed, which use simple texture analysis to get significant complexity reduction with similar Rate Distortion (RD) performance as the standard. These schemes are in stark contrast to more accurate but computationally intensive approaches [4, 5]. Section 3 presents the proposed algorithm for further complexity reduction of the work in [3] (which in turn is based on [2]), while still achieving similar RD performance. The paper concludes in Section 4 by benchmarking the proposed scheme versus the H264 standard and the algorithm in [3]. II. PREVIOUS WORK In [2, 3], the intra/inter mode decision of the macroblock to be encoded is based on fixed thresholds, homogeneity and predicted motion characteristics. For the intra mode decision, an edge map is initially computed using Sobel operators in the vertical/horizontal directions. Specifically, for every Christos Grecos, Lihui Chen Department of Electronic/Electrical Engineering Loughborough University Loughborough, UK C.Grecos(&,iboro.ac.uk, L.Chen(&,iboro.ac.uk macroblock pixel pij , an edge vector Ejj= {dxi,j, dyi,j } is computed, where the dx. , dy.. components are local gradients: cAj =pi-1,j+l +2*pi,j+l +pi+,j+l-pi-+j-l-2p ij (1) k,j =p1+1,] 1p+2 Pp9p +1+1,1+1-Pij -2*p -,.-Pi,tj+l (2) The edge vector amplitude is An(Ej)=ddci +y(i3j And the direction is estimated by: An(E. 1800 dy.ij) Ang(E. . ) =- * arctan( lrJ) I,J z dx. I,J Ang(D1 go,J) ) A histogram is then created by summing the edge vector amplitudes belonging to the same plane segment. Two such vectors and Eki belong to the same segment if A<Ang( ).B and A < Ang(E, l) < B, with A and B segment limits. For the 4*4 intra mode case, the plane is split into eight segments (signifying eight modes), while for the 8*8 chroma and the 16*16 luma intra modes the plane is split into three segments, corresponding to horizontal, vertical and plane modes respectively. The total number of candidate modes in all cases is one more than the number of segments due to the inclusion of the DC mode. To reduce the computational cost, [2] only checks a subset of these intra modes. In the 4*4 case, only the mode of the histogram cell with the highest amplitude, the modes of its two adjacent cells and the DC mode are considered for a total of four (as opposed to nine) modes. In the 16*16 luma case, the mode of the cell with the highest amplitude along with the DC mode are checked, for a total of two (as opposed to four) modes. Finally, if the two histograms of the U and V chroma 1-4244-0157-7/06/$20.00 C2006 IEEE 117