Digital Signal Processing 17 (2007) 652–664 www.elsevier.com/locate/dsp A framework for fast mode decision in the H264 video coding standard Christos Grecos ∗ , Mingyuan Yang Department of Electronic/Electrical Engineering, Loughborough University, Leicestershire, LE11 3TU, UK Available online 9 December 2005 Abstract We propose a novel framework for fast mode decision in the simple and main profiles of the H264 video coding standard. Our framework consists of a specific combination of algorithms, each achieving computational savings while retaining rate distortion (RD) performance very similar to the standard. In particular, we utilise a set of skip mode conditions for P and B slices, two heuristics that reduce the cardinality of the inter mode set to be examined, inter/intra mode prediction and the monotonicity property of the rate distortion cost functions. We achieve content dependent savings in run times between 5.8 and 90.1% as compared to H264. Compared to other work that was used as input to the standard, our scheme is faster by 9–23% for very similar RD performance. The proposed framework can be used wholly or partially for computational speed-ups, it is independent of the motion search method used and is applicable in both the rate controlled and nonrate controlled cases. 2005 Elsevier Inc. All rights reserved. Keywords: Video coding standards; H264; Mode prediction; Complexity 1. Introduction The H264 video coding standard [1] is the newest standard from the ITU-T Video Coding Experts Group and the ISO/IEC Moving Pictures Experts Group. Its main features that make it stand out from previous standards (MPEG-2, MPEG-4, H263 [2–4], derivatives, etc.) are the great variety of applications in which it can be used and its versatile design. This standard has shown significant rate distortion improvements as compared to other standards for video compression and in this section we highlight some of its design features that enable such performance, as well as other features that enable the coding process in a more general sense. Compared to the fixed block size motion com- pensation using half pixel accurate motion vectors and just one previous and next reference pictures as in H263 and MPEG-2, H264 has some new features such as variable block size motion compensation using quarter sample ac- curate motion vectors, the allowance of motion vectors to point even outside picture boundaries, multiple reference pictures and even the allowance of bi-predictive pictures to be used as references to improve the RD performance. The decoupling of referencing from display order adds flexibility to the standard and it enables complete removal of the extra delay associated with bi-predictive coding. H264 also utilises weighted offsetting of prediction signal for im- proving the coding efficiency in scenes including fades, etc. For a better complexity/RD trade-off in video sequences * Corresponding author. Fax: +44 (0) 1509 227014. E-mail address: c.grecos@lboro.ac.uk (C. Grecos). 1051-2004/$ – see front matter 2005 Elsevier Inc. All rights reserved. doi:10.1016/j.dsp.2005.11.005