946 IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, NO. 6, DECEMBER 2013 Motion Compensated Prediction and Interpolation Filter Design in H.265/HEVC Kemal Ugur, Alexander Alshin, Elena Alshina, Frank Bossen, Woo-Jin Han, Jeong-Hoon Park, and Jani Lainema Abstract—Coding efciency gains in the new High Efciency Video Coding (H.265/HEVC) video coding standard are achieved by improving many aspects of the traditional hybrid coding framework. Motion compensated prediction, and in particular the interpolation lter, is one area that was improved signicantly over H.264/AVC. This paper presents the details of the interpolation lter design of the H.265/HEVC standard. First, the improve- ments of H.265/HEVC interpolation ltering over H.264/AVC are presented. These improvements include novel lter coefcient design with an increased number of taps and utilizing higher precision operations in interpolation lter computations. Then, the computational complexity is analyzed, both from theoretical and practical perspectives. Theoretical complexity analysis is done by studying the worst-case complexity analytically, whereas practical analysis is done by proling an optimized decoder imple- mentation. Coding efciency improvements over the H.264/AVC interpolation lter are studied and experimental results are pre- sented. They show a 4.0% average bitrate reduction for the luma component and 11.3% average bitrate reduction for the chroma components. The coding efciency gains are signicant for some video sequences and can reach up to 21.7%. Index Terms—Video coding, standards, HEVC, interpolation lter, H.265. I. INTRODUCTION M OTION compensated prediction (MCP) is a technique used by video coders to reduce the amount of informa- tion transmitted to a decoder by exploiting the temporal redun- dancy present in the video signal [1]–[4]. In MCP, the picture to be coded is rst divided into blocks, and for each block, an en- coder searches reference pictures to nd a best matching block. The best matching block is called the prediction of the corre- sponding block and the difference between the original and the prediction signal is coded by various means, such as transform coding, and transmitted to a decoder. The relative position of Manuscript received January 30, 2013; revised May 10, 2013; accepted June 19, 2013. Date of publication July 11, 2013; date of current version November 18, 2013. This work was supported by the Gachon University research fund of 2013 GCU-2013-R185. The guest editor coordinating the review of this man- uscript and approving it for publication was Prof Joern Ostermann. (Corre- sponding author: W. J. Han). K. Ugur and J. Lainema are with Nokia Corporation, 33720 Tampere, Finland (e-mail: kemal.ugur@nokia.com; jani.lainema@nokia.com). A. Alshin, E. Alshina, and J. H. Park are with the Digital Media and Commu- nication Research and Development Center, Samsung Electronics, Suwon 443- 742, Korea (e-mail: alexander_b.alshin@samsung.com; elena_a.alshina@sam- sung.com; jeonghoon@samsung.com). W. J. Han is with Gachon University, Seongnam 461-701, Korea (e-mail: hurumi@gmail.com). F. Bossen is with DOCOMO Innovations, Palo Alto, CA 94304 USA (e-mail: bossen@docomoinnovations.com). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/JSTSP.2013.2272771 the prediction with respect to the original block is called a mo- tion vector and it is transmitted to the decoder along with the residual signal. The true displacements of moving objects be- tween pictures are continuous and do not follow the sampling grid of the digitized video sequence. Hence, by utilizing frac- tional accuracy for motion vectors instead of integer accuracy, the residual error is decreased and coding efciency of video coders is increased [2]. If a motion vector has a fractional value, the reference block needs to be interpolated accordingly. The interpolation lter used in video coding standards are carefully designed taking into account many factors, such as coding ef- ciency, implementation complexity and visual quality [5]. As in H.264/AVC [6], the High Efciency Video Coding (HEVC) standard supports motion vectors with quarter-pel accuracy. Compared to H.264/AVC, H.265/HEVC includes various modications to the interpolation lter design. During the development of the H.265/HEVC standard, several tech- niques were considered, including switched interpolation lter with offset (SIFO) [7], maximum order of interpolation with minimal support (MOMS) [8], one-dimensional directional interpolation lter (DIF) [9], and DCT-based interpolation lter (DCT-IF) [10]. The latest design of the H.265/HEVC in- terpolation lter is based on the simplied form of the DCT-IF with the addition of the high-accuracy motion compensation processing. These modications yield an average 4.0% bitrate reduction over the H.264/AVC interpolation lter for luma and 11.3% bitrate reduction for chroma components. The coding efciency gains become very signicant for some sequences and can reach a measured maximum of 21.7%. In this paper, the details of the motion-compensated prediction, and in particular the interpolation ltering, of H.265/HEVC are presented. The paper is organized as follows. Section II describes the interpolation ltering process of the H.264/AVC video coding standard. Section III discusses the details of the interpolation lter design of H.265/HEVC and describes the differences compared to H.264/AVC. Section IV presents a detailed complexity analysis of the interpolation lter in H.265/HEVC. Section V presents experimental results and shows the coding efciency gains and Section VI concludes the paper. II. BACKGROUND A. Brief Summary of H.264/AVC Interpolation Process H.264/AVC supports motion vectors with quarter-pel accu- racy for the luma component and one-eighth pel accuracy for chroma components for video in the 4:2:0 color format [11]. Although some video sequences may benet from higher mo- tion vector accuracy, it was found that quarter-pel accuracy pro- 1932-4553 © 2013 IEEE