946 IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 7, NO. 6, DECEMBER 2013 Motion Compensated Prediction and Interpolation Filter Design in H.265/HEVC Kemal Ugur, Alexander Alshin, Elena Alshina, Frank Bossen, Woo-Jin Han, Jeong-Hoon Park, and Jani Lainema Abstract—Coding efﬁciency gains in the new High Efﬁciency Video Coding (H.265/HEVC) video coding standard are achieved by improving many aspects of the traditional hybrid coding framework. Motion compensated prediction, and in particular the interpolation ﬁlter, is one area that was improved signiﬁcantly over H.264/AVC. This paper presents the details of the interpolation ﬁlter design of the H.265/HEVC standard. First, the improve- ments of H.265/HEVC interpolation ﬁltering over H.264/AVC are presented. These improvements include novel ﬁlter coefﬁcient design with an increased number of taps and utilizing higher precision operations in interpolation ﬁlter computations. Then, the computational complexity is analyzed, both from theoretical and practical perspectives. Theoretical complexity analysis is done by studying the worst-case complexity analytically, whereas practical analysis is done by proﬁling an optimized decoder imple- mentation. Coding efﬁciency improvements over the H.264/AVC interpolation ﬁlter are studied and experimental results are pre- sented. They show a 4.0% average bitrate reduction for the luma component and 11.3% average bitrate reduction for the chroma components. The coding efﬁciency gains are signiﬁcant for some video sequences and can reach up to 21.7%. Index Terms—Video coding, standards, HEVC, interpolation ﬁlter, H.265. I. INTRODUCTION M OTION compensated prediction (MCP) is a technique used by video coders to reduce the amount of informa- tion transmitted to a decoder by exploiting the temporal redun- dancy present in the video signal [1]–[4]. In MCP, the picture to be coded is ﬁrst divided into blocks, and for each block, an en- coder searches reference pictures to ﬁnd a best matching block. The best matching block is called the prediction of the corre- sponding block and the difference between the original and the prediction signal is coded by various means, such as transform coding, and transmitted to a decoder. The relative position of Manuscript received January 30, 2013; revised May 10, 2013; accepted June 19, 2013. Date of publication July 11, 2013; date of current version November 18, 2013. This work was supported by the Gachon University research fund of 2013 GCU-2013-R185. The guest editor coordinating the review of this man- uscript and approving it for publication was Prof Joern Ostermann. (Corre- sponding author: W. J. Han). K. Ugur and J. Lainema are with Nokia Corporation, 33720 Tampere, Finland (e-mail: kemal.ugur@nokia.com; jani.lainema@nokia.com). A. Alshin, E. Alshina, and J. H. Park are with the Digital Media and Commu- nication Research and Development Center, Samsung Electronics, Suwon 443- 742, Korea (e-mail: alexander_b.alshin@samsung.com; elena_a.alshina@sam- sung.com; jeonghoon@samsung.com). W. J. Han is with Gachon University, Seongnam 461-701, Korea (e-mail: hurumi@gmail.com). F. Bossen is with DOCOMO Innovations, Palo Alto, CA 94304 USA (e-mail: bossen@docomoinnovations.com). Color versions of one or more of the ﬁgures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identiﬁer 10.1109/JSTSP.2013.2272771 the prediction with respect to the original block is called a mo- tion vector and it is transmitted to the decoder along with the residual signal. The true displacements of moving objects be- tween pictures are continuous and do not follow the sampling grid of the digitized video sequence. Hence, by utilizing frac- tional accuracy for motion vectors instead of integer accuracy, the residual error is decreased and coding efﬁciency of video coders is increased [2]. If a motion vector has a fractional value, the reference block needs to be interpolated accordingly. The interpolation ﬁlter used in video coding standards are carefully designed taking into account many factors, such as coding efﬁ- ciency, implementation complexity and visual quality [5]. As in H.264/AVC [6], the High Efﬁciency Video Coding (HEVC) standard supports motion vectors with quarter-pel accuracy. Compared to H.264/AVC, H.265/HEVC includes various modiﬁcations to the interpolation ﬁlter design. During the development of the H.265/HEVC standard, several tech- niques were considered, including switched interpolation ﬁlter with offset (SIFO) [7], maximum order of interpolation with minimal support (MOMS) [8], one-dimensional directional interpolation ﬁlter (DIF) [9], and DCT-based interpolation ﬁlter (DCT-IF) [10]. The latest design of the H.265/HEVC in- terpolation ﬁlter is based on the simpliﬁed form of the DCT-IF with the addition of the high-accuracy motion compensation processing. These modiﬁcations yield an average 4.0% bitrate reduction over the H.264/AVC interpolation ﬁlter for luma and 11.3% bitrate reduction for chroma components. The coding efﬁciency gains become very signiﬁcant for some sequences and can reach a measured maximum of 21.7%. In this paper, the details of the motion-compensated prediction, and in particular the interpolation ﬁltering, of H.265/HEVC are presented. The paper is organized as follows. Section II describes the interpolation ﬁltering process of the H.264/AVC video coding standard. Section III discusses the details of the interpolation ﬁlter design of H.265/HEVC and describes the differences compared to H.264/AVC. Section IV presents a detailed complexity analysis of the interpolation ﬁlter in H.265/HEVC. Section V presents experimental results and shows the coding efﬁciency gains and Section VI concludes the paper. II. BACKGROUND A. Brief Summary of H.264/AVC Interpolation Process H.264/AVC supports motion vectors with quarter-pel accu- racy for the luma component and one-eighth pel accuracy for chroma components for video in the 4:2:0 color format [11]. Although some video sequences may beneﬁt from higher mo- tion vector accuracy, it was found that quarter-pel accuracy pro- 1932-4553 © 2013 IEEE