IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 3, MARCH 2010 691 Video Coding Focusing on Block Partitioning and Occlusion Manoranjan Paul, Member, IEEE, and Manzur Murshed, Member, IEEE Abstract—Among the existing block partitioning schemes, the pattern-based video coding (PVC) has already established its su- periority at low bit-rate. Its innovative segmentation process with regular-shaped pattern templates is very fast as it avoids handling the exact shape of the moving objects. It also judiciously encodes the pattern-uncovered background segments capturing high level of interblock temporal redundancy without any motion compen- sation, which is favoured by the rate-distortion optimizer at low bit-rates. The existing PVC technique, however, uses a number of content-sensitive thresholds and thus setting them to any prede- ﬁned values risks ignoring some of the macroblocks that would otherwise be encoded with patterns. Furthermore, occluded back- ground can potentially degrade the performance of this technique. In this paper, a robust PVC scheme is proposed by removing all the content-sensitive thresholds, introducing a new similarity metric, considering multiple top-ranked patterns by the rate-distortion op- timizer, and reﬁning the Lagrangian multiplier of the H.264 stan- dard for efﬁcient embedding. A novel pattern-based residual en- coding approach is also integrated to address the occlusion issue. Once embedded into the H.264 Baseline proﬁle, the proposed PVC scheme improves the image quality perceptually signiﬁcantly by at least 0.5 dB in low bit-rate video coding applications. A similar trend is observed for moderate to high bit-rate applications when the proposed scheme replaces the bi-directional predictive mode in the H.264 High proﬁle. Index Terms—Block partitioning, H.264, motion estimation, occlusion, pattern-based coding, rate-distortion optimisation, sub-blocking, uncovered background coding, video coding. I. INTRODUCTION V IDEO compression standards such as H.263 [1] and MPEG-2 [2] are inefﬁcient while coding at low bit-rate due to their inability to exploit intrablock temporal redundancy (ITR). Fig. 1 shows that objects can partly cover a block, leaving highly redundant information in successive frames as background is almost static in co-located blocks. Inability to exploit ITR results in the entire 16 16-pixel macroblock (MB) being coded with motion estimation (ME) and motion compensation (MC) regardless of whether there are moving objects in the MB. The recent H.264/AVC [3] video coding standard has ex- tended the block-based coding paradigm by introducing tree- Manuscript received August 10, 2008; revised August 28, 2009. First pub- lished September 29, 2009; current version published February 18, 2010. This work was supported in part by the Australian Research Council under Discovery Projects Grant DP0666456. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Antonio Ortega. The authors are with the Gippsland School of Information Technology, Monash University, Churchill, Vic 3842, Australia (e-mail: manoranjan. paul@infotech.monash.edu.au; manzur@infotech.monash.edu.au). Digital Object Identiﬁer 10.1109/TIP.2009.2033406 Fig. 1. Example on how pattern-based coding can exploit the intrablock tem- poral redundancy (ITR) to improve coding efﬁciency at low bit-rate. structured variable block-size (TVBS) ME&MC to approximate the various motions within the MB more accurately by parti- tioning the 16 16-pixel MB gradually to rectangular/square sub-blocks up to 4 4 pixels. We empirically observed in [4] that while coding head-and-shoulder type video sequences at low bit-rate, more than 70% MBs were never partitioned by the H.264 that would otherwise be at very high bit-rate. It can be easily observed that the possibility of choosing smaller block sizes diminishes as the target bit-rate is lowered. Consequently, coding efﬁciency improvement due to TVBS can no longer be realized for a low bit-rate target as larger blocks have to be chosen in most cases to keep the bit-rate in check but at the ex- pense of inferior shape approximation. Recently, many researchers [5]–[11] successfully introduced other forms of block partitioning to approximate the shape of a moving region (MR) even more closely to improve the com- pression efﬁciency (see Section II for details). But none of these techniques, including the H.264 standard, allows for encoding a block-partitioned segment by skipping ME&MC. Consequently they use unnecessary bits to encode almost zero-length mo- tion vector with perceptually insigniﬁcant residual errors for the background segment. These bits are quite valuable at low bit- rate that could otherwise be spent wisely for encoding residual errors in perceptually signiﬁcant segments. These block partitioning techniques effectively divide a MB into two disjoint segments that are encoded with independent ME&MC. This is a signiﬁcant improvement compared to the TVBS in the H.264 standard, which could use as many as 16 disjoint segments each with independent ME&MC. However, they are not suitable for low bit-rate video coding for the fol- lowing reasons: (i) the penalty of extra bits to encode additional motion vectors and corresponding residual errors outweighs the marginal picture quality beneﬁt at low bit-rate coding, espe- cially when only one of the segment covers part of a moving object and the other segment covers almost static background with high ITR; and (ii) the computational complexity overhead for segmentation is also unjustiﬁed for low bit-rate video coding 1057-7149/$26.00 © 2010 IEEE Authorized licensed use limited to: Nanyang Technological University. Downloaded on February 24,2010 at 22:36:00 EST from IEEE Xplore. Restrictions apply.