Sub-Sequence Video Coding For Improved Temporal Scalability Dong Tian Tampere International Center for Signal Processing Tampere, Finland dong.tian@tut.fi Miska M. Hannuksela Nokia Research Center Tampere, Finland miska.hannuksela@nokia.com Moncef Gabbouj Tampere University of Technology Tampere, Finland moncef.gabbouj@tut.fi Abstract—Compression efficiency and bitrate scalability are among the key factors in video coding. The paper introduces novel sub-sequence coding techniques for temporal scalability. The presented coding schemes provide a wider range for bitrate scaling than conventional temporal scalability methods and maintain high coding efficiency at the same time. The proposed sub-sequence techniques are adopted into the latest video coding standard H.264, making it easy to identify sub- sequences and possible to discard them intentionally. As shown by the extensive simulations, a wide range of applications, from mobile messaging to consumer electronics such as digital TV can benefit from sub-sequences. I. INTRODUCTION In recent years, scalable video coding has been one of the key challenges in the field of video coding. Scalable bitstreams can be used for various purposes, such as adjustment of the transmitted bitrate according to the prevailing network throughput in streaming applications and scaling the complexity of the decoding process according to the available computational resources. Scalable coding also partitions the coded bitstream into sections with different impact on decoded video quality. These sections can be used in the transport layer to implement unequal error protection. Scalable video coding methods can be classified into temporal, spatial, and SNR techniques, as well as any combination of them. Two general categories exist for interframe coding in temporal scalable video coding algorithms: predictive coding and subband coding [1]. All prevailing video coding standards, such as H.263, H.264 (aka MPEG-4 AVC), MPEG-2 Visual, and MPEG-4 Visual, deploy motion compensation predictive techniques, and hence this paper focuses on the temporal scalability for predictive coding. The paper introduces a novel sub-sequence coding technique, which is an enhancement of the known temporal scalability methods. It is shown that the range for bitrate scaling is wider and the compression efficiency is the same or better compared to earlier methods. Thus, the proposed method gives more flexibility in applications utilizing bitrate scalability, such as rate scaling in streaming servers. Modern video coding techniques often utilize multiple reference pictures for motion compensation to improve compression efficiency and error resilience. The sub- sequence technique also makes use of multiple reference pictures. A typical mode for reference pictures operation is “sliding window”, which removes the oldest reference frame from the buffer when a new reference frame is decoded and the buffer is full. This paper is organized as follows. Section II reviews the conventional temporal scalable coding. The proposed sub- sequence technique and coding schemes for improved temporal scalability are given in Section III. Section IV discusses the simulation results. Finally, we conclude the work in Section V. II. CONVENTIONAL TEMPORAL SCALABILITY A. Individually Disposable Pictures In other video coding standards than H.264, bi-predictive (B) pictures are not used as prediction references. Consequently, they provide a way to achieve temporal scalability. The enhanced reference picture selection mode (Annex U) of H.263 allows signaling whether a particular picture is a reference picture for any inter prediction of any other picture. Consequently, a picture not used for prediction (a non- reference picture) can be safely disposed. The H.264 syntax 6074 0-7803-8834-8/05/$20.00 ©2005 IEEE.