STREAM MORPHING APPROACHES TO TEMPORAL SCALABLE VIDEO CODING James Macnicol, Michael Frater, John Arnold School of Electrical Engineering University College, the University of New South Wales Australian Defence Force Academy, Canberra ABSTRACT Stream morphing has been previously introduced as an ef- ﬁcient way of simulcasting a hierarchy of single-layer de- scriptions of a single video sequence with diﬀerent levels of quality while sharing the same frame size and rate. One application for this process was SNR scalable video cod- ing. This paper presents two diﬀerent adaptations of the stream morphing process to temporal scalability, one with bi-directional motion-compensated prediction and one with- out. Both methods retain the attractive features of the origi- nal SNR scalable structure, notably the use of a single quan- tization cycle and single decoder motion-compensated pre- diction loop irrespective of the number of layers present and the ability to convert between single-layer and scalable forms without transcoding. 1. INTRODUCTION For operation over a wide range of bit rates it is undesirable that a scalable video system be constrained to operate us- ing the same frame size and rate in all layers. This would require that the lower layers be operated at very poor qual- ity and/or the higher layers at extremely high quality where we might prefer to operate at a lower SNR but with a higher frame size and/or rate. In this paper a number of approaches to temporal scalability are investigated that compliment the existing SNR scalable stream morphing technique. The paper is structured as follows: Section 2 reviews the stream morphing process, Section 3 describes the existing approaches to temporal scalability used in the MPEG-2 and MPEG-4 standards before describing two modiﬁcations to stream morphing that provide similar functionality while re- taining the attractive features of the original technique. Sec- tion 4 shows experimental results for the two techniques and ﬁnally Section 5 draws conclusions. 2. STREAM MORPHING OVERVIEW Extensions to the existing single-layer MPEG techniques to support scalable coding typically involve the summation at Stream Morphing Encoder O 1 O 2 O 3 O n D 1 D 2 D n-1 Fig. 1. General encoder conﬁguration for stream morphing the decoder of a coarsely-quantized base layer service and a number of additional components that have been processed using progressively ﬁner levels of quantization. For exam- ple, MPEG-2 SNR scalability [1] deﬁnes a decoder that adds together texture after inverse quantization from mul- tiple standard bitstreams (the only exception being that mo- tion vectors are only coded in the base layer) and then passes these DCT coeﬃcients through a single IDCT and motion- compensated prediction (MCP) loop. This structure is at- tractive because of its low decoder complexity and high per- formance in error-free environments. Systems of this type with many layers suﬀer from poor subjective image quality, however, due to quantization being applied many times to the video signal [2]. DCT coeﬃcients with large values are coarsely quantized in one of the lower layers, however, the values of these coeﬃcients are unlikely to be reﬁned in any of the higher layers as the residual signal is not large enough to create another non-zero coeﬃcient in one of those layers. This is especially evident in ﬂat, static areas where blocking artifacts are still visible even where the quantizer step size in the top layer is relatively small. Fine Granularity Scalability (FGS) in the MPEG-4 standard [3] takes a similar approach whereby enhancement layer data is added to the base layer outside the MCP loop. FGS does not suﬀer from loss of subjective quality with many layers since it uses an embed- ded (bit-plane) quantization scheme for the enhancement layer data. The lack of motion-compensated prediction in the enhancement layer oﬀers immediate recovery from er- rors in the upper layers at the expense of decreased coding eﬃciency. 0-7803-7750-8/03/$17.00 ©2003 IEEE. ICIP 2003