STREAM MORPHING APPROACHES TO TEMPORAL SCALABLE VIDEO CODING
James Macnicol, Michael Frater, John Arnold
School of Electrical Engineering
University College, the University of New South Wales
Australian Defence Force Academy, Canberra
ABSTRACT
Stream morphing has been previously introduced as an ef-
ficient way of simulcasting a hierarchy of single-layer de-
scriptions of a single video sequence with different levels
of quality while sharing the same frame size and rate. One
application for this process was SNR scalable video cod-
ing. This paper presents two different adaptations of the
stream morphing process to temporal scalability, one with
bi-directional motion-compensated prediction and one with-
out. Both methods retain the attractive features of the origi-
nal SNR scalable structure, notably the use of a single quan-
tization cycle and single decoder motion-compensated pre-
diction loop irrespective of the number of layers present
and the ability to convert between single-layer and scalable
forms without transcoding.
1. INTRODUCTION
For operation over a wide range of bit rates it is undesirable
that a scalable video system be constrained to operate us-
ing the same frame size and rate in all layers. This would
require that the lower layers be operated at very poor qual-
ity and/or the higher layers at extremely high quality where
we might prefer to operate at a lower SNR but with a higher
frame size and/or rate. In this paper a number of approaches
to temporal scalability are investigated that compliment the
existing SNR scalable stream morphing technique.
The paper is structured as follows: Section 2 reviews the
stream morphing process, Section 3 describes the existing
approaches to temporal scalability used in the MPEG-2 and
MPEG-4 standards before describing two modifications to
stream morphing that provide similar functionality while re-
taining the attractive features of the original technique. Sec-
tion 4 shows experimental results for the two techniques and
finally Section 5 draws conclusions.
2. STREAM MORPHING OVERVIEW
Extensions to the existing single-layer MPEG techniques to
support scalable coding typically involve the summation at
Stream Morphing
Encoder
O
1
O
2
O
3
O
n
D
1
D
2
D
n-1
Fig. 1. General encoder configuration for stream morphing
the decoder of a coarsely-quantized base layer service and a
number of additional components that have been processed
using progressively finer levels of quantization. For exam-
ple, MPEG-2 SNR scalability [1] defines a decoder that
adds together texture after inverse quantization from mul-
tiple standard bitstreams (the only exception being that mo-
tion vectors are only coded in the base layer) and then passes
these DCT coefficients through a single IDCT and motion-
compensated prediction (MCP) loop. This structure is at-
tractive because of its low decoder complexity and high per-
formance in error-free environments. Systems of this type
with many layers suffer from poor subjective image quality,
however, due to quantization being applied many times to
the video signal [2]. DCT coefficients with large values are
coarsely quantized in one of the lower layers, however, the
values of these coefficients are unlikely to be refined in any
of the higher layers as the residual signal is not large enough
to create another non-zero coefficient in one of those layers.
This is especially evident in flat, static areas where blocking
artifacts are still visible even where the quantizer step size in
the top layer is relatively small. Fine Granularity Scalability
(FGS) in the MPEG-4 standard [3] takes a similar approach
whereby enhancement layer data is added to the base layer
outside the MCP loop. FGS does not suffer from loss of
subjective quality with many layers since it uses an embed-
ded (bit-plane) quantization scheme for the enhancement
layer data. The lack of motion-compensated prediction in
the enhancement layer offers immediate recovery from er-
rors in the upper layers at the expense of decreased coding
efficiency.
0-7803-7750-8/03/$17.00 ©2003 IEEE. ICIP 2003