IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 2, FEBRUARY 2001 199
Encoding Stored Video for Streaming Applications
I.-Ming Pao, Member, IEEE, and Ming-Ting Sun, Fellow, IEEE
Abstract—In streaming video applications, video sequences
are encoded off-line and stored in a server. Users may access the
server over a constant bitrate channel. Examples of the streaming
video applications are video-on-demand, archived video news,
and noninteractive distance learning. Before the playback, part of
the video bitstream is pre-loaded in the decoder buffer to ensure
that every frame can be decoded at the scheduled time. For these
streaming video applications, since the video is encoded off-line
and the future video frames are available to the encoder, a more
sophisticated bit-allocation scheme can be used to achieve better
video quality. During the encoding process for streaming video,
two requirements need to be considered: the pre-loading time that
the video viewers have to wait and the physical buffer-size at the
receiver (decoder) side. In this paper, we propose a sliding-window
rate-control scheme that uses statistical information of the future
video frames as a guidance to generate better video quality for
video streaming involving constant bitrate channels. A quantized
discrete cosine transform coefficient selection scheme based on
the rate-distortion measurement is also used to improve the video
quality. Simulation results show video quality improvements over
the regular H.263 TMN8 encoder.
Index Terms—Bit-allocation, buffer, rate control, streaming
video.
I. INTRODUCTION
D
IGITAL video applications have become increasintly
popular in our everyday life. Currently, there are several
video standards established for different purposes, for example,
MPEG-1 [1] and MPEG-2 [2] for multimedia applications,
and H.263 [3], [4] for video-conferencing applications. All
these standards use discrete cosine transform (DCT), motion
compensation (MC) (which involves motion estimation and
motion-compensated prediction), quantization, and variable
length coding (VLC) as the building blocks. A rate-control
scheme, which decides the quantization step-size and monitors
the buffer fullness, is another important part of the video en-
coder and can greatly affect the video quality; it is not specified
in the standards and is left open for application-dependent
implementation.
With the knowledge of the channel model, the rate-control
scheme in the video encoder produces video bitstreams which
can be supported by the channels. Rate-control schemes in cur-
rent standard test models (e.g., MPEG2 TM5 [5] and H.263
TMN8 [6]) are usually used for real-time visual communica-
tions. Video encoders receive the video (image frames) from
the video capture device and generate the compressed bitstream.
Manuscript received June 3, 1999; revised April 28, 2000. This paper was
recommended by Associate Editor S. Panchanathan.
The authors are with the Information Processing Laboratory, Department
of Electrical Engineering, University of Washington, Seattle, WA 98195-2500
USA.
Publisher Item Identifier S 1051-8215(01)01240-X.
Fig. 1. Video users can access video database over networks for streaming
video applications.
The bitstream is then send to the decoder over the channel.
Delay is an important issue in real-time communication. For ex-
ample, a delay of a few seconds is not acceptable for video con-
ferencing applications. The whole process of capturing video,
encoding, transmission, and decoding needs to be done within
the delay constraint in real-time communication applications.
In this paper, our focus is on the nonreal-time visual com-
munications, such as video-on-demand, digital library, and non-
interactive distance learning. For these applications, video se-
quences are encoded in advance and stored in the server. Users
may access the server over a constant bitrate channel, such as
the Public Switched Telephone Network (PSTN) or Integrated
Services Digital Network (ISDN) (Fig. 1). Before the playback,
part of the video bitstream is pre-loaded in the decoder buffer to
ensure that every frame can be decoded at the scheduled time.
The pre-loading time (or how many bits need to be pre-loaded)
depends on several factors, such as how much network delay
jitter needs to be smoothed out, the physical buffer-size at the
decoder, and the waiting time a viewer is willing to accept. The
distribution of the bit counts for each frame may also affect the
pre-loading time. For example, if the encoder uses the same
number of bits (channel-rate/frame-rate) to encode every frame,
the pre-loading of only the first frame is needed. However, due
to the different complexity of video frames, every frame usu-
ally needs different number of bits for encoding, in order to pro-
duce good quality video. Pre-loading is necessary for the video
sequence with active frames using a lot of bits which exceeds
the channel bandwidth. Without proper pre-loading, these active
frames will not be decoded at the right time because their bits
may not have arrived at the decoder buffer due to the channel
bandwidth limitation (in this paper we assume that there is no
feedback from the client to the server).
1051–8215/01$10.00 © 2001 IEEE