IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 2, FEBRUARY 2001 199 Encoding Stored Video for Streaming Applications I.-Ming Pao, Member, IEEE, and Ming-Ting Sun, Fellow, IEEE Abstract—In streaming video applications, video sequences are encoded off-line and stored in a server. Users may access the server over a constant bitrate channel. Examples of the streaming video applications are video-on-demand, archived video news, and noninteractive distance learning. Before the playback, part of the video bitstream is pre-loaded in the decoder buffer to ensure that every frame can be decoded at the scheduled time. For these streaming video applications, since the video is encoded off-line and the future video frames are available to the encoder, a more sophisticated bit-allocation scheme can be used to achieve better video quality. During the encoding process for streaming video, two requirements need to be considered: the pre-loading time that the video viewers have to wait and the physical buffer-size at the receiver (decoder) side. In this paper, we propose a sliding-window rate-control scheme that uses statistical information of the future video frames as a guidance to generate better video quality for video streaming involving constant bitrate channels. A quantized discrete cosine transform coefficient selection scheme based on the rate-distortion measurement is also used to improve the video quality. Simulation results show video quality improvements over the regular H.263 TMN8 encoder. Index Terms—Bit-allocation, buffer, rate control, streaming video. I. INTRODUCTION D IGITAL video applications have become increasintly popular in our everyday life. Currently, there are several video standards established for different purposes, for example, MPEG-1 [1] and MPEG-2 [2] for multimedia applications, and H.263 [3], [4] for video-conferencing applications. All these standards use discrete cosine transform (DCT), motion compensation (MC) (which involves motion estimation and motion-compensated prediction), quantization, and variable length coding (VLC) as the building blocks. A rate-control scheme, which decides the quantization step-size and monitors the buffer fullness, is another important part of the video en- coder and can greatly affect the video quality; it is not specified in the standards and is left open for application-dependent implementation. With the knowledge of the channel model, the rate-control scheme in the video encoder produces video bitstreams which can be supported by the channels. Rate-control schemes in cur- rent standard test models (e.g., MPEG2 TM5 [5] and H.263 TMN8 [6]) are usually used for real-time visual communica- tions. Video encoders receive the video (image frames) from the video capture device and generate the compressed bitstream. Manuscript received June 3, 1999; revised April 28, 2000. This paper was recommended by Associate Editor S. Panchanathan. The authors are with the Information Processing Laboratory, Department of Electrical Engineering, University of Washington, Seattle, WA 98195-2500 USA. Publisher Item Identifier S 1051-8215(01)01240-X. Fig. 1. Video users can access video database over networks for streaming video applications. The bitstream is then send to the decoder over the channel. Delay is an important issue in real-time communication. For ex- ample, a delay of a few seconds is not acceptable for video con- ferencing applications. The whole process of capturing video, encoding, transmission, and decoding needs to be done within the delay constraint in real-time communication applications. In this paper, our focus is on the nonreal-time visual com- munications, such as video-on-demand, digital library, and non- interactive distance learning. For these applications, video se- quences are encoded in advance and stored in the server. Users may access the server over a constant bitrate channel, such as the Public Switched Telephone Network (PSTN) or Integrated Services Digital Network (ISDN) (Fig. 1). Before the playback, part of the video bitstream is pre-loaded in the decoder buffer to ensure that every frame can be decoded at the scheduled time. The pre-loading time (or how many bits need to be pre-loaded) depends on several factors, such as how much network delay jitter needs to be smoothed out, the physical buffer-size at the decoder, and the waiting time a viewer is willing to accept. The distribution of the bit counts for each frame may also affect the pre-loading time. For example, if the encoder uses the same number of bits (channel-rate/frame-rate) to encode every frame, the pre-loading of only the first frame is needed. However, due to the different complexity of video frames, every frame usu- ally needs different number of bits for encoding, in order to pro- duce good quality video. Pre-loading is necessary for the video sequence with active frames using a lot of bits which exceeds the channel bandwidth. Without proper pre-loading, these active frames will not be decoded at the right time because their bits may not have arrived at the decoder buffer due to the channel bandwidth limitation (in this paper we assume that there is no feedback from the client to the server). 1051–8215/01$10.00 © 2001 IEEE