Statistical Models for MPEG Video Standard J. MATA, S. SALLENT, J. BALSELLS, J. ZAMORA and ∗ A. van der KOLK Department of Applied Mathematics and Telematics, Polytechnic University of Catalonia, C/ Gran Capitn, S/N. Mdulo C-3, Campus Nord, 08034 Barcelona SPAIN, Tel/Fax +34-[3] 401 6014 / 401 5981, E-Mail : jmata@mat.upc.es * Tele-Informatics and Open Systems, University of Twente Enschede, The Netherlands, P.O. Box 217, 7500 AE. Abstract. This paper introduces a statistical model for a variable bit rate MPEG output stream. The appropriate choice of the parameters (q, M, N ) that define a VBR MPEG codec is studied. The traffic is assumed to be generated on an ATM network. A VBR MPEG codec is simulated through an object-oriented programming environment written in C++. In this work a video luminance sequence of 3600 frames is used. The codec frame stream is separated according to the Intra, Predicted and Bidirectionally-Predicted coding algorithm. A statistical analysis of each stream is presented. It can be concluded that in all cases, the statistical behaviour approaches a binomial random variable. A composed multistage Markov chain is proposed to model the VBR MPEG output stream. The histograms comparing the empirical data and the model results are also shown. To generate video traffic stream an event-driven simulator was developed in C++. The frame bit-rate is controlled by semi-Markov chain model. In the chain states a flow of cells is delivered equidistantly. 1. Introduction Asynchronous Transfer Mode (ATM) Broadband Networks will support traffic coming from variable bit rate video (VBR) codecs [1], which are capable of maintaining a constant picture quality of the reconstructed image. The modeling of VBR video sources becomes important in the analysis and design of Broadband Integrated Services Digital Networks (B-ISDN). The network architecture and its characteristics such as cell-loss probabilities, transmission delay, high-speed statistical multiplexing gain, buffering, are strongly related by the statistical properties of the sources and the coding schemes involved. Several video coding schemes have been proposed for VBR services. MPEG is a video coding standard [2] which can be used for transmitting real-time variable bit rate broadcast video. MPEG has mainly two coding modes: interframe mode and intraframe mode (I). In its turn, two types of frames can be distinguished for the interframe mode, predicted (P) and bidirectionally-predicted (B) frames. Four levels of coding can be considered: picture, slice, macroblock and block. A picture (or frame) is a basic unit of display. The frame size in pixels depends on the application. A slice is a horizontal strip within a frame. A macroblock consists of four 8x8 blocks of luminance pixels and two 8x8 chrominance blocks. The smallest unit is a block which is an 8x8 matrix of pixels. Discrete cosine transform (DCT) and motion compensation techniques are used in the MPEG algorithm. A video sequence of pictures (SOP) is divided into groups of N pictures (GOP). A GOP consists of subgroups of M pictures where the first is a reference picture, intra or predicted, and the rest are bidirectionally-predicted. The image quality depends on M, N and the selected quantizer step size (q). MPEG codec can be set in an open-loop mode to maintain the quality whith a fixed q, and the coded variable bit rate (VBR) output stream is delivered to the network. A characterization of the traffic generated by a VBR source is necessary to allocate resources in ATM networks, as well as, to keep a satisfatory quality of service (QoS). ATM is a connection-oriented transfer mode. In the call establishment phase, service requirements are negotiated between the user and the network. Peak and mean rate are the main descriptors used to specify the statistical properties of the traffic. Source models are useful in the analysis and dimensioning of network components in order to characterize adequately its traffic behaviour[3]. Several sources models for video traffic have been developed [4,5, 6] and compared [7]. In this work, separated statistical analysis of traffic generated in I, P and B coding frames are introduced. A composed multistage Markov chain for the three activity levels (I, P, B) is developed as a statistical model to characterize the above mentioned video standard source. The rest of the paper is organized as follows. In Section 2 we discuss the influence of parameters q, M, N on the picture quality. Statistical studies of I, P and B flows are analyzed and markovian models proposed in Sections 3 and 4 respectively. Finally, some conclusions are presented in Seccion 5. 2. MPEG video coding analysis The MPEG coding algorithm was developed to achieve a high compression ratio with a good picture quality. A suitable choice of q, M, N parameters is important to minimize the traffic bit-rate for a fixed subjective quality or for a constant signal-to-noise ratio (SNR). In this paper, a documentary called " Geografia de Catalunya " of 3600 luminance frames ( 352 x 288 pels per frame ) is used, to perform the study of subjective quality and video source traffic. The documentary consists of high and low activity movement and scene changes. Several video sequences with low activity are also tested (Flower-garden, Miss America,...), however, for these cases, the obtained results are not representative for broadcast video since these sequences has a slow movement or follows a head and soulders scheme. Nine sets of parameters q, M, N are chosen to code the video sequence. Figure 1 shows the I, P, and B interpolated curves for q=8, M=3, N=12. The rate is specified in cells per frame considering a 48 bytes payload cell. The slope of I curve easily identifies scene changes and camera movement. The I height is related to the scene complexity. Notice that the P curve is close to the I one when camera movement or scene changes happen, because forward prediction is ineficient and, consequently, the P macroblocks are forced to be generated in the intra mode of codification. In this case, the B curve shows a proportional