Improved p-Domain Rate Control and Perceived Quality Optimizations for MPEG-4 Real-Time Video Applications Michael Militzer, Maciej Suchomski, Klaus Meyer-Wegener Database Systems Chair, Friedrich-Alexander University of Erlangen-Nuremberg Martensstr. 3 91058 Erlangen, Germany mm@xvid.org, {ms, kmw}@informatik.uni-erlangen.de ABSTRACT The paper describes bit rate control for a one-pass MPEG-4 video encoding algorithm in order to make it suitable for real- time applications. The proposed control method is of low com- putational complexity and more accurate than previous ap- proaches. In result, the rate-control buffer size which highly influences the latency between a video sender and receiver can be decreased significantly. Additionally, a solution is proposed for increasing the perceived quality by introducing an advanced bit allocation scheme and by exploiting activity masking. The proposed algorithm has been implemented in the XVID codec, a representative of the MPEG-4 standard. Experiments prove that the proposed algorithm is highly accurate and provides im- proved perceived visual quality. Moreover, the implementation outperforms other up-to-now bit rate control algorithms. Categories and Subject Descriptors H.4.3 [Information Systems Applications]: Communications Applications – computer conferencing, teleconferencing, and videoconferencing. H.5.1 [Information Interfaces and Presentation]: Multimedia Information Systems – evaluation/methodology, video. General Terms Algorithms, Measurement, Performance, Design, Experimenta- tion, Verification. Keywords ”bit rate control”, “real-time”, “video encoding”, “MPEG-4”, “p-domain”, “quality optimization”, “live streaming” 1 Introduction Today effective compression of video has become a substantial requirement for data transmission over digital networks. In gen- eral, two kinds of video transmission are distinguished: (1) com- plete transmission of stored video from server to client before the playback starts, and (2) on-time transmission under quality of service (QoS) constraints (typical in real-time applications). Data transmission channels are restricted and have a constant bandwidth, because of the digital networks nature. This and the growing demand for transmission of visual information have stimulated the development of video compression standards such as MPEG-2 [11], H.263 [6] and MPEG-4 [12]. However, the coding bit rate varies depending on temporal and spatial activity during encoding videos. So, in order to allow consistent video transmission over limited channels, a bit rate control algorithm has to be employed during the video compression process. In real-time multimedia applications, besides the previously mentioned limitations, additional requirements such as very low latencies in end-to-end video communication have to be taken into account. Delays in such applications must not exceed 200 ms. So, in designing a real-time rate control algorithm, the proc- essing time of input data provided continuously must be consid- ered, i.e. one has to take care of quick adjustment to drastic and unpredictable changes in the behavior of the system. Thus sup- porting only intra-coded (I-) or forward-predicted (P-) frames seems to be the only possible solution, because the use of back- ward-predicted (B-) frames requires a rearrangement of the transmitted stream structure and results in a latency increase. Of course the proposed algorithm can easily be applied to B-frames as well, but it must be pointed out that any use of B-frames in real-time applications should be carefully considered. Real-time compression in video conferencing has yet another important characteristic—the algorithm has no information on the content of a video it needs to process. So algorithms that at first entirely analyze a video sequence for compressibility and then in a second step encode the sequence using the analysis result in order to improve video presentation quality (as it is done by the two-pass or three-pass algorithms commonly used in offline compression) cannot be employed. Alternative ways must be found to reliably control the bit rate and simultaneously raise the perceived quality of the resulting coded video. The basic idea is to combine a novel and accurate bit rate control algo- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MM’03, November 2-8, 2003, Berkeley, California, USA. Copyright 2003 ACM 1-58113-722-2/03/0011…$5.00. 402