TOWARDS PRACTICAL WYNER-ZIV CODING OF VIDEO Anne Aaron, Eric Setton, and Bernd Girod Information Systems Laboratory, Department of Electrical Engineering Stanford University, Stanford, CA 94305 {amaaron,esetton,bgirod}@stanford.edu ABSTRACT In current interframe video compression systems, the encoder per- forms predictive coding to exploit the similarities of successive frames. The Wyner-Ziv Theorem on source coding with side infor- mation available only at the decoder suggests that an asymmetric video codec, where individual frames are encoded separately, but decoded conditionally (given temporally adjacent frames) could achieve similar efficiency. We report results on a Wyner-Ziv cod- ing scheme for motion video that uses intraframe encoding, but interframe decoding. In the proposed system, key frames are com- pressed by a conventional intraframe codec and in-between frames are encoded using a Wyner-Ziv intraframe coder. The decoder uses previously reconstructed frames to generate side information for interframe decoding of the Wyner-Ziv frames. 1. INTRODUCTION Current video compression standards perform interframe predic- tive coding to exploit the similarities among successive frames. Since predictive coding makes use of motion estimation, the video encoder is typically 5 to 10 times more complex than the decoder. This asymmetry in complexity is desirable for broadcasting or for streaming video-on-demand systems where video is compressed once and decoded many times. However, some future systems may require the dual scenario. For example, we may be inter- ested in compression for mobile wireless cameras uploading video to a fixed base station. Compression must be implemented at the camera where memory and computation are scarce. For this type of system what we desire is a low-complexity encoder, possibly at the expense of a high-complexity decoder, that nevertheless com- presses efficiently. To achieve low-complexity encoding, we propose an asym- metric video compression scheme where individual frames are encoded independently (intraframe encoding) but decoded condi- tionally (interframe decoding). Two results from information the- ory suggest that an intraframe encoder - interframe decoder system can come close to the efficiency of an interframe encoder-decoder system. Consider two statistically dependent discrete signals, X and Y , which are compressed using two independent encoders but are decoded by a joint decoder. The Slepian-Wolf Theorem on distributed source coding states that even if the encoders are in- dependent, the achievable rate region for probability of decoding error to approach zero is RX ≥ H(X|Y ), RY ≥ H(Y |X) and Rx + Ry ≥ H(X,Y ) [1]. The counterpart of this theorem for lossy source coding is Wyner and Ziv’s work on source coding with side information [2]. Let X and Y be statistically dependent Gaussian random processes, and let Y be known as side informa- tion for encoding X. Wyner and Ziv showed that the conditional Rate-Mean Squared Error Distortion function for X is the same whether the side information Y is available only at the decoder, or both at the encoder and the decoder. We refer to lossless dis- tributed source coding as Slepian-Wolf coding and lossy source coding with side information at the decoder as Wyner-Ziv coding. Although these information theoretic results present signif- icant insights on compression, there are few examples where they have been considered for practical compression applications. Pradhan and Ramchandran applied distributed source coding to a system where a digital stream enhances the quality of a noisy ana- log image transmission [3]. Similarly, Liveris et al. used turbo codes to encode the pixels of an image with a noisy version of the image available at the decoder [4]. In these systems [3, 4], Wyner- Ziv coding is applied to natural images, and the side information is defined to be a version of the image corrupted by additive Gaus- sian noise. In [5], Jagmohan et al. discuss how a predictive coding scheme with multiple predictors can be seen as a Wyner-Ziv prob- lem, and thus, can be solved using coset codes. Specifically, they suggest that Wyner-Ziv codes could be used to prevent prediction mismatch or drift in video systems. In [6] we apply Wyner-Ziv coding to a real-world video signal. We take X as the even frames and Y as the odd frames of the video sequence. X is compressed by an intraframe encoder that does not know Y . The compressed stream is sent to a decoder which uses Y as side information to conditionally decode X.A similar video compression system using distributed source coding principles was proposed independently by Puri et al. in [7]. In this work, we extend the Wyner-Ziv video codec, first out- lined in our paper [6], to a more general and practical framework. The key frames of the video sequence are compressed using a con- ventional intraframe codec. The remaining frames, the Wyner- Ziv frames, are intraframe encoded using a Wyner-Ziv encoder. To decode a Wyner-Ziv frame, previously decoded frames (both key frames and Wyner-Ziv frames) are used to generate side in- formation. Interframe decoding of the Wyner-Ziv frames is per- formed by exploiting the inherent similarities between the Wyner- Ziv frame and the side information. In Section 2, we describe the proposed Wyner-Ziv video codec. In Section 3, we present different frame dependency ar- rangements and discuss the decoder flexibility in generating side information. Finally, in Section 4, we compare the performance of the proposed coder to conventional intraframe coding, using a standard H263+ video coder. 2. WYNER-ZIV VIDEO CODEC We propose an intraframe encoder and interframe decoder system for video compression as shown in Fig. 1. A subset of frames from