EXTRAPOLATING SIDE INFORMATION FOR LOW-DELAY PIXEL-DOMAIN DISTRIBUTED VIDEO CODING Luís Natário 1 , Catarina Brites 2 , João Ascenso 3 , Fernando Pereira 4 1,2,4 Instituto Superior Técnico – Instituto de Telecomunicações 3 Instituto Superior de Engenharia de Lisboa – Instituto de Telecomunicações {luis.natario; catarina.brites; joao.ascenso; fernando.pereira}@lx.it.pt ABSTRACT Distributed Video Coding (DVC) is a new video coding approach based on the Wyner-Ziv theorem. Unlike most of the existing video codecs, each frame is encoded separately (either as a key-frame or a Wyner-Ziv frame) which results in a simpler and lighter encoder since complex operations like motion estimation are not performed. The previously decoded frames are used at the decoder to estimate the Wyner-Ziv frames – the frames are coded independently but jointly decoded. To have a low-delay codec, the side information frames (estimation of the Wyner-Ziv frames to be decoded) must be extrapolated from past frames. This paper proposes a robust extrapolation module to generate the side information based on motion field smoothening to provide improved performance in the context of a low-delay pixel-domain DVC codec. Keywords: distributed video coding, side information, motion extrapolation, low-delay I. I NTRODUCTION Most of the existing coding schemes, namely the popular MPEG standards, are based in an architecture where the encoder is typically much more complex than the decoder mainly due to the computationally consuming operation of motion estimation done at the encoder. The Distributed Video Coding (DVC) approach based on the Wyner-Ziv (WZ) theorem [3] (which is the extension of the Slepian-Wolf theorem [5] for the lossy case with side information available at the decoder) allows reversing this scenario by shifting the motion estimation complexity from the encoder to the decoder enabling applications where the encoder’s low complexity is a requirement. The Slepian-Wolf theorem states that is possible to compress in a distributed way (separate encoding and joint decoding) two statistically dependent signals at a rate similar to the rate obtained using a system where the signals are encoded and decoded jointly (as in the traditional video coding schemes). In DVC schemes, each frame is encoded independently from previous and subsequent frames which results in a decrease of the typical encoding complexity. In order to have a low-delay codec, the frames must be decoded regardless of future frames, i.e. the side information must be created by extrapolation (as opposed to create the side information by interpolation using also future frames). The main novelty of this paper is the side information extrapolation module that is able to generate accurate side information by employing an extrapolation model that uses overlapped motion estimation, motion field smoothening and spatial-interpolation for uncovered areas. This approach enables a low-delay DVC architecture which is particularly well suited for emerging applications where the encoder complexity must be as low as possible and low-delay is a ‘must have’ like in wireless low- power surveillance and mobile camera phones among others. II. PIXEL-DOMAIN WYNER-ZIV CODEC ARCHITECTURE The IST-Wyner-Ziv (IST-WZ) codec developed at IST [4] is based on the pixel-domain coding architecture proposed in [1]. The scheme, modified to support the low-delay extrapolation module, is depicted in Figure 1. Conventional Intra frame encoder Conventional Intra frame encoder Quantization Quantization Reconstruction Reconstruction Turbo decoder Turbo decoder Buffer Buffer Turbo encoder Turbo encoder Conventional Intra frame decoder Conventional Intra frame decoder Extrapolation Extrapolation Decoded WZ frames [X'] Decoded WZ frames [X'] Side information Side information Wyner-Ziv frames [X] Wyner-Ziv frames [X] Decoded key frames [K'] Decoded key frames [K'] Feedback channel Feedback channel Slepian-Wolf encoder Slepian-Wolf encoder Slepian-Wolf decoder Slepian-Wolf decoder Fig. 1 – Wyner-Ziv codec architecture with side information extrapolation In the IST-WZ codec, the frames encoded are of two types: key-frames and WZ-frames. The key-frames are intra coded using H.263+, for example. The WZ-frames, after being uniformly quantized, are encoded using a turbo-based Slepian-Wolf encoder. The key-frames (and previously decoded WZ-frames, if decided) are used by the decoder to generate, by extrapolation, the side information that along with the WZ-bits received will be used to decode the WZ-frames. The Slepian-Wolf encoder generates sequences of parity bits for each bitplane output by the quantizer. These bits (which depend on the turbo encoder rate) are punctured The work presented was developed within VISNET, a European Network of Excellence (http://www.visnet-noe.org)