FOUR-BAND LINEAR-PHASE ORTHOGONAL SPATIAL FILTER BANK FOR SUBBAND VIDEO CODING Gr´ egoire Pau and B´ eatrice Pesquet-Popescu GET-ENST, Signal and Image Proc. Dept. 46 Rue Barrault, 75634 Paris, France e-mail: {gpau,pesquet}@tsi.enst.fr ABSTRACT In wavelet-based scalable video coding schemes, temporal interframe redundancy is exploited by applying a tempo- ral wavelet transform along the motion trajectories. A spa- tial decomposition of the temporal subbands is further per- formed to take advantage of the spatial redundancy of the filtered frames. However, most of the t +2D video codecs do not take into account the very different spatial character- istics of the temporal subband frames and use indifferently the same spatial 9/7 biorthogonal transform to decompose them. In this paper, we present a spatial transform based on a four-band filter bank, whose frequency selectivity proper- ties are shown to be more suited to represent detail frames. We give the analytical form of a linear-phase, orthogonal and regular four-band filter bank and we show by experi- mental results conducted on video sequences that significant improvements in terms of PSNR can be obtained using the proposed filter bank to decompose the detail frames. 1. INTRODUCTION Subband motion-compensated temporal filtering (MCTF) video codecs have attracted recently [1, 2] a lot of atten- tion, due to their high compression performance comparable to state-of-the-art hybrid codecs and due to their scalability features. The spatio-temporal subband scheme (t +2D) exploits the temporal interframe redundancy by applying a temporal wavelet transform along the motion trajectories on the frames of a video sequence. A spatial decomposition of the temporal subbands is then done to take advantage of the spatial redundancy of the filtered frames and the resulting wavelet coefficients can be encoded by different algorithms such as 3D-SPIHT [3], 3D-ESCOT [4] or MC-EZBC [1]. Approximation frames result from the low-pass tempo- ral filtering of video frames and look very similar to natural images, with large piecewise smooth areas. Detail frames result from the high-pass temporal filtering and do not ex- hibit a natural image behavior. Sharp edges resulting from the temporal misprediction of moving areas and high-frequency textures are predominant. Most of the MCTF-based video codecs do not take into account the very different spatial characteristics of the tem- poral approximation and detail frames and use the popular 9/7 biorthogonal wavelet transform to spatially decompose them, independently of their type. The spatial 9/7 trans- form has been shown to perform well in the case of natu- ral images [5] and therefore should be appropriate for ap- proximation frames. However, it may not be so well-suited to represent detail frames which have a significant amount of intermediate and high frequencies. Since detail frames usually constitutes a major portion of the temporal subband frames, an effective and parsimonious spatial representation of these frames is highly desirable. Some previous works [6] considered the use of wavelet packets to decompose displacement frame differences (DFD) which are close to detail frames, but the computational com- plexity of wavelet packet best basis algorithms is high, the corresponding filter banks are not always very selective and the global results were not very satisfactory. We present in this paper a spatial transform based on M -band filter banks, whose frequency selectivity properties are more suited to represent detail frames. We derive ana- lytically from the general design framework proposed by Alkin and Caglar [7] a linear-phase, orthogonal and regular 4-band filter bank. We show by experimental results con- ducted on video sequences that significant improvements in terms of PSNR can be obtained using the proposed 4-band filter bank to decompose the detail frames. The paper is organized as follows: in the next section, we study the characteristics of temporal subband frames. In Section 3 we review the M -band filter bank and in Section 4 we design an optimal filter bank for transforming the detail frames. Section 5 illustrates by simulation results the cod- ing performance of the proposed spatial decomposition. We conclude in Section 6. 2. CHARACTERISTICS OF SUBBAND FRAMES In order to study the spatial characteristics of the temporal subband frames, we compute their averaged 2D power spec-