Video Compression Using Structural Flow Orkun Alatas, Omar Javed and Mubarak Shah School of Computer Science University of Central Florida 4000 Central Florida Blvd Orlando, FL 32816 Email: alatas, ojaved, shah@cs.ucf.edu Abstract— This paper proposes a new technique in wavelet video compression that exploits the spatiotemporal regularity of the video. A sequence of frames is said to be regular along the directions in which the pixels vary the least. The directions of regularity of a sequence depend on both its motion content and its spatial structure. We model these directions by a 3D vector ﬁeld, referred as the Structural Flow. This ﬂow determines the paths of regularity along which the entropy of the data is smaller. We use these paths to construct a special class of wavelet basis, i.e., the 3D orthonormal bandelet basis for the directional decomposition of the sequence. Our experiments on several standard video sequences demonstrate the signiﬁcant improvement in compression compared to the standard wavelet video coding. I. I NTRODUCTION Video compression is a very important part of many ap- plications, such as video-conferencing, video storage, and broadcasting, since their performance largely relies on the efﬁciency of the compression. The wavelet coding, which proved to be very efﬁcient in image compression, is also used in this area since it outperforms the standard DCT (Discrete Cosine Transform) based methods, such as MPEG1 and MPEG2. In standard wavelet video coding, a group of frames (gof ) is decomposed along the three major axes: temporal, horizontal and vertical. However, this decomposition does not take the regularity of the gof into account. In the presence of global motion, uniform 3D paths of regularity are deﬁned in a gof , which extend along the direction of motion. The situation gets more complicated when the motion is a mixture of the local and global components. In this case, subgroups of frames (subgof s) with different motion types result in multiple directions of regularity. One way of modelling this regularity is modelling the motion. The pixel correspondence information over multiple frames gives the directions of regularity of the gof . The motion-compensated (MC) wavelet coding algo- rithms use this approach. The choice of the motion model is an important factor in such algorithms, as its precision and com- pressibility directly affect the bit rate. In the recent literature, the researchers have used dense motion ﬁelds modelled by Markov Random Fields [1] and deformable triangular meshes ([2]). All these models, however, use only consecutive pairs of frames to compute the directions of regularity of the whole gof . Hence, the overhead is a problem since the temporal redundancy in the model cannot be removed when frame pairs have similar motions. Moreover, the (MC) wavelets reduce to the standard wavelets when there is no motion in the gof . This means that it cannot exploit the spatial regularity of the frames. In this paper, we propose to model the spatiotemporal directions of regularity of a gof by a 3D vector ﬁeld, called the structural ﬂow. The structural ﬂow can be modelled in dif- ferent ways, depending on whether the regularity is spatial or spatiotemporal. Once the ﬂow is computed, the wavelet basis can be warped along the directions of regularity to decompose the gof . Then the warped basis is bandeletized, a technique ﬁrst introduced by Mallat et al in [3], in order to take further advantage of the regularity. The overall compression requires partitioning the gof into subgofs, whose regularities can be as closely modelled as possible by their respective structural ﬂows. This is achieved by using an oct tree segmentation of the gof , such that the reconstruction error and the bit rate of the gof are optimized. In this paper, Section II explains the main steps of con- structing a bandelet basis for a subgof : Section II-A goes into the details of structural ﬂow, presenting some mathematical background. Section II-B and II-C describe how this ﬂow can be used to construct a bandelet basis. Next, we discuss the optimal segmentation of a gof into subgof s in Section III. Finally, we demonstrate our results on standard video sequences in Section IV, and conclude with a discussion in Section V. II. THE ORTHONORMAL 3D BANDELET BASIS In wavelet video coding, the efﬁciency can be improved by analyzing the directions of regularity of the gof (F ), which are represented by the structural ﬂow. Unlike the standard wavelets, the orthonormal bandelets can greatly beneﬁt from this direction information, and can achieve higher compression rates. In this section, we will explain the main steps of constructing a bandelet basis. A. The Structural Flow The structural ﬂow, ζ (x, y, t), is a 3D vector ﬁeld that shows the directions, in which a subgof (F i ) varies regularly. De-