WAVELET DOMAIN PROCESSING FOR TRAJECTORY EXTRACTION Alexia Briassouli, Dimitra Matsiki, Ioannis Kompatsiaris Informatics and Telematics Institute Centre for Research and Technology Hellas ABSTRACT This work presents a novel approach to the extraction of trajectories in video. It has the advantage of simultane- ously processing all spatiotemporal information, i.e. all the video frames, and thus overcoming disadvantages of local approaches. A projection of the video frames over time is used to create a frequency modulated signal, which is then processed with the Continuous Wavelet Transform (CWT). The CWT’s power is concentrated around the prominent time-varying frequencies, which are proportional to the ob- ject trajectory. Consequently it can extract the time-varying motion trajectories, which can be used in future extensions to fully characterize the type motion taking place. This approach is robust to local measurement noise and occlusions, since it processes the available data in a global, integrated manner. Experiments take place with synthetic and real sequences demonstrate the capabilities of this approach. Index Terms— motion estimation, wavelets, video pro- cessing 1. INTRODUCTION Numerous methods have been developed for the problem of motion estimation and trajectory extraction, with various advantages and disadvantages. Local methods ﬁnd displace- ments between pairs of frames, based on the ﬂow equa- tion [1], [2], [3], on feature, or on block matching. They are based on the constant illumination assumption, and are spa- tiotemporally local, so they are therefore sensitive to spatially and temporally local measurement noise and occlusions. These problems have been addressed by robust ﬂow esti- mation techniques, which essentially eliminate outlier ﬂow values [4], [5]. Global processing of the data has also been used to ad- dress these limitations [6], [7]. Constant motions form energy planes in the the 3D spatiotemporal spectrum [8], which are ﬁtted to parametric models for the extraction of the related motion features. Global methods have the inherent advantage of addressing the problem of motion estimation similarly to the way the human visual system functions, according to neurophysiological evidence [9]. Processing of the entire video provide mores accurate motion estimates than pair- wise motion estimation [10], since all the available data is being used at once. However, most current literature on the frequency-based motion estimation assumes that inter-frame displacement in the video is constant [8], or handles time- varying motions by processing pairs of frames, instead of the entire video [7]. We present a method that extracts time- varying trajectories from a video by applying the wavelet transform to all video frames, without limiting the motion to be piecewise constant. This has several advantages: (1) All video frames are used at once, so the proposed method is robust to local noise, both in space and time. The occlusion of a moving object over a few frames, for example, will only introduce a small gap in the extracted trajectory, but most of the motion information will be extracted. Illumination vari- ations between successive frames also produce local errors, which are overcome by the use of all video frames. (2) As opposed to existing spatiotemporal ﬁltering based methods [6], no prior knowledge is required about the mo- tions taking place, nor noise-sensitive and computationally intensive ﬁltering. (3) State-of-the-art transform estimation algorithms exist for estimating the wavelet transform (Fast Fourier Trans- form (FFT) based methods), which lower its computational cost [11]. (4) The wavelet transform also provides a visualization of its coefﬁcients’ magnitude, thus allowing users to observe when and which frequencies are stimulated, their duration, time evolution and their density. In our proposed method, a frequency modulated (FM) sig- nal is formed from projections of the video frames in the horizontal and vertical directions. The frequency of this sig- nal varies in time proportionally to the object trajectory, by its construction with a method called “μ-propagation”. The wavelet transform is then applied to the FM signal for the ex- traction of its time-varying frequency and, as a consequence, the time-varying trajectory. The paper is organized as follows. Sec. 2 presents the basic principles of the CWT. In Sec. 3 the algorithm used to construct the FM signal from which the trajectories will be extracted is described. Discussion on the choice of the μ- parameter is included in Sec. 3.1. Experimental results with synthetic and real video sequences are presented in Sec. 4, and conclusions are drawn in Sec. 5.