Local Binary Pattern Descriptors for Dynamic Texture Recognition Guoying Zhao and Matti Pietikäinen Machine Vision Group, Infotech Oulu and Department of Electrical and Information Engineering, P. O. Box 4500 FI-90014 University of Oulu, Finland E-mail:{gyzhao, mkp}@ee.oulu.fi Abstract Dynamic texture is an extension of texture to the temporal domain. In this paper, a new method for recognizing dynamic textures is proposed. The textures are modeled with concatenated local binary patterns in three orthonormal planes. The circular neighborhoods are generalized to elliptical sampling to fit to the space-time statistics. This is an extension of the LBP approach widely used in still texture analysis, combining the motion and appearance together. Our approach has many advantages compared with the earlier approaches providing a better performance for the DynTex and MIT databases. 1.Introduction Dynamic textures or temporal textures are textures with motion [1]. Dynamic textures (DT) encompass the class of video sequences that exhibit some stationary properties in time [2]. There are lots of dynamic textures in real world, including sea-waves, smoke, foliage, fire, shower and whirlwind. Description and recognition of DT is needed, for example, in video retrieval systems, which have attracted growing attention. Because of their unknown spatial and temporal extend, the recognition of DT is a challenging problem compared with the static case [3]. Polana and Nelson classify visual motion into activities, motion events and dynamic textures [4]. Recently, a brief survey of DT description and recognition of dynamic texture was given by Chetverikov and Péteri [5]. Methods based on optic flow [3,4,6,7] are currently the most popular ones [5], because optic flow estimation is a computationally efficient and natural way to characterize the local dynamics of a temporal texture. Péteri and Chetverikov [3] proposed a method that combines normal flow features with periodicity features, in an attempt to explicitly characterize both motion magnitude, directionality and periodicity. Lu et al. presented a new method using spatio-temporal multi-resolution histograms based on velocity and acceleration fields [7]. Fazekas and Chetverikov compared normal flow features and regularized complete flow features in dynamic texture classification [8]. They conclude that normal flow contains information on both dynamics and shape. Saisan et al. [9] applied a dynamic texture model [1] to the recognition of 50 different temporal textures. Despite this success, their method assumes stationary DTs well-segmented in space and time, and the accuracy drops drastically if they are not. Fujita and Nayar [10] modified the approach [9] by using impulse responses of state variables to identify model and texture. Fablet and Bouthemy introduced temporal co- occurrence [6] that measures the probability of co- occurrence in the same image location of two normal velocities (normal flow magnitudes) separated by certain temporal intervals. Recently, Smith et al. dealt with video texture indexing using spatiotemporal wavelets [11]. Otsuka et al. [12] assume that DTs can be represented by moving contours whose motion trajectories can be tracked. Zhong and Sclaro [13] modified [12] and used 3D edges in the spatiotemporal domain. The key problem of dynamic texture recognition is how to combine motion features with appearance features. To address this, we recently proposed a volume LBP method (VLBP) [15]. But with the increase in the number of neighboring points, the number of patterns for basic VLBP will become very large. Due to this fast increase it is difficult to extend VLBP to have a large number of neighboring points, which limits its applicability. In this paper, we propose a novel, theoretically and computationally simple approach in which dynamic textures are modeled using local binary patterns in three orthonormal planes within a volume. 0-7695-2521-0/06/$20.00 (c) 2006 IEEE