Multi-GPU based Event Detection and Localization using High Deﬁnition Videos Sidi Ahmed Mahmoudi and Pierre Manneback University of Mons, Faculty of Engineering Computer Science Department Place du Parc, 20 7000 Mons, Belgium Email: {Sidi.Mahmoudi, Pierre.Manneback}@umons.ac.be Abstract—Video processing algorithms are widely used in applications related to computer vision such as motion tracking, human behavior understanding, event detection and localization. Nevertheless, the new video standards, in high deﬁnitions (HD: 1280×720, or Full HD: 1920×1080) cause that actual implemen- tations, even running on modern hardware, can’t respect the needs of real-time processing. To overcome this constraint, many applications have been developed, that exploit the high power of graphic processing units (GPUs). However, any is able to treat high deﬁnition videos efﬁciently. In this work, we propose an effective exploitation of single and multiple GPUs, in order to achieve real-time detection and localization of abnormal events, using HD and Full HD videos. The proposed approach detects portions of video that corresponds to sudden changes of motion variations of movements. It allows also to provide areas in video frames where motion behavior is surprising compared to the rest of motion in the same frame. Experimental results have been conducted using several videos showing efﬁcient detection and localization of abnormal events in multi-user scenarios. The use of multiple GPUs enabled a real time treatment of high deﬁnition videos with a global speedup ranging from 5 to 35, by comparison with CPU implementations. Key words: Event detection and localization, GPU, optical ﬂow computation, motion tracking. I. I NTRODUCTION Video surveillance presents a very active research topic in computer vision domain. Recently, the increasing concern about public safety and law enforcement has generated a high growth in the number of surveillance cameras. Therefore, automatic techniques which process, analyze and describe human behaviors and activities are more and more required. In this context, event detection and localization methods present a fundamental processing step in the majority of visual surveil- lance algorithms. They make use of techniques able to analyze human behaviors and activities, which allow to identify, in real time, sudden changes in motions which are then interpreted as abnormal events. The optical ﬂow method, initially described in 1950 by J.J. Gibson [3], presents one of the most commonly used technique for event detection and localization. This method was followed by several techniques of motion estimation such as Horn & Schunck [5] or Lucas & Kanade [9] with the latter being re- garded as more robust to the noise and capable of tracking even small motions. However, the high accuracy is achieved at the expense of high computational complexity. Moreover, modern surveillance systems are more and more equipped with high deﬁnition cameras that, despite the increased computational burden, are still expected to be handled in real-time. As a result, a need arose for high performance implementations of motion estimation algorithms. Recently, a high interest has been given to new compu- tational architectures, such as GPUs, that turned out to be very efﬁcient in various ﬁelds of science, and particularly, for image and video processing [7], [10]. Yet, even though several approaches to the problem of motion detection and tracking have been proposed lately, including those taking advantage of GPUs[13] [14], they are either unable to handle high deﬁnition video streams or are limited to a single GPU and thus do not scale up well. Therefore, we propose GPU and Multi-GPU implementations of background extraction, silhouette detection and optical ﬂow estimation algorithms that are exploited, within our proposed approaches of event detection and localization. The remainder of the paper is organized as follows: re- lated works are discussed in the second section. Section 3 presents the proposed approach for both event detection and localization, while Section 4 is related to describe our GPU and Multi-implementations of these methods. Experimental results are given in the ﬁfth section, showing performance of CPU, GPU and Multi-GPU implementations. Finally, conclusions and future works may be found in Section 6. II. RELATED WORK Generally, the methods of event detection and localization consist of modeling normal behaviors, and then estimating the difference between the normal behavior model and the observed behaviors. These variations can be labeled as emer- gency events, and the deviations from examples of normal behavior are used to characterize abnormality. In this category, [1] extract hog descriptors in order to use predeﬁned models (i.e. crowd scenarios) to recognize crowd events. Authors in [6] propose a context-aware method that allows to detect anoma- lies by tracking all moving objects in the video. There are also some work in [20] which addresses the problem of analyzing video events in crowded scenes. A novel manifold learning method was developed to achieve an effective modeling of video events in a low dimensional space. On the other hand, one can ﬁnd several GPU implemen- tations related to the domain of motion tracking, which is so useful for event detection and localization methods. In case