Fast Change Detection for Camera-based Surveillance Systems Matthias Michael 1 Christian Feist 2 Florian Schuller 2 Marc Tschentscher 1 Abstract— Many parking garages and open parking spaces today are already equipped with surveillance cameras to increase the security of pedestrians or to record potentially illegal actions. An additional use case for such multi-camera surveillance systems is the automatic extraction of 3D-positions of objects and pedestrians. The safety of autonomous vehicles could beneﬁt from this information in cases where the on- board sensors might be unable to detect potentially dangerous situations due to occlusion. Since the used cameras are installed statically, change detection is often employed as the ﬁrst operation and all subsequent processing steps rely on its quality. Different scenarios impose speciﬁc challenges to the respective algorithms. In this paper we present an efﬁcient algorithm for change detection which is tailored to the difﬁculties arising in an indoor surveillance scenario and demonstrate its applica- bility by adapting an existing pipeline and improving overall performance. I. INTRODUCTION Modern surveillance systems have the task of inferring in- formation about the environment they are monitoring. These information can range from simply extracting the location of movable objects and persons in the environment up to more complex statements like the detection of potentially danger- ous situations e.g., in the context of autonomously driving vehicles. Often a multi-stage processing pipeline is necessary regardless of the system’s ultimate purpose. The ﬁrst step of such a pipeline often consists of identifying moving objects and areas in the camera images. Since surveillance cameras are usually installed permanently and statically, methods of change detection (also known as background/foreground segmentation or background subtraction) can be utilized for this task without additional knowledge about the objects in the scene. The basic idea of change detection is to build a represen- tation of the static elements in a scene. New camera images can then be compared to this representation which allows the detection of changes with respect to the static model. Areas that display changes are then denoted as foreground. This concept is visualized in Fig. 1. Since only those areas that are identiﬁed as foreground are considered for further processing, change detection is a crucial step and the performance of the entire system depends on its quality. If the algorithm fails to identify moving areas, the rest of the pipeline might be missing decisive information. On the other hand, if too many pixels are segmented erroneously the rest of the system is getting irrelevant information as input. It would need to be capable of identifying that these areas should not be considered 1 University of Bochum, Institute for Neural Computation {firstname.lastname}@ini.rub.de 2 AUDI AG, Ingolstadt, Germany {firstname.lastname}@audi.de for generating statements about the scene which in turn might slow processing down signiﬁcantly, simply because larger areas of the image have to be processed. Overall, a functioning change detection allows for a more simple and streamlined design of the pipeline. Even though the identiﬁcation of moving areas is a com- mon task in computer vision and many algorithms attempt to solve it, no decisive state-of-the-art has been established. This is most likely due to the various problems like mov- ing background, moving camera, and shadows, which are different in each speciﬁc scenario. It is very difﬁcult for a single algorithm to address all possible aspects at the same time while maintaining a reasonable processing speed – es- pecially when real-time requirements are present. Therefore a promising approach is to identify the challenges speciﬁc to the scenario at hand and choose or design a specialized algorithm. In this paper we present a change detection algorithm that is tuned for indoor surveillance systems in an urban envi- ronment with a non-moving camera. In Sec. II the speciﬁc challenges of this scenario are identiﬁed and the performance of existing algorithms is examined. Section III describes the concept of the algorithm with Sec. IV investigating its performance on a popular benchmark. Besides standard performance measures like precision and recall, it is also important to evaluate performance with respect to an existing surveillance system. Therefore we adapt the system presented in [2] which is a multi-camera surveillance system designed for localizing arbitrary objects in an indoor parking garage. We substitute their change detection step with our method and adjust a few other details. The entire pipeline as well as our changes are described in Sec. V with a conclusion being given in Sec. VI. II. RELATED WORK There are a few general approaches that can be seen standard due to the fact that they are relatively simple to implement and produce acceptable results in a short amount Fig. 1. Left: Single frame of a video sequence. Center: Representation of the static elements of the scene – here displayed by an image of the scene without moving objects. Right: Ground-truth mask of moving areas. White pixel denote movement while black pixel belong to non-moving background. (Images taken from [1]; Dataset Baseline – Highway.)