Evaluation of Motion Segmentation Quality for Aircraft Activity Surveillance Josep Aguilera, Horst Wildenauer, Martin Kampel Mark Borg, David Thirde, James Ferryman Pattern Recognition and Image Processing Group Computational Vision Group Vienna University of Technology The University of Reading Favoritenstr.9, 183-2, A-1040 Vienna, Austria Whiteknights, Reading, RG6 6AY, UK {agu, wilde, kampel}@prip.tuwien.ac.at {m.borg, d.j.thirde, j.m.ferryman}@reading.ac.uk Abstract Recent interest has been shown in performance evaluation of visual surveillance systems. The main purpose of perfor- mance evaluation on computer vision systems is the statisti- cal testing and tuning in order to improve algorithm’s reli- ability and robustness. In this paper we investigate the use of empirical discrepancy metrics for quantitative analysis of motion segmentation algorithms. We are concerned with the case of visual surveillance on an airport’s apron, that is the area where aircrafts are parked and serviced by spe- cialized ground support vehicles. Robust detection of indi- viduals and vehicles is of major concern for the purpose of tracking objects and understanding the scene. In this paper, different discrepancy metrics for motion segmentation eval- uation are illustrated and used to assess the performance of three motion segmentors on video sequences of an airport’s apron. 1 Introduction Over the last decade, increasing interest in the field of visual surveillance has led to the design of a plethora of systems for automated visual tracking of moving objects. In many of these systems, the detection and segmentation of moving objects represent the first step on which subsequent process- ing stages heavily depend. Deteriorations of segmentation quality can have a severe impact on the performance of a surveillance system, and thus, the ability to effectively seg- ment moving objects under a wide range of disturbing con- ditions is a critical requirement. Although considerable efforts have been spent on the de- velopment of robust motion segmentation algorithms, no comparable attention has been given to their evaluation. As a consequence, there is rising demand for quantitative eval- uation of segmentation quality in order to assess the relia- bility of existing approaches and to facilitate their compa- rability [7]. Especially in the case of outdoor surveillance, where illumination changes, weather conditions, shadows, and occlusions strongly impact the segmentation quality. Empirical methods developed for the assessment of mo- tion segmentation quality can be characterised by their ba- sis of evaluation: (1) goodness methods that operate without reference segmentations (ground truth), and (2) discrepancy methods based on the use of ground truth. Recently, a set of performance metrics for motion seg- mentation evaluation have been proposed in the case of non available ground truth. In [5] Correia and Pereira present a methodology based on the idea of measuring intra-object homogeneity features and inter-object disparity features. Erdem et al. [9] propose two error metrics based on colour information and motion features. Ellis [7] proposes error metrics based on correct and false matches between ground truth and observations. Er- dem et al. [8] suggest the use of spatio-temporal segmenta- tion measures for object-based motion segmentation. Both an evaluation methodology and metrics for video segmen- tation quality analysis separating individual object evalua- tion from overall evaluation have been introduced by Cor- reia and Pereira [4]. Perceptually-weighted criteria, which take into account visually desirable properties of reference segmentations, have been designed by Villegas and Mari- achal [15] and Cavallaro et al. [3]. To facilitate and accelerate the creation of ground truth, semi-automatic frameworks such as ViPER [6] and ODViS [11] have been designed. In [13] Schl¨ ogl et al. present a fully-automatic evaluation framework, which in- troduces ground truth by the use of synthetic objects. In this paper, we investigate the use of discrepancy met- rics for the quantitative analysis of spatial accuracy of seg- mentations provided by motion segmentors. Specifically, we are concerned with the case of visual surveillance on an airport’s apron addressed by the European AVITRACK project [1]; that is, the robust detection of individuals and vehicles for the purpose of tracking and categorising objects in the scene. We evaluate three motion segmentation algo- rithms on airport’s apron sequences across a range of con- ditions. The conditions studied are: Varying weather and illumination conditions, different camera viewpoints, and different scene complexity. The motion segmentors used in