ARTICLE IN PRESS Pattern Recognition ( ) – www.elsevier.com/locate/patcog Appearance-based recognition of 3-D objects by cluttered background and occlusions Michael P. Reinhold ∗ , Marcin Grzegorzek, Joachim Denzler, Heinrich Niemann Chair for Pattern Recognition, University Erlangen-Nuremberg, Martensstr. 3, 91058 Erlangen, Germany Received 30 September 2003; received in revised form 28 October 2004; accepted 28 October 2004 Abstract In this article we present a new appearance-based approach for the classification and the localization of 3-D objects in complex scenes. A main problem for object recognition is that the size and the appearance of the objects in the image vary for 3-D transformations. For this reason, we model the region of the object in the image as well as the object features themselves as functions of these transformations. We integrate the model into a statistical framework, and so we can deal with noise and illumination changes. To handle heterogeneous background and occlusions, we introduce a background model and an assignment function. Thus, the object recognition system becomes robust, and a reliable distinction, which features belong to the object and which to the background, is possible. Experiments on three large data sets that contain rotations orthogonal to the image plane and scaling with together more than 100 000 images show that the approach is well suited for this task. 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. Keywords: Object recognition; Appearance-based; Object representation; Statistical modelling; Background model; 3-D transformation of objects 1. Introduction For many tasks the recognition of objects in images is necessary, for example for visual inspection or for automatic detection of objects. In doing so, mostly the class as well as the pose of the object have to be estimated. One main aspect in object recognition is that the appearance as well as the size of the objects vary under 3-D transformations, i.e. scaling or rotations orthogonal to the image plane. An example is shown in Fig. 1. Therefore the appearance of the objects has to be stored for the different, possible viewpoints in a proper way. Especially the large data size has to be reduced. ∗ Corresponding author. Tel.: +49 9131 85 27775; fax: +49 89 4129 13055. E-mail addresses: michael.p.reinhold@web.de (M.P. Reinhold), marcin.grzegorzek@informatik.uni-erlangen.de (M. Grzegorzek), joachim.denzler@uni-jena.de (J. Denzler), niemann@informatik.uni-erlangen.de (H. Niemann). 0031-3203/$30.00 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.patcog.2004.10.008 Furthermore, for real recognition tasks one has to deal with the following problems: often the illumination changes, the objects are situated in heterogeneous background and are partially occluded. A robust object recognition system has to handle these disturbances and has to guarantee a reliable recognition in spite of that. 1.1. Related work There are two main approaches for object recognition. First, there exist approaches that apply a segmentation process and use geometric features like lines or vertices as features, e.g. Refs. [1–6]. But these methods suffer from segmentation errors, and they have problems to deal with objects that have no distinct edges. Therefore many authors, e.g. Refs. [7–14], prefer the second method, the appearance- based approach. Here, the features are directly calculated by the pixel intensities without a previous segmentation process.