Model based selection and classification of local features for recognition using Gabor filters ⋆ Plinio Moreno, Alexandre Bernardino, and Jos´ e Santos-Victor {plinio, alex, jasv}@isr.ist.utl.pt Instituto Superior T´ ecnico & Instituto de Sistemas e Rob ´ otica 1049-001 Lisboa - Portugal Abstract. We propose models based on Gabor functions to address two related aspects in the object recognition problem: interest point selection and classifica- tion. We formulate the interest point selection problem by a cascade of bottom- up and top-down stages. We define a novel type of top-down saliency operator to incorporate low-level object related knowledge very soon in the recognition process, thus reducing the number of canditates. For the classification process, we represent each interest point by a vector of Gabor responses whose parame- ters are automatically selected. Both the selection and classification procedures are designed to be invariant to rotations and scaling. We apply the approach to the problem of facial landmark classification and present experimental result il- lustrating the performance of the proposed techniques. 1 Introduction The object recognition problem has been tackled recently using the concept of low- level features with several successful results [1–4]. All of these works exploit the idea of selecting various points in the object and building up a local neighborhood represen- tation for each one of the selected points. In this work we introduce models built with Gabor functions to unfold the following issues: (selection) which points are important to represent the object, and (classification) how to represent and match the information contained in each point’s neighborhood. The point selection problem, also called keypoint detection [1, 5], interest point de- tection[3], bottom-up saliency [6], and salient region detection [7], has been addressed in a bottom-up fashion. Bottom-up means that points selected are image-dependent, not task-dependent. Salient points are selected to be distinguishable from its neighbors and have good properties for matching, repeatability, and/or invariance to common image deformations. However, there is evidence of interaction between bottom-up and top- down processes in nearly every visual search model in the human visual system[8]. In guided visual search problems, where specific objects are searched in the scene, it is convenient to incorporate object related knowledge(top-down information) as soon as possible in the recognition process, to reduce the amount of possible candidates. The ⋆ Research partly funded by the FCT Programa Operacional Sociedade de Informac ¸˜ ao(POSI) in the frame of QCA III, and Portuguese Foundation for Science and Technology PhD Grant FCT SFRH\BD\10573\2002 ICIAR 2006 - Intl. Conf. on Image Analysis and Recognition, Póvoa do Varzim, Portugal, Sept. 2006.