Automatic Selection and Detection of Visual Landmarks Using Multiple Segmentations Daniel Langdon, Alvaro Soto, and Domingo Mery Pontificia Universidad Catolica de Chile Santiago 22, Chile dlangdon@puc.cl, {asoto, dmery}@ing.puc.cl Abstract. Detection of visual landmarks is an important problem in the development of automated, vision-based agents working on unstruc- tured environments. In this paper, we present an unsupervised approach to select and to detect landmarks in images coming from a video stream. Our approach integrates three main visual mechanisms: attention, area segmentation, and landmark characterization. In particular, we demon- strate that an incorrect segmentation of a landmark produces severe problems in the next steps of the analysis, and that by using multiple segmentation algorithms we can greatly increase the robustness of the system. We test our approach with encouraging results in two image sets taken in real world scenarios. We obtained a significant 52% increase in recognition when using the multiple segmentation approach with respect to using single segmentation algorithms. 1 Introduction Vision is an attractive option to provide an intelligent agent with the type of perceptual capabilities that it needs to deal with the complexity of an unstruc- tured natural environment. The robustness and flexibility exhibited by most seeing beings is a clear proof of the advantages of an appropriate visual sys- tem. In particular, the selection and detection of relevant visual landmarks is a highly valuable perceptual capability. In effect, the ability to select relevant vi- sual patches from an input image, such that, they can be detected in subsequent images, is a useful tool for a wide variety of applications, such as video editing, place and object recognition, or mapping and localization by a mobile agent. In this paper we present an unsupervised method for the automatic selection and subsequent detection of suitable visual landmarks from images coming from a video stream. To achieve this goal, we frame the problem of landmark detection as a pure bottom-up process based on low level visual features such as shape, color, or spatial continuity. The goal is to select interesting, meaningful, and useful landmarks. We base our approach on the integration of three main mechanisms: visual attention, area segmentation, and landmark characterization. Visual attention provides the selection mechanism to focus the processing on the most salient parts of the input image. This eliminates the detection of irrelevant landmarks L.-W. Chang, W.-N. Lie, and R. Chiang (Eds.): PSIVT 2006, LNCS 4319, pp. 601–610, 2006. c Springer-Verlag Berlin Heidelberg 2006