Robot Navigation by Panoramic Vision and Attention Guided Features Alexandre Bur 1 , Adriana Tapus 2 , Nabil Ouerhani 1 , Roland Siegwart 2 and Heinz H ¨ ugli 1 1 Institute of Microtechnology (IMT), University of Neuchˆ atel, Neuchˆ atel, Switzerland 2 Swiss Federal Institute of Technology, Lausanne (EPFL), Switzerland {alexandre.bur,nabil.ouerhani,heinz.hugli}@unine.ch, adriana.tapus@ieee.org Abstract In visual-based robot navigation, panoramic vision emerges as a very attractive candidate for solving the lo- calization task. Unfortunately, current systems rely on spe- ciﬁc feature selection processes that do not cover the re- quirements of general purpose robots. In order to fulﬁll new requirements of robot versatility and robustness to environ- mental changes, we propose in this paper to perform the feature selection of a panoramic vision system by means of the saliency-based model of visual attention, a model known for its universality. The ﬁrst part of the paper describes a localization system combining panoramic vision and visual attention. The second part presents a series of indoor local- ization experiments using panoramic vision and attention guided feature detection. The results show the feasibility of the approach and illustrate some of its capabilities. 1. Introduction Vision is an interesting and attractive choice of sen- sory input, in the context of robot navigation. Speciﬁcally, panoramic vision is becoming very popular because it pro- vides a wide ﬁeld of view in a single image and the visual information obtained is independent of the robot orienta- tion. Many robot navigation methods based on panoramic vision have been developed in literature. For instance, a model in [9] was designed to perform topological naviga- tion and visual path-following. The method has been tested on a real robot equipped with an omnidirectional camera. Another model for robot navigation using panoramic vi- sion is described in [1]. Vertex and line features are ex- tracted from the omnidirectional image and tracked so that to determine the robot’s position and orientation. In [8], the authors present an appearance-based system for topo- logical localization. An omnidirectional camera was used. The resulting images were classiﬁed in real-time based on nearest-neighbor learning, image histogram matching and a simple voting scheme. Tapus et al. [7] have conceived a multi-modal, feature-based representation of the environ- ment called a ﬁngerprint of a place for localization and map- ping. The multi-modal system is composed of an omnidi- rectional vision system and a 360 degrees laser rangeﬁnder. In these systems, the feature selection process is usually quite speciﬁc. In order to fulﬁll new requirements of versa- tility and robustness imposed to general purpose robot op- erating in wide varying environments, adaptive multi modal feature detection is required. Inspired from human vision, the saliency-based model of visual attention [3] is able to automatically select the most salient features in different en- vironments. In [5], the authors presented a feature-based