LOCALIZATION USING COMBINATIONS OF MODEL VIEWS zyxw Ronen Basri Dept of Applied Math The Weizmann Institute of Science Rehovot, Israel 76100 Abstract zyxwvuts A method for localization, the act zyxwvuts of recognizing the environment, is presented. The method is based on representing the scene as a set of 20 views and predicting the appearances of novel views by linear combinations of the model views. The method accu- rately approximates the appearance of scenes under weak perspective projection. Analysis of this projec- tion as well as experimental results demonstrate that in many cases this approximation is suficient to accu- rately describe the scene. When weak perspective ap- proximation is invalid, either a larger number of mod- els can be acquired or an iterative solution to account for the perspective distortions can be employed. The method has several advantages over other ap- proaches. It uses relatively rich representations; the representations are 2D rather than 9D; and localiza- tion can be done from only a single 2D view. 1 Introduction Basic tasks in autonomous robot navigation are le calization and positioning. Localization is the act of recognizing the environment, that is, assigning con- sistent labels to different locations, and positioning is the act of computing the coordinates of the robot in the environment. Positioning is a task complemen- tary to localization, in the sense that position (e.g., “1.5 meters northwest of table zyxwvutsrq 7”‘) is often specified in a place-specific coordinate system (“in room zyxwvut 911”). This paper addresses the problem of localization. Positioning is addressed in zyxwvutsrq [6]. Unlike existing meth- ods, which represent the environment using zyxwvut 30 models (e.g., [l, 2,4]), our method, based on the linear combi- nations scheme of [7], represents scenes by sets of their 2D images. Localization is achieved by comparing the observed image to linear combinations of model views. Ehud Rivlin Center for Automation Research University of Maryland College Park, MD 20742-341 1 The rest of the paper is organized as follows. The next section describes the method of localization us- ing linear combinations of model views. The method assumes weak perspective projection. An iterative scheme to account for perspective distortions is pre- sented in Section 3. An analysis of the error resulting from the projection assumption is presented in Sec- tion 4. Experimental results follow. 2 Localization The problems of localization and object recogni- tion are similar in many ways. Both problems require the matching of visual images to stored models, either of the environment or of the observed objects. Both problems face similar difficulties, such as varying il- lumination conditions and changes in appearance due to viewpoint changes. Similar methodologies therefore often are used to handle both problems. The problem of localization is defined as follows: given P, a 2D image of a place, and M , a set of stored models, find a model zyxwv Mi E M such that P matches Mi. A method for localization, based on the “Linear Combinations” (LC) scheme [7], is defined as follows. Given an image, we construct two view vectors from the feature points in the image (z and y-coordinates). The environment is modeled by a set of such views, where the points in these views are ordered in cor- respondence. The appearance of a novel view of the object is predicted by applying linear combinations to the stored views. The predicted appearance is then compared with the actual image, and the object is recognized if the two match. Formally, given P, a 2D image of a scene, and M, a set of stored models, the ob’ective is to find a model Mi E M such that P = a,Mj for some con- stants a, E zyxwv 72. More concretely, let p; = (zi, y;, zi), 3, 226 0-8186-3870-2/93 $3.00 0 1993 IEEE