3D MODELING OF INDOOR ENVIRONMENTS BY A MOBILE PLATFORM WITH A LASER SCANNER AND PANORAMIC CAMERA Peter Biber † , Sven Fleck † , Florian Busch † , Michael Wand † , Tom Duckett ‡ , Wolfgang Strasser † email: {biber,fleck,busch,wand,strasser}@gris.uni-tuebingen.de, tom.duckett@tech.oru.se † Wilhelm Schickard Institute, Graphical-Interactive Systems (WSI/GRIS), University of T¨ ubingen, 72070 T¨ ubingen, Germany ‡ Applied Autonomous Sensor Systems (AASS), University of ¨ Orebro, 70182 ¨ Orebro, Sweden ABSTRACT One major challenge of 3DTV is content acquisition. Here, we present a method to acquire a realistic, visually convincing 3D model of indoor environments based on a mobile platform that is equipped with a laser range scanner and a panoramic camera. The data of the 2D laser scans are used to solve the simultaneous lo- calization and mapping problem and to extract walls. Textures for walls and floor are built from the images of a calibrated panoramic camera. Multiresolution blending is used to hide seams in the gen- erated textures. The scene is further enriched by 3D-geometry cal- culated from a graph cut stereo technique. We present experimental results from a moderately large real environment. 1 1. INTRODUCTION A 3D model can convey much more useful information than the typ- ical 2D maps used in many applications. By combining vision and 2D laser range-finder data in a single representation, a textured 3D model can provide remote human observers with a rapid overview of the scene, enabling visualization of structures such as windows and stairs that cannot be seen in a 2D model. In the context of 3DTV such models can help planning camera paths and can provide real- istic previews of large scenes with moderate effort. We present an easy to use method to acquire such a model. A mobile robot equipped with a laser range scanner and a panoramic camera collects the data needed to generate a realistic, visually con- vincing 3D model of large indoor environments. Our geometric 3D model consists of planes that model the floor and walls (there is no ceiling, as the model is constructed from a set of bird’s eye views). The geometry of the planes is extracted from the 2D laser range scanner data. Textures for the floor and the walls are gen- erated from the images captured by the panoramic camera. Multi- resolution blending is used to hide seams in the generated textures stemming, e.g., from intensity differences in the input images. The scene is further enriched by 3D-geometry calculated from a graph cut stereo technique to include non-wall structures like stairs, tables, etc. An interactive editor allows fast postprocessing of the automatically generated stereo data to remove outliers or moving objects. So our approach builds a hybrid model of the environment by extracting geometry and using image based approaches (texture mapping). A similar approach was applied by Fr¨ uh and Zakhor [7] for generating a 3D model of downtown Berkley. A complete re- view of hybrid techniques is beyond the scope here and we refer to references in [7] and to the pioneering work of Debevec [5]. We be- lieve that such hybrid techniques are superior to pure image based techniques like Aliaga’s work [1] that needs advanced compression and caching techniques and still provides only a limited set of view- points (a single plane). The acquired indoor model presented here is much larger than other indoor models reported, yet it is possible to view it in our point cloud viewer from arbitrary viewpoints in real time. 1 This work is supported by EC within FP6 under Grant 511568 with the acronym 3DTV. Wall extraction Laser scans Odometry 2D map Omni images Warping Cropping Blending Geometry Selection Warping Cropping Blending Omnicam internal Omnicam/Laser Scanner external Calibration Texture Generation Map Generation Final point cloud Combiner Graphcut Postprocessing Point clouds Floor Texture Wall Textures Stereo Processing Omni images Figure 1: An overview of our method to build a 3D model of an indoor environment. Shown is the data flow between the different modules. The main idea of our method to build a 3D model of an indoor environment is to remotely steering a mobile robot through it. At regular intervals, the robot records a laser scan, an odometry reading and an image from the panoramic camera. The robot platform is described in section 2. From this data, the 3D model is constructed. Fig. 1 gives an overview of the method and shows the data flow between the different modules. Five major steps can be identified as follows (the second step, data collection, is omitted from Fig. 1 for clarity). 1. Calibration of the robot’s sensors. 2. Data collection. 3. Map generation 4. Texture generation 5. Stereo processing Our method consists of manual, semi-automatic and automatic parts. Recording the data and calibration is done manually by tele- operation, and extraction of the walls is done semi-automatically with an user interface. Stereo matching is automatic, but selec- tion of extracted 3D geometry and postprocessing includes semi- automatic and manual parts. 2. HARDWARE PLATFORM The robot platform used in these experiments is an ActivMedia Peo- plebot (see Fig. 3). It is equipped with a SICK LMS 200 laser scan- ner and a panoramic camera consisting of an ordinary CCD camera (interlaced and TV resolution) with an omni-directional lens attach-