Markerless Image-based 3D Tracking for Real-time Augmented Reality Applications R. Koch, K. Koeser, B. Streckel, J.-F. Evers-Senne Institute of Computer Science and Applied Mathematics Christian-Albrechts-University of Kiel, 24098 Kiel, Germany email: rk@informatik.uni-kiel.de Abstract In this contribution we describe a visual marker-less real- time tracking system for Augmented Reality applications. The system uses a fisheye lens mounted on a firewire cam- era with 10 fps for visual tracking of 3D scene points with- out any prior scene knowledge. All visual-geometric data is acquired online during the tracking using a structure-from- motion approach. 2D Image features in the hemispherical fisheye image are tracked using a 2D feature point tracker. Tracking may be facilitated by orientation compensation with an inertial sensor. Based on the image tracks, 3D cam- era egomotion and 3D features are estimated online from the image sequence. The tracking is robust even in the presence of moving objects as the large field of view of the camera stabilizes the tracking. 1 INTRODUCTION Augmented Reality (AR) systems aim at the superpo- sition of additional scene data into the video stream of a real camera. One can distinguish between offline augmen- tation for special effects in video post production [3], and online augmentation, where a user typically carries a head mounted display. Additional information is either super- imposed directly onto the video stream using video see- through devices or it is projected optically into the visual path of the users gaze direction [1, 2]. The technical and algorithmic demands for online AR are very challenging. The AR equipment must be carried by the user possibly for a long time, hence it should be light- weight and ergonomic and not hinder free movements. At the same time, computation of camera pose must be very fast and reliable, even in uncooperative environments with difficult lighting situation. This will require high computa- tional demands on the system. Recently, quite some research activities on online AR were undertaken. The work was inspired by the online tracking algorithms from robotics and computer vision. In robotics, the realtime SLAM approach (Simultaneous Lo- calization And Mapping) has been used with non-visual sensors like odometry and ultrasound/laser sensors. These ideas were recently extended to visual tracking [4]. In com- puter vision, offline AR and visual reconstruction has been in the focus for some years. The dominant approach in this field is termed SfM (Structure from Motion), where simulta- neous camera pose estimation, even from uncalibrated cam- eras, and 3D structure reconstruction is possible [8]. Both approaches have much in common and can be merged to- wards a versatile realtime AR system [6]. 2 ONLINE AR SYSTEM DESIGN In the following we will describe the components of an online AR system that allows robust 3D camera tracking in complex and uncooperative scenes where parts of the scene may move independently. It is based on the SfM approach from computer vision. The robustness is achieved in two ways: 1. A 190 Degree hemispherical fisheye lens is used that captures a very large field of view of the scene. If used in indoor environments, the hemispherical view will always see lots of static visual structures, even if the scene in front of the user may change dramat- ically. The system is therefore mainly designed for (but not restricted to) indoor use, because in outdoor scenes the sun light falling directly onto the CCD sen- sor will cause problems. These problems can be facil- itated when CMOS sensors with logarithmic response and high dynamic range are used. 2. The 3D tracking is based on robust camera pose esti- mation using structure from motion algorithms [8] that are optimized for realtime performance. These algo- rithms can handle measurement outliers from the 2D tracking using robust statistics.