First International Workshop on Digital and Computational Video, December 10, 1999, Tampa, Florida, USA STEREO MOSAICS FROM A MOVING VIDEO CAMERA FOR ENVIRONMENTAL MONITORING Zhigang Zhu, Allen R. Hanson, Howard Schultz, Frank Stolle, Edward M. Riseman Computer Vision Lab, Department of Computer Science University of Massachusetts at Amherst, MA 01003, U.S.A. {zhu, hanson, hschultz, stolle, riseman }@cs.umass.edu Abstract Environmental monitoring using automated analysis of high-resolution aerial video is an application of growing importance with its own set of technical challenges. A mosaic is a commonly used tool for representing the enormous amount of data generated from aerial video sequences in an easily viewable form. In contrast to the usual applications of mosaics, the environmental monitoring domain requires both geo-corrected mosaics tied to real-world coordinates and three-dimensional information. The standard techniques of generating seamless mosaics using only image data in a frame-by-frame image registration process cannot satisfy this requirement because registration errors accumulate over extended periods of time, and the mosaicing results are typically only 2D images that lose 3D information. On the other hand, 3D reconstruction of the terrain on a frame-by-frame base has proven to be both difficult and time-consuming. In this paper we will present a method for automatically and efficiently generating stereoscopic mosaics by seamless registration of optical data collected by a video camera mounted on an airborne platform. The resultant mosaics are globally correct with respect to the ground and exhibit correct 3D views when viewed stereoscopically. 1. INTRODUCTION A critical issue among nations in the coming decades will be how to manage the use of land and natural resources. Our interdisciplinary NSF environmental monitoring project, being conducted jointly by researchers from the Department of Computer Science and the Department of Natural Resources Conservation at the University of Massachusetts at Amherst, aims at developing a methodology for estimating the standing biomass of forests. The instrumentation package mounted on an airplane consists of two video cameras (with different focal lengths), a GPS system, an INS system, and a profiling pulse laser. This paper will focus on the development of automated tools that can create video mosaics from high resolution low-altitude video sequences that can be registered with lower resolution high-altitude aerial image data (or satellite image data) as a tool for interpreting the lower resolution data. The previous manual approach used by our forestry experts utilized only a fraction of the available data due to the labor involved in hand interpreting the large amount of video data. For example, a recent project in Bolivia involved over 600 sites and more than 20 hours of video, which is prohibitive if the video is interpreted manually.