IEEE Workshop on Video Registration (with ICCV’01), Vancouver, Canada, July 13, 2001 1 Error Characteristics of Parallel-Perspective Stereo Mosaics Zhigang Zhu, Allen R. Hanson, Howard Schultz, Edward M. Riseman Department of Computer Science, University of Massachusetts at Amherst, MA 01003 E-mail: zhu@cs.umass.edu Abstract This paper analyzes different aspects of the error characteristics of parallel-perspective stereo mosaics generated from an airborne video camera moving through a complex three-dimensional scene. First, we show that theoretically a stereo pair of parallel- perspective mosaics is a good representation for an extended scene, and the adaptive baseline inherent to the geometry permits depth accuracy independent of absolute depth. Second, in practice, we have proposed a 3D mosaicing technique PRISM (parallel-ray interpolation for stereo mosaicing) that uses interframe match to interpolate the camera position between the original exposure centers of video frames taken at discrete spatial steps. By analyzing the errors introduced by a 2D mosaicing method, we explain why the "3D mosaicing" solution is important to the problem of generating smooth and accurate mosaics while preserving stereoscopic information. We further examine whether this ray interpolation step introduces extra errors in depth recover from stereo mosaics by comparing to the typical perspective stereo formulation. Third, the error characteristics of parallel stereo mosaics from cameras with different configurations of focal lengths and image resolutions are analyzed. Results for mosaic construction from aerial video data of real scenes are shown and for 3D reconstruction from these mosaics are given. We conclude that (1) stereo mosaics generated with the PRISM method have significantly less errors in 3D recovery (even if not depth independent) due to the adaptive baseline geometry; and (2) longer focal length is better since stereo matching becomes more accurate. 1. Introduction There have been attempts in a variety of applications to add 3D information into an image-based mosaic representation. Creating stereo mosaics from two rotating cameras was proposed by Huang & Hung [1], and from a single off-center rotating camera by Ishiguro, et al [2], Peleg & Ben-Ezra [3], and Shum & Szeliski [4]. In these kinds of stereo mosaics, however, the viewpoint -- therefore the parallax -- is limited to images taken from a very small area. Recently our work [5,6,7] has been focused on parallel-perspective stereo mosaics from a dominantly translating camera, which is the typical prevalent sensor motion during aerial surveys. A rotating camera can be easily controlled to achieve the desired motion. On the contrary, the translation of a camera over a large distance is much hard to control in real vision applications such as robot navigation [8] and environmental monitoring [6, 9]. We have previously shown [5-7] that image mosaicing from a translating camera raises a set of different problems from that of circular projections of a rotating camera. These include suitable mosaic representations, the generation of a seamless image mosaic under a rather general motion with motion parallax, and epipolar geometry associated with multiple viewpoint geometry. In this paper we will give a thorough analysis on various aspects of the error characteristics of 3D reconstruction from parallel-perspective stereo mosaics generated from real video sequences. It has been shown independently by Chai and Shum [10] and by Zhu, et al [5,6] that parallel- perspective is superior to both the conventional perspective stereo and the recently developed multi- perspective stereo for 3D reconstruction, in that the adaptive baseline inherent to the parallel-perspective geometry permits depth accuracy independent of absolute depth. However, this conclusion is obtained in an ideal case – i.e. enough samples of parallel projection rays from a “virtual camera” with ideal 1D or 2D motion can be generated from a complete scene model. In the practice of stereo mosaicing from a real video sequence, however, we need to consider the errors in the final mosaics versus camera motion types, frame rates, focal lengths, and scene depths. The analysis of the error characteristics of 3D construction from real stereo mosaics will be the focus of this paper. First we will show why an efficient “3D mosaicing” techniques are important for accurate 3D reconstruction from stereo mosaics. Obviously use of standard 2D mosaicing techniques based on 2D image transformations such as a manifold projection [11] cannot generate a seamless mosaic in the presence of large motion parallax, particularly in the case of surfaces that are highly irregular or with large different heights. Moreover, perspective distortion causing the geometric seams will introduce errors in 3D reconstruction using the parallel-perspective geometry of stereo mosaics. In generating image mosaics