ASPRS /MAPPS 2009 Fall Conference November 16 – 19, 2009 * San Antonio, Texas AIRBORNE SYNTHETIC SCENE GENERATION (AEROSYNTH) Karl Walli, Lt Col, USAF-AFIT/CI Dave Nilosek, MS Student John Schott, PhD Carl Salvaggio, PhD Center for Imaging Science Rochester Institute of Technology Rochester, NY 14623 ABSTRACT Automated synthetic scene generation is now becoming feasible with calibrated camera remote sensing. This paper implements computer vision techniques that have recently become popular to extract ”structure from motion” (SfM) of a calibrated camera with respect to a target. This process is similar to Microsoft’s popular ”PhotoSynth” technique (Microsoft, 2009), but, blends photogrammetric with computer vision techniques and applies it to geographic scenes imaged from an airborne platform. Additionally, it will be augmented with new features to increase the fidelity of the 3D structure for realistic scene modeling. This includes the generation of both sparse and dense point clouds useful for synthetic macro/micro-scene reconstruction. Although, the quest for computer vision has been an active area of research for decades, it has recently experienced a renaissance due to a few significant breakthroughs. This paper will review the developments in mathematical formalism, robust automated point extraction, and efficient sparse matrix algorithm implementation that have fomented the capability to retrieve 3D structure from multiple aerial images of the same target and apply it to geographical scene modeling. Scenes are reconstructed on both a macro and a micro scale. The macro scene reconstruction implements the scale invariant feature transform to establish initial correspondences, then extracts a scene coordinate estimate using photogrammetric techniques. The estimates along with calibrated camera information are fed through a sparse bundle adjustment to extract refined scene coordinates. The micro scale reconstruction uses a denser correspondence done on specific targets using the epipolar geometry derived in the macro method. The seeds of computer vision were actually planted by photogrammetrists over 40 years ago, through the development of “space resectioning” and “bundle adjustment” techniques. But it is only the parallel breakthroughs, in the previously mentioned areas that have finally allowed the dream of rudimentary computer vision to be fulfilled in an efficient and robust fashion. Both areas will benefit from the application of these advancements to geographical synthetic scene modeling. This paper explores the process the authors refer to as Airborne Synthetic Scene Generation (AeroSynth). Key words: Structure from motion, bundle adjustment, multi-view imaging, scene synthesis, computer vision. AEROSYNTH INTRODUCTION Recovering 3D structure from 2D images requires only that the scene is imaged from two different viewing geometries and that the same features can be accurately identified. Figure 1, depicts a site of interest imaged from multiple views using an airborne sensor; here the point of interest is the top of a smokestack that will be imaged with the effects of parallax displacing it with respect to other features within the scene. This parallax displacement effect has been used for decades within the photogrammetry community to recover the 3D structure within a scene (DeWitt & Wolf, 2000). Unfortunately, robust automated techniques to match similar features within a scene have been fairly elusive until very recent breakthroughs in the area of computer vision.