Multimodality Image Registration And Fusion Using Neural Network Mostafa G. Mostafa, Aly A. Farag, and Edward Essock* Computer Vision and Image Processing Laboratory, Department of Electrical and Computer Engineering, University of Louisville, Louisville, KY 40292, USA. *Department of Psychology, University of Louisville, Louisville, KY 40292, USA E-mails: {mostafa,farag}@cairo.uofl.edu, eaesso01@gwise.louisville.edu Abstract - Multimodality image registration and fusion are essential steps in building 3-D models from remote sensing data. In this paper, we present a neural network technique for the registration and fusion of multimodality remote sensing data for the reconstruction of 3-D models of terrain regions. A FeedForward neural network is used to fuse the intensity data sets with the spatial data set after learning its geometry. Results on real data are presented. Human performance evaluation is assessed on several perceptual tests in order to evaluate the fusion results. Keywords: Data fusion, image registration, image interpolation, neural network, 3-D model building. 1 INTRODUCTION Three-dimensional (3-D) digital models of objects are essential for various imaging applications. Object recognition, visualization of Earth observation data, medical imaging, regional/urban planning and monitoring land-cover changes, are just few examples. Although information extraction in remote sensing over the past three decades has largely made use of the information contained within multispectral measurements from a single sensor, it has been recognized that supplementing multispectral data with spatial data from other resources may provide substantial improvement in the output [1]. The current trend of multisensor data includes not only spectral data, but also ground-cover maps, hyperspectral data, radar data, and topographic information such as elevation, slope, and range data. The synergistic use of multiple sensor data is a major factor of enabling some measure of intelligence to be incorporated into the overall operation of the system. A 3-D model of a terrain area can be built by the fusion of at least two types of data sets. First, thematic maps, e.g., images from color, Landsat multispectral, and/or AVIRIS hyperspectral sensors. Second, topographic maps, e.g., digital elevation data (DEM) from air-borne laser terrain mapper (ALTM) and/or interferometric synthetic radar (IFSAR) sensors. The accuracy of the reconstructed model depends on the reliability of both the data registration and data fusion methodology. The registration of two images allow for the combination of complementary information from both images. The goal of image registration is to establish the correspondence between two images and determine the geometric transformation that align one image with the other. Most of the fusion errors originate from poor data registration processes, which is essential for fusing multisource data. For example, a shift of only one pixel in the registration can change the results of many pixels, especially with low resolution data, e.g. Landsat and AVIRIS. There are two main approaches for image registration [2]. One approach makes direct use of the original data, e.g., region, contour, and/or edge matching, and the other is based on feature matching. Features may be ground control points (GCP), corners, line segments, etc. They are assumed to be available as a result of applying standard feature extraction algorithms. The First approach, which is based on a correlation measure between the two images, is computationally expensive, and is also sensitive to noise unless substantial preprocessing is employed. Feature-based methods, on the other hand, tend to yield more accurate results, as features are usually more reliable than intensity or radiometric values. However, accuracy of the feature-based registration methods, depend largely on the accuracy of the selected GCP