Spatial content-based scene similarity assessment Caixia Wang ⇑ , Anthony Stefanidis, Peggy Agouris Department of Geography and Geoinformation Science, Center for Geospatial Intelligence, George Mason University, Fairfax, VA 20109, USA article info Article history: Received 4 April 2011 Received in revised form 25 February 2012 Accepted 27 February 2012 Keywords: Scene matching Similarity metrics Image matching abstract Scene comparison and matching is a fundamental operation in geoinformatics. However, existing solu- tions are rather inadequate to support scene similarity assessment when comparing datasets collected from diverse sources especially ones that are available in diverse modalities (e.g. comparing image to vec- tor datasets), or represent different time instances and thus differ partially in their content. In this paper we introduce a two-stage scene similarity assessment and matching framework that makes use of spatial scene content to compare and match two scenes as they may be captured in two different datasets (e.g. an aerial image and a map). At ﬁrst stage our approach makes use of a matching algorithm based on the comparison of attributed graphs, where linear feature networks (e.g. road networks) are transformed into graphs and network properties are expressed through graph-embedded invariant attributes. By matching these graphs we can assess the similarity between two scenes. At the second stage, we proceed with an invariant scene comparison metric that incorporates additional scene content in the form of object con- ﬁgurations present within individual road network loops (e.g. building arrangements within city squares). By combining diverse but co-located pieces of information (e.g. roads and buildings) in an inte- grated process, our algorithm supports scene comparison and matching even when comparing heteroge- neous datasets. In this paper we present key theoretical concepts and provide experimental results to demonstrate the performance of the proposed approach. Ó 2012 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS) Published by Elsevier B.V. All rights reserved. 1. Introduction Scene comparison and matching is a fundamental operation in geoinformatics, supporting for example the co-registration or georegistration of geospatial datasets, and the integration of multi- source and multitemporal datasets for site monitoring and analy- sis. This challenging process entails the identiﬁcation in one or more datasets of scenes that correspond to (i.e. match) a reference scene. In this paper we are addressing the problem of comparing entities extracted from digital imagery (e.g. through photogram- metric or remote sensing techniques) to the corresponding records of these features in a geospatial database. In the context of this publication we are referring to this as an image-to-vector match- ing. While a wide variety of image-to-image matching algorithms exist that make use of image intensity values to compare and match two image windows (Babbar et al., 2010), vector datasets lack such intensity information, thus rendering the existing inten- sity-based algorithms not suitable for image-to-vector matching. The matching problem becomes even more complicated when the compared datasets (and thus the representations of entities ex- tracted from them) vary in scale, orientation, or coverage, or when they are affected by occlusions, shadows and other comparable artifacts. Thus, image-to-vector comparison often requires a human operator to manually identify corresponding features from the image and the GIS dataset and use them for matching. These features can be points, objects, or networks of linear features (e.g. road networks). As point features are easily extracted and can be widely avail- able, point-based solutions have long been popular approaches. For instance, Drewniok and Rohr (1997) proposed to automatically match the constellations of manhole covers extracted from large- scale urban imagery to those from a cadastral database for georeg- istration. In the work of Holm et al. (1995), natural objects such as islands and lakes have been used as point features for matching SPOT imagery. Brown and Lowe (2007) use scale invariant features derived through scale space analysis as input for a RANSAC solu- tion to stitch panoramic images. More recent approaches deal with point cloud matching through the use of objective functions (e.g. Kaminsky et al., 2009), through a merge-and-mesh approach (Liu et al., 2010), or through a plane reconstruction approach (Pathak et al., 2009). Point-based approaches prove inadequate in scenes where the number and spatial distribution of extracted point fea- tures is low and irregular, and they also tend to be error-prone, as points themselves have minimal content information. Regarding object shape, position, size and orientation we can mention the work of Zhao et al. (2011) on spatial congruence metrics. By 0924-2716/$ - see front matter Ó 2012 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS) Published by Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.isprsjprs.2012.02.007 ⇑ Corresponding author. E-mail addresses: cwangg@gmu.edu (C. Wang), astefani@gmu.edu (A. Stefani- dis), pagouris@gmu.edu (P. Agouris). ISPRS Journal of Photogrammetry and Remote Sensing 69 (2012) 103–120 Contents lists available at SciVerse ScienceDirect ISPRS Journal of Photogrammetry and Remote Sensing journal homepage: www.elsevier.com/locate/isprsjprs