Range segmentation of large building exteriors: A hierarchical robust approach Reyhaneh Hesami * , Alireza BabHadiashar, Reza HosseinNezhad Faculty of Engineering and Industrial Sciences, Swinburne University of Technology, John Street, Hawthorn, VIC 3122, Australia article info Article history: Received 11 March 2009 Accepted 17 December 2009 Available online 23 December 2009 Keywords: Large-scale range data Range segmentation Robust estimation Historical building exteriors abstract There are three main challenging issues associated with processing range data of large-scale outdoor scene: (a) signiﬁcant disparity in the size of features, (b) existence of complex and multiple structures; and (c) high uncertainty in data due to the construction error or moving objects. Existing range segmen- tation methods in computer vision literature have been generally developed for laboratory-sized objects or shapes with simple geometric features and do not address these issues. This paper studies the main problems involved in segmenting the range data of large building exteriors and presents a robust hierar- chical segmentation strategy to extract ﬁne as well as large details from such data. The proposed method employs a high breakdown robust estimator in a coarse-to-ﬁne approach to deal with the existing dis- crepancies in size and sampling rates of various features of large outdoor objects. The segmentation algo- rithm is tested on several outdoor range datasets obtained by different laser rangescanners. The results show that the proposed method is an accurate and computationally cost-effective tool that facilitates automatic generation of 3D models of large-scale objects in general and building exteriors in particular. Ó 2009 Elsevier Inc. All rights reserved. 1. Introduction Earlier development of three-dimensional imaging technologies was mainly inspired by robotics applications involving laboratory- size polyhedral objects. In the last decade, large-scale range mea- surement technology has been signiﬁcantly advanced [1,2] to the point that accurate dense range data of outdoor objects, up to a few hundred meters in size, can be produced in minutes. As a re- sult, it is now feasible to generate accurate geometric models of ur- ban environment. Existence of such models are important in variety of applications including augmented reality (e.g. [3]), archeology [4], and production of automated urban models of whole buildings and streetscapes (e.g. [5–7]). In particular, extrac- tion of ﬁne architectural details embedded in the façade of impor- tant buildings has found new signiﬁcance for 3D modeling and preservation of historical and cultural sites (e.g. [2,8–11]). Segmentation of two-dimensional data (such as color images and video sequences) of urban structures has been studied for many years (e.g. [12–15]). However, 3D range data segmentation of large buildings, especially data obtained by laser rangescanners is a relatively new topic. The existing range data segmentation methods for man-made buildings in the computer vision literature can be classiﬁed into two main categories: the ﬁrst class of ap- proaches which has been particularly proposed for building exteri- ors, is based on using architectural features. Stamos and Allen [16] employed attributes such as vanishing points and Cantzler et al. [17] used parallelism of walls and orthogonality of edges to extract linear features of buildings. Another approach is to consider the 3D dataset as a collection of pre-deﬁned classes of segments. For instance, Zhao and Shibasaki [18] used a hierarchical scheme to partition images into instances of vertical, horizontal and non-vertical lines, vegetation and outli- ers classes. Han et al. [19,20] used a jump-diffusion method for segmenting a range image obtained by a 3D laser rangescanner and its associated reﬂectance image. This method was imple- mented in the Bayesian framework to allow the integration of both geometric and freeform models. To reduce the speed of computa- tion arising from Markov Chain searching scheme, a data-driven course-to-ﬁne approach is employed. Although the experimental results show that the algorithm is accurate, it is computationally expensive (it takes one hour to process a 300  300 pixels dataset [20]) and the quality of segmentation depends on the availability of a priori knowledge of the models as an input to the algorithm. Anguelov et al. [21] developed a learning-based approach for segmentation of complex scenes. In the learning phase (pre-seg- mentation) of this method, the scan points are labeled and weighted according to an appropriate object class (e.g. ground, building, tree and shrub) using maximum-margin learning approach. In the segmentation phase, a Markov Random Field (MRF) segmentation algorithm is applied to the classiﬁed data. The learning phase of this method is computationally expensive and because it does not model parts of the objects and their spatial relations, it is not able to effectively segment the objects that have many parts with local similarities. For instance, it is able to 1077-3142/$ - see front matter Ó 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.cviu.2009.12.004 * Corresponding author. Fax: +61 3 9214 8264. E-mail addresses: rhesami@swin.edu.au (R. Hesami), abab-hadiashar@swin. edu.au (A. BabHadiashar), rhosseinnezhad@swin.edu.au (R. HosseinNezhad). Computer Vision and Image Understanding 114 (2010) 475–490 Contents lists available at ScienceDirect Computer Vision and Image Understanding journal homepage: www.elsevier.com/locate/cviu