A SEGMENTATION PROCEDURE OF LIDAR DATA BY APPLYING MIXED PARAMETRIC AND NONPARAMETRIC MODELS Fabio Crosilla, Domenico Visintini, Francesco Sepic Department of Georesources & Territory, University of Udine, via Cotonificio, 114 I-33100 Udine, Italy crosilla@dgt.uniud.it KEY WORDS: LIDaR data, segmentation process, parametric and nonparametric models ABSTRACT: The paper proposes a segmentation procedure inspired to a robust LIDaR filtering data method recently introduced by the authors. The method is based on the application of a Simultaneous AutoRegressive (SAR) model for describing a trend surface and of an iterative Forward Search (FS) algorithm to detect clusters of non-stationary data. The procedure consists in an automatic process to identify raw clusters of data relating to the geometrical configurations to be segmented with the robust iterative SAR-FS parametric model. The search of homogenous clusters of points is carried out by applying a local polynomial regression algorithm, automatically adapted to the morphological variability of the LIDaR points. The combination of the parametric and nonparametric models in a mixed analytical procedure makes it possible to optimize the efficiency of the segmentation and dramatically reduce the requirements of computational memory and time consuming. Some significant experiments make it possible to evidence the potential of the method proposed. 1. INTRODUCTION Airborne Laser Scanning technique is extremely efficient to fulfil increasing demand of high accuracy spatial data for civil engineering, environmental protection, planning purposes, etc.. The main processing steps are the filtering of the points (to detect the ground terrain), their segmentation (to classify the point dataset in different classes), and the 3D modeling of clusters (to enhance the data structure from an irregularly sparse to a vector object-oriented one). With regard to point segmentation algorithms, exploiting geometrical and/or radiometric properties, this paper proposes a new one inspired to the procedure suggested by Crosilla, Visintini and Prearo (2004a) for the filtering of non-ground measurements from airborne laser data. The method is based on a Simultaneous AutoRegressive (SAR) model to describe the geometrical trend of a surface (chapter 2), and on an iterative Forward Search (FS) algorithm (Atkinson and Riani, 2000), to find out outliers and/or clusters of non-stationary data (chapter 3). Starting from a subset of stationary LIDaR data, the forward search approach allows to perform a robust iterative estimation of the SAR unknown parameters. At each iteration, one or more LIDaR points are joined, according to their level of agreement with the postulated surface model. Outliers and or non-stationary data are identified by proper statistical diagnostics and are included only at the end of the iterative process. The method has already been successfully applied to segment man-made objects characterized by plane surfaces, like roofs, or by more complicated higher order geometry (Crosilla, Visintini and Prearo, 2004b). Nevertheless, it presents some critical aspects for the automatic extraction of the raw initial clusters, and for the extension of the process to the entire set of points, that contains also points not presenting any geometrical relationship with the particular cluster to be identified. The paper proposes a new analytical method to automatically identify the initial raw data cluster relating to a generic geometrical feature. For every subset of homogeneous LIDaR data, the method identifies a limited number of surrounding points to submit to the refinement segmentation process, so to dramatically reduce computing time and memory. At the end, the algorithm makes it possible to automatically perform the segmentation of the entire data collection. The search of the initial homogeneous raw cluster of points is carried out by applying a local nonparametric regression algorithm (chapter 4), while the refinement process is performed by a robust parametric model, the before mentioned SAR one following the FS procedure. For each LIDaR point, the nonparametric algorithm makes it possible to compute the predicted surface local trend value and its partial derivatives in the East and North directions. The LIDaR points belonging to the same homogeneous subset are characterized by a significant agreement between the measured and the predicted height and, for planar surfaces, by a further spatial constant value of the partial derivatives (chapter 5). 2. A SIMULTANEOUS AUTOREGRESSIVE SEGMENTATION MODEL The proposed algorithm works under the hypothesis that LIDaR measures of the surface point height can be rightfully represented by the SAR model (Anselin, 1988): A Wz z (1) where: z is the [n x 1] vector of laser height values (being n the total number of points to be segmented); is a value (constant for the whole dataset) that measures the mean spatial interaction between neighbouring points; W is a [n x n] spatial adjacency (binary) matrix defined as 1 w ij if the points are neighbours, 0 w ij otherwise; A is a [n x r] matrix with s i s i i i i N E ... N E 1 A as rows where i E and i N are East and North-coordinates of points interpolated by a s = (r-1)/2 degree orthogonal polynomial; T 1 r 1 0 ... is a [r x 1] vector of parameters; is the [n x 1] vector of normally distributed errors (noise) with mean 0 and variance . To solve equation (1), a Maximum Likelihood (ML) estimation of the unknown parameters has been chosen. Let us start from ISPRS WG III/3, III/4, V/3 Workshop "Laser scanning 2005", Enschede, the Netherlands, September 12-14, 2005 132