Forest Mapping by Partially Surpervised Classification applied to Vegetation Indexes Michaela DE MARTINO, Sebastiano B.SERPICO, Caterina CAMURRI Dept. of Biophysical and Electronic Engineering, Univ. of Genoa I-16145 Genova, Italy michaela.demartino@dibe.unige.it, vulcano@dibe.unige.it Abstract — The great amount of available remotely sensed data requires to be analyzed before being of practical usefulness for the wide range of their potential customers. This work aims to develop an adequate approach to make vegetation mapping by means of multispectral data feasible and reliable, exploiting the power of both pattern recognition and application oriented data processing. An automated partially supervised classification system is proposed, with reduced interaction with the operator; a feature vector will be employed totally composed of vegetation indexes, suitable for the task of vegetated area classification in mountainous regions. We focus our attention on the investigation of the subject of vegetation species detection within forestry in mountainous regions. The proposed approach has been tested on a multispectral image of the Landsat 7 sensor, acquired over a mountainous area in Arizona (USA). The proposed approach has proved to be effective, as confirmed by comparisons with the results obtained by the application of the same classification procedure directly to the original bands and by the use of a completely supervised Maximum Likelihood classifier. Keywords: Vegetation mapping; partially supervised classification; multispectral satellite data. I. INTRODUCTION Forests are a valuable resource providing food, shelter, wildlife habitat, fuel, and daily supplies such as medicinal ingredients and paper. Forests play an important role in balancing the Earth's CO2 supply and exchange, acting as a key link between the atmosphere, geosphere, and hydrosphere. The main issues concerning forest management are depletion due to natural causes (fires and infestations) or human activity (clear-cutting, burning, land conversion), and monitoring of health and growth for effective commercial exploitation and conservation. The analysis of the location, the extension, the health, the productivity, the sustainability of such natural resources are critical information for natural land management. Remote sensing can reduce the cost of resource inventory and monitoring if remotely sensed data are well correlated with important field measurement, and available when needed. The immediate advantages that Remote Sensing technology provides to a field as land management are the following: it makes possible a synoptic view of areas which otherwise should be investigated with enormous human as well as economic resource waste, it allows to bypass non-trivial obstacles, such as hard accessibility to some regions or the risk of modifying the environment by means of invasive in site inspections [1]. In order to manage and interpret this kind of data, it appeared to be wise to attempt to construct a data analysis scheme that takes advantage of keen perceptive and associative powers of humans in conjunction with the objective quantitative abilities of computers. Classification is one of the fundamental aspects of the analysis of remotely sensed images: exploiting imagery information, it aims at labelling pixels in an image as representing particular ground cover types, or classes. Image classification is the first step of most recognition techniques and change detection algorithms; it can be performed in two different manner: in a supervised way, by means of training pixels presented as ground truth images, or in an unsupervised way, i.e. without any a priori knowledge about the real nature of pixels in the image, allowing automatic cluster-seeking procedures to reveal the natural classes contained in the scene. Often a combination of these two methods in hybrid approaches is used to get more satisfactory results. This work aims to propose a classification approach based on a partially supervised method for forest mapping applications, which aims to distinguish the different vegetation species within the general land cover class named Forest. It is our intention to demonstrate that the construction of a multidimensional feature vector, designed ad hoc for this goal can widen the application field of multispectral data, which is usually limited to the control of the vegetation health and of general vegetation class distribution by means of just one spectral feature. II. METHODOLOGICAL APPROACH Our purpose is to develop a classification approach able to discerning specific information classes as different types of vegetation covers or vegetated area, using a limited training set. We study the problem from two viewpoints: first of all, the composition of a feature vector as suitable as possible for our target, i.e. forest mapping; second, the adoption of a methodology able to exploit this vector in the best way, under the aforesaid constraint of scarce ground truth, and implemented in such a way to be as automated as possible. A. Architecture of the proposed method Supervised classification represents a powerful analysis tool from both the viewpoint of accuracy and speed. Unfortunately, it also presents some drawbacks. In particular, 3128 0-7803-9050-4/05/$20.00 ©2005 IEEE. 3128