Special issue of the IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (JSTARS) 1 Abstract—This paper presents hierarchical clustering algorithms for land cover mapping problem using multi-spectral satellite images. In unsupervised techniques, the automatic generation of number of clusters and its centers for a huge database is not exploited to their full potential. Hence, a hierarchical clustering algorithm that uses splitting and merging techniques is proposed. Initially, the splitting method is used to search for the best possible number of clusters and its centers using Mean Shift Clustering (MSC), Niche Particle Swarm Optimization (NPSO) and Glowworm Swarm Optimization (GSO). Using these clusters and its centers, the merging method is used to group the data points based on a parametric method (k-means algorithm). A performance comparison of the proposed hierarchical clustering algorithms (MSC, NPSO and GSO) is presented using two typical multi-spectral satellite images - Landsat 7 thematic mapper and QuickBird. From the results obtained, we conclude that the proposed GSO based hierarchical clustering algorithm is more accurate and robust. Index Terms—Glowworm swarm optimization, mean shift clustering, niche particle swarm optimization. I. INTRODUCTION N land cover mapping problem, we need actual information regarding the features of land to make good use of it. Using satellite images, we can accurately plan and use land efficiently. Satellite images offer a method of extracting this temporal data that can be used in gaining knowledge regarding land use. Recent advances in the realm of computer science have allowed us to perform this “intelligent” job. This has established a vast research area in solving the land cover mapping problem for city planning and land-usage [1]. Manuscript received September 27, 2011. A preliminary version of this paper was presented at the IGARSS, 2011. J. Senthilnath, Student Member, IEEE, is with the Department of Aerospace Engineering, Indian Institute of Science, Bangalore 560012, India. (e-mail: snrj@aero.iisc.ernet. in). S. N. Omkar is with the Department of Aerospace Engineering, Indian Institute of Science, Bangalore 560012, India. (corresponding author, phone: +91-80-22932873; fax:+91-80-22930134; e-mail: omkar@ aero.iisc.ernet.in). V. Mani is with the Department of Aerospace Engineering, Indian Institute of Science, Bangalore 560012, India. (e-mail: mani@aero.iisc.ernet.in). N. Tejovanth is with the Department of Electrical & Electronics Engineering, National Institute of Technology, Surathkal, India. (e-mail: tejovanth.n@ieee.org). P.G. Diwakar is with the Earth Observation System, ISRO Head quat., Bangalore, India. (e-mail: diwakar@isro.gov.in). Archana Shenoy B is with the Department of Electronics & Communication Engineering, National Institute of Technology, Surathkal, India. (e-mail: archana.shenoyb@gmail.com). Unsupervised techniques can be used for grouping distinct land cover regions, provided there is a lack of ground truth information [2]. Based on certain similarity metric, the data is sub-divided into clusters [3, 4], using unsupervised methods where the number of clusters is not known a priori [5]. The objective is to maximize the inter-cluster distances while the intra-cluster distances are minimized. The clustering problems can be studied using hierarchical approach [6], by breaking a large cluster and merging smaller groups into their closest centroid [7]. Two approaches are used in this hierarchical clustering method: (i) divisive methods, where a large cluster is split into several small clusters; (ii) agglomerative methods, where many small clusters are merged to form a large cluster. The grouping of the same clusters is regarded as a fundamental task in land cover mapping problem, which transforms the remotely sensed images to generate thematic land-use/land-cover maps [5]. Several methods to compute a single-band gradient function from satellite images have been studied previously by Tarabalka et. al. including pixel-wise classification methods [4, 8]. Studies show that hierarchical step-wise optimization and spectral clustering have given good results for analyses of satellite images [9]. And these results have been improved by a combination of probabilistic classification and hierarchical step-wise optimization algorithm [10]. In the literature different methods have been developed to cluster data sets by splitting and merging [6]. Broadly, they can be classified into parametric and non-parametric methods. In parametric methods such as K-means clustering [11], prior assumptions of the number of clusters are made. This is essentially a function minimization technique, where the objective function is the squared error distance measure. In non-parametric methods such as Mean Shift Clustering (MSC) [12, 13], no prior assumptions are made on the number of clusters. This is a procedure for locating the maxima of a mapped function given a set of discrete data points sampled from that function. It is useful for detecting the modes of density given a density function. Conventionally, mean shift clustering uses single point for locating modes (local maxima). Recently, researchers are interested in locating multiple local optima of a given multi-modal function in a d- dimensional search space. For this purpose nature inspired techniques are used. Brits et al. [14] developed Niche-Particle Swarm Optimization (NPSO) which is a variant of Particle Swarm Optimization (PSO) [15], Krishnand et al. [16, 17] Hierarchical Clustering Algorithm for Land Cover Mapping using Satellite Images J. Senthilnath, S.N. Omkar, V. Mani, N. Tejovanth, P.G. Diwakar and Archana Shenoy B I