Noname manuscript No. (will be inserted by the editor) Independent Component Analysis and clustering for pollution data Asis Kumar Chattopadhyay · Saptarshi Mondal · Atanu Biswas the date of receipt and acceptance should be inserted later Abstract Independent Component Analysis (ICA) is closely related to Principal Component Analysis (PCA). Whereas ICA finds a set of source variables that are mutually independent, PCA finds a set of variables that are mutually uncorrelated. Here we consider an objective classification of different regions in central Iowa, USA, in order to study the pollution level. The study was part of the Soil Moisture Experiment 2002. Components responsible for significant variation have been obtained through both PCA and ICA, and the classification has been done by K-Means clustering. Result shows that the nature of clustering is significantly improved by the ICA. Keywords Circular data, distance, FastICA algorithm, Independent Component Analysis, K-means clustering, negentropy, non-Gaussianity, Principal Component Analysis. 1 Introduction 1.1 Data set under consideration We consider a data set containing measurements collected from flights conducted in June and July 2002 over the Walnut Creek watershed in central Iowa, USA. The study was part of the Soil Moisture Experiment 2002 (SMEX02) and the Soil Moisture-Atmosphere Coupling Experiment (SMACEX), run Asis Kumar Chattopadhyay Calcutta University, Kolkata, India E-mail: akcstat@caluniv.ac.in Saptarshi Mondal Calcutta University, Kolkata, India E-mail: saptarshi.stat@gmail.com Atanu Biswas Indian Statistical Institute, Kolkata, India E-mail: atanu@isical.ac.in