Noname manuscript No. (will be inserted by the editor) Independent Component Analysis and clustering for pollution data Asis Kumar Chattopadhyay · Saptarshi Mondal · Atanu Biswas the date of receipt and acceptance should be inserted later Abstract Independent Component Analysis (ICA) is closely related to Principal Component Analysis (PCA). Whereas ICA ﬁnds a set of source variables that are mutually independent, PCA ﬁnds a set of variables that are mutually uncorrelated. Here we consider an objective classiﬁcation of diﬀerent regions in central Iowa, USA, in order to study the pollution level. The study was part of the Soil Moisture Experiment 2002. Components responsible for signiﬁcant variation have been obtained through both PCA and ICA, and the classiﬁcation has been done by K-Means clustering. Result shows that the nature of clustering is signiﬁcantly improved by the ICA. Keywords Circular data, distance, FastICA algorithm, Independent Component Analysis, K-means clustering, negentropy, non-Gaussianity, Principal Component Analysis. 1 Introduction 1.1 Data set under consideration We consider a data set containing measurements collected from ﬂights conducted in June and July 2002 over the Walnut Creek watershed in central Iowa, USA. The study was part of the Soil Moisture Experiment 2002 (SMEX02) and the Soil Moisture-Atmosphere Coupling Experiment (SMACEX), run Asis Kumar Chattopadhyay Calcutta University, Kolkata, India E-mail: akcstat@caluniv.ac.in Saptarshi Mondal Calcutta University, Kolkata, India E-mail: saptarshi.stat@gmail.com Atanu Biswas Indian Statistical Institute, Kolkata, India E-mail: atanu@isical.ac.in