Research Article Canonical PSO Based -Means Clustering Approach for Real Datasets Lopamudra Dey 1 and Sanjay Chakraborty 2 1 Heritage Institute of Technology, Kolkata, West Bengal 700 107, India 2 Institute of Engineering & Management, Kolkata, West Bengal 700 091, India Correspondence should be addressed to Sanjay Chakraborty; sanjay ciem@yahoo.com Received 14 June 2014; Revised 19 September 2014; Accepted 2 October 2014; Published 13 November 2014 Academic Editor: Francesco Camastra Copyright © 2014 L. Dey and S. Chakraborty. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. “Clustering” the signifcance and application of this technique is spread over various felds. Clustering is an unsupervised process in data mining, that is why the proper evaluation of the results and measuring the compactness and separability of the clusters are important issues. Te procedure of evaluating the results of a clustering algorithm is known as cluster validity measure. Diferent types of indexes are used to solve diferent types of problems and indices selection depends on the kind of available data. Tis paper frst proposes Canonical PSO based K-means clustering algorithm and also analyses some important clustering indices (intercluster, intracluster) and then evaluates the efects of those indices on real-time air pollution database, wholesale customer, wine, and vehicle datasets using typical K-means, Canonical PSO based K-means, simple PSO based K-means, DBSCAN, and Hierarchical clustering algorithms. Tis paper also describes the nature of the clusters and fnally compares the performances of these clustering algorithms according to the validity assessment. It also defnes which algorithm will be more desirable among all these algorithms to make proper compact clusters on this particular real life datasets. It actually deals with the behaviour of these clustering algorithms with respect to validation indexes and represents their results of evaluation in terms of mathematical and graphical forms. 1. Introduction One of the best known problems in the data mining is the clustering. Clustering is the task of categorising objects having several attributes into diferent classes such that the objects belonging to the same class are similar, and those that are broken down into diferent classes are not [1]. Tere are several clustering algorithms that have been proposed till now. Due to no prior information in clustering, the suitable evaluation of the results is necessary. Evaluation means measuring the similarity between clusters, measuring the compactness, and separation between clusters [2]. Evaluation measurement is also proposed as a key feature in internal and external cluster validation indexes [3]. Such a measure can be used to compare the performance of diferent data clustering algorithms on diferent real life datasets. Tese measures are usually tied to the type of criterion being considered in assessing the quality of a clustering method. Tree diferent techniques are available to evaluate the clustering results: external, internal, and relative [4]. Both internal and external criteria are based on statistical methods and they have high computation demand. Te external validity methods evaluate the clustering based on some user specifc intuitions [4]. Te objective of this paper is the comparison of the diferent clustering schemas that have been already proposed [5] with Canonical PSO based K-means clustering algorithm. Te rest of the paper is organized as follows. Te Canon- ical PSO based K-means algorithm is proposed in Section 2 with some other existing clustering algorithms. Some popular and widely used validity indices are introduced in Section 3. Section 4 demonstrates the clustering compactness measure- ments on a toy example dataset using K-means and DBSCAN clustering algorithms. Section 5 demonstrates the clustering compactness measurements with experimental results and comparison of the indices is outlined in this section, and Section 7 gives a brief conclusion of this paper. Interested Hindawi Publishing Corporation International Scholarly Research Notices Volume 2014, Article ID 414013, 11 pages http://dx.doi.org/10.1155/2014/414013