Abstract Objective: This paper discusses and compares the various clustering methods over Ill-structured datasets and the primary objective is to find the best clustering method and to fix the optimal number of clusters. Methods: The dataset used in this experiment has derived from the measures of sensors used in an urban waste water treatment plant. In this paper, clustering methods like hierarchical, K means and PAM have been compared and internal cluster validity indices like connectivity, Dunn index, and silhouette index have been used to validate the clusters and the optimization of clustering is expressed in terms of number of clusters. At the end, experiment is done by varying the number of clusters and optimal scores are calculated. Findings: Optimal score and optimal rank list are generated which reveals that the hierarchical clustering is the optimal clustering method. The optimum value of connectivity index should be minimum, silhouette should be maximum, dunn should be maximum. So by interpreting the results, the optimal number of clusters for the experimental dataset have been concluded as K=2 and the optimal method for clustering the given dataset is hierarchical. Applications: The experiment has been done over the dataset derived from the measures of sensors used in a urban waste water treatment plant. *Author for correspondence Indian Journal of Science and Technology, Vol 9(12), DOI: 10.17485/ijst/2016/v9i12/89282, March 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Comparative Study of Clustering Methods over Ill-Structured Datasets using Validity Indices Sheik Faritha Begum 1 *, K. P. Kaliyamurthie 2 and A. Rajesh 3 1,2 Department of Computer Science and Engineering, Bharath University, Chennai, Tamil Nadu, India; sfaritha@gmail.com 3 Department of Computer Science and Engineering, C. Abdul Hakeem College of Engineering and Technology, Vellore – 638052, Tamil Nadu, India; amrajesh73@gmail.com 1. Introduction Based on nature of domain attributes, Clustering meth- odology tends to identify homogeneous group of objects. Te aim of clustering is to categorize or group the similar data items together in order to reduce the amount of data. All approaches of clustering face a common problem of interpreting the generated clusters. Some of the algorithms uses cluster shapes as a solution to the above mentioned problem, and those will assign the data to clusters of such shapes. Terefore, inferencing cluster shape attracts more attention rather than compressing the data set. So cluster analysis plays an important role in entire clustering pro- cess. Validation of the cluster analysis results must also be done. Hence the primary work of clustering process is to express the data patterns in the form of “meaningful” groups, which leads us to identify similarities and dis- similarities and also to derive some needed conclusions about them. Te two basic questions which needs to be addressed in every typical clustering method are: a) Te number of clusters originally present in the data and b) Te quality of clusters formed, which means that the vali- dation of clusters must be done while applying clustering technique 1 . Clustering is basically divided into two groups namely Partitional clustering and Hierarchical clustering. Hierarchical clustering iterates repeatedly by either divid- ing larger clusters into smaller ones, or by merging smaller clusters. Te former one is termed as top down and latter is bottom up. Te variation of clustering methods relies on the selection of larger cluster for splitting or in the Keywords: Clustering Methods, Ill-Structured Datasets, Optimization,Validity Indices