A STUDY ON ANALYSIS OF GRAPH CLUSTERING TECHNIQUES IN FEATURE SELECTION S.DeepaLakshmi 1 , T.Velmurugan 2 1 Research Scholar, Bharathiar University, Coimbatore deepa.dgvc@gmail.com 2 Associate Professor, PG and Research Dept. Of Computer Science, D.G.Vaishnav college, Chennai, velmurugan_dgvc@yahoo.co.in Abstract: Graph Clustering is the task of grouping the vertices of a graph into clusters. The grouping is based on similarity measure defined for the data elements. The field of graph clustering has become popular nowadays. Feature selection or extraction is a technique that transforms and simplifies the data to make data mining tasks easier. Feature selection removes the irrelevant and redundant features and selects the relevant and useful features that provide an enhanced classification results as the original data. This research work presents about the application of graph clustering in feature selection of high dimensional data. Also, this work aims at clustering the features using graph theoretical concepts. The irrelevant features are removed and the relevant features are grouped into clusters using minimum spanning tree in this work. The main contribution of this work is to select a representative feature from each resulting cluster to form the set of relevant features. Keywords –Feature Selection, Graph Clustering, Minimum Spanning Tree, Mutual Information. Introduction : Data Mining is the task of discovering interesting patterns from large amounts of data. Mining High Dimensional data has some challenges including the curse of dimensionality and the meaningfulness of the similarity measure in the high dimensional space[1]. Feature selection or attribute selection is the process of selecting relevant features from a large number of features. Feature Selection also known as Attribute Selection or Variable Subset Selection is the process of selecting the most relevant subset of attributes from large set of attributes according to some selection criteria[2]. Some of the benefits of Feature Selection are facilitating data visualization and data understanding, reducing the measurement and storage requirements, reducing training and utilization times of the final model, defying the curse of dimensionality to improve prediction and performance improvement. Many algorithms exist that select optimal features from high dimensional dataset. The use of graph theory to feature selection has gained momentum in the recent research works. This research work aims at reducing the feature set from a large and high dimensional dataset using the concepts of graph theory. In graph based clustering methods, similar data are represented in a graph. The highly connected sub graph forms the clusters. The elements in a cluster are highly similar to each other. In this research work, irrelevant features are removed using filter method. The relevant features are grouped into clusters using graph based clustering methods in the second step. Also, from each cluster, one strong representative feature is selected. Thus, the