International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 02 Issue: 09 | Dec-2015 www.irjet.net p-ISSN: 2395-0072 © 2015, IRJET ISO 9001:2008 Certified Journal Page 2159 An Efficient Analysis for High Dimensional Dataset Using K-Means Hybridization with Ant Colony Optimization Algorithm Prabha S . 1 , Arun Prabha K. 2 1 Research Scholar, Department of Computer Science, Vellalar College for Women Tamilnadu, India 2 Head and Assistant Professor, Department of Computer Technology (IT & CT) Tamilnadu, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Data mining is the process of discovering meaningful, new correlation patterns and trends by the large amount of data are stored. Clustering is the useful technique for the discovery of data distribution and patterns in the underlying data. The purpose of clustering is grouping similar data. K-means is one of the simplest unsupervised learning algorithms that solve the well known clustering problem. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume K clusters) fixed a priori. Proposed the well-known Ant Colony Optimization algorithm can be applied to K- Means clustering problems. The Ant Colony algorithm is based on the behavior of ants in searching of food. The Ant converge is used to find a shortest path, a near- optimum solution for the target problem. A new method of K-Mean clustering in which is calculate initial centroid instead of random selection, due to the number of iterations is reduced. Ant Colony Optimization algorithm is to evaluate the efficiency with respect to accuracy in improving the fitness values among the ants. Finally concluded the proposed scenario yields superior performance than the existing scenario through Extended Particle Swarm Optimization Algorithm. Key Words: K-Means, Clustering, Partical Swarm Optimization, Ant Colony Optimization. 1.DATA MINING The term “data mining” refers to the finding of relevant and useful information from databases. Data mining and Knowledge discovery in the databases is a new interdisciplinary field, merging ideas from statistics, machine learning, databases and parallel computing. Data mining should have been more appropriately named” knowledge mining from data”. Knowledge mining a shorter term may not reflect the emphasis on mining from large amounts of data. Data mining tools predict future trends and behaviour, allowing businesses to make proactive, knowledge-driven decisions. Data mining techniques can be implemented rapidly on existing software and hardware platforms to enhance the value of existing information resources, and can be integrated with new products and systems. Data mining techniques can be broadly classified as Predictive and Description. Data mining is the process of discovering meaningful patterns and relationships that lie hidden within very large databases. Data mining is a part of a process called knowledge discovery in databases (KDD). This process consists basically of steps that are performed before carrying out data mining, such as data selection, data cleaning, pre-processing, and data transformation.[6] There are many other terms carrying a similar or slightly different meaning to data mining such as knowledge mining from databases, knowledge extraction, Data/pattern analysis, Data archaeology and Data dredging. A standard definition for data mining is the non-trivial extraction of implicit, previously unknown, and potentially useful knowledge from data. 1.1 Ant Colony Optimization (ACO) An Ant Colony Optimization algorithm (ACO) is essentially a system based on agents which simulate the natural behaviour of ants, including mechanisms of cooperation and adaptation. In the use of this kind of system as a new metaheuristics was proposed in order to solve combinatorial optimization problem. This new metaheuristics has been shown to be both robust and versatile in the sense that it has been successfully applied to a range of different combinatorial optimization problems. ACO algorithms are based on the following ideas: Each path followed by an ant is associated with a candidate solution for a given problem. When an ant follows a path, the amount of pheromone deposited on that path is proportional to the quality of the corresponding candidate solution for the target problem. When an ant has to choose between two or more paths, the path(s) with a larger amount of pheromone have a greater probability of being chosen by the ant. As a result, the ants eventually converge to a short path, hopefully the optimum or a near-optimum solution for the