International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 02 Issue: 09 | Dec-2015 www.irjet.net p-ISSN: 2395-0072
© 2015, IRJET ISO 9001:2008 Certified Journal Page 2159
An Efficient Analysis for High Dimensional Dataset Using K-Means
Hybridization with Ant Colony Optimization Algorithm
Prabha S .
1
, Arun Prabha K.
2
1
Research Scholar, Department of Computer Science, Vellalar College for Women Tamilnadu, India
2
Head and Assistant Professor, Department of Computer Technology (IT & CT) Tamilnadu, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Data mining is the process of discovering
meaningful, new correlation patterns and trends by the
large amount of data are stored. Clustering is the useful
technique for the discovery of data distribution and
patterns in the underlying data. The purpose of
clustering is grouping similar data. K-means is one of
the simplest unsupervised learning algorithms that
solve the well known clustering problem. The procedure
follows a simple and easy way to classify a given data
set through a certain number of clusters (assume K
clusters) fixed a priori. Proposed the well-known Ant
Colony Optimization algorithm can be applied to K-
Means clustering problems. The Ant Colony algorithm is
based on the behavior of ants in searching of food. The
Ant converge is used to find a shortest path, a near-
optimum solution for the target problem. A new method
of K-Mean clustering in which is calculate initial
centroid instead of random selection, due to the number
of iterations is reduced. Ant Colony Optimization
algorithm is to evaluate the efficiency with respect to
accuracy in improving the fitness values among the
ants. Finally concluded the proposed scenario yields
superior performance than the existing scenario
through Extended Particle Swarm Optimization
Algorithm.
Key Words: K-Means, Clustering, Partical Swarm
Optimization, Ant Colony Optimization.
1.DATA MINING
The term “data mining” refers to the finding of relevant and
useful information from databases. Data mining and
Knowledge discovery in the databases is a new
interdisciplinary field, merging ideas from statistics, machine
learning, databases and parallel computing. Data mining
should have been more appropriately named” knowledge
mining from data”. Knowledge mining a shorter term may not
reflect the emphasis on mining from large amounts of data.
Data mining tools predict future trends and behaviour,
allowing businesses to make proactive, knowledge-driven
decisions. Data mining techniques can be implemented
rapidly on existing software and hardware platforms to
enhance the value of existing information resources, and can
be integrated with new products and systems. Data mining
techniques can be broadly classified as Predictive and
Description.
Data mining is the process of discovering meaningful patterns
and relationships that lie hidden within very large databases.
Data mining is a part of a process called knowledge discovery
in databases (KDD). This process consists basically of steps
that are performed before carrying out data mining, such as
data selection, data cleaning, pre-processing, and data
transformation.[6]
There are many other terms carrying a similar or slightly
different meaning to data mining such as knowledge mining
from databases, knowledge extraction, Data/pattern analysis,
Data archaeology and Data dredging. A standard definition for
data mining is the non-trivial extraction of implicit, previously
unknown, and potentially useful knowledge from data.
1.1 Ant Colony Optimization (ACO)
An Ant Colony Optimization algorithm (ACO) is essentially a
system based on agents which simulate the natural behaviour
of ants, including mechanisms of cooperation and adaptation.
In the use of this kind of system as a new metaheuristics was
proposed in order to solve combinatorial optimization
problem. This new metaheuristics has been shown to be both
robust and versatile – in the sense that it has been successfully
applied to a range of different combinatorial optimization
problems.
ACO algorithms are based on the following ideas: Each path
followed by an ant is associated with a candidate solution for
a given problem. When an ant follows a path, the amount of
pheromone deposited on that path is proportional to the
quality of the corresponding candidate solution for the target
problem. When an ant has to choose between two or more
paths, the path(s) with a larger amount of pheromone have a
greater probability of being chosen by the ant.
As a result, the ants eventually converge to a short path,
hopefully the optimum or a near-optimum solution for the