G.Madhumitha et al, International Journal of Computer Science and Mobile Computing, Vol.7 Issue.8, August- 2018, pg. 192-195
© 2018, IJCSMC All Rights Reserved 192
Available Online at www.ijcsmc.com
International Journal of Computer Science and Mobile Computing
A Monthly Journal of Computer Science and Information Technology
ISSN 2320–088X
IMPACT FACTOR: 6.017
IJCSMC, Vol. 7, Issue. 8, August 2018, pg.192 – 195
A Survey on Clustering
Techniques in Data Mining
G.Madhumitha
1
, K.Kathiresan
2
¹Student, Master of Engineering, Department of CSE, Angel College of Engineering & Technology, India
²Assistant Professor, Department of CSE, Angel College of Engineering & Technology, India
1
gmadhumitha7@gmail.com;
2
kathirpk@gmail.com
Abstract— Data mining refers to the process of extracting information from a large amount of data and
transforming it into an understandable form. Clustering is one of the most important methodology in the
field of data mining. It is an unsupervised machine learning technique. Clustering means grouping a set of
objects so that similar objects present in the same group and dissimilar objects present in different groups.
This paper provides a broad survey on various clustering techniques and also analyzes the advantages and
shortcomings of each technique.
Keywords— Data mining, clustering, clustering analysis, clustering techniques, advantages and limitations
I. INTRODUCTION
This Data mining analyzes data from different perspectives and transforming it into an useful information [4].
The goal of data mining is the fast retrieval of data or information, discovering knowledge and identifying
hidden patterns. Data mining involves various tasks such as anomaly detection, association rule learning,
classification, regression and clustering analysis. In this paper, clustering analysis is done [10]. It is the process
of dividing a set of data objects into subsets. Each subset is a cluster. The set of clusters resulting from a cluster
analysis referred as clustering [8]. Clustering is used to group similar objects from a dataset. It leads to the
discovery of previously unknown groups within the dataset. Clustering is also called data segmentation because
clustering partitions large data sets into groups based on their similarity. Different clustering methods generate
different clustering on the same data set. It is a fundamental operation in data mining.
Fig 1 - Stages of Clustering
Raw data
Clustering algorithm Set of
clusters