(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 5, No. 7, 2014 160 | Page www.ijacsa.thesai.org Clustering of Image Data Using K-Means and Fuzzy K-Means Md. Khalid Imam Rahmani 1 1 Associate Professor, Deptt. of Computer Sc. & Engg. Echelon Institute of Technology Faridabad, INDIA. Naina Pal 2 , Kamiya Arora 3 2,3 M.Tech. Scholar, Deptt. of Computer Sc. & Engg. Echelon Institute of Technology Faridabad, INDIA. Abstract—Clustering is a major technique used for grouping of numerical and image data in data mining and image processing applications. Clustering makes the job of image retrieval easy by finding the images as similar as given in the query image. The images are grouped together in some given number of clusters. Image data are grouped on the basis of some features such as color, texture, shape etc. contained in the images in the form of pixels. For the purpose of efficiency and better results image data are segmented before applying clustering. The technique used here is K-Means and Fuzzy K-Means which are very time saving and efficient. Keywords—Clustering; Segmentation; K-Means Clustering; Fuzzy K-Means I. INTRODUCTION Clustering is the unsupervised classification of patterns such as observations, data items, or feature vectors into groups named as clusters [1]. Applications of clustering is growing nowadays very rapidly because it saves a lot of time and the results obtained from the clustering algorithm is very suitable for the algorithms in the later stages of the applications. Clustering basically groups the data. The data in every group is similar to each other but quiet dissimilar to the data in different groups [5]. So, the data which are grouped together are similar to each other. Clustering has very wide range of applications in the field of research & development like in medical science, where the symptoms and cures of diseases are grouped into clusters to save time and achieve efficient results [10]. It is applied in image processing, data mining and marketing etc. In information retrieval clustering can enhance the performance of retrieving of information from the Internet considerably. All pages are grouped into clusters and optimal results are achieved. Fig. 1. Grouping of Similar Data Points Clustering may also be used in marketing scenarios as it can segment the market into many profitable groups including advertising, promotions and follow ups etc [10]. Clustering can also be used in archeology where researchers are trying to discover stone tools, funeral tools etc. to save time in investigation surveys [10]. Image clustering can also be used in order to segment a movie [4]. Clustering is defined as unsupervised learning where user can randomly selects the data points without the help of a supervisor. There are huge applications of clustering as data clustering has proved a very powerful technique in classifying each application into clusters and sub-clusters for easy, quick and efficient results [11]. A brief description of the state of the art of clustering and various forms of clustering are given in section II. K-Means applied on image is described in section III. In section IV, an overview of existing methodologies has been described. Segmentation of images is being described in section V. In section VI, a proposed algorithm has been described. Section VII has been used for the conclusion and future direction of the research work. II. THE STATE OF THE ART A. Clustering Clustering is a method which groups data into clusters, where objects within each clusters have high degree of similarity, but are dissimilar to the objects in other clusters. So, Clustering is a method of grouping data objects into different groups, such that similar data objects belong to the same cluster and dissimilar data objects to different clusters [9]. Clustering involves dividing a set of data points into non- overlapping groups or clusters of points where points in a cluster are ―more similar‖ to one another than the points present in other clusters [2]. Clustering of images is done on the basis of the intra-class similarity. Target or close images can be retrieved a little faster if it is clustered in a right manner [8]. Data points in each cluster are calculated with a data points in the cluster, similar data points are brought in one cluster. So, each data points exhibits same characteristics present in one cluster. So, a good clustering method would exhibit high similarity in a single cluster and a very less similarity with other clusters.