International Conference on Electrical, Electronics, Signals, Communication and Optimization (EESCO) - 2015 978-1-4799-7678-2/15/$31.00 ©2015 IEEE Modified Gustafson-Kessel Clustering On Medical Diagnostic Systems B.Simhachalam 1,2 1 Department of Engineering Mathematics GITAM University Visakhapatnam-530045, India drbschalam@gmail.com G.Ganesan 2 2 Department of Mathematics Adikavi Nannaya University Rajahmundry-533296, India prof.ganesan@yahoo.com Abstract—Mostly Clustering methods are not supervised methods those can be applied to the data to arrange them into groups based on a feature called similarity among the individual data items. In this study, Modified Gustafson-Kessel (MGK) clustering technique is applied to group the patients into different thyroid diseases’ clusters. Further, the results of Modified Gustafson-Kessel clustering algorithm and Fuzzy c-Means (FCM) clustering algorithm are compared according to the classification performance. These results show that Modified Gustafson-Kessel clustering algorithm gives better performance. Keywords—Clustering, Cluster prototype, Fuzzy covariance matrix, Medical diagnostic system, GK clustering. I. INTRODUCTION Cluster analysis refers the methods which try to partition a dataset Z of N elements into c N) < c < ( 1 subsets called as clusters. Clustering methods can be applied to the data where the elements are numerical, categorical or both. The traditional manual data analysis has become inefficient since the rapid development on sophisticated medical devices. In this regard, we need reliable techniques to analysis the data. The capacity of clustering algorithms is to discover the underlying structures in data which can be utilized in a wide variety of applications, including pattern recognition, image processing, classification, modeling and identification [4]. The application of fuzzy sets in a classification function causes the class membership to become a relative one and several classes contain same object but with different degrees [2]. To increase the sensitivity this feature is important for medical diagnostic systems. This work presents two clustering techniques on the data of thyroid gland obtained from Dr. Coomans [6] to assign the patients into three clusters. Five important different tests we applied to the patients to measure the thyroid status of the patients and used the results for the classification purpose. The unsupervised clustering techniques Modified Gustafson- Kessel (MGK) clustering and Fuzzy c-Means (FCM) clustering algorithms are used and the results are shown. Typically observations of some physical process are called as data in a dataset Z . Let { } N z , , z , z = Z . . . 2 1 be a set of N observations. Each observation is a n -dimensional row vector n kn k2 k1 k ] z , , z , [z = z . . . . The dataset Z can be represented by a n N × matrix. In medical diagnosis, patients can be represented by means of rows and the symptoms or laboratory measurements for these patients by means of columns in the matrix Z . A partition of the dataset Z be represented by the fuzzy partition matrix N c ik ] [μ = U × where c is number of clusters and N is the number of observations in Z . In the fuzzy partition matrix ik μ represents the membership value (grade or degree) of the th k object in the th i cluster. The fuzzy partitioning space for Z is the set { × i , μ < k , = μ k; i, ], [ μ U = M N = k ik c = i ik ik N c fc 1 1 0 ; 1 1 0, / (1) The rest of the paper is organized as follows: section II describes the FCM algorithm; section III describes the MGK algorithm, experimental results are presented in section IV and conclusion is presented in section V. II. FUZZY C-MEANS CLUSTERING FCM is also known as Fuzzy ISODATA. The FCM make use of fuzzy partitioning such that more than one group can have a same data point with different membership values between 0 and 1. The FCM gives the weighted mean as i v of a cluster’s data items, where the membership values are the weights of the data items. FCM is an iterative algorithm and the objective is to find cluster prototypes (centroids) by optimizing the objective function. The objective function of FCM is defined as N - ∑∑ c = i N = k A i k m ik V U v z ) (μ = V) U, J(Z; 1 1 2 , min (2) where fc ik M ] [μ = U (3) is a fuzzy partition matrix of Z , vector of cluster prototypes(centers)