A Comparison between Two Fuzzy Clustering Algorithms for Mixed Features Irene Olaya Ayaquica-Martínez and José F. Martínez-Trinidad Instituto Nacional de Astrofísica, Óptica y Electrónica Luis Enrique Erro No. 1 Santa María Tonantzintla, Pue. C.P. 72840, México {ayaquica, fmartine}@inaoep.mx Abstract In this paper, a comparative analysis of the mixed-type variable fuzzy c-means (MVFCM) and the fuzzy c-means using dissimilarity functions (FCMD) algorithms is presented. Our analysis is focused in the dissimilarity function and the way of calculating the centers (or representative objects) in both algorithms. 1 Introduction Restricted unsupervised classification (RUC) problems have been studied intensely in Statistical Pattern Recognition (Schalkoff, 1992). The fuzzy c-means algorithm is based on a metric over a n-dimensional space. It has shown its effectiveness in the solution for many unsupervised classification problems. The fuzzy c-means algorithm starts with an initial partition then it tries all possible moving or swapping of data from one group to others iteratively to optimize the objective measurement function. The objects must be described in terms of features such that a metric can be applied to evaluate the distance. Nevertheless, the conditions in soft sciences as Medicine, Geology, Sociology, Marketing, etc., are quite different. In these sciences, the objects are described in terms of quantitative and qualitative features (mixed data). For example, if we look at geological data, features such as age, porosity, and permeability, are quantitative, while others such as rock types, crystalline structure and facies structure, are qualitative. Likewise, missing data is common in this kind of problems. In these circumstances, it is not possible measure the distance between objects; only the degree of similarity can be determined. Nowadays, the mixed-type variable fuzzy c-means algorithm (MVFCM) of Yang et al. (2003) and the fuzzy c-means using dissimilarity functions (FCMD) of Ayaquica (2002) (see also Ayaquica and Martínez (2001)) are the most recent works that solve the RUC problem when mixed data appear. In this paper, the mixed-type variable fuzzy c-means and the fuzzy c-means using dissimilarity functions algorithms are analyzed. In addition, a comparison between them is made. A. Sanfeliu and J. Ruiz-Shulcloper (Eds.): CIARP 2003, LNCS 2905, pp. 472−479, 2003. Springer-Verlag Berlin Heidelberg 2003