Fuzzy Sets and Systems 159 (2008) 2399 – 2427
www.elsevier.com/locate/fss
Collaborative clustering with the use of Fuzzy C-Means
and its quantification
Witold Pedrycz
b, ∗
, Partab Rai
a
a
Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada AB T6R 2G7
b
System Research Institute, Polish Academy of Sciences,Warsaw, Poland
Received 31 March 2007; received in revised form 26 September 2007; accepted 24 December 2007
Available online 20 January 2008
Abstract
In this study, we introduce the concept of collaborative fuzzy clustering—a conceptual and algorithmic machinery for the collective
discovery of a common structure (relationships) within a finite family of data residing at individual data sites. There are two
fundamental features of the proposed optimization environment. First, given existing constraints which prevent individual sites from
exchanging detailed numeric data, any communication has to be realized at the level of information granules. The specificity of
these granules impacts the effectiveness of ensuing collaborative activities. Second, the fuzzy clustering realized at the level of the
individual data site has to constructively consider the findings communicated by other sites and act upon them while running the
optimization confined to the particular data site. Adhering to these two general guidelines, we develop a comprehensive optimization
scheme and discuss its two-phase character in which the communication phase of the granular findings intertwines with the local
optimization being realized at the level of the individual site and exploits the evidence collected from other sites. The proposed
augmented form of the objective function is essential in the navigation of the overall optimization that has to be completed on a
basis of the data and available information granules. The intensity of collaboration is optimized by choosing a suitable tradeoff
between the two components of the objective function. The objective function based clustering used here concerns the well-known
Fuzzy C-Means (FCM) algorithm. Experimental studies presented include some synthetic data, selected data sets coming from the
machine learning repository and the weather data coming from Environment Canada.
© 2008 Elsevier B.V. All rights reserved.
Keywords: Collaborative clustering; Fuzzy C-Means (FCM); Induced fuzzy partition matrices; Proximity measure; Consistency index; Information
granules and granular computing; Granular prototypes; Type-2 fuzzy sets
1. Introduction
Clustering and in particular, fuzzy clustering [2,6–8,10,12,18,23] occupy an important role in understanding data
by revealing their underlying structures and offering some useful insights into the general tendencies, associations and
dependencies manifesting therein. Within this setting, concepts like information granules, information granulation and
granular computing have proven very useful cf. [3,17,21,27].
Recently, an important extension of clustering leads to the concepts of combination of several clustering results
that is implied by the existence of several outcomes of clustering resulting from several runs of the same clustering
algorithm, use of several clustering methods being applied to the same data set or the use of clustering for several data
∗
Corresponding author. Tel.: +1 780 492 3332; fax: +1 780 492 1811.
E-mail address: pedrycz@ee.ualberta.ca (W. Pedrycz).
0165-0114/$ - see front matter © 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.fss.2007.12.030