Cluster-Weighted Modeling as a basis for Fuzzy Modeling Madasu Hanmandlu Dept. of Electrical Engineering I.I.T. Delhi New Delhi-110016, India. mhmandlu@ee.iitd.ernet.in Vamsi Krishna Madasu School of IT & EE University of Queensland QLD 4072, Australia. madasu@itee.uq.edu.au Shantaram Vasikarla Information Technology Dept. American InterCon.University Los Angeles, CA 90066, U.S.A. svasikarla@la.aiuniv.edu Abstract The Cluster-Weighted Modeling (CWM) is emerging as a versatile tool for modeling dynamical systems. It is a mixture density estimator around local models. To be specific, the input regions together with output regions are treated to be Gaussian serving as local models. These models are linked by a linear or non-linear function involving the mixture of densities of local models. The present work shows a connection between the CWM and Generalized Fuzzy Model (GFM) thus paving the way for utilizing the concepts of probability theory in fuzzy domain that has already emerged as a versatile tool for solving problems in uncertain dynamic systems. 1. Introduction Cluster-Weighted Modeling, introduced by Gershenfeld et al. [1] is a versatile approach for deriving functional relationship between input data and output data by using a mixture of expert clusters. Each cluster is localized to a Gaussian input region having its own local trainable model. The CWM algorithm uses expectation- maximization (EM) to find the optimal location of the clusters in the input space and to solve for the parameters of the local model [2]. CWM can be used as a modeling tool that allows one to characterize and predict systems of arbitrary dynamic character [3]. The framework employed in CWM is concerned with density estimation around Gaussian kernels containing simple local models that describe the system dynamics of a data subspace. In the simplest case, where we require only one kernel, the framework boils down to a simple model that is linear in the coefficients. In the complex case, we may need non-Gaussian, discontinuous, high-dimensional and chaotic models. In between CWM covers a wide range of models, each of which is characterized by a different local model. We can also create globally non-linear models with transparent local structures through the embedding of past practice and mature techniques in the general non-linear framework. Fuzzy modeling has evolved over the years for dealing with problems of dynamic systems. Recently, Generalized Fuzzy Model is proposed in [7], which generalizes the existing fuzzy models, viz., Compositional Rule of Inference (CRI) model and Takagi-Sugeno (TS) model. In this paper, we will show a strong connection between CWM and GFM. So far, GFM lacks a sound mathematical footing. But now, with this connection, GFM can gain a strong foothold and can be used to assimilate the strong points of probabilistic framework. The organization of the paper is as follows. Section 2 gives the concept of CWM, the use of EM in estimating density functions and the model estimation. Section 3 briefly reviews the fuzzy models. Section 4 establishes the equivalence between the CWM and GFM. Finally, conclusions are drawn in Section 5. 2. Cluster-Weighted Modeling It is hard to capture the local behavior with global beliefs. For example, if a smooth curve has some discontinuities then trying to fit the discontinuity we may miss the smoothness. Here, comes the need for a proper choice of a function to fit in so that the transition from low dimensional space to high dimension is easily achieved. The above considerations suggest that for capturing the local behavior we need to estimate density using local rather than global functions. Kernel density estimation adopts this approach by placing a Gaussian at each data point. This requires retention of every point in the model. The better approach is to find important points to fit in a smaller number of local functions that can model larger neighborhoods. Mixture models preferably involving Gaussians can achieve this. These models lead to splitting of a dataset into a set of clusters. An example is the unsupervised learning algorithm, which must learn itself where to fit in the local functions. Proceedings of the International Conference on Information Technology: Computers and Communications (ITCC03) 0-7695-1916-4/03 $17.00 ' 2003 IEEE