Eﬃcient Training Algorithms for the Probabilistic RBF Network Constantinos Constantinopoulos and Aristidis Likas Dept. of Computer Science Univ. of Ioannina GR 45110, Ioannina, Greece {ccostas,arly}@cs.uoi.gr Abstract. The Probabilistic RBF (PRBF) network constitutes an a- daptation of the RBF network for classiﬁcation. Moreover it extends the typical mixture model by allowing the sharing of mixture components among all classes, in contrast to the conventional approach that sug- gests mixture components describing only one class. The typical learning method of PRBF for a classiﬁcation task employs the Expectation – Max- imization (EM) algorithm. This widely used method depends strongly on the initial parameter values. The Greedy EM algorithm is a recently pro- posed method that tries to overcome this drawback, in the case of the density estimation problem using mixture models. In this work we pro- pose a similar approach for incremental training of the PRBF network for classiﬁcation. The proposed algorithm starts with a single compo- nent and incrementally adds more components. After convergence the algorithm splits all the components of the network. The addition of a new component is based on criteria for detecting a region in the data space that is crucial for the classiﬁcation task. Experimental results using several well-known classiﬁcation datasets indicate that the incremental method provides solutions of superior classiﬁcation performance. Keywords: Machine Learning, Neural Networks, Probabilistic Reason- ing, Mixture Models, Classiﬁcation. 1 Introduction An eﬃcient method to tackle the classiﬁcation problem is to construct a model that estimates the class conditional densities p(x|k) of the data and the respective prior probabilities P (k) for each class. Using Bayes theorem, we can compute the posterior probabilities P (k|x) according to: P (k|x)= p(x|k)P (k) ∑ ℓ p(x|ℓ)P (ℓ) . (1) In order to classify an unknown pattern, according to Bayes decision rule, we select the class with the higher posterior probability. In the traditional statistical approach each class density p(x|k) is estimated using a separate mixture model G.A. Vouros and T. Panayiotopoulos (Eds.): SETN 2004, LNAI 3025, pp. 183–190, 2004. c  Springer-Verlag Berlin Heidelberg 2004