Physica A 257 (1998) 85–98 Bayesian learning versus optimal learning Mirta B. Gordon ∗ , Arnaud Buhot D epartement de Recherche Fondamentale sur la Mati ere Condens ee, CEA-Grenoble, 17 rue des Martyrs, 38054 Grenoble Cedex 9, France Abstract We consider the optimal performance that may be reached in the problem of learning the symmetry-breaking direction of a cloud of P = N points in a N -dimensional space. The perfor- mance is measured through the overlap R opt between the true symmetry-breaking direction and the learnt one. Depending on the problem, the learning curves Ropt () may present discontinu- ities. We show that close to these, bayesian learning is not optimal. c  1998 Elsevier Science B.V. All rights reserved. 1. Introduction It has recently been shown that nding the principal component of a set of points, clustering data with a mixture of gaussians, and learning pattern classication from ex- amples with neural networks may be casted as particular cases of unsupervised learning [1]. In all these problems, PN -dimensional points, also called examples, patterns or training set are drawn from a probability density function (pdf ) with axial symmetry. The determination of the symmetry-breaking direction given the training set is called learning. If we are not given any additional information about the data besides the coordinates of the examples, this determination is called unsupervised learning, in con- trast with supervised learning in which each training example is labelled. The learning process has to detect the dierences of the pattern distribution along the symmetry- breaking direction with respect to the orthogonal directions. Given the training set, the probability of the symmetry-breaking direction is given by Bayes’ formula of statisti- cal inference. Sampling the direction with Bayes probability is called Gibbs learning [2]. The average of the solutions obtained through Gibbs learning, weighted with the corresponding probability, is called bayesian solution. Generally, the solution to the learning problem may be formalized as the search of the minimum of an adequate cost function. In fact, several ad hoc cost functions allowing * Corresponding author. 0378-4371/98/$19.00 Copyright c  1998 Elsevier Science B.V. All rights reserved. PII: S0378-4371(98)00130-7