J. R. Statist. Soc. B (1978), 40, No.1, pp. 106--112 A Quasi-Bayes Sequential Procedure for Mixtures By A. F. M. SMimt and U. E. MAKOV University College London [Received June 1977. Revised November 1977] SUMMARY Coherent Bayessequential learning and classification procedures are often useless in practice becauseof ever-increasing computational requirements. On the other hand, computationally feasible procedures may not resemble the coherent solution, nor guarantee consistent learning and classification. In this paper, a particular form of classification problem is considered and a "quasi-Bayes" approximate solution requiring minimalcomputation is motivatedand defined. Convergence properties are established and a numerical illustration provided. Keywords: SEQUENTIAL; CLASSIFICATION; BAYESIAN; PATTERN RECOGNITION; STOCHASTIC APPROXIMATION; MIXTURES 1. INTRODUCTION IN this paper, we shall consider a problem of unsupervised sequential learning and classification that has received much attention in the contexts of Pattern Recognition and Signal Detection. A sequence of (possibly vector-valued) observations Xl' X 2, ... , X n, ... , is obtained, each belonging to one of k exclusive populations TIl' TI 2 , ... , TI k (for example, "pattern types" or "signal sources"). For each n, the nth observation is to be classified on the basis of the observed values Xl' X 2 . ..., x n and without any feedback concerning the correctness, or other- wise, of previous classifications. We shall consider the situation in which, conditional on e = (8 1,82 , ... , 8 k )T and density functions ft,f2' ... ,fk> we may assume that the random variables x n are independent, with probability densities f(xnl e) = 8 Ift(x n) + 8J2(X n) + ... + 8dix n), (1) where the 8/s are non-negative and sum to unity. The density fi specifies the probability distribution of the observation given that it belongs to population TI i , and 8 i denotes the probability of this latter event. We further assume that thefi are known and the 8 i unknown. This model is appropriate in many situations where extensive study can be made of the distribution of observations belonging to individual populations, but where the population mix is unknown in the context of interest. A non-sequential version of this problem has been considered by Shapiro (1974), and the books by Fu (1968) and Young and Calvert (1974) provide a general discussion of this and related models in Pattern Recognition and Signal Detection contexts. 2. A COHERENT BAYES SOLUTION The formal Bayes solution to the problem of learning about e and classifying the observations is deceptively straightforward. We suppose that pee) denotes a prior density for e, pee I x.) = pee I Xl> x 2 , ... , x r ) the resulting posterior density for e given Xl = Xl' X 2 = X 2, ... , X r = x., and ple I Xr) the posterior density for e if, in addition to Xl' X 2, ... , x.; it were also known that the rth observation came from TI i . t Present address: Department of Mathematics, University of Nottingham.