Expectation propagation learning of a Dirichlet process mixture of Beta-Liouville distributions for proportional data clustering Wentao Fan a , Nizar Bouguila b,n a Department of Computer Science and Technology, Huaqiao University, Xiamen, China b The Concordia Institute for Information Systems Engineering (CIISE), Concordia University, Montreal, QC, Canada H3G 1T7 article info Article history: Received 9 January 2014 Received in revised form 24 March 2015 Accepted 27 March 2015 Keywords: Unsupervised learning Expectation propagation Dirichlet process Beta-Liouville distribution Facial expression Action recognition abstract We propose a nonparametric Bayesian model for the clustering of proportional data. Our model is based on an inﬁnite mixture of Beta-Liouville distributions and allows a compact description of complex data. The choice of the Beta-Liouville as the basis of our model is justiﬁed by the fact that it has been shown to be a good alternative to the Dirichlet and generalized Dirichlet distributions for the statistical representation of proportional data. Using this inﬁnite mixture, we show how a careful modeling can achieve good results by allowing the elicitation of prior belief about the parameters and the number of clusters through suitable learning. Indeed, we develop an efﬁcient learning algorithm, based on expectation propagation, to estimate the parameters of our inﬁnite Beta-Liouville mixture model. The feasibility and effectiveness of the proposed method are demonstrated by two challenging applications namely action and facial expression recognition. & 2015 Elsevier Ltd. All rights reserved. 1. Introduction Statistical models are becoming increasingly important because of their role in providing a concise picture of the data, by taking uncertainty into account (Lewis and Catlett, 1994; Frey et al., 1995; Rosset and Segal, 2002; Keysers et al., 2004), and then in the development of useful algorithms for pattern recognition, computer vision, and image processing (Yildizer et al., 2012; Liao et al., 2013). Finite mixtures have been widely used in the past for statistical modeling and exploring data structure (Patrick, 1968; McLachlan and Peel, 2000; Nock and Nielsen, 2006). Images and videos modeling and clustering is a prime example of the role mixtures play. In practice, however, mixture-based modeling rely generally on simplistic assumptions that may compromise modeling and generalization capabilities. Examples of these assumptions include supposing that the number of clusters is known in advance, which implies that we have to rely on the practitioner ability to determine the optimal complexity, or using a multivariate normal distribution for modeling which disregard the nature of the data. Inﬁnite mixtures have been proposed to overcome the deﬁcien- cies related to ﬁnite mixtures and have been shown to be effective tools in data analysis, modeling and clustering (Lau and Green, 2007). Traditionally, there has been interest in inﬁnite mixture models from a wide variety of disciplines including machine learning, data mining, pattern recognition, and computer vision. The prevalent assumption when using inﬁnite mixture models has been to consider that the component densities are Gaussians. Unfortunately, the Gaussian assumption may not be met in practice and is often violated, producing poor modeling results. This is especially true in the case of proportional data (e.g. normalized histograms) which are largely present and naturally generated in several domains. Examples include the representation of textual (or visual) documents using histograms containing the normalized frequencies of words (or visual) words in a given dictionary (Bouguila, 2012a). The goal of this paper is to examine another alternative based on Beta-Liouville distribution to model suitably proportional data. Indeed, few applications of the Beta-Liouville mixture have appeared recently, and much of the potential of this model has not been realized yet (Bouguila, 2011, 2012a,b). In Bouguila (2012a), ﬁnite Beta-Liouville mixture models are applied on scene modeling and classiﬁcation, and automatic image orienta- tion detection. In Bouguila (2012b), inﬁnite Beta-Liouville mixture models have been proposed and been successfully applied on text classiﬁcation and texture discrimination. Inﬁnite mixture-based modeling belongs to the group of non- parametric Bayesian approaches which have been widely adopted recently (Hirano, 2002; Chib and Hamilton, 2002; Li et al., 2007; Ray and Mallick, 2006; Bouguila, 2012b). A challenging problem in this context is the development of efﬁcient learning approaches. Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/engappai Engineering Applications of Artiﬁcial Intelligence http://dx.doi.org/10.1016/j.engappai.2015.03.016 0952-1976/& 2015 Elsevier Ltd. All rights reserved. n Corresponding author. Tel.: þ1 5148482424; fax: þ1 5148483171. E-mail addresses: fwt@hqu.edu.cn (W. Fan), nizar.bouguila@concordia.ca (N. Bouguila). Engineering Applications of Artiﬁcial Intelligence 43 (2015) 1–14