MARKOV-TREE BAYESIAN GROUP-SPARSE MODELING: EFFICIENT SOLUTIONS TO LARGE INVERSE PROBLEMS GANCHI ZHANG † AND NICK KINGSBURY † Abstract. In this paper, we propose a new Markov-tree Bayesian modeling of wavelet coef- ﬁcients. Based on a group-sparse GSM model with 2-layer cascaded Gamma distributions for the variances, the proposed method eﬀectively exploits both intrascale and interscale relationships across wavelet subbands. To determine the posterior distribution, we apply Variational Bayesian inference with a subband adaptive majorization-minimization method to make the method tractable for large problems. Key words. Image deconvolution, markov-tree, majorization minimization, variational Bayesian, dual-tree complex wavelets. 1. Introduction. Linear inverse problems appear often in many applications of image processing where a noisy indirect observation y, of an original image x, is mod- eled as y = Hx + n, where H of size M × N is the matrix representation of a direct linear operator and n is usually additive Gaussian noise with variance ν 2 . Wavelet- based methods are good for solving ill-posed image restoration problems because natural images can often be sparsiﬁed using a wavelet basis [1]. Note that, the statis- tical properties of wavelet coeﬃcients can often be modeled by heavy-tailed Gaussian scale mixture (GSM) priors that capture the intrascale relationships among wavelet coeﬃcients [2, 3]. However, many authors have argued that there is a strong persis- tence of large/small wavelet coeﬃcients across scales, and such interscale relationships are beneﬁcial for modeling wavelet coeﬃcients[4, 5, 6]. In general, this interscale de- pendency mechanism can be well represented using a wavelet tree structure where child coeﬃcient energy relates strongly to parent energy [6]. Various methods such as bivariate shrinkage [7], Hidden Markov Tree [5] and overlapping-group penalties [6] have been used to exploit the parent-child relationship. 2. Model Formulation. We propose a new Markov-tree based model for explor- ing both intrascale and interscale dependencies among wavelet coeﬃcients. Assume we can represent the image x by wavelet expansion as x = Mw where M is the inverse wavelet transform, and w is an N × 1 vector which contains all wavelet coeﬃcients. This results in a wavelet-based formulation as y = HMw + n. It is noted that for an orthogonal basis, M is a square orthogonal matrix, whereas for an over-complete dictionary (e.g. a tight frame), M has N columns and M rows, with N>M [1]. The resulting likelihood of the data can be shown to be p(y|w,ν 2 )= ( 2πν 2 ) − M 2 exp{− 1 2ν 2 ‖y − HMw‖ 2 } (2.1) Similarly to [8], we use a non-overlapped group-sparse GSM model to model w, and the conditional prior of w can then be expressed as p (w|S)= G  i=1 N ( w i |0,σ 2 i ) = N ( w|0, S −1 ) (2.2) where the i th group w i is a vector of size g i whose elements are drawn from a zero- mean Gaussian distribution with a signal variance σ 2 i (as yet unknown), and where G † Signal Processing Group, Dept. of Engineering, University of Cambridge, UK 1