Dictionary learning for image prediction Mehmet Türkan ⇑,1 , Christine Guillemot INRIA, Campus Universitaire de Beaulieu, 35042 Rennes Cedex, France article info Article history: Received 19 May 2012 Accepted 2 February 2013 Available online 14 February 2013 Keywords: Dictionary learning Image prediction Image compression Sparse representations Coding Block transforms abstract We present a dictionary learning algorithm which is tailored to the block-based image prediction prob- lem. More precisely, we learn two related sub-dictionaries A c and A t , the ﬁrst one (A c ) for approximating known samples in a causal neighborhood of the block to be predicted and the other one (A t ) to approx- imate the block to be predicted. These two dictionaries are learned so that representation vectors com- puted by approximating the known samples using A c will lead to a good approximation of the block to be predicted when used together with A t . Because of its simplicity, this method can be used for on-the-ﬂy learning of dictionaries. The proposed method has ﬁrst been evaluated for intra prediction. It has then been applied in a complete image compression algorithm. Experimental results show gains up to 3 dB in terms of prediction compared to the H.264/AVC intra modes and up to 2 dB in terms of rate-distortion performance. Ó 2013 Elsevier Inc. All rights reserved. 1. Introduction The problem of learning redundant and over-complete dictio- naries has gained great importance in many image processing tasks such that texture modeling [1], image denoising [2,3], image resto- ration [4,5], image compression [6,7], inpainting and zooming [8], and more. The underlying main idea suggests that natural (image) signals can be better approximated sparsely, and therefore, be compacted/represented more efﬁciently as a weighted linear com- bination of a set of pre-learned dictionary basis functions (atoms) when compared to off-the-self (e.g., DCT, DFT, wavelets) bases or dictionaries. One of the earliest dictionary learning schemes is pro- posed by Olshausen and Field [9]. Various learning algorithms have then been proposed in the literature in accordance with the sparse signal representations problem. The most recent dictionary learn- ing methods focus on ‘ 0 and ‘ 1 sparsity measures. The sparsity con- straint that is associated with the learning problem potentially leads to simple formulations, hence to efﬁcient techniques in terms of approximation accuracy, compaction rate and computational complexity. Non-parametric dictionary learning methods, such as method of optimal directions (MOD) [10] and K-SVD [11], have been devel- oped resulting in non-structural learned dictionaries. These meth- ods are indeed very effective in practice, however, the computational complexity required for learning a non-structured dictionary makes their usage restricted only to low-dimensional problems. There are also parametric learning structures for such as translation invariant dictionaries [12–15], multiscale dictionaries [5,16], and sparse dictionaries [17]. These dictionaries are usually learned by imposing various desired properties on the dictionary leading typically to a more compact representation, hence to a more efﬁcient implementation, when compared to non-parametric dictionaries [18]. Unions of orthonormal bases [19,20] can also be seen as parametric learning methods resulting in structured dictio- naries in tight frames. Although their efﬁciency in learning dictio- naries with a reduced complexity, these methods suffer from not being ﬂexible to more complex structures in natural images. More- over, online learning algorithms [21,22], iteration-tuned schemes [23,24], task-driven learning approaches [25], and tree-structured hierarchical [26–28] methods have been introduced into the liter- ature aiming at improving dictionary learning methods and their applications to various image processing tasks. Tutorial papers on dictionary learning techniques are available in [18,29]. The above mentioned dictionary learning algorithms are how- ever not very well suited to the image prediction problem. These methods are mainly adapted to the learning of basis functions (atoms) to be used for approximating the input data vectors (e.g., image blocks), but not to the problem of predicting the unknown pixels from noisy observed samples in a causal neighborhood (approximation support or template). Moreover, the complexity, which results from the number and the dimension of training sam- ples, of these methods often limits their applicability to low- dimensional data analysis problems, and makes them fragile to 1047-3203/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.jvcir.2013.02.001 ⇑ Corresponding author. Tel.: +33 642121549. E-mail addresses: Mehmet.Turkan@gmail.com (M. Türkan), Christine.Guillemot@ inria.fr (C. Guillemot). URL: http://www.irisa.fr/temics/staff/guillemot/ (C. Guillemot). 1 Mehmet Türkan is currently a researcher in Technicolor R&D, Cesson Sévigné, France. J. Vis. Commun. Image R. 24 (2013) 426–437 Contents lists available at SciVerse ScienceDirect J. Vis. Commun. Image R. journal homepage: www.elsevier.com/locate/jvci