1 Image Coding using Generalized Predictors based on Sparsity and Geometric Transformations Lu´ ıs F. R. Lucas *§ , Nuno M. M. Rodrigues *† , Eduardo A. B. da Silva § , Carla L. Pagliari ‡ and S´ ergio M. M. de Faria *† * Instituto de Telecomunicac ¸˜ oes, Portugal; † ESTG, Instituto Polit´ ecnico de Leiria, Portugal; ‡ DEE, Instituto Militar de Engenharia; § PEE/COPPE/DEL/Poli, Universidade Federal do Rio de Janeiro, Brazil; e-mails: luis.lucas,eduardo@smt.ufrj.br, nuno.rodrigues,sergio.faria@co.it.pt, carla@ime.eb.br Abstract—Directional intra prediction plays an important role in current state-of-the-art video coding standards. In directional prediction, neighbouring samples are projected along a specific direction to predict a block of samples. Ultimately, each pre- diction mode can be regarded as a set of very simple linear predictors, a different one for each pixel of a block. Therefore, a natural question that arises is whether one could use the theory of linear prediction in order to generate intra prediction modes that provide increased coding efficiency. However, such an interpretation of each directional mode as a set of linear predictors is too poor to provide useful insights for their design. In this paper we introduce an interpretation of directional prediction as a particular case of linear prediction, that uses first-order linear filters and a set of geometric transformations. This interpretation motivated the proposal of a generalized intra prediction framework, whereby the first-order linear filters are replaced by adaptive linear filters with sparsity constraints. In this context, we investigate the use of efficient sparse linear models, adaptively estimated for each block through the use of different algorithms, such as Matching Pursuit, Least Angle Regression, Lasso or Elastic Net. The proposed intra prediction framework was implemented and evaluated within the state-of-the-art high efficiency video coding standard. Experiments demonstrated the advantage of this predictive solution, mainly in the presence of images with complex features and textured areas, achieving higher average bitrate savings than other related sparse representation methods proposed in the literature. Index Terms—Intra image prediction, sparse linear prediction, least squares regression, least angle regression, lasso, geometric transformations I. I NTRODUCTION S Tate-of-the-art video compression standards are based on a hybrid approach that comprises three stages: a prediction step, transform-based residue coding and entropy coding. The prediction methods play an important role in image and video coding algorithms, as they provide an efficient solution to reduce signal energy based on the previously encoded samples. In the case of video coding, inter-prediction methods, such as motion-compensation, tend to provide the highest coding gains by exploiting the temporal similarities between the current and previously encoded frames. However, in some situations, intra prediction is the only available solution, as in the cases of This work was funded by FCT - “Fundac ¸˜ ao para a Ciˆ encia e Tecnologia”, Portugal, under the grant SFRH/BD/79553/2011, and by CAPES/Pro-Defesa under grant number 23038.009094/2013-83. still image coding applications [1] or for the first frame and refreshing frames of a video sequence. The idea of intra prediction is to use previously encoded samples from spatial neighbouring blocks to predict the un- known samples. The directional prediction [2] is the main solution for intra prediction adopted in the current state- of-the-art H.264/AVC [3], [4] and High Efficiency Video Coding (HEVC) [5], [6] standards. Its principle consists in projecting the reconstructed samples at block boundaries along specific directions, providing an efficient representation of the directional structures and straight edges, often present in natural images. While H.264/AVC only supports 8 directional modes, HEVC exploits 33 prediction directions. In addition to directional modes, these standards use the DC and planar modes which provide efficient prediction of smooth areas. Despite its advantages, directional intra prediction presents some issues, mostly in the presence of complex regions, such as textured areas. These issues are an intrinsic limitation of directional modes, because they only exploit the first line of samples at the top and left neighbourhoods of the block. In order to better predict the textured areas, alternative methods that reuse repeated texture patterns along the image have been proposed in literature. The most common solutions are based on block matching (BM) [7] and template matching (TM) [8] algorithms. While BM algorithm requires some kind of signalling to indicate the optimal matched block in the causal reconstructed area, TM provides an implicit way to derive the predictor block. In TM algorithm, a reference template is formed using the causal reconstructed samples in the neigh- bourhood of the block to be predicted. A search procedure is performed by comparing the reference template with each equally shaped candidate template existing in a predefined causal search window. The block predictor is given by the block associated to the candidate template which produces the lowest matching error. Improved variations of TM have been proposed for H.264/AVC, e.g. using an average of mul- tiple predictors [9], using adaptive illumination compensation methods [10], or being combined with BM algorithms [11]. A related class of algorithms which has been widely inves- tigated in literature for efficient intra image prediction is based on sparse representation. The research of these methods has been motivated by the assumption that natural image signals are formed by few structural primitives. Most solutions involve a linear combination of few patterns chosen from a large