SIViP DOI 10.1007/s11760-016-1052-9 ORIGINAL PAPER Facial expression recognition based on image pyramid and single-branch decision tree Abubakar M. Ashir 1 · Alaa Eleyan 2 Received: 4 May 2016 / Revised: 4 October 2016 / Accepted: 26 December 2016 © Springer-Verlag London 2017 Abstract In this paper, a new approach has been pro- posed for improved facial expression recognition. The new approach is inspired by the compressive sensing theory and multiresolution approach to facial expression problems. Initially, each image sample is decomposed into desired pyra- mid levels at different sizes and resolutions. Pyramid features at all levels are concatenated to form a pyramid feature vector. The vectors are further reinforced and reduced in dimension using a measurement matrix based on compressive sensing theory. For classification, a multilevel classification approach based on single-branch decision tree has been proposed. The proposed multilevel classification approach trains a number of binary support vector machines equal to the number of classes in the datasets. Class of test data is evaluated through the nodes of the tree from the root to its apex. The results obtained from the approach are impressive and outperform most of its counterparts in the literature under the same databases and settings. Keywords Facial expression recognition · Compressive sensing · Image pyramid 1 Introduction Facial expression recognition (FER) is one of the branches of pattern recognition (PR) which enjoys increasing patron- B Abubakar M. Ashir ashir4real@yahoo.com Alaa Eleyan aeleyan@avrasya.edu.tr 1 Department of Electric and Electronic Engineering, Selçuk University, Konya, Turkey 2 Department of Electric and Electronic Engineering, Avrasya University, Trabzon, Turkey age from many works of life in recent times. This could be attributed to the developments in technology and human’s needs for information and intelligence gathering. Some of the emerging applications of FER are in marketing, security, psy- chology, medical diagnosis, human–machine interaction and entertainments [1]. The algorithm flow for FER is not much different from its counterpart algorithms in PR. The steps include: preprocessing, feature extraction, classification and the decision. Generally, in FER two major approaches are adopted for feature extraction. First, is the component-based (holistic) and the second is feature-based (local) approach. In the former, the entire face image is used as input to extract features, while in the later only some key points within the face image (e.g., eye, nose, mouth) are used to take some geometrical measurements and localized information around them [2, 3]. Use of multiresolution algorithms such as Gabor wavelets transform (GWT), discrete Wavelets transform (DWT) to mention but few, is very common in FER and appears to have an edge over other feature extractors like local binary pattern (LBP), principal component analysis (PCA) and local discriminant analysis (LDA) [2]. Authors in [2] used a mul- tiresolution transform called curvelets transform (CT) at different orientations and scales to form curvelets products which were wrapped around their origin. The products were then used to extract curvelets coefficients using inverse CT. The coefficients are subsequently used as feature vectors. Though improved performance has been reported, intensive computations are required to arrive at that performance. In a similar way, in [38] authors used GWT in one form or the other to encode features for FER. For instance, [3] subjected the face images to local, multiscale Gabor filter operations, and then the resulting Gabor decompositions were encoded using radial grids, imitating the topographical map-structure of the human visual cortex (HVC). Due to the similarity of 123