ORIGINAL ARTICLE Partition selection with sparse autoencoders for content based image classiﬁcation Rik Das 1 • Ekta Walia 2 Received: 13 October 2016 / Accepted: 15 June 2017 Ó The Natural Computing Applications Forum 2017 Abstract Managing colossal image datasets with large dimensional hand-crafted features is no more feasible in most of the cases. Content based image classiﬁcation (CBIC) of these large image datasets calls for the need of dimensionality reduction of features extracted for the pur- pose. This paper identiﬁes the escalating challenges in the discussed domain and introduces a technique of feature dimension reduction by means of identifying region of interest in a given image with the use of reconstruction errors computed by sparse autoencoders. The automated process identiﬁes the signiﬁcant regions in an image for feature extraction. It not only improves the dimension of useful features but also contributes to increased classiﬁ- cation results compared to earlier approaches. The reduc- tion in number of one kind of features easily makes space for the inclusion of other features whose fusion facilitates improved classiﬁcation performance compared to individ- ual feature extraction techniques. Two different datasets, i.e. Wang dataset and Corel 5K dataset have been used for the experiments. State-of-the-art classiﬁers, i.e. Support Vector Machine and Extreme Learning Machine are used for CBIC. The proposed techniques are evaluated and compared in the context of both the classiﬁers and analysis of results suggests the appropriateness of the proposed methods for real time applications. Keywords Content based image classiﬁcation (CBIC)  Dimension reduction  Early fusion  Autoencoders  Extreme Learning Machine (ELM)  Support Vector Machine (SVM)  Feature extraction  Partition selection 1 Introduction Content based image classiﬁcation (CBIC) has emerged as a research theme of importance due to escalating application of image data in assorted domains including medicine, entertainment, education, defence, etc. [1]. Radical improvement in accuracy has been observed in object recognition systems with the advent of various low level features [2]. Nevertheless, the efﬁciency of these systems turns out to be questionable in case of large image datasets because of the high computational overhead and elevated storage requirements. One of the reasons for the above mentioned issues has been the dimension of feature vectors extracted to represent an image. The hand-crafted feature extraction techniques in contemporary literatures have considered extraction of image features by manipulating the entire image surface [3]. However, the whole image may not be necessary to create a distinct feature vector to effectively sample the image categories using ﬁne-grained features [4]. Literature suggests the effectiveness of image blocks for local feature extraction. Signiﬁcant performance enhancement in local invariant face recognition has been observed by adaptive selection of image blocks [5]. Hence, it has become imperative to locate the region of interest (ROI) for extracting useful features to facilitate effective CBIC. Image data is growing on a daily basis, and con- sidering the aforesaid inﬂation, it is essential for image- based real time applications to reduce the memory requirements and computational expenses. The authors have & Rik Das rikdas@xiss.ac.in 1 Department of Information Technology, Xavier Institute of Social Service, Ranchi, India 2 Department of Computer Science, University of Saskatchewan, Saskatoon, Canada 123 Neural Comput & Applic DOI 10.1007/s00521-017-3099-0