An Efficient Convolutional Neural Network for Remote- Sensing Scene Image Classification Muhammad Ashad Baloch 1* , Sajid Ali 2 , Mubashir H.Malik 3 , Aamir Hussain 4 , Abdul Mustaan Madni 1 1 National College of Business Administration & Economics Multan, Pakistan. 2 Department of Information Science, University of Education Lahore, Pakistan. 3 Institute of Southern Punjab Multan, Pakistan. 4 Muhammad Nawaz Shareef University of Agriculture Multan, Pakistan. * Corresponding author. Tel.: +923056976008; email: Ashad5765@gmail.com Manuscript submitted December 12, 2019; accepted January 13, 2020. doi: 10.17706/jcp.15.2.48-58 Abstract: Deep neural networks are providing a powerful solution for remote-sensing scene image classification. However, a limited number of training samples, inter-class similarity among scene categories, and to get the benefits of multi-layer features remains a significant challenge in the remote sensing domain. Many efforts have been proposed to deal the above challenges by adapting knowledge of state-of-the-art networks such as AlexNet, GoogleNet, OverFeat, etc. However, these networks have high number of parameters. This research proposes a five-layer architecture which has fewer parameters compared with above state-of-the-art networks, and can be also complementary to other convolutional neural network features. Extensive experiments on UC Merced and WHU-RS datasets prove that although our network decreases the number of parameters dramatically, it generates more accurate results than AlexNet, OverFeat, and its accuracy is comparable with other state-of-the-art methods. Keywords: Satellite image classification, convolutional neural network, feature fusion. 1. Introduction DUE to the invention of imaging devices like hyper spectral sensors, synthetic aperture radar, airborne visible/infrared Imaging spectrometer (AVIRIS) etc., more and more instruments are being developed to facilitate earth observation that allow us to examine the ground surface in greater detail. However, the very high resolution scenes images, inter-class similarity among scene categories or intra-class variability, significantly effects the classification performance. Although, scene categories are different from each other, the differences are almost undistinguishable due to identical thematic classes. For example, images from forest and park, which belong to two different scene categories, may both consist of trees, mountains, and water at the same time but differ in the density and spatial distribution of these three thematic classes [1]. In this regard, existing approaches can be classified into three main components [2], namely: local visual methods, global visual methods and methods based on high-level vision information. Low-level features are handcrafted features, usually consist of color, shape or textual information [3], [4]. Although these features have been utilized effectively in different applications, an object or pixel-based information cannot fulfill the entire scene understanding due to its high-diversity and the various thematic classes. Mid-level features are potentially more distinctive than the traditional low-level local features and Journal of Computers 48 Volume 15, Number 2, March 2020