This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS 1
Remote Sensing Scene Classification Using
Convolutional Features and Deep Forest Classifier
Yaakoub Boualleg , Student Member, IEEE, Mohamed Farah , and Imed Riadh Farah
Abstract—High-resolution remote sensing scene classification
(HR-RSSC) plays an increasingly important role since it aims
to enhance the scene semantic understanding. Recently, con-
volutional neural networks (CNNs) proved their effectiveness
in learning powerful feature representations for various visual
recognition tasks. However, in the RS domain, the performance
of CNN is still limited due to the lack of sufficient labeled data.
In this letter, we propose an HR-RSSC method based on CNN
transfer learning (TL) for feature extraction (FE) and deep forest
(DF) for classification. In fact, we extract deep features from the
last convolutional layer in order to avoid the use of the fully
connected layers (FCLs) which need many parameters to tune.
Moreover, we train a DF model that is based on ensemble learning
that can achieve better performances than single classifiers and
is easy to train with few parameters. We evaluate the proposed
method on two RS image data sets. Compared to full-training,
fine-tuning, and state-of-the-art CNN TL methods, the results
demonstrate the effectiveness of the DF model for HR-RSSC
based on CNN TL in terms of overall accuracy and training
time.
Index Terms— Convolutional neural network (CNN), deep
forest (DF), remote sensing (RS), scene classification, transfer
learning (TL).
I. I NTRODUCTION
I
N RECENT years, large volumes of high-resolution remote
sensing (RS) images have become publicly available.
In order to mine high-quality information from these available
large-scale RS images, the research community has shown
a growing interest in RS image analysis. Significant efforts
have been made to develop accurate high-resolution RS scene
classification (HR-RSSC) methods that intend to increase the
semantic understanding of the RS images by labeling each
image with a specific semantic scene category.
Most recent methods rely on deep learning (DL) and
convolutional neural networks (CNNs) are the dominant DL
architecture, which are being used for most computer vision
tasks. However, training CNNs from scratch requires a huge
amount of labeled data. In addition, parameter’s tuning is a
hard process and is uninterpretable for the theoretical analysis.
Also, although CNNs convolutional layers (Conv) are powerful
to extract high-order features, the ordinary CNN architectures
use the fully connected layers (FCLs) as a classifier. However,
it is widely known that FCLs can easily overfit with a small
size of the training data.
Manuscript received March 12, 2019; accepted April 14, 2019. (Correspond-
ing author: Yaakoub Boualleg.)
The authors are with the SIIVT–RIADI Laboratory, National School of
Computer Science, University of Manouba, Manouba 2010, Tunisia (e-mail:
yaakoub.boualleg@ieee.org).
Color versions of one or more of the figures in this letter are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/LGRS.2019.2911855
Several studies attempt to alleviate the overfitting prob-
lem by using dropout and regularization methods [1], [2]
or replace the FCLs by a global average pooling (GAP)
layer [3]. Also, an effective solution to exploit the CNNs
performance for small-scale data sets is CNN transfer learning
(TL) by fine-tuning the pretrained model or using the CNN
model as a feature extractor. In addition, recent studies have
tended to replace the common neurons with decision trees [4]
as an alternative solution to alleviate the mentioned deficien-
cies of deep neural networks (DNNs).
The deep forest (DF) is a recent DL architecture that
is based on an ensemble-learning method where multiple
learners are trained and combined for a single task. It was first
proposed by Zhou and Feng [4]. DF architectures have few
parameters compared to CNNs and are, therefore, easy to train
with low computational costs. Furthermore, the DF models can
be applied to small data sets and have achieved a competitive
performance to CNNs for different classification tasks.
In order to fully exploit the advantages of the CNN
in extracting high-order image features and the DF as an
ensemble-learning method which can achieve better perfor-
mance than a single classifier, and motivated by the promising
results of the CNN TL methods, we propose an HR-RSSC
method based on the TL of a pretrained CNN model using a
DF classifier. The pretrained CNN model is used as a feature
extractor by extracting the image convolutional features. Then,
the extracted feature maps (Fmaps) are fed into the pro-
posed DF model to predict the corresponding scene category.
Unlike standard classifiers, the DF is able to handle different
spatial feature relationships among the feature maps in the
feature representation stage using the multigrained scanning
(MGS). Next, in the second stage of tuning the DF model,
the cascade forest structure (CFS) allows the layer-by-layer
feature processing, where the information is fed forward over
the model layers until the final layer to get the final class
prediction.
II. RELATED WORKS
The RS domain still suffers from the lack of sufficient
labeled samples due to the high cost of the labeling process.
Exploiting CNNs for small-scale labeled RS images has been
widely investigated. Nogueira et al. [5] have investigated
the three strategies of exploiting CNNs for HR-RSSC: full-
training, fine-tuning, and using CNNs as feature extractors.
Nogueira et al. [5] conclude that using the pretrained CNN
as feature extractors is the best strategy. Hu et al. [6] inves-
tigated the strength of the extracted features (Fmaps) from
the VGG16 pretrained CNN within two scenarios. In the
first scenario, the Fmaps are extracted from the last FCL.
In the second scenario, the Fmaps are extracted from the last
1545-598X © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.