Improving the Representation of CNN Based
Features by Autoencoder for a Task of
Construction Material Image Classification
S. Bunrit, N. Kerdprasop, and K. Kerdprasop
Suranaree University of Technology, Thailand
Email: sbunrit@sut.ac.th
Abstract—Deep learning based model named Convolution
Neural Network (CNN) has been extensively employed by
diversified applications concerned images or videos data.
Because training a specific CNN model for an application
task consumes enormous machine resources and need many
of the training data, consequently pre-trained models of
CNN have been broadly used as the transfer-learning
scenario. By the scenario, features had been learned from a
pre-trained model by one source task can be proficiently sent
further to another specific task in a concept of knowledge
transferring. As a result, a task specific can be directly
employed such pre-trained features or further train more by
setting the pre-trained features as a starting point. Thereby,
it takes not much time and can improve the performance
from many referenced works. In this work, with a task
specific on construction material images classification, we
investigate on the transfer learning of GoogleNet and
ResNet101 that pre-trained on ImageNet dataset (source
task). By applying both of the transfer-learning schemes,
they reveal quite satisfied results. The best for GoogleNet, it
gets 95.50 percent of the classification accuracy by
fine-tuning scheme. Where, for ResNet101, the best is of
95.00 percent by using fixed feature extractor scheme.
Nevertheless, after the learning based representation
methods are further employed on top of the transferred
features, they expose more appeal results. By Autoencoder
based representation method reveals the performance can
improve more than PCA (Principal Component Analysis) in
all cases. Especially, when the fixed feature extractor of
ResNet101 is used as the input to Autoencoder, the classified
result can be improved up to 97.83%. It can be inferred, just
applying Autoencoder on top of the pre-trained transferred
features, the performance can be improved by we have no
need to fine-tune the complex pre-trained model.
Index Terms—Convolution Neural Network (CNN), transfer
learning, Autoencoder, construction material, image
classification
I. INTRODUCTION
Since the emerging of deep learning, Convolution
Neural Network (CNN) based learning has been
extensively employed by diversified applications.
Especially, for the tasks concerned images or videos data.
Due to the constructing and learning of a specific CNN
Manuscript received April 17, 2020; revised September 23, 2020.
model for an application task consumes enormous
machine resources and need many of the training data,
consequently pre-trained models of CNN have been
published and appreciation by many application domains.
The features had been learned from a pre-trained model by
one source task can be proficiently sent further to another
specific task in a concept of transfer learning. By transfer
learning, a task specific can be directly employed such
pre-trained transfer features or further train more by
setting the pre-trained transfer features as a starting point.
Thereby, it takes not much time and can improve the
performance from many referenced works.
Transfer learning of CNN model can be applied by two
schemes, which are fixed feature extractor and fine-tuning.
Fixed feature extractor directly transfers pre-trained
features to a task specific by just project (activate) the task
specified data to such features. Another one popular
scheme is fine-tuning. It means the pre-trained transfer
features from a source task are fine-tuned to a task specific
by training more with a task specific dataset. The result
features after retrain are then utilize. Naturally, fixed
featured extractor can be process faster than fine-tuning,
especially when the pre-trained model is very deep. The
deeper of the model, the longer of the fine-tune process. In
addition, fine-tuning process need to set many of
hyper-parameters. Searching for such suitable and optimal
hyper-parameters also take much of time and complex.
In this research, aim at looking for the best performance
in construction material images classification task, the
novel suitable approaches are then explored. Previous
works concerned construction material image
classifications were studied based on hand-designed
features, of which the outstanding algorithms in image
analysis were applied to extract the features and then some
classifiers were selected to classify such features.
Therefore, the classification accuracy depends on manual
selection of the feature-extracted algorithm. In our study,
the state of the art approach based on the transfer learning
of CNN pre-trained models/architectures is investigated.
A set of construction material images is act as a task
specific dataset in the transfer-learning scenario. The two
selected architectures are GoogleNet [1] and ResNet101 [2]
pre-trained on a source task of ImageNet dataset. These
two architectures are differences both in deep and in
detailed layers.
Journal of Advances in Information Technology Vol. 11, No. 4, November 2020
© 2020 J. Adv. Inf. Technol. 192
doi: 10.12720/jait.11.4.192-199