Research Article
Metaheuristic Algorithms for Convolution Neural Network
L. M. Rasdi Rere,
1,2
Mohamad Ivan Fanany,
1
and Aniati Murni Arymurthy
1
1
Machine Learning and Computer Vision Laboratory, Faculty of Computer Science, Universitas Indonesia, Depok 16424, Indonesia
2
Computer System Laboratory, STMIK Jakarta STI&K, Jakarta 12140, Indonesia
Correspondence should be addressed to L. M. Rasdi Rere; laode.mohammad@ui.ac.id
Received 29 January 2016; Revised 15 April 2016; Accepted 10 May 2016
Academic Editor: Martin Hagan
Copyright © 2016 L. M. Rasdi Rere et al. his is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
A typical modern optimization technique is usually either heuristic or metaheuristic. his technique has managed to solve
some optimization problems in the research area of science, engineering, and industry. However, implementation strategy of
metaheuristic for accuracy improvement on convolution neural networks (CNN), a famous deep learning method, is still rarely
investigated. Deep learning relates to a type of machine learning technique, where its aim is to move closer to the goal of artiicial
intelligence of creating a machine that could successfully perform any intellectual tasks that can be carried out by a human. In this
paper, we propose the implementation strategy of three popular metaheuristic approaches, that is, simulated annealing, diferential
evolution, and harmony search, to optimize CNN. he performances of these metaheuristic methods in optimizing CNN on
classifying MNIST and CIFAR dataset were evaluated and compared. Furthermore, the proposed methods are also compared
with the original CNN. Although the proposed methods show an increase in the computation time, their accuracy has also been
improved (up to 7.14 percent).
1. Introduction
Deep learning (DL) is mainly motivated by the research of
artiicial intelligent, in which the general goal is to imitate
the ability of human brain to observe, analyze, learn, and
make a decision, especially for complex problem [1]. his
technique is in the intersection amongst the research area of
signal processing, neural network, graphical modeling, opti-
mization, and pattern recognition. he current reputation of
DL is implicitly due to drastically improve the abilities of
chip processing, signiicantly decrease the cost of computing
hardware, and advanced research in machine learning and
signal processing [2].
In general, the model of DL technique can be classi-
ied into discriminative, generative, and hybrid models [2].
Discriminative models, for instance, are CNN, deep neural
network, and recurrent neural network. Some examples of
generative models are deep belief networks (DBN), restricted
Boltzmann machine, regularized autoencoders, and deep
Boltzmann machines. On the other hand, hybrid model
refers to the deep architecture using the combination of a
discriminative and generative model. An example of this
model is DBN to pretrain deep CNN, which can improve
the performance of deep CNN over random initialization.
Among all of the hybrid DL techniques, metaheuristic opti-
mization for training a CNN is the focus of this paper.
Although the sound character of DL has to solve a variety
of learning tasks, training is diicult [3–5]. Some examples
of successful methods for training DL are stochastic gradient
descent, conjugate gradient, Hessian-free optimization, and
Krylov subspace descent.
Stochastic gradient descent is easy to implement and also
fast in the process for a case with many training samples.
However, this method needs several manual tuning scheme
to make its parameters optimal, and also its process is princi-
pally sequential; as a result, it was diicult to parallelize them
with graphics processing unit (GPU). Conjugate gradient, on
the other hand, is easier to check for convergence as well as
more stable to train. Nevertheless, CG is slow, so that it needs
multiple CPUs and availability of a vast number of RAMs [6].
Hessian-free optimization has been applied to train
deep autoencoders [7], proicient in handling underitting
problem, and more eicient than pretraining + ine tuning
Hindawi Publishing Corporation
Computational Intelligence and Neuroscience
Volume 2016, Article ID 1537325, 13 pages
http://dx.doi.org/10.1155/2016/1537325