Evolutionary Structure Optimization of Hierarchical Neural Network for Image
Recognition
SATORU SUZUKI
1
and YASUE MITSUKURA
2
1
Tokyo University of Agriculture and Technology, Japan
2
Keio University, Japan
SUMMARY
The purpose of this paper is to optimize the structure
of hierarchical neural networks. In this paper, structure
optimization is used to represent a neural network by the
minimum number of nodes and connections, and is per-
formed by eliminating unnecessary connections from a
trained neural network by means of a genetic algorithm. We
focus on a neural network specialized for image recognition
problems. The flow of the proposed method is as follows.
First, the Walsh–Hadamard transform is applied to images
for feature extraction. Second, the neural network is trained
with the extracted features based on a back-propagation
algorithm. After neural network training, unnecessary con-
nections are eliminated from the trained neural network by
means of a genetic algorithm. Finally, the neural network
is retrained to recover from the degradation caused by
connection elimination. In order to validate the usefulness
of the proposed method, face recognition and texture clas-
sification examples are used. The experimental results in-
dicate that a compact neural network was generated,
maintaining the generalization performance by the pro-
posed method. © 2012 Wiley Periodicals, Inc. Electron
Comm Jpn, 95(3): 28–36, 2012; Published online in Wiley
Online Library (wileyonlinelibrary.com). DOI
10.1002/ecj.10384
Key word: neural network; genetic algorithm; face
recognition; texture classification.
1. Introduction
A neural network (NN) is a mathematical model
which imitates human brain activity, and is widely used for
pattern recognition, prediction, motion control, and signal
processing. While NNs have a variety of applications, there
is a problem that the performance of NNs is influenced by
the number of hidden units. A small number of hidden units
would decrease performance, since the network cannot
learn the relationship between input and output adequately.
Conversely, a large number of hidden units would degrade
generalization performance, since the network learns even
some noise included in the training data. Furthermore,
excess hidden units make it difficult to analyze and under-
stand the role of each hidden unit, and a long time is
required for computation.
To solve architecture design problems, NN architec-
ture decision methods, which first train a network with
excess hidden units, and gradually eliminate unnecessary
connections from it, have been proposed [1–5]. Hidden
units without connection are generated by eliminating un-
necessary connections from the network. As a result, the
proper number of hidden units is obtained and generaliza-
tion performance and computational efficiency will be im-
proved. Optimal brain damage (OBD) [1] is one of the
connection elimination methods. OBD calculates the influ-
ence of connection elimination on the network, and elimi-
nates the connections which have a small influence on it.
There is another method of eliminating unnecessary con-
nections, called structural learning with forgetting (SLF) [2,
3]. SLF adds a forgetting term which consists of an absolute
sum of weights to the evaluation formula of the mean square
error (MSE) in order to perform both network training and
connection elimination at the same time. However, how the
forgetting rate should be set is a problem. To solve the
problem, a fuzzy-based method which changes the forget-
ting rate adaptively in response to training progress has
been proposed [4]. But since there is no explicit basis for
defining fuzzy rules and membership functions, it also
requires repetitive trials in several conditions to obtain the
proper network size. Although MV regularization which
learns the threshold for elimination of unnecessary connec-
tions has been proposed [5], it assumes that the proper
number of hidden units is known in advance. In terms of
© 2012 Wiley Periodicals, Inc.
Electronics and Communications in Japan, Vol. 95, No. 3, 2012
Translated from Denki Gakkai Ronbunshi, Vol. 131-C, No. 5, May 2011, pp. 983–989
28