Evolutionary Structure Optimization of Hierarchical Neural Network for Image Recognition SATORU SUZUKI 1 and YASUE MITSUKURA 2 1 Tokyo University of Agriculture and Technology, Japan 2 Keio University, Japan SUMMARY The purpose of this paper is to optimize the structure of hierarchical neural networks. In this paper, structure optimization is used to represent a neural network by the minimum number of nodes and connections, and is per- formed by eliminating unnecessary connections from a trained neural network by means of a genetic algorithm. We focus on a neural network specialized for image recognition problems. The flow of the proposed method is as follows. First, the Walsh–Hadamard transform is applied to images for feature extraction. Second, the neural network is trained with the extracted features based on a back-propagation algorithm. After neural network training, unnecessary con- nections are eliminated from the trained neural network by means of a genetic algorithm. Finally, the neural network is retrained to recover from the degradation caused by connection elimination. In order to validate the usefulness of the proposed method, face recognition and texture clas- sification examples are used. The experimental results in- dicate that a compact neural network was generated, maintaining the generalization performance by the pro- posed method. © 2012 Wiley Periodicals, Inc. Electron Comm Jpn, 95(3): 28–36, 2012; Published online in Wiley Online Library (wileyonlinelibrary.com). DOI 10.1002/ecj.10384 Key word: neural network; genetic algorithm; face recognition; texture classification. 1. Introduction A neural network (NN) is a mathematical model which imitates human brain activity, and is widely used for pattern recognition, prediction, motion control, and signal processing. While NNs have a variety of applications, there is a problem that the performance of NNs is influenced by the number of hidden units. A small number of hidden units would decrease performance, since the network cannot learn the relationship between input and output adequately. Conversely, a large number of hidden units would degrade generalization performance, since the network learns even some noise included in the training data. Furthermore, excess hidden units make it difficult to analyze and under- stand the role of each hidden unit, and a long time is required for computation. To solve architecture design problems, NN architec- ture decision methods, which first train a network with excess hidden units, and gradually eliminate unnecessary connections from it, have been proposed [1–5]. Hidden units without connection are generated by eliminating un- necessary connections from the network. As a result, the proper number of hidden units is obtained and generaliza- tion performance and computational efficiency will be im- proved. Optimal brain damage (OBD) [1] is one of the connection elimination methods. OBD calculates the influ- ence of connection elimination on the network, and elimi- nates the connections which have a small influence on it. There is another method of eliminating unnecessary con- nections, called structural learning with forgetting (SLF) [2, 3]. SLF adds a forgetting term which consists of an absolute sum of weights to the evaluation formula of the mean square error (MSE) in order to perform both network training and connection elimination at the same time. However, how the forgetting rate should be set is a problem. To solve the problem, a fuzzy-based method which changes the forget- ting rate adaptively in response to training progress has been proposed [4]. But since there is no explicit basis for defining fuzzy rules and membership functions, it also requires repetitive trials in several conditions to obtain the proper network size. Although MV regularization which learns the threshold for elimination of unnecessary connec- tions has been proposed [5], it assumes that the proper number of hidden units is known in advance. In terms of © 2012 Wiley Periodicals, Inc. Electronics and Communications in Japan, Vol. 95, No. 3, 2012 Translated from Denki Gakkai Ronbunshi, Vol. 131-C, No. 5, May 2011, pp. 983–989 28