Accelerating the Multilayer Perceptron Learning With The Davidon Fletcher Powell Algorithm * Sabeur Abid, Aymen Mouelhi; Farhat Fnaiech : Senior Member IEEE, ESSTT, (CEREP) 5 Av. Taha Hussein, 1008, Tunis, Tunisia, Email: Farhat.Fnaiech@esstt.rnu.tn ; abid2f@yahoo.fr ; aymen_mouelhi@yahoo.fr Abstract: In this paper, the Davidon Fletcher Powell (DFP) algorithm for nonlinear least squares is proposed to train multilayer perceptron (MLP). Applied on both a single output layer perceptron and MLP, we find that this algorithm is faster than the Marquardt-Levenberg (ML) algorithm known as the fastest algorithm used to train MLP until now. The number of iterations required by DFP algorithm to converge is less than about 50% of what is required by the ML algorithm. Interpretations of these results are provided in the paper. Keywords: Multilayer Perceptron (MLP), Marquardt-Levenberg (ML) algorithm, Davidon Fletcher Powell (DFP) algorithm. 1. INTRODUCTION The traditional method for training a multilayer perceptron is the standard backpropagation (SBP) algorithm. This one suffers from the slowness of convergence; several iterations are required to train a small network, even for a simple problem. Much research works was pushed to accelerate the convergence of the algorithm. This research falls roughly into two learning approaches. The first one is based on the first order optimization techniques such as varying the learning rate of the gradient method used in SBP, using momentum and rescaling variables [6]-[7]. The second approach is based on the use of the second order optimization methods applied to train the MLP. The most popular approaches from the second category have used conjugate gradient or quasi- Newton (Secant) methods. The quasi-Newton methods are modified versions of Newton method known by its high convergence speed. All these optimization techniques like ML and DFP methods are based on the approximation of the Hessian matrix used in the Newton method and they are considered to be more efficient, but their storage and computational requirements go up as the square of the size of the network. In [4], a quasi-Newton method called Broyden, Fletcher, Goldfarb and Shanno (BFGS) method have been applied to train the MLP and it’s found that this algorithm converges much faster than the standard backpropagation algorithm, but the potential drawback of the BFGS method is that it requires a lot of memory to store the Hessian matrix. In [3], the * A part of this work has been published in 3 rd International Symposium on Image and Signal Processing and Analysis, September 18-200, 2003, Rome, Italy Marquardt-Levenberg algorithm was applied for training the feedforward neural network and simulation results on some problems showed that the algorithm is very faster than conjugate gradient algorithm and variable learning algorithm. The great drawback of this algorithm is its high computational complexity and its sensitivity for the initial choice of the parameter µ in the update of Hessian matrix. This paper presents the application of a new algorithm based on DFP method to train a MLP. The DFP algorithm consists of approximating the Hessian matrix (as is the case of the quasi-Newton methods) and it’s used in the first time to train a single output layer and in second time to train a MLP. In the previous work in [2] S. Abid et al have applied the DFP algorithm to train only the output layer of the MLP, the hidden layers are trained with the standard back propagation algorithm. In this work we apply the DFP algorithm to update synaptic weights of all layers. The new algorithm is compared with the ML one. Note that there is no need to compare the proposed algorithm with respect to the SBP one because this later has a very slow rate of convergence since it’s based on the first order optimization method, and all second order optimization methods are obviously much faster. Nevertheless a brief review of the SBP technique is important to carry out the other algorithm equations. This paper is organized as follows: In section II, we present briefly the basic standard back-propagation (SBP) algorithm. The Davidon Fletcher Powell algorithm is then presented in section III. In section IV, we present the DFP algorithm for training MLP. In section V, simulation results are given showing comparison between the new algorithm and the ML one, and finally section VI contains a summary and conclusions. f f f f f f f f f 1 s L u [1] 1 u [1] 2 u [1] n1 u [L] 1 u [L ] 2 u [ L] n u [s ] 1 u [s ] 2 u [s] n y [1] 1 y [1] 2 y [1] n 1 y [s ] 1 y [s ] 2 y [s] n y [L] 1 y [L] 2 y [L] n y 1 y 2 y n 1 0.5 0.5 0.5 W [1] W [ s] W [ L] s s L L [0] [0] [0] Fig.1: Fully connected feedforward multilayer perceptron 0-7803-9490-9/06/$20.00/©2006 IEEE 2006 International Joint Conference on Neural Networks Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada July 16-21, 2006 3389