Faster Convergence of BP Network with Hybridization of Improved Cost Function and Control Memory Adaptation Shafaatunnur Hasan Siti Mariyam Hj. Shamsuddin Soft Computing Research Group Soft Computing Research Group K-Economy Research Alliance K-Economy Research Alliance Universiti Teknologi Malaysia, Skudai, Malaysia Universiti Teknologi Malaysia,Skudai, Malaysia Email: shafaatunnur@gmail.com Email:mariyam@utm.my Abstract-Due to the weaknesses of Neural Network (NN) learning, this paper proposes an alternative approach in enhancing NN learning by integrating improved cost function with control adaptation of the nodes and address memory. As commonly known, weight adjustments of NN particularly in Backpropagation (BP) algorithm, involve the connections between neurons, the activation function used by the neurons, the learning algorithm that specifies the procedure for adjusting the weights and the cost functions. The cost functions of BP are calculated based on the derivatives. These derivatives will determine the success rate of the application to train the network with an error function that resembles the objective of the problem on hand. Due to that, the concept of weights governance with control part mechanism between the input and hidden layer, and unit offsets of the hidden layer are implemented. to alleviate the problems of BP learning. The address memory part of the network will detain the output pattern of the hidden layer. Subsequently, the output patterns are compared with the input pattern, and propels back to the output layer after learning. From the experiments, we found that the results are promising with these mechanisms and improved cost function which yields faster convergence rates. Keywords- Neural network; cost function; control adaptation, address memory, classification problems I. INTRODUCTION Despite its existence for almost three decades, Backpropagation (BP) algorithm is popular and used for many applications. BP executions are implemented by calculating the first derivatives, or gradient, of the error function required by some optimization methods. It is certainly not the only method for estimating the gradient. However, it is the most efficient [1]. The major limitations of this algorithm are the existence of temporary, local minima resulting from the saturation behavior of the activation function, and the slow rates of convergence [2]. From 1988 to date, a number of researchers have done many modifications to standard BP algorithm to overcome these problems. A number of approaches have been implemented to improve the speed of convergence. There are basically on selection of better cost function, dynamic variation of learning rate and momentum, selection of better activation function and better error function. Rumelhart was the first employed BP learning algorithm for multilayer perceptron [3]. However, BP has disadvantages of which weights and neuron numbers are changing according to the input and output patterns or learning method. Its learning speed is slow, causes the weights and unit-offsets control become difficult in the case that errors produce during learning, and the learning rate is sensitively changed according to the initial weights [4]. As well, the error signal of the output unit can be 0, not only when target values equal to 0, but also when network output equal to 0 or 1. This will lead to the error signal from output layer tends to 0 and the network loses its learning ability. In this paper, we propose an alternative method of enhancing BP learning by incorporating an improved cost function with modular control of hidden layer and address memory part. The rest of the paper is organized as follows. Section 2 gives brief introduction of Artificial Neural Network and BP learning algorithm. Section 3 discusses some modifications of standard BP with improved cost function. Section 4 describes the alternative approach BP learning enhancement. Finally, Section 5 is a conclusion. II. RELATED STUDIES ON BP LEARNING The BP algorithm is popular and used for many applications. BP is a method for calculating the first derivatives, or gradient, of the cost function required by some optimization methods. It is certainly not the only method for estimating the gradient. However, it is the most efficient (Edward, 2004). The major limitations of this algorithm are the existence of temporary, local minima resulting from the saturation behavior of the activation function, and the slow rates of convergence (Zweiri et al., 2003). From 1988 to date, a number of researchers have done many modifications to standard BP algorithm which also known as Two Term BP to overcome these problems. A number of approaches have been implemented to improve the convergence speed. There are basically on selection of dynamic variation of learning rate and momentum, selection of better activation function and better cost function (Table 1). 2010 Fourth Asia International Conference on Mathematical/Analytical Modelling and Computer Simulation 978-0-7695-4062-7/10 $26.00 © 2010 IEEE DOI 10.1109/AMS.2010.38 128 Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on July 08,2010 at 12:38:51 UTC from IEEE Xplore. Restrictions apply.