Faster Convergence of BP Network with Hybridization of Improved Cost Function
and Control Memory Adaptation
Shafaatunnur Hasan Siti Mariyam Hj. Shamsuddin
Soft Computing Research Group Soft Computing Research Group
K-Economy Research Alliance K-Economy Research Alliance
Universiti Teknologi Malaysia, Skudai, Malaysia Universiti Teknologi Malaysia,Skudai, Malaysia
Email: shafaatunnur@gmail.com Email:mariyam@utm.my
Abstract-Due to the weaknesses of Neural Network (NN)
learning, this paper proposes an alternative approach in
enhancing NN learning by integrating improved cost function
with control adaptation of the nodes and address memory. As
commonly known, weight adjustments of NN particularly in
Backpropagation (BP) algorithm, involve the connections
between neurons, the activation function used by the neurons,
the learning algorithm that specifies the procedure for
adjusting the weights and the cost functions. The cost functions
of BP are calculated based on the derivatives. These derivatives
will determine the success rate of the application to train the
network with an error function that resembles the objective of
the problem on hand. Due to that, the concept of weights
governance with control part mechanism between the input
and hidden layer, and unit offsets of the hidden layer are
implemented. to alleviate the problems of BP learning. The
address memory part of the network will detain the output
pattern of the hidden layer. Subsequently, the output patterns
are compared with the input pattern, and propels back to the
output layer after learning. From the experiments, we found
that the results are promising with these mechanisms and
improved cost function which yields faster convergence rates.
Keywords- Neural network; cost function; control adaptation,
address memory, classification problems
I. INTRODUCTION
Despite its existence for almost three decades,
Backpropagation (BP) algorithm is popular and used for
many applications. BP executions are implemented by
calculating the first derivatives, or gradient, of the error
function required by some optimization methods. It is
certainly not the only method for estimating the gradient.
However, it is the most efficient [1]. The major limitations
of this algorithm are the existence of temporary, local
minima resulting from the saturation behavior of the
activation function, and the slow rates of convergence [2].
From 1988 to date, a number of researchers have done many
modifications to standard BP algorithm to overcome these
problems. A number of approaches have been implemented
to improve the speed of convergence. There are basically on
selection of better cost function, dynamic variation of
learning rate and momentum, selection of better activation
function and better error function.
Rumelhart was the first employed BP learning algorithm
for multilayer perceptron [3]. However, BP has
disadvantages of which weights and neuron numbers are
changing according to the input and output patterns or
learning method. Its learning speed is slow, causes the
weights and unit-offsets control become difficult in the case
that errors produce during learning, and the learning rate is
sensitively changed according to the initial weights [4]. As
well, the error signal of the output unit can be 0, not only
when target values equal to 0, but also when network output
equal to 0 or 1. This will lead to the error signal from output
layer tends to 0 and the network loses its learning ability.
In this paper, we propose an alternative method of
enhancing BP learning by incorporating an improved cost
function with modular control of hidden layer and address
memory part. The rest of the paper is organized as follows.
Section 2 gives brief introduction of Artificial Neural
Network and BP learning algorithm. Section 3 discusses
some modifications of standard BP with improved cost
function. Section 4 describes the alternative approach BP
learning enhancement. Finally, Section 5 is a conclusion.
II. RELATED STUDIES ON BP LEARNING
The BP algorithm is popular and used for many
applications. BP is a method for calculating the first
derivatives, or gradient, of the cost function required by
some optimization methods. It is certainly not the only
method for estimating the gradient. However, it is the most
efficient (Edward, 2004). The major limitations of this
algorithm are the existence of temporary, local minima
resulting from the saturation behavior of the activation
function, and the slow rates of convergence (Zweiri et al.,
2003). From 1988 to date, a number of researchers have
done many modifications to standard BP algorithm which
also known as Two Term BP to overcome these problems.
A number of approaches have been implemented to improve
the convergence speed. There are basically on selection of
dynamic variation of learning rate and momentum, selection
of better activation function and better cost function (Table
1).
2010 Fourth Asia International Conference on Mathematical/Analytical Modelling and Computer Simulation
978-0-7695-4062-7/10 $26.00 © 2010 IEEE
DOI 10.1109/AMS.2010.38
128
Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on July 08,2010 at 12:38:51 UTC from IEEE Xplore. Restrictions apply.