CONSCIENCE ALGORITHM IN NEURAL NETWORK Geok See Ng, Loo See Tan Nanyang Technological University, School of Computer Engineering, Division of Software Systems, Nanyang Avenue, Singapore 639798 ABSTRACT A type of networks called the Contender Network (CN) was earlier proposed by Ng, Erdogan and Ng [5]. A classification algorithm is used to assign weighted vote in a monotonically decreasing function of the rank in CN. In this paper, modification to the CN classification algorithm known as conscience algorithm [2] is presented. However, a new problem was encountered when conscience algorithm is used in CN. We name this problem as saturation problem (i.e. when saturation stage of the neural network is reached). This saturation problem is solved by introducing a count threshold. The threshold is decided rigorously through many experiments based on the criteria of the accuracy, error and confusion rates of the network performance. We present experiments that shows that conscience algorithm introduced during training with appropriate count threshold can improve the network performance. Experimental results of this approach are presented and discussed through the application of the neural network in digit classification. 1. INTRODUCTION Contender Network (CN) [5] uses supervised competitive learning method. In this paper, we present a modification to the CN by introducing an algorithm known as conscience algorithm [2] that improves the network performance in digit classification. The basis concept of the CN is described in this section and its modification using conscience algorithm is described in the next section. CN is a 3-layer (input layer, hidden layer and output layer) feedforward network. It is a K-nearest neighbour classifier. The conventional K-nearest neighbour algorithm [6] is as follows: Given input vector x, find the k nearest neighbours of x, then count the number of nearest neighbours for each class, and categorise x to the class that has the largest number of nearest neighbours. However, CN’s classification does not depend on the number of nearest neighbours but the weighted vote of nearest neighbours. Among the k nearest neighbours, CN finds the t contenders that are not far away from the winner (i.e. the nearest neighbour of the input vector). These t contenders are to compete with the winner during classification. Among these t contenders, there are strong and ordinary contenders. Strong contenders are those that are strongly against the winner and the rest are ordinary contenders [5]. Weighted vote are then assigned to these t contenders. The output class is the class with the highest sum of votes from these t contenders during the meeting. The learning of the CN is that the network grows when a new cluster (i.e. hidden node) is to be formed. When no new hidden node is added, CN updates the strong contender list of the hidden node. The adding/removing of a strong contender from the strong contender list of the hidden node will suppress/enhance the competitiveness of the hidden node in the meeting during classification. CN extends the classification process by having a classification competition among t contenders that are found from k nearest neighbours. One of the objectives of CN is to overcome the sensitivity to scale. Instead of using a fixed distance to identify the top k references that are closest to the input x, a factor γ as defined in Equation 1 is used. d (x,n i ) ≤ γ d (x,n 1 ) (1) where 1.0 < γ ≤ 2.0, d(*) is the distance between vectors, n i is the ith closest reference for i = 2, 3, ..., k and n 1 is the reference closest to the input pattern. The sensitivity to scale is overcome by the use of two parameters, k and γ. Only t out of the k references (t ≤ k) satisfies Equation 1. The t references are called contenders, since they contend for the input pattern’s class membership during classification. The class membership of the input vector x is a function of these t references and itself, i.e. class_of (x) = f (x, n 1 , n 2 , ..., n t ) (2) In the experiments, the γ is set to 2 so that the distance is not too far away to be considered for contenders and k to 5 for the reason that this is 50% of the possible 10 output classes. The weighted vote of ith closest reference n i for i>1 is defined by: ( )    − = 1 1 n of contender strong a is if n of contender ordinary an is if * 1 . 0 1 . 1 ote weighted_v i i n n n s i i (3) where s is a weighted vote assigned to strong contender. In our work, the weighted vote ranges from 0 to 1. The