"Clearning" Neural Networks with Continuity Constraint for Prediction of Noisy Time Series Benyang Tang, William Hsieh, and Fred Tangang Dept. of Earth and Ocean Sciences, University of British Columbia Vancouver, Canada, V6T 1Z4 tang@ocgy.ubc.ca To appear in Conference Proceeding of ICONIP96 , Hong Kong, 1996, (Springer-Verlag, 1996) Abstract ––Neural networks with "clearning" and continuity constraints are described. When a "clearning" neural network is trained, not only the weights, but also the input to the network are adjusted, to minimize a cost function consisting of three terms: The first term measures the difference between the network output and the data (the output constraint), the second term measures the difference between the network input and the data (the input constraint), and the third term measures the difference between the network output and the network input of the next step (the continuity constraint). Both the in- sample and out-sample tests on the Mackey-Glass time series show that the new network gives better performance than a traditional neural network when there is noise in the time series. 1 Introduction to "Clearning" Normally, when a neural network is trained, only the network weights are adjusted to minimize a cost function which measures only the difference between the network output and the data. Input data are fed directly into the neural network without modification, implying an assumption that the input data are error free. However, in many applications involving noisy data, this assumption does not hold. Here we use a simple curve fitting to illustrate the problem. Let ~ x t and ~ y t (t=1,...,T) be the observations of two variables. We want to use a neural network to find a functional relationship y t =f(x t , w) between the two variables, where w are the adjustable weights. (x , y ) ~ t ~ t o o o o o o o o o o o x y Figure 1 (x , y ) ~ t ~ t o o o o o o o o o o o x y Figure 2 The backpropagation of a traditional neural network minimizes the following cost function, J = 1 2 ∑ t ( f( ~ x t , w) – ~ y t ) 2 , (1)