PERGAMON zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA MATHEMATICAL COMPUTER MODELLING zyxwvutsrqponm Mathematical and Computer Modelling 35 (2002) 259-271 www.elsevier.com/locate/mcm Generalization and Learning Error for Nonlinear Perceptron zyxwvutsrqponmlkjihgfedcbaZY M. SHCHERBINA Institute for Low Temperature Physics, Ukraine Academy of Science 47 Lenin Avenue, Kharkov, Ukraine B. TIROZZI Department of Physics, Rome University ‘(La Sapienza” 5, p-za A. Moro, Rome, Italy zyxwvutsrqponmlkjihgfedcbaZYXWVU (Received and accepted June 2001) Abstract-A rigorous derivation of the asymptotic behaviour of learning and prediction error for the nonlinear perceptron is presented. The saddle-point method is used for evaluating these quantities. @ 2002 Elsevier Science Ltd. All rights reserved. Keywords-Perceptron, Law large numbers, Saddle point, Learning. zyxwvutsrqponmlkjihgfedcbaZYXW 1. INTRODUCTION Neural networks (NN) have shown to be very useful in many tasks of data analysis. Almost any problem of modelling a set of data {(x (fi), y(@))}FC1 has been successfully solved by back prop- agation neural networks, as well as predicting new data. For example, if we have the stochastic process z(t), we can consider as an input vector x = z(t - l), z(t - 2), . . ,z(t - n) and as an output y = z(t)-the value of the process at the time t (here the time t plays the role of p). This kind of NN is widely used and applied in many fields like meteorology [l] , geology [2], economy [3], and recognition processes [4]. The back propagation for two layers perceptron is an algorithm inspired by the adaptation process of the brain for solving particular tasks. It has been noticed for a long time that the synaptic weights of the brain change during the growth of humans. Each group of synaptic weights of neurons changes in such a way that an elementary task, e.g., recognition of an object situated at a certain angle in the plain of observation, is solved. The use of sigmoid functions as an input-output relationship and the terminology of neurons, synaptic weights are coming from the analogy with real neurons. For NN used in solving the data analysis problem (also called artificial neural networks ANN), an architecture is also introduced. There is a first layer of n input neurons, i.e., suppose that our data are n-dimensional vectors x(w) = (CC?), . . . , ~2’) and to each neuron i, there is associated an input data ,jP), then the output neuron n is connected through a synaptic weights wj with the input neuron j and it receives as a total synaptic input the sum C wiz!” = (x(p), w), the output of this neuron is 0895-7177/02/$ - see front matter @ 2002 Elsevier Science Ltd. All rights reserved. Typeset by &+W PII: SO895-7177(01)00163-7