Associative Neural Network IGOR V. TETKO $ Institute for Bioinformatics, MIPS, GSF, Ingolsta ¨dter Landstraße 1, D-85764 Neuherberg, Germany and Biomedical Department, Institute of Bioorganic and Petroleum Chemistry, Ukrainian Academy of Sciences, Murmanskaya 1, Kiev-660, 253660, Ukraine. e-mail: itetko@vcclab.org Abstract. An associative neural network (ASNN) is a combination of an ensemble of the feed- forward neural networks and the K-nearest neighbor technique. The introduced network uses correlation between ensemble responses as a measure of distance among the analyzed cases for the nearest neighbor technique and provides an improved prediction by the bias correction of the neural network ensemble both for function approximation and classification. Actually, the proposed method corrects a bias of a global model for a considered data case by analyzing the biases of its nearest neighbors determined in the space of calculated models. An associative neural network has a memory that can coincide with the training set. If new data become available the network can provide a reasonable approximation of such data without a need to retrain the neural network ensemble. Applications of ASNN for prediction of lipophilicity of chemical compounds and classification of UCI letter and satellite data set are presented. The developed algorithm is available on-line at http://www.virtuallaboratory.org/lab/asnn. Key words. associative memory, bias correction, classification, function approximation, k-nearest neighbors, memory-based methods, memoryless, prototype selection 1. Introduction The traditional multi-layer neural network (MLP) is a memoryless approach. This meansthataftertrainingiscompleteallinformationabouttheinputpatternsisstored in the neural network weights and input data are no longer needed, i.e. there is no explicit storage of any presented example in the system. Contrary to that, such meth- ods as the k-nearest-neighbors (KNN) (e.g., [1]), the Parzen-window regression (e.g., [2]), etc. represent the memory-based approaches. These approaches keep in memory the entire database of examples and their predictions are based on some local approx- imation of the stored examples. The neural networks can be considered global mod- els, while the other two approaches are usually considered local models [3]. Consider a problem of multivariate function approximation from examples, i.e. finding a mapping R m ¼> R n from a given set of sampling points. For simplicity, let us assume that n ¼ 1. A global model provides a good approximation of the glo- bal metric of the input data space R m . However, if the analyzed function, f, is too $ Address for correspondence: Dr. Igor Tetko, Institute for Bioinformatics, GSF - Forschungszentrum fu¨r Umwelt und Gesundheit, GmbH Ingolsta¨dter Landstraße 1, D-85764 Neuherberg, Germany. Neural Processing Letters 16: 187–199, 2002. 187 # 2002 Kluwer Academic Publishers. Printed in the Netherlands.