SMART ARCHIVE FOR ON-LINE LEARNING SYSTEMS Kauko Väinämö, Juha Röning, Perttu Laurinen Machine Vision and Media Processing Group, Infotech Oulu, Department of Electrical Engineering, University of Oulu, FIN-90570 Oulu, Finland email: verneri@ee.oulu.fi, jjr@ee.oulu.fi, perttu@ee.oulu.fi ABSTRACT The amount of information grows at an exponential speed, posing increasing demands for the systems processing that information. The focus in future intelligent systems will be to find and store the information significant for the overall system. This involves data warehousing or data mining techniques as well as machine learning. The meaningful information affects the ability of systems to learn and adapt to the ever-changing reality. In this paper, we present a concept of a smart archive, to be used as a memory and information processor for on-line learning systems. The overall concept includes three logical parts: a smart archive, a static classifier (static model) and a neural network structure (dynamic model). The smart archive consists of four sub-objects: a data pre-processor, a smart data storage, a training set generator and a control object. The smart archive involves on-line and off-line algorithms. The on-line algorithms are used for filtering, data validation, data annotation and the creation and maintenance of a history buffer. The off-line algorithms are used for sampling control, filtering control and network structure re-creation and training. The smart archive is presented in the case of an industrial classifier where visual inspection is used for production quality control. A smart application on a production line must be robust, accurate and reliable, it must be able to cope with changing situations and it must have a short start-up and installation period. The approach proposed meets these demands; the presented system has an ability to filter outliers from the measurement data, and it has a short start-up time, because a static model can be used from the beginning. The dynamic model learns gradually and improves the accuracy of the system. The concept is efficient because real-time algorithms are used and memory consumption is minimised by using smart archiving algorithms. Keywords: smart archive, artificial neural networks, on-line learning, machine learning 1 INTRODUCTION Today's information society generates huge amounts of information that need to be processed, not to mention industrial production systems. The amount of information poses challenging requirements for information models or systems, which must also be able to cope with changing situations. The key point is to find the relevant information from the data and to realise that relevance is quite often a function of time. This involves data mining and data warehousing