Data Sniffing - Monitoring of Machine Learning for Online Adaptive Systems Yan Liu yanliu@csee.wvu.edu Tim Menzies tim@menzies.com Bojan Cukic cukic@csee.wvu.edu Lane Department of Computer Science & Electrical Engineering West Virginia University Morgantown, WV 26505, U. S. A. Abstract Adaptive systems are systems whose function evolves while adapting to current environmental conditions. Due to the real-time adaptation, newly learned data have a sig- nificant impact on system behavior. When online adaptation is included in system control, anomalies could cause abrupt loss of system functionality and possibly result in a failure. In this paper we present a framework for reasoning about the online adaptation problem. We describe a ma- chine learning tool that sniffs data and detects anoma- lies before they are passed to the adaptive components for learning. Anomaly detection is based on distance compu- tation. An algorithm for framework evaluation as well as sample implementation and empirical results are discussed. The method we propose is simple and reasonably effective, thus it can be easily adopted for testing. 1 Introduction Adaptive systems can be applied to domains where au- tonomy is significant or environmental conditions tend to be unpredictable. Usually, the aim of an adaptive system is to perform appropriately under both foreseen and unforeseen circumstances through adaptation. If the adaptation occurs after deployment of the system, the system is called online adaptive system. In recent years, online adaptive systems have been proposed and implemented in flight control with the expectation to react promptly to unforeseen flight con- ditions and subsystem failures. As an example of online adaptive systems, Figure 1 illus- trates a simple development paradigm. Before Wednesday, the system is built and validated on the training data sets. After fielding, there is unseen data as well as seen data en- tering the system and making it learn and change. Thus, the system will react distinctively with respect to the specific data. Train Test Field Use Monday Tuesday Wednesday Thursday Friday Figure 1. An Online Adaptive System Cycle With the growing usage of such systems, validation tech- niques have been developed to assure system stability and reliability. Most of them are static techniques focusing on the system performance before fielding. Effective meth- ods for validating the running system dynamically are rare. As one promising dynamic strategy, novelty detection es- timates the reliability of the outputs by using the proba- bility density estimates with respect to the data items al- ready seen by the system [1]. Researchers like J.A Leonard have conducted experiments on radial basis functions neu- ral networks by extending the network structure with ad- ditional output nodes to calculate confidence intervals for the outputs. However, this technique usually takes a rela- tively large amount of computing effort and brings burden onto the system performance, which makes it inapplicable for validating online adaptive systems in a real time man- ner without impelling excessive computational effort on the running system. In this paper, we propose a dynamic method based on distance measurement to validate the system adaptation. By sniffing the incoming data in real time not only before but also after it enters the system, we are allowed to prevent the anomalies from system adaptation and discard surpris- ing results that may cause unreliable system performance. The paper is organized as follows. In section 2, we describe a framework comprising of two agents that assures the system performance. Section 3 presents the data sniffing strategy for assess- ing an online adaptive system. We describe the dis- tance measuring techniques and propose an algorithm for testing our approach. 1