Reinforcing Network Security by Converting Massive Data Flow to Continuous Connections for IDS Maher Salem Group of Network and Data Security University of Applied Sciences, Fulda, Germany {Maher.salem@informatik.hs-fulda.de Ulrich Buehler Group of Network and Data Security University of Applied Sciences, Fulda, Germany u.buehler@informatik.hs-fulda.de Abstract—Massive data flow in intrusion detection research area become a serious challenge. It is considered as a deficiency of handling heterogeneous and non-stationary data stream and the inability to uncover anomaly in online operation mode professionally. This paper proposes a novel online method that constructs connections from the massive data flow for evaluating IDS models. The proposed method overcomes this challenge by using a queuing concept of dynamic window size. It captures network traffic and hosts events constantly and handles them synchronously within time slot windows inside the queue in order to construct connection vectors based on certain features. We have evaluated the method in offline mode using DARPA dump data flow and in online using a simulated network at the university campus. In addition, we have evaluated our IDS model using the constructed connections to proof the feasibility and plausibility of the proposed method in IDS area. The performance evaluation confirms that, the proposed method is able to operate in offline as well online modes efficiently. Moreover, constructed connections are very adequate for training and evaluating IDS models. Keywords—intrusion detection; data aggregation; real-time systems; performance monitoring I. INTRODUCTION The online operation mode of Intrusion Detection System (IDS) has become a real challenge since the amount of generated heterogeneous and non-stationary data and the interconnection between communication networks are increasing rapidly [1],[2],[3]. Moreover, the voluminous traffic amount increases system vulnerabilities and leads to emerge new and complex types of attacks. In order to effectively detect these attacks, the traffic must be professionally processed and analyzed [32]. Accordingly, IDS models could operate effectively and achieve a high-performance over computer networks only in offline mode [4]. Usually, IDS models are trained using an offline and outdated dataset such as KDDCup99 [15],[16], GureKDD [5], [6] and Koyot2006+ [17]. However, KDDCup dataset is still the most used benchmark in evaluating network security applications although there are many criticisms about the data collection method and the characteristics of the data as well [7], [8]. On the other hand, providing IDS models by continuous datasets for training or testing purposes will optimize the performance and enhance the detection rate. Therefore, a professional and novel method that handles and treats massive data flow in computer networks has become a severe attention in network security. In this paper, we present a novel method that handles the problem of massive data flow in IDS area. The proposed method captures network packets and hosts’ events continuously in real-time, processes and analyzes them in a queuing concept of dynamic window size, and then constructs sequential connection vectors based on selected features. The result of performance evaluation demonstrates the feasibility of the queuing concept in large scale and heterogeneous networks and shows a high performance results in IDS models. The rest of this paper is organized as follows. Section II discusses previous approaches in the scope of our research. The architecture of our proposed method is described in section III. While, section IV investigates an evaluation study, shows the result of a comparative study, and discusses them accordingly. Finally, section V concludes our work. II. RELATED WORK Traffic aggregation, measurement, monitoring, and management are always active topics in the area of computer and communications networks. Several proposals have demonstrated positive and promising results in flow measurement and monitoring [9],[10]. However, aggregation and management of network traffic in online mode are still open issues in intrusion detection. In this regard, A. Cardigliano et al. propose a framework design and implementation called vPF_RING to capture packets on virtual machine [11]. They have exploited the queuing method RF_RING and enhanced it to ease the monitoring task and reduce the cost of the hardware. This idea encourages us to propose a novel queuing method that is able to continuously capture network packets in online mode. In contrast, S. Blagodurov and M. Arlitt discuss the use of DataSeries format as online logging format for network applications. They examine the idea by modifying the Webalizer tool to analyze Apache Web server and the Bro IDS logs efficiently and deliver a promising result by the Bro IDS specifically [12]. Some other works have also discussed the characteristics of network traffic and achieve a point of view in online mode such as [13]. Several frameworks have been proposed to ease the possibility of aggregating network and host data, and to improve IDS performance in real-time. The collaborative program between the MIT and DARPA proposes a first evaluation environment to dump network packet headers and host events [14], which are exploited to construct the