Network Intrusion Alert Aggregation Based on PCA and Expectation
Maximization Clustering Algorithm
Maheyzah Md Siraj
+
, Mohd Aizaini Maarof and Siti Zaiton Mohd Hashim
Faculty of Computer Science and Information Systems
Universiti Teknologi Malaysia
81310 Skudai Johor, Malaysia
Abstract. Most of the organizations implemented various security sensors for increased information
security and assurance. A popular choice is Network Intrusion Detection Systems (NIDSs). Unfortunately,
NIDSs trigger a massive amount of alerts even for a day and overwhelmed security experts. Worse, a large
number of these alerts are false positives, and redundant warnings for the same attack, or alert notifications
from erroneous activity. Such low quality of alerts gives negative impact to the alert analysis. We propose an
alert aggregation model based on Principal Component Analysis (PCA) coupled with unsupervised learning
clustering algorithm - Expectation Maximization for Gaussian Mixture (EM_GM) to aggregate similar alerts
and to reduce the number of alerts. Our empirical results show that the proposed model effectively clustered
NIDSs alerts and significantly reduced the alert volume.
Keywords: alert clustering, alert filtering, alert aggregation, alert reduction, PCA, EM_GM
1. Introduction
Network Intrusion Detection Systems (NIDSs) have been extensively used by researchers and
practitioners to monitor intrusive activities in computer networks. NIDSs usually generated thousands of
alerts even for a day. Worse, those alerts are mixed with false positives, and repeated warnings for the same
attack, or alert notifications from erroneous activity [1]. Therefore, manually analyze those alerts are tedious,
time-consuming and error-prone [1].
A promising technique to automatically analyze the intrusion alerts is called correlation. Alert
Correlation Systems (ACS) is post-processing modules that provide high-level insight on the security state of
the network and filter false positives as well as redundant alerts efficiently from the output of NIDSs [2]. The
analysis results actually become an important guidance for the security expert (SE) to plan and develop the
responsive and preventive mechanisms. Generally, correlation can be of two types: structural correlation and
causal correlation. In this paper, we address the structural correlation (or alert clustering) aspect of NIDSs
data to group (or aggregate) alerts with similar features.
The main problem in existing ACSs is they require high levels of human SE involvement in creating the
system and/or maintaining it. For instance, algorithm introduced by [3] required a significant amount of
alerts to be managed manually (i.e., hand-clustered) beforehand. Likewise, system by [1], it required manual
tuning periodically. Moreover, in their first system deployment, it needs to encode network properties to
assist the clustering algorithm. These approaches were time-consuming since regular setup and maintenance
are significantly required for their system. Therefore, those constraints make the development of supervised
learning-based correlation system less practical. Our goal is to minimize the intervention (i.e., to ease the
burden) of SE as much as possible, but not to replace them. Therefore, an unsupervised learning-based
+
Corresponding author. Tel.: +607 5532245; fax: +607 5593185.
E-mail address: maheyzah@utm.my
395
2009 International Conference on Computer Engineering and Applications
IPCSIT vol.2 (2011) © (2011) IACSIT Press, Singapore