Proceedings of the 2017 Winter Simulation Conference
W. K. V. Chan, A. D’Ambrogio, G. Zacharewicz, N. Mustafee, G. Wainer, and E. Page, eds.
MELODY: SYNTHESIZED DATASETS FOR EVALUATING INTRUSION DETECTION
SYSTEMS FOR THE SMART GRID
Vignesh Babu
Rakesh Kumar
Hoang Hai Nguyen
David M. Nicol
Kartik Palani
Elizabeth Reed
Department of Electrical and Computer Engineering
University of Illinois at Urbana-Champaign
Urbana, IL 61801, USA
ABSTRACT
As smart grid systems become increasingly reliant on networks of control devices, attacks on their
inherent security vulnerabilities could lead to catastrophic system failures. Network Intrusion Detection
Systems(NIDS) detect such attacks by learning traffic patterns and finding anomalies in them. However,
availability of data for robust training and evaluation of NIDS is rare due to associated operational and
security risks of sharing such data. Consequently, we present Melody, a scalable framework for synthesizing
such datasets. Melody models both, the cyber and physical components of the smart grid by integrating
a simulated physical network with an emulated cyber network while using virtual time for high temporal
fidelity. We present a systematic approach to generate traffic representing multi-stage attacks, where each
stage is either emulated or recreated with a mechanism to replay arbitrary packet traces. We describe
and evaluate the suitability of Melodys datasets for intrusion detection, by analyzing the extent to which
temporal accuracy of pertinent features is maintained.
1 INTRODUCTION
The smart grid is representative of a cyber-physical system, which uses a networked set of devices that sense
its state and take appropriate control decisions (e.g. open/close a circuit breaker). Due to vulnerabilities in
the communication protocols, end-host firmwares and control algorithms, the smart grid’s control network
becomes a potential attack vector. The recent trend in attacks on power grids indicates usage of sophisticated
attack campaigns characterized by multi-stage exploits (Falliere, Murchu, and Chien 2011), (Bencs´ ath, P´ ek,
Butty´ an, and Felegyhazi 2012), (Assante and LEE 2015). In a typical attack campaign, the attacker creeps
through different network layers by stealing legitimate credentials and/or exploiting vulnerabilities in the
network services, progressively acquires more privileged access to one or more of the “critical assets” before
finally delivering the attack, e.g. opening multiple circuit breakers at once (Lee, Assante, and Conway
2016). Such multi-stage attacks are observable on both the cyber (e.g. packet counts measured at network
devices) and physical (e.g. power values measured by a phasor measurement unit) attributes of the smart
grid system.
Machine learning based network intrusion detection systems (NIDS) can detect such multi-stage attacks
by observing statistical patterns in such attributes. These systems are trained with historical data comprising
of normal background and attack traffic; two kinds of training occurs, one on normal traffic, so as to be
able to detect abnormalities by deviation from the norm, and separately on specific patterns from known
attacks, to detect specific abnormalities. We are concerned with both kinds of training. The accuracy of a
1061 978-1-5386-3428-8/17/$31.00 ©2017 IEEE