Network Anomaly Detection in the Cloud: The Challenges of Virtual Service Migration Kirila Adamova, Dominik Schatzmann, and Bernhard Plattner ETH Z¨ urich Z¨ urich, Switzerland Email: kirila.adamova@gmail.com {dominik.schatzmann, plattner}@tik.ee.ethz.ch Paul Smith AIT Austrian Institute of Technology 2444 Seibersdorf, Austria Email: paul.smith@ait.ac.at Abstract—The use of virtualisation technology in the cloud enables services to migrate within and across geographically diverse data centres, e.g., to enable load balancing and fault tolerance. An important part of securing cloud services is being able to detect anomalous behaviour, caused by attacks, that is evident in network traffic. However, it is not clear whether virtual service migration adversely affects the performance of contemporary network-based anomaly detection approaches. In this paper, we explore this issue, and show that wide-area virtual service migration can adversely affect state of the art approaches to network flow-based anomaly detection techniques, potentially rendering them unusable. I. I NTRODUCTION Cloud computing has proved to be a popular way for organisations to provision services for their users. There are a number of reasons for this popularity, including potential reductions in operating costs, flexible and on-demand service provisioning, and increased fault-tolerance. Drawn to these benefits, operators of critical information infrastructures – the ICT infrastructures that support gas and electricity utilities and government services, for example – are considering using the cloud to provision their high assurance services. This is reflected in a recent white paper produced by the European Network and Information Security Agency (ENISA), which provides specific guidelines in this area [1]. Deploying high assurance services in the cloud increases cyber-security concerns – successful attacks could lead to out- ages of key services that our society depends on, and disclosure of sensitive personal information. To address these concerns, a range of security measures must be put in place, such as cryptographic storage and network firewalls. An important measure is the ability to detect when a cloud infrastructure, and the services it hosts, is under attack via the network, e.g., from a Distributed Denial of Service (DDoS) attack. A number of approaches to network attack detection exist, based on the detection of anomalies in relation to normal network behaviour [2]. One of the essential characteristics of cloud computing is the use of virtualisation technology, which supports the migration of services across a physical infrastructure within and between large-scale cloud data centres – known as local and wide-area migration, respectively. The reasons for service migration are manifold, including responding to hardware faults, planned maintenance tasks, and handling localised peaks in service requests by moving services “closer” to their user or to underutilized resources. Whilst virtual service migration has a number of benefits, it has the potential to make the implementation of security measures challenging, therefore introducing new vulnerabilities [3]. In this paper, we are specifically interested in examining the effect virtual service migration has on network anomaly-based attack detection techniques – as services move, migration may be observable in the network traffic that is being used for anomaly detection. Such techniques aim to detect anomalous traffic in relation to a learned baseline that represents normal behaviour. It is unclear to what extent virtual service migration, which is arguably representative of “normal” cloud behaviour, can be incorrectly observed as an anomaly, and therefore an attack. Conversely, attacks may be missed because of virtual service migration. If this problem is significant, anomaly detection techniques could be rendered unusable for the cloud, thus representing a significant vulnerability and a potential inhibitor to the deployment of high assurance services. Using a novel toolchain, which simulates attacks and virtual service migration in network flow traces, we have examined the detection performance of two anomaly detection techniques – Principal Component Analysis (PCA) [4], [5], [6] and the Expectation-Maximisation (EM) clustering algo- rithm [7], [8]. In previous research, these detection techniques have been shown to give acceptable detection performance results in non-cloud settings. Under different attack and virtual service migration scenarios, we have measured their ability to reliably detect attack behaviour in the cloud. Our results suggest that, in some configurations, a potentially insecure number of attacks are missed, and an unusably high number of alarms pertaining to normal behaviour are generated. This result draws into question the use of these techniques, and potentially others, in large cloud data centres, in which virtual service migration is a common undertaking. The rest of this paper is organised as follows: Section II discusses related work – our investigations indicate that, to the best of our knowledge, there is no previous work that directly addresses the problem explored in this paper. A discussion on virtual service migration and its effect from a network perspective is presented in Section III. Section IV describes the toolchain and traffic data that we used to obtain the experimental results, which are described in Section V. We conclude and discuss potential solutions to the problem we have explored in Section VI.