Network Anomaly Detection with Net-GAN, a Generative Adversarial Network for Analysis of Multivariate Time-Series Gastón García González Universidad de la República & AIT gastong@fing.edu.uy Pedro Casas AIT Austrian Institute of Technology pedro.casas@ait.ac.at Alicia Fenández Universidad de la República alicia@fing.edu.uy Gabriel Gómez Universidad de la República ggomez@fing.edu.uy ABSTRACT We introduce Net-GAN, a novel approach to network anomaly de- tection in time-series, using recurrent neural networks (RNNs) and generative adversarial networks (GAN). Different from the state of the art, which traditionally focuses on univariate measurements, Net-GAN detects anomalies in multivariate time-series, exploit- ing temporal dependencies through RNNs. Net-GAN discovers the underlying distribution of the baseline, multivariate data, without making any assumptions on its nature, offering a powerful approach to detect anomalies in complex, difficult to model network moni- toring data. We present preliminary detection results in different monitoring scenarios, including anomaly detection in sensor data, and intrusion detection in network measurements. CCS CONCEPTS Computing methodologies Anomaly detection; Machine learning algorithms; KEYWORDS Anomaly Detection; Multivariate Time-Series; Generative Models; GAN; LSTM ACM Reference Format: Gastón García González, Pedro Casas, Alicia Fenández, and Gabriel Gómez. 2020. Network Anomaly Detection with Net-GAN, a Generative Adversarial Network for Analysis of Multivariate Time-Series. In ACM Special Interest Group on Data Communication (SIGCOMM ’20 Demos and Posters), August 10–14, 2020, Virtual Event, USA. ACM, New York, NY, USA, 3 pages. https: //doi.org/10.1145/3405837.3411393 1 INTRODUCTION Network monitoring data generally consists of hundreds or thou- sands of counters periodically collected in the form of time-series, resulting in a complex-to-analyze multivariate time-series process (MTS). In particular, detecting anomalies in such multivariate, tem- poral data is challenging. Without loss of generality, we refer to the MTS as a set of n, non-iid time series sampled at the same Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). SIGCOMM ’20 Demos and Posters, August 10–14, 2020, Virtual Event, USA © 2020 Copyright held by the owner/author(s). ACM ISBN 978-1-4503-8048-5/20/08. . . $15.00 https://doi.org/10.1145/3405837.3411393 rate, referred to as x t = {x t (1), x t (2),..., x t ( n)} ∈ IR n . Current approaches to anomaly detection tackle this challenge by either focusing on univariate time-series analysis – running an indepen- dent detector for each time-series x t ( i ), or by considering multi- dimensional input data x IR n at each time t , neglecting the tem- poral aspects of the MTS. To improve the state of affairs we propose Net-GAN, a novel unsupervised approach to anomaly detection in MTS data, based on Recurrent Neural Networks (RNNs), trained through a Generative Adversarial Networks framework (GAN) [2]. The usage of generative models for semi-supervised anomaly detection helps to solve two major problems faced in this specific field: the high imbalance between normal operation and anomaly instances, as well as the lack of labeled instances for learning and validation purposes. Generative models such as Variational Auto- Encoders (VAEs) or Generative Adversarial Networks (GANs) are powerful approaches to learn the underlying distributions of data samples, in a purely data-driven, model-agnostic manner. Such models can be used in the practice to construct better baselines (i.e., profiles for normal operation) for the anomaly detection task, improving the identification of instances which deviate from this baseline. Examples of VAEs and GANs for anomaly detection are presented in [6] and [7], respectively. Most of previous work in this direction treats data as temporally independent samples, loosing the information provided by causality and temporal correlation. To capture the temporal correlations characterizing an MTS, we adapt the original GAN model proposed in [2], replacing the multi- layer perceptrons by recursive, LSTM networks for both generator and discriminator models. The input data is therefore sequences of multi-dimensional measurements, of length T : {x t T , ..., x t }. Net-GAN is inspired by previous work on GANs for time-series synthesizing and anomaly detection [1, 3, 4]. 2 THE NET-GAN APPROACH Fig. 1 depicts the Net-GAN architecture and both the model training and anomaly detection procedures. In the training phase (left), the generator G draws synthetic sample sequences G(z ) from Gaussian noise – the latent space Z , with the objective of deceiving the dis- criminator D, which in turn learns to determine whether training samples are real or derived from the generative distribution. The classification result proposed by D is additionally fed back to G, serving as a reinforcement loop to guide the generation process. As both G and D compete to achieve their adversarial tasks, synthetic samples become more and more “realistic”, and the discriminator be- comes robust to noise, improving the detection of non-conforming 62