Vesper: Using Echo-Analysis to Detect Man-in-the-Middle Attacks in LANs Yisroel Mirsky, Naor Kalbo, Yuval Elovici, and Asaf Shabtai Department of Software and Information Systems Engineering, Ben-Gurion University, Beer-Sheva, Israel. yisroel@post.bgu.ac.il, kalbo@post.bgu.ac.il, elovici@bgu.ac.il, and shabtaia@bgu.ac.il Abstract—The Man-in-the-Middle (MitM) attack is a cyber- attack in which an attacker intercepts trafﬁc, thus harming the conﬁdentiality, integrity, and availability of the network. It remains a popular attack vector due to its simplicity. However, existing solutions are either not portable, suffer from a high false positive rate, or are simply not generic. In this paper, we propose Vesper: a novel plug-and-play MitM detector for local area networks. Vesper uses a technique inspired from impulse response analysis used in the domain of acoustic signal processing. Analogous to how echoes in a cave capture the shape and construction of the environment, so to can a short and intense pulse of ICMP echo requests model the link between two network hosts. Vesper uses neural networks called autoencoders to model the normal patterns of the echoed pulses, and detect when the environment changes. Using this technique, Vesper is able to detect MitM attacks with high accuracy while incurring minimal network overhead. We evaluate Vesper on LANs consisting of video surveillance cameras, servers, and PC workstations. We also investigate sev- eral possible adversarial attacks against Vesper, and demonstrate how Vesper mitigates these attacks. Index Terms—Man in the middle, anomaly detection, echo- analysis, LAN security. I. I NTRODUCTION A Man-in-the-Middle attack (MitM) is where a malicious third party takes control of a communication channel between two or more endpoints by intercepting and forwarding the trafﬁc in transit. An attacker in the middle has the capability of harming the conﬁdentiality, integrity, and availability of the user’s content, by eavesdropping, manipulating, crafting, and dropping trafﬁc on the network. In general, the MitM attack model on a local area network (LAN) has three steps: (1) gain access to the network, (2) intercept trafﬁc in transit, and (3) manipulate, craft, or drop trafﬁc. Depending on the scenario, access can be achieved by connecting to a public Wi-Fi access point (e.g. at a café, airport...) or by connecting physically to an exposed network cable or network switch. The attacker can also conduct this attack remotely via a malware which has infected a trusted computer within the existing network [1]. After gaining access, interception can be achieved by exploiting known vulnerabili- ties in network protocols. For example, the attacker can poison a host’s address resolution protocol (ARP) table to capture local trafﬁc [2]–[4], or spooﬁng a domain name server (DNS) to intercept all web trafﬁc [5]–[7]. The attacker can easily exploit these vulnerabilities with free tools which work out-of- the-box such as Ettercap, Cain and Abel, Evilgrade, arpspoof, dsniff, and many others. Although MitM attacks on LANs have been known for some time, they are still considered a signiﬁcant threat [8], [9], and have gained academic attention over the years. This is likely because the attack is relatively easy to achieve, yet challenging to detect [10]. Encryption can protect the integrity and conﬁdentiality of the trafﬁc in transit. However, according to [11], 30% of the world’s web trafﬁc is not encrypted. Furthermore, in many cases networked systems do not encrypt their trafﬁc by default (e.g., SCADA control system [12]). Moreover, even if the trafﬁc is encrypted, encryption protocols may have ﬂaws [13], [14], be misconﬁgured, or simply left out by a manufacturer (e.g. CVE-2017-15643). We also note that LAN-based MitM attacks are used in APTs to achieve lateral movement [15]. Therefore, there is a need for detecting the presence of a MitM, even when encryption is employed. A. The Proposed Solution Our proposed solution is inspired by signal processing domain. In a dynamic system, the output (reaction) of the system to a short input signal is called impulse response. A common use of impulse responses is the modeling and recreation of acoustic environments, such as small rooms or concert halls. As an intuitive example, one can hear the IR of a room by clapping their hands. The sound of the clap changes based on the size, shape, and materials of the room. Using this concept, we propose a MitM detector called Vesper. Vesper bats are the largest and best-known family of the bat species. Akin to it name, our detector captures the impulse response of a LAN by measuring the round-trip- times (RTT) resulting from a short intense burst of ICMP echo requests. This impulse response is used to model the normal behavior of the network in the perspective of two communicating hosts. When a third party intercepts trafﬁc, the harmonic composition of the impulse response between the hosts changes signiﬁcantly. Vesper detects this change using an autoencoder neural network as an anomaly detector. In this paper, we show how Vesper works with various devices, in the presence of diverse trafﬁc, and across multiple switches. We also show how Vesper is robust against adversarial attacks. B. Contributions To summarize, the contributions of this paper are as follows. arXiv:1803.02560v1 [cs.CR] 7 Mar 2018