Quantifying the accuracy of the ground truth associated with Internet traffic traces Maurizio Dusi, Francesco Gringoli, Luca Salgarelli Università degli Studi di Brescia, Italy article info Article history: Received 9 February 2010 Received in revised form 21 July 2010 Accepted 9 November 2010 Available online 23 November 2010 Responsible Editor: I.F. Akyildiz Keywords: Internet Measurement Traffic Characterization abstract Ground truth information for Internet traffic traces is often derived by means of port anal- ysis and payload inspection (Deep Packet Inspection – DPI). In this paper we analyze the errors that DPI and port analysis commit when assigning protocol labels to traffic traces. We compare the ground truth provided by these approaches with that derived by gt, a tool that we developed, which provides error-free ground truth at the application level by con- struction. Experimental results demonstrate that, depending on the protocols composing a trace, ground truth information from port analysis and DPI can be incorrect for up to 91% and 26% of the labeled bytes, respectively. Ó 2010 Elsevier B.V. All rights reserved. 1. Introduction An increasing number of research activities in the field of Internet traffic measurement and analysis rely on the availability of network traces associated with ground truth data, i.e., information about the protocol and the applica- tion behind each flow. The research community commonly adopts two methods to derive ground truth, based on the analysis of port numbers at the transport layer and of application payloads, the latter through Deep Packet Inspection (DPI) techniques. However, known protocols can work on ports different than those assigned by IANA, for example in order to cir- cumvent security restrictions. Furthermore, emerging pro- tocols such as those used by Peer-to-Peer (P2P) and Streaming applications do not even use standard ports. Fi- nally, even DPI mechanisms often fail with encrypted or obfuscated traffic: this is the case with protocols protected by TLS or with applications such as Skype. In this paper we quantify the error that classical, port- based and DPI-based approaches commit in establishing ground truth related to application-layer protocols. We evaluate these approaches on traffic traces that we col- lected, for which ground truth is made available by means of gt [1], a tool that we developed and that, by construc- tion, generates accurate ground truth at the application level. We show that classical approaches can lead to high true-positive rates labeling the traffic produced by clear text protocols such as HTTP and SMTP, yet they sometimes fail to produce a label for some of the traffic produced by those same protocols, and end up mis-labeling much of the P2P, Streaming and Voice over IP traffic. For example, we show that in our traces DPI mislabels up to 6% of the bytes produced by email-related protocols, and up to 60% of those produced by P2P applications. Finally, we use our findings to evaluate the errors one would commit if they were to use transport ports as ground truth in one of the most recent publicly-available, anonymized traces such as the one available at [2]. The rest of the paper is organized as follows. In Section 2 we report on related work. In Section 3 we describe the methodology we used to evaluate the accuracy of the DPI 1389-1286/$ - see front matter Ó 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.comnet.2010.11.006 Corresponding author. Tel.: +39 030 371 5847; fax: +39 030 380 014. E-mail addresses: luca.salgarelli@ing.unibs.it, first.last@ing.unibs.it (L. Salgarelli). Computer Networks 55 (2011) 1158–1167 Contents lists available at ScienceDirect Computer Networks journal homepage: www.elsevier.com/locate/comnet