How to Validate Trafﬁc Generators? S´ andor Moln´ ar 1 , P´ eter Megyesi High Speed Networks Lab., Dept. of Telecomm. and Mediainformatics, Budapest Univ. of Technology and Economics, Budapest, Hungary Email: {molnar, megyesi}@tmit.bme.hu G´ eza Szab´ o TrafﬁcLab Ericsson Research, Budapest, Hungary Email: geza.szabo@ericsson.com Abstract—Network trafﬁc generators are widely used in net- working research and they are validated by a very broad range of metrics (mainly trafﬁc characteristics). In this paper we overview the state of the art of these metrics and unveil that there is no consensus in the research community how to validate these trafﬁc generators and which metric to choose for validation purpose. This situation makes it extremely difﬁcult to evaluate validation results and compare different trafﬁc generators. We advocate the research for ﬁnding a common set of metrics for the validation and comparative evaluation of trafﬁc generators. I. I NTRODUCTION Network trafﬁc generators are vital in the design, develop- ment and management of our networks. Its importance became even more pronounced as the complexity of our networks is increased resulting in the use of simulation methodologies (e.g. ns2 or ns3) less accurate. On the other hand, the network data is the property of the operator and it results in a number of privacy issues limiting the use of the replay of measured traces. As a result a huge number of trafﬁc generators have been developed in the last decades based on different methodologies and they were always adapted to the current need of network environments, application sets and purpose of use. See Table I and its reference list for an overview. The main function of these trafﬁc generators is that these tools can inject packets into the network in a controlled fashion generating a synthetic trafﬁc. The crucial requirement is that the characteristics of the synthetic trafﬁc must capture the characteristics of actual trafﬁc in the network. In spite of the fact that there is a long history of trafﬁc generators and a large number of trafﬁc generators have been proposed so far it seems that there is no consensus in the research community how to validate these trafﬁc generators and which metric is used to evaluate the accuracy of the generator under investigation. In this paper we address the issue of ﬁnding appropriate and common metric for the validation of trafﬁc generators. We overview the most recent metrics researchers use for their trafﬁc generators and categorize them. The main motivation of the paper is to unveil the current situation and show that there is no common metric being used in the state of the art of trafﬁc generators literature and it makes the evaluation of the validation results and also the comparison of different trafﬁc generators very difﬁcult if not impossible. Therefore the motivation of ﬁnding a common set of metrics for this purpose is a key factor for categorize the recent and future 1 The research was supported by OTKA-KTIA grant CNK77802. trafﬁc generators from the most important point of view: how accurately they can generate trafﬁc which is reliable and can be used for the design, development and management of our networks and devices. We are raising an alert here, while the solution to the problem is not easy and deserves a deep study as a future work. This paper is organized as follows. In Section II we present the state of the art trafﬁc generation tools along with the validation techniques used in their introduction. Then, in Section III a categorization is given for the most frequent validation metrics. Finally, Section IV concludes the paper with a discussion on a possible set of metrics that could be the basis for establishing an agreed set of metrics by the research community for the future as the common validation measure for newly developed trafﬁc generation tools. II. TRAFFIC GENERATORS AND VALIDATION TECHNIQUES We have investigated sufﬁcient amount of trafﬁc generators found in the recent literature and classiﬁed them into ﬁve cate- gories according to the metrics used in validation perspective. Table I contains a brief overview about the presented trafﬁc generators. A. Replay Engines Replay engines take previously captured trafﬁc to send the packets out on the network interface the same timing that it was recorded. Given their purpose the only question that rises during their operation is whether the packets follow each other the same way as they were captured. This phenomenon could appear in both Inter Packet Timing (IPT) skewness (usually due to inaccurate software interrupts) and throughput saturation (due to bandwidth limitation). The most common open-source replay application is TCPre- play [1] which can use libpcap ﬁles as input. It is also capable to rewrite Layer 2, 3 and 4 header information for various testing purposes. Since TCPreplay is a general, user-level software working on any UNIX platform it’s performance may highly dependent on the installed environment. In [2] authors present TCPivo an open-source, high-speed packet replay engine on commodity hardware. This paper shows example of the IPT errors using different execution approaches. As a solution for bandwidth limitation Ye at al. [3] presents a technique to replay a captured OC-48 trace on multiple commodity PCs with Gigabit Ethernet interface. The authors