IEEE COMMUNICATIONS LETTERS 1 Quantifying the Deployment of TCP Options - A Comparative Study Kostas Pentikousis and Hussein Badr Abstract— TCP performance evaluation studies need to use protocol configurations representative of real networks. However, the actual deployment of TCP options has not yet been quantified on a large scale. After analyzing a large set of traffic traces collected at 12 different monitoring points, we find that TCP segment sizes have a bimodal distribution, not a tri-modal one as reported in recent studies. We show that the overwhelming majority of senders employ the maximum segment size option, large windows do not accompany SACK deployment, and ECN usage is negligible. Index Terms—Transport Protocols, Transmission Control Pro- tocol, Trace Analysis. I. I NTRODUCTION G IVEN the overwhelming presence of the Transmission Control Protocol (TCP) [1] in Internet traffic, determin- ing the exact TCP version deployed is critical for performance evaluation and an active area of research [2], [3]. Measur- ing performance gains and evaluating fairness mandates the comparison of new transport protocols and TCP modifications against both state-of-the art and the “most common” TCP versions. Gauging progress from Internet Engineering Task Force (IETF) standardization to deployment is also of great interest. Padhye and Floyd [2] employ an active measurement methodology to infer TCP behavior. Using custom software they connect to 4,550 web servers, test conformance to spec- ifications, identify bad implementations, and classify servers according to TCP behavior. Jaiswal et al. [3] use a passive measurement methodology and focus on inferring the exact TCP flavor used by end hosts, estimating congestion window sizes and the round trip times for each connection by analyzing 3 traces containing 202 million TCP segments. II. TRACE ANALYSIS We quantify the deployment of TCP options by analyzing a very large set of packet traces, complementing the work in [2], [3]. We developed Net::Traces::TSH [4] to analyze packet traces in Time Sequenced Headers format (TSH), available from the National Laboratory for Applied Network Research Passive Measurement and Analysis (NLANR/PMA) reposi- tory. During the period 10/29/2003-1/31/2004 we analyzed 5,955 traces (Table I), containing 5,063,796,854 records of IP packets, carrying more than 3 terabytes. The traces were Manuscript received February 26, 2004; revised April 15, 2004. K. Pentikousis is with the Department of Computer Science, Stony Brook University, Stony Brook, NY 11794-4400, USA (phone: 631-632-3800; fax: 631-632-8334; email: kostas@cs.stonybrook.edu). H. Badr is with the Department of Computer Science, Stony Brook Univer- sity, Stony Brook, NY 11794-4400, USA (email: badr@cs.stonybrook.edu). captured at 12 passive monitoring points (PMPs) located at 5 university campus gateways and 7 (Giga)PoPs aggregating traffic from many sites. Each PMP records every passing IP packet and generates TSH trace files. Details about the PMPs are available at http://pma.nlanr.net. Each record examined contained a timestamp, the standard IP header, and the first 16 bytes of the standard TCP header without the TCP options (if any). Trace durations were in the range of 8.47 to 4380.7 seconds, with 75% in the range of 87 to 97 seconds. The total aggregate duration exceeds 13 days of continuous monitoring. Our main contribution is a framework for TCP-centered analysis of TSH traces. Contrary to related work [2], [3], we assess the deployment of TCP options at a diverse set of sites and examine traffic regardless of application protocol. We quantify on an unprecedented scale the deployment of TCP options and Explicit Congestion Notification (ECN) [5] on end hosts and routers. Moreover, we present the distribution of TCP segment sizes and how segments are acknowledged. Our work was not without both its challenges and its limitations. The traces used in this study were sanitized before becoming publicly available. Thus, we cannot reliably detect duplicate packets or report on path asymmetries. However, our software can be extended easily to collect such statistics for non-sanitized traces. Another point worth highlighting is that, because we do not have access to the actual peer advertised TCP options, we have to infer their deployment by carefully analyzing records corresponding to SYN and SYN/ACK segments. III. RESULTS Table I illustrates that ECN deployment is still negligible. Most sites carry practically no TCP/ECN traffic. FRG is the site that carries the most TCP/ECN traffic (0.15%). The pro- portion of TCP segments marked as Congestion Experienced (“CE”) is on the margin of statistical error, indicating that the number of ECN-aware routers is extremely small. A. TCP Segment Sizes and the ACK Factor Fig. 1 illustrates that, contrary to previously published stud- ies [6], [7], TCP traffic is distinctly bimodal across virtually all sites. The major modes are due to minimally-sized segments, which include TCP ACKs, and segments with sizes near the Ethernet maximum transmission unit (MTU). Sizes that formed a third mode in previous studies (at 552 and 576 bytes) are consistently less common in our measurements, accounting for only 1-2% of the total segments in 10 out of 12 sites. COS and FRG are exceptional in this regard having 12% and 6%, respectively, of their segments between 512 and