Towards Enabling Web Proxy Control of TCP Splice Transfer Rates Jiantao Kong College of Computing Georgia Institute of Technology Atlanta, GA 30332 jiantao@cc.gatech.edu Daniela Ros ¸u Marcel C. Ros ¸u IBM T.J. Watson Research Center P.O. Box 704 Yorktown Heights NY 10598, USA drosu,rosu @us.ibm.com Abstract This paper addresses the problem of CPU resource contention between a Web proxy cache and the TCP Splice kernel service, which this application employs for serving cache misses. TCP Splice improves the performance of a proxy cache by reducing the CPU utilization and latency for cache misses. However, TCP Splice implementations based on packet forward- ing in the IP or socket levels can delay the serving of cache hits when high bursts of packets move through the in-kernel infrastructure. In these implementations, the TCP Splice activity has higher priority than the ap- plication, and the application has no means of control- ling the pace of spliced transfers. In this paper, we propose an alternative paradigm for TCP Splice imple- mentation that enables application control, while pro- viding reasonably large reductions in CPU overheads. 1 Introduction Previous research has proved that the TCP Splice ker- nel service can significantly reduce transfer overheads in Internet servers like firewalls, mobile gateways, and content-based routers [5, 4, 2, 1, 7]. Web proxy cache servers can also benefit from exploiting TCP Splice, but the benefits may be limited by the resource con- tention between the kernel-level spliced transfers and the cache-served transfers [7]. More specifically, the TCP Splice implementations proposed by previous research can cause an increase of response times for cache hits. These implementa- tions, based on either IP or socket-layer mechanisms, Work done during Summer Internship at IBM T.J.Watson Re- search Center perform most of the activity in interrupts. Thus, the forwarding of packets in spliced connections receives higher priority than the application itself. This causes delays of the application-level activities, including the serving of cache hits, when high bursts of packets move through the TCP Splice infrastructure. Packet IN Packet OUT App. Proxy Block Blocked Run Runnable Run Run Run t 5 6 Block 1 3 5 6 2 3 4 2 4 1 Figure 1: Timeline with application-level splicing. App. Packet OUT Splicer Packet OUT Splicer Packet IN App. Packet IN Blocked t App. Proxy Kernel Runnable Runnable Blocked 1 3 4 1 4 3 5 5 6 6 2 2 Figure 2: Timeline with interrupt-driven TCP Splice. To illustrate this effect, Figures 1 and 2 repre- sent the execution timeline when the application per- forms application-level splicing and when it exploits an interrupt-driven TCP Splice, respectively. We con- sider an event-driven application, such as the Squid Web proxy cache, using a connection-state tracking mechanism, such as select. The application alter- nates between the runnable state, in which it serves all the events signaled by the most recent invocation of this mechanism, and the blocked state, in which it waits for I/O events. When exploiting TCP Splice, the application is not notified for the I/O events related to 1