Evaluation of a Zero-Copy Protocol Implementation Karl-Andr´ e Skevik, Thomas Plagemann, Vera Goebel al Halvorsen Department of Informatics, University of Oslo UniK, University of Oslo P.O. Box 1080, Blindern, N-0316 OSLO, Norway P.O. Box 70, N-2027 KJELLER, Norway karlas, plageman, goebel @ifi.uio.no paalh@unik.no Abstract Internet services like the world-wide web and multime- dia applications like News- and Video-on-Demand have be- come very popular over the last years. Since a high and rapidly increasing number of users retrieve multimedia data with high data rates, the data servers can represent a severe bottleneck. Traditional time and resource consuming oper- ations, like memory copy operations, limit the number of concurrent streams that can be transmitted from the server, because of two reasons: (1) memory space is wasted hold- ing identical data copies in different address spaces; and (2) a lot of CPU resources are used on copy operations. To avoid this bottleneck and make memory and CPU resources available for other tasks, i.e., more concurrent clients, we have implemented a zero-copy data path through the com- munication protocols to support high-speed network com- munication, based on UVM[6]. In this paper, we describe the implementation and evaluation of the zero-copy proto- col mechanism, and we show the potential for substantial performance improvement when moving data through the communication system without any copy operations. 1. Introduction There has been a tremendous growth in the use of multimedia Internet services, and in particular, applica- tions like News-on-Demand (NoD) and Video-on-Demand (VoD) have become very popular. Thus, the number of users, as well as the amount of data each user downloads from servers on the Internet, is rapidly increasing. Today, contemporary mid-price personal computers are capable of handling the load that such multimedia applications im- pose on the client system, but in Media-on-Demand (MoD) servers, the potentially (very) high number of concurrent users retrieving data represent a problem. In MoD servers in general, commodity operating systems represent the major performance bottleneck, because operating systems are not getting faster as fast as hardware [11]. There are two basic orthogonal approaches for this problem: (1) development of an architecture for a single server that makes optimal use of a given set of resources, i.e., maximize the number of concurrent clients a single server can support; and (2) com- bination of multiple single servers, e.g., in a server farm or cluster, to scale up the number of concurrent users. We have concentrated on the first approach. To support multiple concurrent users each retrieving a high data rate multimedia stream, the operating system and server archi- tecture must be improved and optimized. Crucial issues include copy operations and multiple copies of the same data in main memory [12]. A major bottleneck in high throughput systems is the send() system call (and equiva- lents) which copies data from the application buffer in user space to the kernel memory region. This is expensive for several reasons [6]: The bandwidth of the main memory is limited, and ev- ery copy operation is effected by this. A lot of CPU cycles are consumed for every copy op- eration. Often, the CPU must move the data word-by- word from the source buffer to the destination, i.e., all the data flows through the CPU. This means that the CPU is unavailable during the copy operation. Data copy operations affect the cache. Since the CPU accesses main memory through the cache, useful infor- mation resident in the cache before the copy operation is flushed out. To avoid the memory copy bottleneck several solutions have been proposed (for a state-of-the-art overview, see [12]), using mechanisms like programmed I/O, page remapping, shared memory, etc. For example, Afterburner [7] and Medusa [3] copy data directly onto the on-board memory, using programmed I/O with integrated checksum and data length calculation. Using DMA and a user-level imple- mentation of the communication software, the application device channel [8] gives restricted but direct access to an