A Survey of Proxy Cache Evaluation Techniques Brian D. Davison Department of Computer Science Rutgers, The State University of New Jersey New Brunswick, NJ 08903 USA davison@cs.rutgers.edu, http://www.cs.rutgers.edu/˜davison/ Abstract Proxy caches are increasingly used around the world to reduce bandwidth requirements and alleviate delays as- sociated with the World-Wide Web. In order to com- pare proxy cache performances, objective measurements must be made. In this paper, we define a space of proxy evaluation methodologies based on source of work- load used and form of algorithm implementation. We then survey recent publications and show their locations within this space. 1 Introduction Proxy caches are increasingly used around the world to reduce bandwidth and alleviate delays associated with the World-Wide Web. This paper describes the space of proxy cache evaluation methodologies and places cur- rent research within that space. The primary contribu- tions of this paper are threefold: 1) definition and de- scription of the space of evaluation techniques; 2) ap- praisal of the different methods within that space; and 3) a survey of cache evaluation techniques from the re- search literature. In the next section we provide background into web caching, including the levels of caching present on the web and and an overview of some of the current research issues in web proxy caching. We then describe current proxy cache evaluation methods and place existing re- search in this space. 2 Background Caching has a long history and is a well-studied topic in the design of computer memory systems (e.g. [HP87, Man82, MH99]), in virtual memory management in op- erating systems (e.g. [Dei90, GC97]), in file systems (e.g. [CFKL95]), and in databases (e.g. [SSV96]). Caching on the Internet is also performed for other net- work services such as DNS [Moc87a, Moc87b], Gopher and FTP [Pet98, Wes98, RH98, Cat92], and in fact much of today’s web caching research can be traced back to the effort to reduce the bandwidth used by FTP [DHS93]. 2.1 Web caching Web caching is the temporary storage of web objects (such as HTML documents) for later retrieval. 1 Propo- nents of web caching claim three significant advantages to web caching: reduced bandwidth consumption (fewer requests and responses that need to go over the net- work), reduced server load (fewer requests for a server to handle), and reduced latency (since cached responses are available immediately, and closer to the client being served). A fourth is sometimes added: more reliabil- ity, as some objects may be retrievable via cache even when the original servers are not reachable. Together, these features can make the World Wide Web less ex- pensive and better performing. One drawback of caching is the potential of using an out-of-date object stored in a cache instead of fetching the current object from the ori- gin server. Another is the lack of logs of client viewings of pages for the purposes of advertising (although this is being addressed, e.g. [ML97]). Caching can be performed by the client application, and is built into virtually every web browser. There are a number of products that extend or replace the built-in caches 2 with systems that contain larger storage, more features (such as keeping commonly used pages up-to- date and prefetching likely pages), or better performance (such as faster response times as a result of better caching 1 Actually, web caches store HTTP responses, but we will use the more generic phrase, web object, as a simplification throughout. 2 See http://www.web-caching.com/ for lists of proxies and browser extensions.