A CONTENT-TYPE BASED EVALUATION OF WEB
CACHE REPLACEMENT POLICIES
F.J. González-Cañete, E. Casilari, A. Triviño-Cabrera
Department of Electronic Technology, University of Málaga, Spain
University of Málaga, E.T.S.I. Telecomunicación, Campus de Teatinos, 29071, Málaga, Spain
{fgc,ecasilari,atc}@uma.es
ABSTRACT
In this paper, a study of the performance of six replacement policies taking into account only one content-type of
documents each time (Application, Audio, Images, Text and Video) has been developed in order to implement a proxy
cache that differences the type of traffic. The classical caching algorithms LRU, LFU and LFU-DA and the caching
schemes specifically developed for Web documents GD-SIZE, GDSF and GD* have been studied. Using a trace log of a
real proxy cache, a characterization of the main properties of the documents of each content-type has been performed.
Finally, a trace driven simulation study of the performance of the six replacement policies has been developed for the
traffic generated by each content-type considered. In that way we can conclude which are the replacement policies that
better perform for each content-type and cache size.
KEYWORDS
Web caching, replacement policies, document content-types.
1. INTRODUCTION
Internet and the World Wide Web (the Web) are in a continuous evolution and growth, therefore many
efforts to optimize them have been developed. One of the most important optimization techniques is the Web
proxy caching that store the documents requested by the users close to them. Since it was proposed in
(Luotonen, 1997), Web proxy caching has been utilized to reduce the latency that the users perceive, the
HTTP traffic as well as the servers load. After this original proposal of a Web proxy cache, many research
activities have aimed to study and develop replacement policies (Poplipnig, 2003) (Balamash, 2004),
algorithms for cache coherence (Krishmamurthy, 1999) and cache architectures (Busari, 2000) in order to
improve the performance of the caching system.
One of the main research lines is based on differencing the types of documents that are present in the Web
(Images, Text, Video,…). Khayari proposed to store in the cache only the most frequently demanded
document types (mpeg, gif, jpg, flash, html and plain) although his proposal did not outperform the cache
performance (Khayari, 2005). In this paper we analyze the best replacement policy for each document type
by means of simulations.
This paper is organized as follows. Section 2 summarizes the trace processing and the statistical
characterization of the workload based on the content-type and section 3 lists and evaluates the performance
of a proxy cache that takes into account only one content-type of the downloaded document. Finally, Section
4 presents the main conclusions of this paper.
2. TRACE PROCESSING AND CHARACTERIZATION
To evaluate the performance of a cache that only considers one type of document content type at a time, a
workload trace that contains HTTP requests from a proxy of the IRCache project has been utilised
(IRCache). This proxy is located in the Research Triangle Park (North Carolina, USA). The traces include
requests from the 7
th
to the 11
th
of June 2004 generated by the Squid Web proxy cache software (Squid
ISBN: 978-972-8924-30-0 © 2007 IADIS
90