A Web Object Management Policy for Cooperative Hybrid Caching Architecture Jinsuk Baek, Mingyung Kwak, Paul S. Fisher, and Elva J. Jones Department of Computer Science Winston-Salem State University Winston-Salem, NC 27110 USA e-mail: {baekj, mkwak106, fisherp, jonese}@wssu.edu Abstract— We discuss a predictive, hybrid caching structure that has eliminated the performance issues of the existing structures and proposed a new policy for discarding web objects that are not likely to be accessed by clients. Our proposed approach is performed based upon the predictive technique using tables of rules derived from actual experience of the web object requests. We discuss the previously proposed schemes with their disadvantages, and then describe an improved approach employing summary tables in each proxy cache and allowing the limited number of executions of the expensive predictions. We provide a simulation of the proposed policy using the NLANR proxy traces to show the impact this proposed policy has on the performance of the Internet with web object traffic. Keywords- Finite Inductive Sequences, Object Discarding, Hybrid Caching Architecture, Reference Table, Summary Table. I. INTRODUCTION With the increasing popularity of the WWW, web traffic has become one of the most resource consuming applications on the Internet. This increasing use of the web results in increased network bandwidth usage, straining the capacity of the supporting networks. Hence, many solutions to improve bandwidth utilization of web traffic have been proposed. These solutions include the use of compression and delta- encoding [1], multicast [2], new congestion avoidance algorithms [3], and so on. Among many proposals, caching the popular web objects at different levels of proxy sites is one of the most efficient approaches promising network bandwidth saving and Internet traffic reduction. Proxy caches can be placed at different levels in the network to serve the clients, and they also cooperate with one another in case of a cache miss. If a requested web object from a client is not found in a local proxy cache, the proxy cache communicates with its sibling or nearby proxy caches to find that object. In the event that object is not found in the nearby caches only then is the request forwarded to the original server. This cooperation between the proxy caches is called cooperative caching. Two of the most common architectures to implement large scale proxy cache cooperation are hierarchical [4] and distributed caching systems [5]. A new architecture [6] has recently been proposed by one of the authors, namely a hybrid caching architecture, using both the hierarchical caches as well as the same level caches. Under the architecture, an FI-based object discarding policy [7] has been proposed to show a performance improvement in terms of hit ratio and response time when compared to the existing two architectures. However, we have found the frequent executions of FI systems can lower the performance of the proxy cache. We propose a new object management policy focused on this issue. With our policy, each lower level proxy cache has the summary table that contains its neighbor proxy caches’ object information. The summary table specifies the location of the requested object when the object is not available in the proxy cache. In addition, in order to boost the performance of the proxy cache, the proposed solution limits the number of executions of FI Systems depending upon the current available space of the proxy cache. As a result, our solution provides reduced response time for the requested object by minimizing unnecessary traffic and bandwidth usage between the low level proxy caches and upper level proxy cache. The outline of the rest of this paper is as follows. In Section 2, we briefly review the existing caching architectures including the hybrid caching architecture. In Section 3, we describe a new object management policy that can be applied to the hybrid caching architecture. We show the performance of the proposed solution in Section 4 and conclude in Section 5. II. EXISTING SOLUTION Hierarchical caching systems basically work from the lowest levels toward the highest levels. That is, if a cache miss occurs at a given level of cache, the request is forwarded to a higher level cache. If the requested object is not found in the higher level cache, the forwarding goes on until the requested object is found. In distributed caching, each object is allowed to be cached only at the lowest level, and a cache can obtain an object from neighboring caches. It effectively tackles many of the drawbacks of hierarchical caching. There are several approaches including Internet Cache Protocol (ICP) [8], Caching Array Routing Protocol (CARP) [9], Summary Cache [10], and Home Cache [11]. Unfortunately, each architecture has performance limitations in terms of message overhead, response time and object duplication. Comprehensive studies [12, 13] on both architectures found that hierarchical caching provides shorter connection time than distributed caching while the latter provides shorter transmission time and higher bandwidth usage. To overcome the drawbacks of both architectures, the _____________________________ 978-1-4244-4520-2/09/$25.00 ©2009 IEEE Authorized licensed use limited to: Winston Salem State University. Downloaded on September 14, 2009 at 14:22 from IEEE Xplore. Restrictions apply.