IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 04 | Apr-2014, Available @ http://www.ijret.org 836 HYBRID WEB CACHING FRAMEWORK FOR REDUCTION OF WEB LATENCY Ranju Khemka 1 , Aruna Jain 2 1 M.Tech Student, Dept. of Computer Science and Engineering, Birla Institute of Technology, Jharkhand, India 2 Associate Professor, Dept. of Computer Science and Engineering, Birla Institute of Technology, Jharkhand, India Abstract Distributed web caching and Hierarchical web caching are two important techniques to minimize the problem of web latency. Web latency is the time taken by user in retrieving the web documents. The main performance problems with distributed caching are longer connection times and overhead such as resolution delay, queuing delay etc in used bandwidth. Whereas the main issue with Hierarchical caching is longer transmission times. So there is a need of a sophisticated combination of a hybrid scheme to effectively reduce web latency. Studies show that combination of Hierarchical and Distributed web caching reduces transmission time and connection time thereby reducing the overall latency time. Our results show that we can improve hit ratio from Hierarchical and Distributed caching strategy by 55% and 42% respectively. Keywords: Distributed caching, Hierarchical caching, Hybrid caching, Web Latency, Hit ratio -----------------------------------------------------------------------***----------------------------------------------------------------------- 1. INTRODUCTION The unparallel growth of Internet in terms of total bytes transferred among hosts, coupled with dominance of HTTP protocol shows much can be leveraged through World Wide Web Caching technology [1]. Web Caching is a mechanism that can not only enhance end users’ experience by reducing latency time, server load and perceived lag but also at the same time save bandwidth for the Internet Service Providers (ISPs). In simple terms a Web Cache is a temporary storage place for data requested from the Internet. Data can be an HTML page, images or multimedia files. The first request for a particular data is fulfilled from the Internet, now the web cache stores copies of document passing through it. Subsequent requests for the same data can be satisfied from the cache, if certain conditions are met. Websites are composed of many WebPages and Web documents. These inturn are composed of many small parts like logos, images, tables, text and audio files. Each part is cached as a different object. And some of the parts may not be cached at all. For example- when we access a news website, if the logo object, some advertising bars and some static content can be cached, it will be easier to download just the dynamic news content [2]. A proxy server works by intercepting the connections between the sender and receiver. Fig- 1: A proxy Server For client it acts as Origin Server and for Origin Server it acts as client. Proxies predominantly cache web pages. Each time an internal user requests a URL from outside , a temporary copy is stored locally. The next time any internal user requests for the same URL, the proxy can serve the local copy instead of retrieving the original across the network, thereby reducing latency and improving performance [3]. Proxy caching improves performance in many ways. Firstly Caching attempts to reduce web latency time required in obtaining web documents. Latency time can be reduced because the proxy cache is much closer to the end user than the original content provider or Origin Server. Secondly caching reduces the network traffic across the web server. Network load can be reduced because required document that are served from cache has to travel less on the network than when they are served by the Origin Server. Finally proxy caching can reduce the service demands on Origin Servers, since cache hits need not involve the Origin Server. It may also lower transit costs for access providers or ISPs. Furthermore, as it delivers cached objects from Proxy Servers; it reduces external latency and improves reliability as a user can obtain a cached copy even if the remote Server is unavailable. Due to the larger number of users connected to the same proxy server, object characteristics (popularity, spatial and temporal locality) can be better exploited increasing the cache hit ratio and improving Web performance [4]. With the exponential increase in interests towards dynamically generated content the need for more and faster Proxy Servers cache was created. As replacing these limited sized proxy caches every time was not a cost effective solution. So many proxy servers were grouped together to form a cluster. The clustering technology handled the issues of scalability, load balancing and fault