Modeling and Analysis of A Multi-Level Caching in Distributed Database Systems Doaa S. El Zanfaly Reda A. Ammar A. Sharaf Eldin Info. Systems Dept. Computer Science. & Engr. Dept. Info. Systems Dept. Faculty of Computers & Info. Helwan Univ., Cairo Univ. of Connecticut, Storrs, CT 06269-1155 Faculty of Computers & Info. Helwan Univ., Cairo Abstract: - Caching frequently asked queries is an effective way to improve the performance of both centralized and distributed database systems. Intensive works have been done in this area to propose different query caching techniques and to evaluate their performance. However, most of these works were confined to caching previous query results in a single-level caching architecture. Evaluations of these works were based on simulations. In [1], we proposed a new query caching technique for caching both query results and execution plans in a multi-level caching architecture. The centralized version of this technique was evaluated and the results were reported in [2]. In this paper, we present an analytical model to evaluate the performance of the proposed technique in distributed database systems. Key-Words: - Multi-Level Caching, Query Caching, Query Processing, Performance Modeling. 1 Introduction Most query caching techniques are confined to caching prior query results in a single level caching architecture to avoid accessing the underlying database each time a user submits the same query [3-7]. Although these techniques play a vital role in improving the performance of distributed queries, they do have some drawbacks. Caching query results needs space and an appropriate updating strategy to always keep cached results consistent with the underlying databases [4]. To overcome these drawbacks, we developed a new query caching approach that is based on caching query execution plans and some results in a multi-level caching architecture [1]. We also developed an analytical model to evaluate the performance of the new architecture when it is implemented to run on a single computer [2]. In this paper, we describe a new multi-level caching technique. Then, we present the analytical model to evaluate the performance of this technique when it is implemented to run in distributed systems. The results show that the multi-level caching technique achieves high performance outcomes. 2 Multi-Level Caching Technique Instead of caching prior query results in a single-level cache architecture, we developed a new caching technique that caches query execution plans, subplans, and results in a multi-level cache architecture. By dividing the cache linearly into multiple levels, each level contains a subset of global queries subplans. Plans are cached in the form of interrelated but independent subplans. Subplans are interrelated as they all together represent a global execution plan. At the same time, they are independent as each subplan can be reused by itself to construct another execution plan with some other cached subplans. Each cache level will be further divided into two partitions: one for caching query execution plans (subplans) and the other for caching the results of those plans (subplans) that are frequently used. New plans are cached on the top cache level. As the system is being used, subplans will be distributed among different cache levels according to their recency. Through this architecture, we extend the idea of query caching by doing the following: ● We cache a combination of query results and execution plans. Caching query results reduces the access time. Caching execution plans makes the cache independent of the database. Each time data in a database is modified, there is no need to propagate these modifications to the cache. The plan will be reused whenever needed with its new attributes. ● We cache global plans and hence some of its subplans will be implicitly cached. Thus, cached plans can be reused to execute different queries instead of being used to execute only a sole general query. This will reduce the processing time by avoiding optimization and reconstruction of global execution plans (subplans). It also reduces the search time since we look into some parts of the cache instead of searching all of it.