CC-GiST: Cache Conscious-Generalized Search Tree for Supporting Various Fast Intelligent Applications Won-Sik Kim 1 , Woong-Kee Loh 2 , and Wook-Shin Han 1, 1 Department of Communication Engineering Kyungpook National University, Korea wskim@www-db.knu.ac.kr, wshan@knu.ac.kr 2 Department of Computer Science Korea Advanced Institute of Science and Technology (KAIST), Korea woong@mozart.kaist.ac.kr According to the advance of technologies, the speed gap between CPU and main memory is getting larger every year. Due to the speed gap, it was perceived important to make the most use of the cache residing between CPU and main memory, and there have been a lot of research efforts on this issue. Among those is the research on cache conscious trees for reducing the cost for accessing main memory indexes. Cache conscious trees were designed to cause as few cache misses as possible based on the characteristics of the cache. The most widely known cache conscious trees are the CSB + -tree, the pkB-tree, and the CR-tree. Since it is costly and error-prone to implement every cache conscious tree sep- arately, we need a new systematic approach. An analogous approach was made for the disk based indexes. The Generalized Search Tree (GiST) was proposed as a framework for development of disk based indexes. The GiST basically provides the common features for disk based balanced search trees. Hence, when devel- oping a disk based index using the GiST, only the features specific to the index need to be implemented. However, the GiST has the weakness that it cannot be efficiently used for main memory indexes because it was originally designed for the disk based indexes. In this paper, we propose the Cache Conscious-Generalized Search Tree (CC- GiST) by extending the GiST to be cache conscious. By analyzing the techniques used by the existing cache conscious trees, we derive two generalized techniques that can be applied to any cache conscious trees: the pointer compression and the key compression techniques. The CC-GiST incorporates the techniques in extending the disk based GiST. The pointer compression technique in cache con- scious trees reduces the number of pointers in an internal node. It removes a sub- set of n pointers ptr i (1 ≤ i ≤ n) in the internal node, and stores the child node (pointed by a pointer ptr j that is not removed) along with the subsequent child nodes (originally pointed by the pointers ptr j+1 , ptr j+2 , ... that are removed) physically consecutively in the same segment. The key compression technique in cache conscious trees reduces the size of a key so that Len (Compress (BaseKey i , Key i )) ≤ Len (Key i ) holds, where Key i is the i-th key in a node, and BaseKey i Corresponding author. S. Mehrotra et al. (Eds.): ISI 2006, LNCS 3975, pp. 657–658, 2006. c Springer-Verlag Berlin Heidelberg 2006