CARDs: Cluster-Aware Remote Disks Vlad Olaru Walter F. Tichy Computer Science Department Karlsruhe University, Germany E-mail: olaru,tichy @ipd.uka.de Abstract This paper presents Cluster-Aware Remote Disks (CARDs), a Single System I/O architecture for cluster com- puting. CARDs virtualize accesses to remote cluster disks over a System Area Network. Their operation is driven by cooperative caching policies that implement a joint man- agement of the cluster caches. All the CARDs of a given disk employ a common policy, independently of other CARD sets. CARD drivers have been implemented as Linux kernel modules which can flexibly accommodate various cooper- ative caching algorithms. We designed and implemented a decentralized policy called Home-based Serverless Cooper- ative Caching (HSCC). HSCC showed cache hit ratios over 50% for workloads that go beyond the limit of the global cache. The best speedup of a CARD over a remote disk in- terface was 1.54. 1 Introduction High-speed System Area Networks (SAN) have laten- cies and bandwidths comparable to those of a memory sub- system. This makes a case for integrating cluster resources into Single System Image services. Integrating distributed resources into a single system was the subject of substan- tial prior research. User-space communication subsystems [23, 20] improved the SAN performance by removing the kernel from the critical path. However, message passing is not a handy programming model and high-level software abstractions (memory pages, disk blocks, etc.) were devel- oped. A notable research effort in this direction was that of software Distributed Shared Memory systems [1, 24]. Cooperative caching network filesystems [2, 8] changed the distributed filesystem memory hierarchy (client cache, server cache, server disk) by letting client cache misses to be checked against other client caches before the server cache. Thus, the working set grew beyond the local memory limit while read latency improved because remote caches were accessed faster than the disk (even if it was local). Flexible/extensible kernels [5, 9] have shown a better re- sponse to the challenges raised by the new class of highly intensive I/O-bound applications (mostly related to multi- media and Web/Internet) than conventional general-purpose kernels. These systems use a joint management of the re- sources. Applications manage alone their own resources while the system software continues to provide general mechanisms such as protection domains, resource alloca- tion, scheduling, etc. Our work is inspired by all these trends. CARD drivers hide the distributed nature of the cluster disk caches by of- fering the local hosts an interface to a global unified buffer cache (from hereon called cooperative cache). Similar to DSMs, CARDs use a high-level abstraction (disk blocks) to deal with remote resources and cooperative caching algo- rithms [8] to jointly manage the cluster caches. They rely on the low communication latencies of powerful interconnects to minimize block access times. Applications may down- load their own caching policies into the CARD driver. Thus, the kernel provides the block access mechanism (the CARD driver) while applications can specify their block manage- ment policy at will. Different caching policies can be in use at the same time in the cluster, but the set of CARDs of a given disk must employ a common policy. This paper evaluates the performance of CARDs as a distributed storage system. We present and evaluate a sim- ple and efficient decentralized cooperative caching policy, Home-based Serverless Cooperative Caching. We empha- size the flexibility of policy choice in our system by imple- menting and evaluating another cooperative caching algo- rithm, Hash Distributed Caching [8]. 2 Cluster-Aware Remote Disks CARDs are block devices that virtualize remote disk ac- cesses over a SAN. They can be mounted on the local sys- tem as regular block devices. Without cooperative caching, a CARD behaves like a remote disk interface (RD). Every miss in the local buffer cache is checked by the RD also against the remote buffer cache at the physical disk node. A set of CARDs using a common cooperative caching pol-