System Co-Design and Data Management for Flash Devices Philippe Bonnet IT University of Copenhagen Denmark phbo@itu.dk Luc Bouganim INRIA and University of Versailles France Luc.Bouganim@inria.fr Ioannis Koltsidas IBM Research, Zurich Switzerland iko@zurich.ibm.com Stratis D. Viglas School of Informatics, University of Edinburgh United Kingdom sviglas@inf.ed.ac.uk ABSTRACT Flash devices are emerging as a replacement for disks. How does this evolution impact the design of data management systems? While flash devices have been available for years, this question is still open. In this tutorial, we share two views on the development of data management systems for flash devices. The first view considers that flash devices introduce so much complexity that it is necessary to recon- sider the strictly layered approach between storage system, operating system and data management system. The sec- ond view considers that data management systems should recognize the complexity of flash devices and leverage the characteristics of different classes of devices for different us- age patterns. Throughout the tutorial, we will cover the data management stack: from the fundamentals of flash technology, through storage for database systems and the manipulation of flash-resident data, to query processing. 1. SYSTEM CO-DESIGN Since the advent of Unix, the stability of disks characteris- tics and interface have guaranteed the timelessness of major database system design decisions, i.e., pages are the unit of IO; random accesses are avoided. Today, the quest for energy proportional systems and the growing performance gap between processors and magnetic disk performance are pushing flash devices as replacements for disks. Indeed, flash devices rely on tens of flash chips wired in parallel that together can deliver hundreds of thou- sands accesses per second with low energy consumption. Flash devices embed a complex software called Flash Trans- lation Layer (FTL) in order to hide flash chip constraints (erase-before-write, limited number of erase-write cycles, se- quential page-writes within a flash block). A FTL provides address translation, wear leveling and strives to hide the Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Articles from this volume were invited to present their results at The 37th International Conference on Very Large Data Bases, August 29th - September 3rd 2011, Seattle, Washington. Proceedings of the VLDB Endowment, Vol. 4, No. 12 Copyright 2011 VLDB Endowment 2150-8097/11/08... $ 10.00. impact of updates and random writes based on observed update frequencies, access patterns, temporal locality. This trend towards flash devices has created a mismatch between the simple disk model that underlies the design of today’s database systems and the complex flash devices of today’s computers. This mismatch results in sub-optimal IO performance, which is costly both in terms of through- put and energy consumption. In fact, a tension exists be- tween the design goals of flash devices and DBMS. Flash device designers aim at hiding the constraints of flash chips to compete with hard disks providers. They also compete with each other, tweaking their FTL to improve overall per- formance, and masking their design decision to protect their advantage. Database designers, on the other hand, have full control over the IOs they issue. What they need is a clear and stable distinction between efficient and inefficient IO patterns to produce a stable (re)design of core database techniques. They might even be able to trade increased com- plexity for improved performance and stable behavior across devices. The goal of the first part of this tutorial is to offer database researchers and practitioners an insight into flash chip man- agement as well as a survey of the constraints and opportu- nities it creates for database system or algorithm designers. We will stress the need for a tighter form of collaboration between database system, operating system and FTL to rec- oncile the complexity of flash chip management with the performance goals of a database system. 2. DATA MANAGEMENT In the near future, commodity and enterprise-level hard- ware is expected to incorporate both flash Solid State Drives (SSDs) and magnetic disks as storage media. In light of this, fundamental principles of data management need to be re- visited, as all existing database systems and algorithms have been designed for disks consisting of rotating platters. However, the term SSD incorporates multiple classes of device. The only major common characteristic of all these devices is their excellent random read performance. The remaining characteristics range within more than two orders of magnitude across different devices. Some SSDs are more than an order of magnitude slower than disks at random writes, while other SSDs dominate disks in both random read and write throughput and latency. The most important