Implementation of a Linux Log-Structured File System with a Garbage Collector Martin Jambor Dept. of Software Engineering Charles University, Prague jamborm@matfyz.cz Tomas Hruby Dept. of Software Engineering Charles University, Prague byjac@matfyz.cz Jan Taus Dept. of Software Engineering Charles University, Prague pan tau@matfyz.cz Kuba Krchak Dept. of Software Engineering Charles University, Prague gkg@matfyz.cz Viliam Holub Dept. of Software Engineering Charles University, Prague holub@dsrg.mff.cuni.cz ABSTRACT In many workloads, most write operations performed on a file system modify only a small number of blocks. The log- structured file system was designed for such a workload, ad- ditionally with the aim of fast crash recovery and system snapshots. Surprisingly, although implemented for Berkeley Sprite and BSD systems, there was no complete implemen- tation for the current Linux kernel. In this paper, we present a complete implementation of the log-structured file system for the Linux kernel, which includes a user-space garbage collector and additional tools. We evaluate the measure- ments obtained in several test cases and compare the results with widely-used ext3. Categories and Subject Descriptors D.4.3 [Operating Systems]: File Systems Management— file organization ; D.4.2 [Operating Systems]: Storage Management—garbage collection ; E.5 [Data]: Files—orga- nization/structure Keywords Log-structured file systems, Linux file systems, garbage col- lection 1. INTRODUCTION As random access memory is getting cheaper and more abun- dant in both personal computers and servers, many more workloads fit entirely into the disk cache. With most of the read requests satisfied by the cache, it is reasonable to optimize file systems primarily for writing. Moreover, quick crash recovery is a common requirement for a production file system and the most common way of achieving it is jour- naling. On the other hand, this technique incurs a write performance penalty[14] because it needs to write a portion of data twice to achieve a consistent metadata state at any given moment[21]. Log-structured file systems (LFS) have been proposed in [15] and first implemented for Berkeley Sprite system[17] in or- der to address these two issues. A log-structured file system writes all new information to a sequential structure referred to as the log, thus minimalizing the number of seeks and allowing fast crash recovery. They can also provide addi- tional functionality not easily implemented by traditional file systems, such as snapshots. Following the Sprite-LFS mentioned above, Seltzer et al.[18] implemented a LFS system for contemporary BSD systems in 1993. Unfortunately, over the course of time it has been removed from FreeBSD and OpenBSD. It is still present in NetBSD but it appears to be no longer completely functional as of NetBSD 2.0.2[22]. There have been several attempts to write LFS for Linux, but all except one have been aban- doned without achieving their goals. The only exception is the currently developed NILFS project[11], but it still lacks working garbage collector which is a vital part of any LFS and also the part that poses most implementation issues. In the end of the day, there has not been a working implemen- tation of a traditional 1 LFS for an open-source operating system since 4.4BSD. In this paper, we present a design and an implementation of LFS for Linux 2.6 kernel which takes full advantage of the page cache, has a working garbage collector, uses sophisti- cated data structures for large directories that considerably speed up directory operations, implements snapshots and is capable of fast recovery from a system failure. We concen- trate on those parts that differ from the BSD implemen- tation[18], how the file system is integrated to the current Linux environment and our solutions to the problems en- countered during the implementation of the garbage collec- tor and the segment management in general. We have also done a series of measurements to compare our file system with ext3 [21]. 1 There are LFS for flash-based devices but they pursue dif- ferent goals and are not considered by this paper.