Freewrite: Creating (Almost) Zero-Cost Writes to SSD in Applications Chunyi Liu, †‡ Fan Ni, † Xingbo Wu, † Xiao Zhang, ‡ Song Jiang † † University of Texas at Arlington, Texas, USA ‡ Northwestern Polytechnical University, Xi’an, China † {chun.liu@uta.edu, fan.ni@mavs.uta.edu, xingbo.wu@mavs.uta.edu, song.jiang@uta.edu} ‡ {corey@mail.nwpu.edu.cn, zhangxiao@nwpu.edu.cn} ABSTRACT While flash-based SSDs have much higher access speed than hard disks, they have an Achilles heel, which is the service of write requests. Not only is writing slower than reading but also it can incur expensive garbage collection operations and reduce SSDs’ lifetime. The deduplication technique can help to avoid writing data objects whose contents have been on the disk. A typical object is the disk block, for which a block- level deduplication scheme can help identify duplicate ones and avoid their writing. For the technique to be effective, data written to the disk must not only be the same as those currently on the disk but also be block-aligned. In this work, we will show that many deduplication opportunities are lost due to block misalignment, leading to a substantially large number of unnecessary writes. As case studies, we develop a scheme to retain alignments of the data that are read from the disk in the file modifications by using small additional spaces for two important applications, a log-based key-value store (e.g., FAWN) and an LSM-tree based key-value store (e.g., LevelDB). Our experiments show that the proposed scheme can achieve up to 4.5X and 26% of throughput improvement for FAWN and LevelDB systems, respectively, with a less than 5% space overhead. CCS Concepts •Information systems → Data layout; 1. INTRODUCTION Flash-based SSDs are high-performance storage devices whose access speed is much faster than hard disks. Increas- ingly more applications rely on its sustained high throughput and low latency to provide performance-sensitive services. These applications include databases [17, 14] and key-value stores [9, 10, 2]. In addition to being power efficient, a distinguished characteristic of SSDs is that they have much lower access latency and their throughput is much less sensitive to access sequentiality than hard disks. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. SYSTOR’17, May 22-24, 2017, Haifa, Israel c 2017 ACM. ISBN 978-1-4503-5035-8/17/05. . . $15.00 DOI: http://dx.doi.org/10.1145/3078468.3078471 However, SSDs show an asymmetry between read and write accesses in multiple aspects which complicates use of the device. First, writes are slower than reads. For example, writing and reading a 4KB page may take 200 μs and 25 μs, respectively [11, 1]. Writes entail identification of a write address and updating the mapping table. Second, the flash does not allow page overwrite. A page has to be erased before new content can be written into it. Even worse, the erasure operation has to be conducted at a unit of block, which is much larger than a page (a block can have 64 or more pages [2]), and erasing a block takes around 1.5ms. If the block-level address mapping is used and a write request is issued to overwrite a page in a block, all live pages currently in the block must be migrated to an erased block, adding cost to the service of the write request. Third, writing can lead to expensive garbage collection operations, which in turn can degrade write and read performance. This issue is especially serious with lower-end SSDs whose over- provisioning space is limited and for an SSD which is nearly filled up. Garbage collection operations can also make SSD’s performance less predictable [19]. Fourth, each flash page has as low as a few thousands P/E (program/erase) cycles, excessive writes compromise the SSD’s lifespan [21]. Deduplication is an effective approach to avoid writes to the disk. Though it may compromise sequentiality of file block layout on the disk [22], it is not a concern for SSDs whose performance is less sensitive to the property. In many scenarios, block-level deduplication, which partitions a file into a number of fixed-size blocks and identifies duplicates among the blocks, is preferred to file-level deduplication, which attempts to find identical files, as the former is more likely to find redundancy and thus reduces more writes and saves more space [26]. Furthermore, block-level deduplication can be deployed at a position in the I/O stack close to the storage device [23, 7] or even within an SSD [6], so that the benefit of deduplication can be transparently received by the upper level software/applications. While the block-level deduplication can potentially re- move writes of redundant data to the disk, it requires the redundant data to be of block unit(s) long and block-aligned. A common on-disk data processing is to read files into the memory, insert new data and/or modify/delete existing data in the file, and finally write the updated files back to the disk. Often this process is conducted at byte offset(s) not aligned to the block boundaries. To keep data in a file contiguously laid out, or maintain a compact data layout, one may have to shift data not being updated in the file to make room for new data or eliminate the space left by the deleted data.