ThinDedup: An I/O Deduplication Scheme that
Minimizes Efficiency Loss due to Metadata Writes
Fan Ni
†
, Xingbo Wu
⊥
, Weijun Li
‡
, Song Jiang
†
†
University of Texas at Arlington, Arlington, Texas, USA
⊥
University of Illinois at Chicago, Chicago, Illionois, USA
‡
Shenzhen Dapu Microelectronics Co. Ltd, Shenzhen, China
†
fan.ni@mavs.uta.edu address,
⊥
wuxb@uic.edu,
‡
liweijun@dputech.com,
†
song.jiang@uta.edu
Abstract—I/O deduplication is an important technique for
saving I/O bandwidth and storage space for storage systems.
However, it requires an additional level of address indirection,
and consequently needs to maintain corresponding metadata.
To meet requirements on data persistency and consistency, the
metadata writing is likely to make deduplication operations much
fatter, in terms of amount of additional writes on the critical
I/O path, than one might expect. In this paper we propose to
compress the data and insert metadata into data blocks to reduce
metadata writes. Assuming that performance-critical data are
usually compressible, we can mostly remove separate writes of
metadata out of the critical path of servicing users’ requests, and
make I/O deduplication much thinner. Accordingly the scheme is
named ThinDedup. In addition to metadata insertion, ThinDedup
also uses persistency of data fingerprints to evade enforcement of
write order between data and metadata. We have implemented
ThinDedup in the Linux kernel as a device mapper target to
provide block-level deduplication. Experimental results show,
compared to existing deduplication schemes, ThinDedup achieves
(much) higher (up to 3X) I/O throughput and lower latency
(reduced by up to 88%) without compromising data persistency.
Index Terms—Deduplication, compression, flush, consistency
I. I NTRODUCTION
I/O Deduplication has been widely used in storage systems
for saving storage space [1]–[4]. With data growth at an
explosive rate, deduplication plays an important role at var-
ious computing environments, including data centers, portable
devices, and cyberphysical systems. Deduplication in primary
storage has “cascading benefits across all tiers”, contributing
to reduction of network and I/O loads to other storage tiers and
of space demand on all the tiers [5]. In addition, advantages
of inline deduplication are also well recognized [3]. It save
disk space and disk bandwidth in the first place.
However, even with these clear benefits inline deduplication
is rarely deployed for performance-sensitive primary storage
systems in production systems [3]. There are two main con-
cerns from the user side. One is degraded read performance
due to compromised locality. When data are deduplicated
on hard disks (HDDs), sequentiality of data layout can be
disrupted, leaving one or even multiple disk seeks during
originally sequential reads. This issue has been well addressed
by the iDedup scheme by performing selective deduplication
to retain data spatial locality [3]. The other concern, which
can be more challenging to tackle, is additional writes for
persistency of deduplication metadata.
Deduplication systems have to maintain mappings from
logical address space exposed to users of the storage system to
the physical address space supported by the storage devices
1
.
In particular the mapping is from a logical block number
(LBN) to a physical block number (PBN) for block-level
deduplication. To service a synchronous write request, the
corresponding address mapping must be persisted onto the
disk even when the data is deduplicated. In addition to address
mappings, there are other metadata whose persistency can be
expensive, including mapping between a data block’s finger-
print and its physical address, and a data block’s reference
count indicating number of logical block addresses mapped to
it. Furthermore, frequency of updating the metadata is high.
In many deduplication systems, out-of-place block writing is
used to enable efficient maintenance of consistency between a
block’s content and its fingerprint [3], [6], [7]. By arranging
the data of the out-of-place writes into a log, slow random
accesses can be turned into fast sequential ones. These benefits
come at the cost of high metadata maintenance cost – every
write causes a new LBA to PBA address mapping.
For high persistency and strong consistency of the system,
immediate persistency of the metadata is required, which poses
significant challenges to use of inline deduplication in the
primary storage. First, the metadata are small compared to
the data block size. Writing them through the block interface
of disks can introduce significant write amplification. Second,
many today’s applications prefer to quickly persist user data to
minimize chances of losing them in a trend where concern on
user experience exceeds that on consumption of resources [8].
This may neutralize the effort of collectively writing metadata
through batched service of requests for high I/O efficiency and
make metadata I/O even more expensive. Third, metadata may
be retained in non-volatile memory without being immediately
persisted on the disks. However, this requires special hardware
support, such as supercapacitor-backed RAM, which may not
be available. It is desired that a general-purpose solution does
not assume availability of such supports while still achieving
similar performance and persistency. Fourth, to maintain crash
consistency between metadata and data, one has to pay extra
1
In this context the physical address is distinct from the one internal to
the storage devices. It refers to a logical address in the linear address space
exposed by the device(s) 978-1-5386-6808-5/18/$31.00 ©2018 IEEE