Kinetic Action: Performance Analysis of Integrated Key-Value Storage
Devices vs. LevelDB Servers
Manas Minglani, Jim Diehl, Xiang Cao
∗
, Bingzhe Li, Dongchul Park
†
, David J. Lilja, and David H.C. Du
University of Minnesota - Twin Cities, Minneapolis, Minnesota, USA
{mingl001, jdiehl, lixx1743, lilja, du}@umn.edu
∗
School of Computing and Information Systems, Grand Valley State University, Allendale, Michigan, USA
caox@gvsu.edu
†
Computer & Electronic Systems Engineering, Hankuk University of Foreign Studies, South Korea
dpark@hufs.ac.kr
Abstract—With the rise of cloud storage and many data
intensive applications, there is an unprecedented growth in the
volume of unstructured data. In response, key-value object
storage is becoming more popular for the ease with which it
can store, manage, and retrieve large amounts of this data.
Seagate recently launched Kinetic direct-access-over-Ethernet
hard drives which incorporate a LevelDB key-value store
inside each drive. In this work, we evaluate these drives using
micro as well as macro benchmarks to help understand the
performance limits, trade-offs, and implications of replacing
traditional hard drives with Kinetic drives in data centers and
high performance systems. We perform in-depth throughput
and latency benchmarking of these Kinetic drives (each acting
as a tiny independent server) from a client machine connected
to them via Ethernet. We compare these results to a SATA-based
and a faster SAS-based traditional server running LevelDB. Our
sample Kinetic drives are CPU-bound, but they still average
sequential write throughput of 63 MB/sec and sequential
read throughput of 78 MB/sec for 1 MB value sizes. They also
demonstrate unique Kinetic features including direct disk-to-disk
data transfer. Our macro benchmarking using the Yahoo Cloud
Serving Benchmark (YCSB) shows that mid-range LevelDB
servers outperform the Kinetic drives for several workloads;
however, this is not always the case. For larger value sizes, even
these first generation sample Kinetic drives outperform a full
server for several different workloads.
Keywords— Performance Evaluation, Data Center Storage
Architecture, Key-Value Store, Cloud Applications
I. I NTRODUCTION
The amount of digital data is growing at an extremely
rapid pace, and it is estimated that its volume will grow at
40%-50% per year [1]. Most of this data explosion is due to
unstructured data. According to predictions from International
Data Corporation (IDC), 80% of the 133 exabytes of global
data growth in 2017 will be unstructured [2]. Managing such
an enormous amount of data is a challenging task.
Most of the data in existing systems is stored and accessed
using traditional file-based or block-based systems [3]. Un-
fortunately, these traditional systems are becoming inefficient
as file-based access and hardware requirements limit their
scalability. Therefore, there is a need for a data access method
that is flexible and capable of horizontal scale-out [4].
Object storage overcomes limitations of file-based systems
by offering scalability and being software defined [5]. The
data is communicated as objects rather than files or blocks,
and additional metadata can be stored alongside the object
[3]. Object storage is flat structured and prevents higher-level
applications from needing to manipulate data at the lowest
level. Therefore, object storage has the flexibility to scale-out
horizontally. When a unique identifier, called a "key," is used
to access the object, or "value," this form of storage can be
called a key-value store (KV store).
There are several key-value stores being deployed to support
large websites such as Dynamo at Amazon [6], Redis at
GitHub [7], and RocksDB at Facebook [8]. All these systems
store ordered <key, value> pairs. Even though key-value stores
address the problem of scaling and managing huge amounts
of data for the above systems, the existing key-value stores
also have several limitations. They run on top of multiple
layers of legacy software and hardware, such as POSIX, RAID
controllers, etc., designed for file-based systems [5]. Also,
these huge systems consume significant amounts of power and
rack space [9].
To help overcome these hardware and software limitations,
Seagate has announced a new class of hard drives called
Kinetic drives [5]. These drives have a built-in processor that
runs a LevelDB-based key-value store directly on the drive
[10]. Rather than the typical SATA or SAS interface, Kinetic
drives communicate externally via TCP/IP over Ethernet. Each
drive acts as a tiny server in itself. An important function
of the drives is direct P2P (Peer-to-Peer) transfer that allows
direct data transfer from one drive to another via Ethernet
without the need to copy data through a storage controller or
other server [11]. By replacing hardware and software layers,
Kinetic drives can reduce the cost and complexity of a large-
scale object storage system.
In this work, we seek to better understand the key function-
alities, features, and performance of the drives. We use this in-
formation to compare Kinetic drives with other LevelDB-based
servers and to derive insights about the possibility of replacing
traditional hard drives with Kinetic drives. Furthermore, the
specification sheets [12] do not provide a detailed analysis
of the throughput, latency, and other features, especially in
comparison to the other LevelDB-based servers.
We also study the ease of programmability of the Kinetic
drives and share our experiences. To the best of our knowledge
there are no prior works which evaluate Kinetic drives this
thoroughly. To understand several salient features including
P2P transfer, "Get," "Put," and others, we develop several tests
501
2017 IEEE 23rd International Conference on Parallel and Distributed Systems
978-1-5386-2129-5/17/31.00 ©2017 IEEE
DOI 10.1109/ICPADS.2017.00072