Reliability Analysis of An Energy-Aware RAID System
Shu Yin, Yun Tian, Jiong Xie, and Xiao Qin∗
Department of Computer Science and Software Engineering
Auburn University, Auburn, AL 36849
Email: {szy0004, tianyun, jzx0009, and xqin}@auburn.edu
Mohammed Alghamdi
Department of Computer Science
Al-Baha University, Al-Baha City, Kingdom of Saudi Arabia
Email: mialmushilah@bu.edu.sa
Xiaojun Ruan
Department of Computer Science
West Chester University of Pennsylvania, West Chester, PA 19383
Email: xruan@wcupa.edu
Meikang Qiu
Department of Electrical and Computer Engineering
University of Kentucky, Lexington, Kentucky, 40506
Email: mqiu@engr.uky.edu
Abstract—We develop a mathematical model– MREED– to quantita-
tively evaluate the failure rate of energy-efficient parallel storage systems.
The Power-Aware Redundant Array of Inexpensive Disk (PARAID) aims
to reduce energy use of commodity server-class disks without specialized
hardware. The goal of PARAID is to skewed striping pattern to adapt
to the system load by changing the number of powered disks. By
spinning down disks during light workloads, PARAID can reduce power
consumption, while still meeting performance demands. We show that
MREED can be used to estimate a five-disk PARAID-0 system. We
validate the accuracy of MREED using the DiskSim simulator. Our
approach shows that MREED can rely on file access pattern to estimate
system utilization correctly. Furthermore, even thought PARAID may
achieve reasonable reliability, our model shows that PARAID’s reliability
is affected by data locality.
Keywords-Parallel storage system, RAID, energy-efficient, reliability
I. I NTRODUCTION
Existing reliability models for conventional parallel and distributed
disk systems do not consider energy-saving issues or data-stripping
mechanisms. In this paper, we first study the reliability of a parallel
disk system equipped with the PARAID [1] technique by employing
the M athematical R eliability model for E nergy- E fficient RAID
system called MREED. As a mathematical model, MREED shows its
advantage of presenting the reliability trend of energy-aware storage
systems. However, it is challenging to validate the MREED model.
To address the correctness issue of MREED, we validate the access-
rate-utilization model, which converts file access rate to utilization of
the storage system, in MREED. Finally, we study impacts of the I/O
load skewing technique –gear shifting – on the reliability of PARAID,
a well known energy-aware data stripping storage system.
Existing energy conservation techniques can yield significant en-
ergy savings in disks. While several energy conservation schemes
like cache-based energy-saving approaches normally have marginal
impact on disk reliability, many energy-saving schemes (e.g., dynamic
power management and workload skew techniques) inevitably have
noticeable adverse impacts on storage systems [2][3]. For example,
dynamic power management (DPM) techniques save energy by
using frequent disk spin-downs and spin-ups, which in turn can
shorten disk lifetime [4][5][6], redundancy techniques [7][8][9][10],
workload skew [11][12][13], and multi-speed settings [14][15]. We
pay attention on the reliability issue of RAID systems, existing energy
conservation techniques can not be applied for RAID systems for the
following reasons:
∙ Conventional RAIDs balance I/O load across all disks in the
array for maximized disk parallelisms and performance, meaning
that all disks are spinning even under a light load. No opportu-
nity is offered to spin down any of disks;
∙ Server class disks are not designed for frequent power cycles,
which significantly reduce life expectancy;
∙ Server systems cannot rely on caching and dynamic power
management because the servers are too busy to have long idle
time.
In this paper, our contributions are summaries as follows:
1) We propose a reliability model MREED for Power-Aware
RAID (i.e., an energy aware data-stripping parallel storage
system);
2) We introduce Weibull distribution analysis to MREED. Using
the utilization of a storage system as an input, we can estimate
and forecast the annual failure rate (a.k.a, AFR) of this system;
3) We validate the access-rate-utilization model of MREED;
4) We study the impacts of the gear-shifting schemes on the
reliability of PARAID.
We study impacts of the I/O load skewing technique especially on
PARAID-0, which is an energy-aware RAID-0 system. Experimental
results shows that gear-shifting affects reliability of parallel disks due
to two reasons: First, disks working at all gears tend to have high
I/O utilization than disks that only works at high gears. Second, disks
with high utilization are likely to have high risk of breaking down.
The remainder of this paper is organized as follows. Section II
presents the overview of the MREED model. In Section III, we
apply MREED model to quantitatively estimate the reliability of
PARAID. Secion IV demonstrates a solution to validate access-
rate-utilization model in MREED. Section V presents experimental
results and performance evaluation. In Section VI, the related work is
discussed. Finally, Section VII concludes the paper with discussions.
II. THE MREED MODELING FRAMEWORK
A. Overview
MREED is a framework developed to model reliability of paral-
lel disk systems employing energy conservation techniques. In the
MREED framework, we evaluate the reliability impacts of a specific
energy-saving technique - the Power-Aware RAID. One critical
module in MREED is to model the impact of energy-efficient schemes
on the utilization and power-state transition frequency of each disk
in a parallel disk system. Another important module developed in
MREED is to calculate the annual failure rate of each disk as a
function of the disk’s utilization, power-state transition frequency.
Given the annual failure rate of each disk in the parallel disk system,
MREED is able to derive the reliability of an energy-efficient parallel
disk system. As such, we used MREED to study the reliability of a
parallel disk system equipped with the PARAID technique.
978-1-4673-0012-4/11/$26.00 ©2011 IEEE