Performance Evaluation 70 (2013) 197–211
Contents lists available at SciVerse ScienceDirect
Performance Evaluation
journal homepage: www.elsevier.com/locate/peva
Dynamic software rejuvenation policies in a transaction-based system
under Markovian arrival processes
Hiroyuki Okamura
∗
, Tadashi Dohi
Department of Information Engineering, Graduate School of Engineering, Hiroshima University, 1–4–1 Kagamiyama, Higashi-Hiroshima 739–8527, Japan
article info
Article history:
Available online 31 August 2012
Keywords:
Software aging
Software rejuvenation
Long-run average reward
Power efficiency
Markov decision process
Optimality of rejuvenation policy
abstract
This paper presents a Markov decision process (MDP) formulation for a transaction-based
system with software aging and rejuvenation. In our formulation, the arrival process of
transactions is described as a Markovian arrival process (MAP). In addition, we introduce
a probabilistically degrading processing rate to model the software aging. Furthermore,
the paper focuses on two performance criteria to determine the optimal rejuvenation
strategy: the long-run average reward and the power efficiency. Under these performance
criteria, we formulate the optimality equations of MDPs for the maximization of the
long-run average reward and power efficiency. Numerical experiments show that the
optimal rejuvenation policy has the monotone property, and can be characterized by a
threshold policy with the number of transactions through the sensitivity and statistical
analysis using real traffic and aging data.
© 2012 Elsevier B.V. All rights reserved.
1. Introduction
The concept of software aging and rejuvenation has widely spread to the system design with low-cost fault tolerance
technique. The software aging is caused by aging-related bugs [1]. Such bugs cause performance degradation or a sudden
hang/crash of the system, which is called the software aging phenomenon. Typical examples of software aging are memory
leaks and round-off errors. They lead to the exhaustion of system resources and accumulation of errors.
In general, software aging can be predicted by monitoring elapsed time, workload or other system attributes. Based on
the monitored attributes, a proactive action is feasible such as a system reboot to prevent the performance degradation
and system failure caused by software aging. Such proactive actions are called software rejuvenation. Garbage collection,
flushing operating system kernel tables, reinitializing internal data structures, and hardware reboot are examples of software
rejuvenation [2,3].
Software rejuvenation has been recognized as an important technique for a software application that executes
continuously for long periods of time. One of the significant issues in software rejuvenation is how to determine the time
to rejuvenate, because the rejuvenation needs overhead time. There are mainly two approaches to determine the time for
software rejuvenation: model-based and measurement-based approaches.
Huang et al. [4] provided a seminal work on the model-based approach for the software aging and rejuvenation process
in a real telecommunication billing application. Their model was based on a continuous-time Markov chain (CTMC) with
four states, and focused on the steady-state system availability and the expected operation cost per unit time in steady state.
Since Huang et al.’s work, many authors have discussed software rejuvenation policies from the viewpoint of model-based
analysis [5–11].
On the other hand, Garg et al. [12] tried to characterize and predict software aging by system attributes that can be
observed in real systems. Their approach is called measurement-based analysis. Vaidyanathan et al. [13], Alonso et al. [14],
∗
Corresponding author. Tel.: +81 82 424 7697; fax: +81 82 422 7025.
E-mail address: okamu@rel.hiroshima-u.ac.jp (H. Okamura).
0166-5316/$ – see front matter © 2012 Elsevier B.V. All rights reserved.
doi:10.1016/j.peva.2012.07.004