Self-healing systems — survey and synthesis
Debanjan Ghosh
a
, Raj Sharman
b,
⁎
, H. Raghav Rao
a,b
, Shambhu Upadhyaya
a
a
Department of CSE, SUNY, Buffalo, United States
b
Department of Management Science and Systems, School of Management, SUNY, Buffalo, NY 14260, United States
Received 25 April 2005; received in revised form 9 March 2006; accepted 7 June 2006
Available online 17 August 2006
Abstract
As modern software-based systems and applications gain in versatility and functionality, the ability to manage inconsistent
resources and service disparate user requirements becomes increasingly imperative. Furthermore, as systems increase in complexity,
rectification of system faults and recovery from malicious attacks become more difficult, labor-intensive, expensive, and error-prone.
These factors have actuated research dealing with the concept of self-healing systems. Self-healing systems attempt to “heal”
themselves in the sense of recovering from faults and regaining normative performance levels independently the concept derives from
the manner in which a biological system heals a wound. Such systems employ models, whether external or internal, to monitor system
behavior and use inputs obtaining therefore to adapt themselves to the run-time environment. Researchers have approached this
concept from several different angles this paper surveys research in this field and proposes a strategy of synthesis and classification.
© 2006 Elsevier B.V. All rights reserved.
Keywords: Software engineering designing; Software architecture; Fault tolerance; Self-healing; Decision support systems; Distributed systems;
Adaptive systems; Survivable systems
1. Introduction
An increasingly significant requisite for software-
based systems is the ability to handle resource variability,
ever-changing user needs and system faults. Certain
standard programming practices, such as capacitating
extensive error handling capabilities through exception
catching schemes, do contribute towards rendering sys-
tems fault-tolerant or self-adaptive,
1
however, these meth-
ods are tightly coupled with software codes and are highly
application-specific. Designs that enable software sys-
tems to heal themselves of system faults and to survive
malicious attacks would radically improve the reliability
and consistency of technology in the field. The endeavor
to secure these benefits has originated the concept of self-
healing systems. Self-healing can be defined as the pro-
perty that enables a system to perceive that it is not
operating correctly and, without (or with) human inter-
vention, make the necessary adjustments to restore itself
to normalcy. Healing systems that require human inter-
vention or intervention of an agent external to the system
can be categorized as assisted-healing systems. The key
focus or contrasting idea as compared to dependable
2
Decision Support Systems 42 (2007) 2164 – 2185
www.elsevier.com/locate/dss
⁎
Corresponding author.
E-mail address: rsharman@buffalo.edu (R. Sharman).
1
A system in which the programs will evolve to optimize an
extrinsic fitness function imposed on their environment. Typically
these programs will only be estimated for their fitness, and those with
the best health will be chosen to survive and propagate. For more on
adaptive systems refer to [22].
2
According to Avizienis et al. [3] dependability is an integrating
concept that encompasses availability (readiness for correct service),
reliability (continuity of correct service), safety (absence of catas-
trophic consequences on the users and the environment), integrity
(absence of improper systems alterations) and maintainability (ability
to undergo modifications and repairs).
0167-9236/$ - see front matter © 2006 Elsevier B.V. All rights reserved.
doi:10.1016/j.dss.2006.06.011