Self-healing systems survey and synthesis Debanjan Ghosh a , Raj Sharman b, , H. Raghav Rao a,b , Shambhu Upadhyaya a a Department of CSE, SUNY, Buffalo, United States b Department of Management Science and Systems, School of Management, SUNY, Buffalo, NY 14260, United States Received 25 April 2005; received in revised form 9 March 2006; accepted 7 June 2006 Available online 17 August 2006 Abstract As modern software-based systems and applications gain in versatility and functionality, the ability to manage inconsistent resources and service disparate user requirements becomes increasingly imperative. Furthermore, as systems increase in complexity, rectification of system faults and recovery from malicious attacks become more difficult, labor-intensive, expensive, and error-prone. These factors have actuated research dealing with the concept of self-healing systems. Self-healing systems attempt to heal themselves in the sense of recovering from faults and regaining normative performance levels independently the concept derives from the manner in which a biological system heals a wound. Such systems employ models, whether external or internal, to monitor system behavior and use inputs obtaining therefore to adapt themselves to the run-time environment. Researchers have approached this concept from several different angles this paper surveys research in this field and proposes a strategy of synthesis and classification. © 2006 Elsevier B.V. All rights reserved. Keywords: Software engineering designing; Software architecture; Fault tolerance; Self-healing; Decision support systems; Distributed systems; Adaptive systems; Survivable systems 1. Introduction An increasingly significant requisite for software- based systems is the ability to handle resource variability, ever-changing user needs and system faults. Certain standard programming practices, such as capacitating extensive error handling capabilities through exception catching schemes, do contribute towards rendering sys- tems fault-tolerant or self-adaptive, 1 however, these meth- ods are tightly coupled with software codes and are highly application-specific. Designs that enable software sys- tems to heal themselves of system faults and to survive malicious attacks would radically improve the reliability and consistency of technology in the field. The endeavor to secure these benefits has originated the concept of self- healing systems. Self-healing can be defined as the pro- perty that enables a system to perceive that it is not operating correctly and, without (or with) human inter- vention, make the necessary adjustments to restore itself to normalcy. Healing systems that require human inter- vention or intervention of an agent external to the system can be categorized as assisted-healing systems. The key focus or contrasting idea as compared to dependable 2 Decision Support Systems 42 (2007) 2164 2185 www.elsevier.com/locate/dss Corresponding author. E-mail address: rsharman@buffalo.edu (R. Sharman). 1 A system in which the programs will evolve to optimize an extrinsic fitness function imposed on their environment. Typically these programs will only be estimated for their fitness, and those with the best health will be chosen to survive and propagate. For more on adaptive systems refer to [22]. 2 According to Avizienis et al. [3] dependability is an integrating concept that encompasses availability (readiness for correct service), reliability (continuity of correct service), safety (absence of catas- trophic consequences on the users and the environment), integrity (absence of improper systems alterations) and maintainability (ability to undergo modifications and repairs). 0167-9236/$ - see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.dss.2006.06.011