186 IEEE SYSTEMS JOURNAL, VOL. 4, NO. 2, JUNE 2010 Hazard Analysis and Validation Metrics Framework for System of Systems Software Safety James Bret Michael, Senior Member, IEEE, Man-Tak Shing, Senior Member, IEEE, Kristian John Cruickshank, and Patrick James Redmond Abstract—Safety-critical software-intensive systems of systems require rigorous verification and validation to ensure that they function as per requirements. Unlike verification, validation is typically an ill-defined activity for software development. This paper presents a well-defined validation metrics framework which uses hazard analysis, and the derived software requirements for mitigating the identified hazards, as proxies in gauging the sufficiency of the software safety requirements early in the soft- ware development process. Moreover, traditional hazard analysis techniques are insufficient to deal with the complexity and size of systems of systems. This paper examines the nature and types of hazards associated with systems of systems and presents a new technique for analyzing one type of emergent hazard known as an interface hazard. Index Terms—Goal question metric, goal structuring notation, hazard analysis, interface hazard, safety, software, system of sys- tems, validation metrics. I. INTRODUCTION S OFTWARE has a growing, even predominant role in au- tomating the decisions taken in safety-critical systems, in- cluding large-scale weapon systems (e.g., command of missile launchers via an engage-on-remote capability). Current and fu- ture generations of military capabilities require dependable sys- tems of systems (SoSs) that integrate multiple software applica- tions along with many physical systems. SoS must address all safety hazards before deploying the final system. One approach to risk management is Hazard Analysis (HA). HA techniques for SoS must address both the size and complexity of a SoS, and in- clude defined practices that allow an engineer to keep pace with the evolution of a SoS over its lifecycle. These practices must cover the full scope of a SoS; they support both design and anal- ysis to clarify what risks and hazards in a SoS may have been Manuscript received August 26, 2009; revised April 05, 2010. Date of pub- lication May 24, 2010; date of current version June 03, 2010. This work was supported in part by a grant from the National Aeronautics and Space Admin- istration. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright annotations thereon. J. B. Michael and M.-T. Shing are with the Department of Computer Science, Naval Postgraduate School, Monterey, CA 93943 USA (e-mail: bmichael@nps. edu; shing@nps.edu). K. J. Cruickshank and P. J. Redmond are with the Royal Australian Air Force, RAAF Base Williams, Laverton Vic., 3027 Australia (e-mail: kristian. cruickshank@defence.gov.au; patrick.redmond@defence.gov.au). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JSYST.2010.2050159 overlooked. HA answers the question for a SoS, “What are the known and possible residual hazards in this system?” Hazard Analysis, as applied in this paper, also plays an im- portant role in the practice of Software System Safety; this prac- tice depends on a set of management and engineering activi- ties from the System Safety and Software Engineering domains. The intent is to identify, analyze, design, and track software mitigation and control of hazards and hazardous functions [1]. Effective analysis of the safety of software requires the engi- neer to work within the system context in which the software is executing. Therefore, the Software System Safety Engineering process typically starts with the System Safety Engineering ac- tivities to identify potential hazards at the System Engineering level. In this case, potential hazards and safety-critical functions are traced through high-level design and architecture, and ends with validation and verification (V&V) of software safety fea- tures required for controlling the hazard casual factors. (Readers can refer to [2] for definitions of error, failure, mishap, hazard, hazard casual factor and risk.) In [3], Weaver identified the sufficiency of hazard identifica- tion and the adequacy of hazard analysis to identify software’s role in causing these hazards as two important considerations in constructing a safety case. He also advocated the use of traceability from the derived software safety requirements to the identified system hazards as an indication of the complete- ness of the set of software safety requirements. Measuring the affects of software on system safety is a relatively unexplored aspect of software engineering. In [4], Basili et al. presented an approach for developing software safety measures to gain early insights into potential safety problems and risks. While the metrics suggested in [4] cannot provide proof or validation of the software safety requirements, they can be used to identify potential weakness in the software system safety process and an associated likelihood that the system will not be safe. This paper addresses the need for management to assess the adequacy of the software safety engineering process. It focuses on the contribution of software toward the safety afforded by SoSs and builds upon Weaver’s and Basili’s work by intro- ducing a means for measuring the sufficiency of software safety requirements with a set of metrics derived from the hazard iden- tification, hazard analysis, and requirements traceability arti- facts. We present a Validation Metrics Framework for manage- ment to gauge the sufficiency of the software safety require- ments early in the software development process. The frame- work describes a validation process in which the software safety engineering team acts as an advocate for safety on behalf of the stakeholders (Fig. 1). It complements the traditional methods of 1932-8184/$26.00 © 2010 IEEE