IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 37, NO. 2, MARCH 2007 249
Automating Scenario Analysis of Human
and System Reliability
Alistair G. Sutcliffe, Member,IEEE, and Andreas Gregoriades
Abstract—The system reliability analyzer tool for analyzing the
reliability of system designs is described and its use illustrated
in a system engineering case study of a naval command and
control system. The performance of systems consisting of human
operators and technology components is assessed by Bayesian
nets, which calculate error probabilities from inputs of agent
properties and environmental conditions. The tool tests scenarios
representing the system design and its operational behavior, which
is modeled as cycles of command and control tasks. The tool
indicates weak points in the scenario sequence and assesses the
reliability of one or more system designs with a set of operational
scenarios and a variety of environmental conditions.
Index Terms—Human factors, system reliability, system re-
quirements and specifications.
I. I NTRODUCTION
P
REVIOUS approaches to assessing human reliability have
employed fault/event trees to diagnose potential failure
points in system operation (e.g., technique for human error
rate prediction (THERP) [44]). Performance-shaping factors
[43] have been used to estimate the probable effect of hu-
man variables such as operator fatigue on system failure [45].
However, several authors have called for a more systematic
approach to estimating human error based on sound models of
psychology [12], [33], [46]. Taxonomies of errors such as slips
and mistakes [33], phenotypes and genotypes [13], and types
of slips and lapses [27] provide a more principled approach,
which have been used to calculate the influence of factors such
as task load, stress, and training on the probability of slip and
mistake errors in system designs with differing complexities
[45]. However, a large number of events can potentially cause
failures, and event tree analysis methods can only address a
small number of causes that are described a priori in the
event/fault tree. Furthermore, hierarchical models hinder the
analysis of multiple interactions between events and system
states that frequently lead to accidents. As Reason [33], [34]
notes, failure has diverse and multiple potential causes arising
from the social/organizational environment, poor maintenance
Manuscript received November 12, 2004; revised June 3, 2005, August 16,
2005, and September 27, 2005. This work was supported in part by EPSRC
Systems Integration in Major Projects (SIMP). This paper was recommended
by Associate Editor J. K. Kuchar.
A. G. Sutcliffe is with the School of Informatics, University of Manchester,
M60 1QD Manchester, U.K. (e-mail: a.g.sutcliffe@man.ac.uk).
A. Gregoriades is with the Surrey Defence Technology Centre, The School
of Management, University of Surrey, GU2 7XH Surrey, U.K. (e-mail:
a.gregoriades@surrey.ac.uk).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TSMCA.2006.886375
of equipment, adverse operating environments, human error,
and poor design. To improve design and prevent, or at least
reduce, the potential for system failure, analysis methods that
estimate the probable influence of multiple factors on system
failure are required.
To address this problem, probabilistic models that combine
environmental and psychological influences on human errors
have been developed [16], [45]. However, these models cannot
account for different types of initiating event that may lead to
system failure. While it is impossible to anticipate all possible
hazardous events that a system may encounter, there has been
increasing interest in using scenarios as test probes for safety
analysis [1], [10]. Scenarios are narratives that describe usage
or operation of a system either drawn from experience of acci-
dents or imagined future situations for system operation. They
usually contain an event sequence with contextual information
that allows the analyst to interpret the likelihood of a design
being successful or failing. Scenarios can be used as test data
to challenge designs and their implicit assumptions by positing
obstacles that might prevent a safe system operation from
being achieved and hence refine system requirements to create
defenses against failure [30], [31], [40]. In a previous work, we
created checklists that probed for potential causes of failure, de-
veloped from Hollnagel’s taxonomy [13] and Reason’s theory
of human error [33], which enabled scenarios to be “walked
through” to evaluate the likelihood of failure in event sequences
[9]. This approach was partially automated by using a pathway
expansion algorithm that detected branch points in an event
sequence and then provided test questions to probe alternative
paths [42]. Unfortunately, this produced too many alternatives
and questions that made the analysis too time consuming.
Bayesian nets (BNs) have been applied to reason about
reliabilities based on the properties of products and devel-
opment processes in several domains ranging from military
vehicles to software [2], [6], [8], [26]. However, to date, BNs
have only been applied to assessing the reliability of designs
based on component properties. No account has been taken of
the operational events the system has to respond to. In this
paper, we describe a software tool that takes the automation
of reliability analysis one step further by reasoning not only
about properties of the design but also how the design interacts
with the environment and the events it has to respond to in
one or more operational scenarios. This paper is organized
in three sections. First, the system reliability analyzer (SRA)
architecture and the BNs are briefly described. This is followed
by a case study in which the tool is applied. Finally, this paper
describes the lessons learned from the validation experiences to
date and discusses future development of our approach.
1083-4427/$25.00 © 2007 IEEE