Annals of Software Engineering 4 (1997) 11–29 11 Modeling reliability growth during non-representative testing * Brian Mitchell and Steven J. Zeil Old Dominion University, Department of Computer Science, Norfolk, VA 23529-0162, USA A reliability growth model is presented that permits prediction of operational reliability without requiring that testing be conducted according to the operation profile of the program input space. Compared to prior growth models, this one shifts the observed random variable from interfailure time to a post-mortem analysis of the debugged faults, using order statistics to combine the observed failure rates of faults no matter how those faults were detected. The primary advantages of this model are: • the flexibility it offers to test planners, as the choice of testing method is no longer solely determined by the desire to predict operational reliability, and • more robust experimental designs can be formulated by taking advantage of a wider variety of options for data collection. 1. Introduction Directed testing methods are those methods that seek to manipulate the choice of test inputs so as to increase the probability and/or rate of fault detection. These include most well-known testing methods, including functional and structural testing, data flow coverage, and less widely used methods such as mutation or domain testing [Adrion et al. 1982; White 1987]. Directed testing methods have been criticized for a lack of quantifiable results. Once a prescribed level of coverage has been achieved, what conclusions can actually be drawn about the quality of the tested software? Granted, some of these methods permit one to demonstrate the absence of either finite [DeMillo et al. 1978] or infinite [Zeil 1989] sets of “potential” faults. Still an infinite number of potential faults will remain possible, so an a priori statement of the reliability to be gained from any directed test method appears unlikely. It is partly for this reason that much attention has been given to representa- tive testing, in which test data is chosen to reflect the operational distribution of the program’s inputs. A variety of reliability growth models have been based upon rep- resentative testing [Littlewood 1980; Musa et al. 1987], which provide the desired * This work was supported by grant NAG-1-439 from the NASA Langley Research Center and grant CCR-9312386 from the National Science Foundation. J.C. Baltzer AG, Science Publishers