Testing for Imperfect Debugging in Software Reliability ERIC SLUD Mathematics Department, University of Maryland ABSTRACT. This paper continues the study of the software reliability model of Fakhre- Zakeri & Slud (1995), an ``exponential order statistic model'' in the sense of Miller (1986) with general mixing distribution, imperfect debugging and large-sample asymptotics re¯ect- ing increase of the initial number of bugs with software size. The parameters of the model are è (proportional to the initial number of bugs in the software), G( : , ì) (the mixing df, with ®nite dimensional unknown parameter ì, for the rates ë i with which the bugs in the software cause observable system failures), and p (the probability with which a detected bug is instantaneously replaced with another bug instead of being removed). Maximum likelihood estimation theory for (è, p, ì) is applied to construct a likelihood-based score test for large sample data of the hypothesis of ``perfect debugging'' (p = 0) vs ``imperfect'' (p . 0) within the models studied. There are important models (including the Jelinski±Moranda) under which the score statistics with 1=  n p normalization are asymptotically degenerate. These statistics, illustrated on a software reliability data of Musa (1980), can serve nevertheless as important diagnostics for inadequacy of simple models. Key words: consistent asymptotically normal estimator, counting process likelihood, expo- nential order statistic model, failure intensity, identi®ability, imperfect debugging, likelihood ratio, mixture model, score statistic 1. Introduction Many parametric models have been proposed over the last 15 years for software reliability growth in time to failure data during the testing stage of software development. There are excellent surveys of Littlewood (1980), Shantikumar (1983), Langberg & Singpurwalla (1985), Musa et al. (1987) and Xie (1991). Fakhre-Zakeri & Slud (1995) studied a class of modelsÐ the ``exponential order statistic models'' of Miller (1986) with general mixing distribution and imperfect debuggingÐwith respect to simultaneous identi®ability of the mixing distribution function G, the parameter p indicating the probability of re-introducing a fault at the time a bug is ®xed, and the initial number n 0 of bugs. It was found there (a) that these three parameters generally are simultaneously identi®able in the sense of being uniquely determined by the means and variances of the observed counts ( N ( t), 0 < t < ä) of failures up to CPU time on test t, but (b) that in many cases of ®nite dimensional parametric G  G( : , ì), such as the Ã(á, â) form, the parameters ( n 0 , p, ì) are uniquely determined from the expected counts ( EN ( t), 0 < t < ä). In the present paper we show that in many of the cases (b), inference for the parameters can be based on counting process likelihoods resembling those of van Pul (1992). Although the joint maximum likelihood estimators cannot be given in closed form, we describe an explicit score- based hypothesis test for p  0 when G takes the Ã form or has two-point support. We formulate the models and ®x notation in section 2, ®nd the likelihood and give formulas for the score statistics in section 3, establish asymptotic properties of estimators in section 4 and illustrate the tests in section 5. We develop in section 5 a new mixed Jelinski±Moranda and Poisson model to ®t data of Musa (1980) and show how the score statistics indicate lack of ®t of other models. Proofs and technical lemmas are deferred to section 6. # Board of the Foundation of the Scandinavian Journal of Statistics 1997. Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA. Vol 24: 555±572, 1997