Journal of Software Testing, Veriﬁcation, and Reliability (to appear) An Empirical Investigation of the Relationship Between Spectra Diﬀerences and Regression Faults Mary Jean Harrold, † Gregg Rothermel, ‡ Kent Sayre, ‡ Rui Wu, * Liu Yi ‡ † College of Computing Georgia Institute of Technology Atlanta, GA 30332 harrold@cc.gatech.edu ‡ Department of Computer Science Oregon State University Corvallis, OR 97331 {grother,ksayre,liuyi}@cs.orst.edu * Department of Computer and Information Science Ohio State University Columbus, OH 43210 rwu@cis.ohio-state.edu Abstract Many software maintenance and testing tasks involve comparing the behaviors of program versions. Program spectra have recently been proposed as a heuristic for use in performing such comparisons. To assess the potential usefulness of spectra in this context an experiment was conducted, examining the relationship between diﬀerences in program spectra and the exposure of regression faults (faults existing in a modiﬁed version of a program that were not present prior to modiﬁcations, or not revealed in previous testing), and empirically comparing several types of spectra. The results reveal that certain types of spectra diﬀerences correlate with high frequency — at least in one direction — with the exposure of regression faults. That is, when regression faults are revealed by particular inputs, spectra diﬀerences are likely also to be revealed by those inputs, though the reverse is not true. The results also suggest that several types of spectra that appear, analytically, to oﬀer greater precision in predicting the presence of regression faults than other, cheaper, spectra may provide no greater precision in practice. These results have ramiﬁcations for future research on, and for the practical uses of, program spectra. Keywords: program spectra, software testing, empirical studies 1 Introduction Various software testing and maintenance tasks require comparisons of the behaviors of program versions. For example, when a program is modiﬁed, regression testing is used to compare the behavior of the modiﬁed version to the behavior of its previous version, in the hope of detecting regression faults – faults existing in a modiﬁed version of a program that were not present prior to modiﬁcations, or not revealed in previous testing. Similarly, when a modiﬁed program fails, the behaviors of versions are compared in the hope of locating the cause of that failure. Tasks such as these constitute a signiﬁcant percentage of the costs of software testing and maintenance, and thus, techniques that reduce the costs of these tasks are valuable. 1