The Effects of Time Constraints on Test Case Prioritization: A Series of Controlled Experiments Hyunsook Do North Dakota State U. hyunsook.do@ndsu.edu Siavash Mirarab, Ladan Tahvildari U. of Waterloo {smirarab, ltahvild}@uwaterloo.ca Gregg Rothermel U. of Nebraska - Lincoln grother@cse.unl.edu December 6, 2009 Abstract Regression testing is an expensive process used to validate modified software. Test case prioritization techniques improve the cost-effectiveness of regression testing by ordering test cases such that those that are more important are run earlier in the testing process. Many prioritization techniques have been proposed and evidence shows that they can be beneficial. It has been suggested, however, that the time constraints that can be imposed on regression testing by various software development processes can strongly affect the behavior of prioritization techniques. If this is correct, a better understanding of the effects of time constraints could lead to improved prioritization techniques, and improved maintenance and testing processes. We therefore conducted a series of experiments to assess the effects of time constraints on the costs and benefits of prioritization techniques. Our first experiment manipulates time constraint levels and shows that time constraints do play a significant role in determining both the cost-effectiveness of prioritization and the relative cost-benefit tradeoffs among techniques. Our second experiment replicates the first experiment, controlling for several threats to validity including numbers of faults present, and shows that the results generalize to this wider context. Our third experiment manipulates the numbers of faults present in programs to examine the effects of faultiness levels on prioritization, and shows that faultiness level affects the relative cost- effectiveness of prioritization techniques. Taken together, these results have several implications for test engineers wishing to cost-effectively regression test their software systems. These include suggestions about when and when not to prioritize, what techniques to employ, and how differences in testing processes may relate to prioritization cost-effectiveness. Keywords: regression testing, test case prioritization, cost-benefits, bayesian networks, empirical studies. 1 Introduction Software systems that succeed must evolve. Software engineers who enhance and maintain systems, however, run the risk of adversely affecting system functionality. To reduce this risk, engineers rely on regression testing: they rerun test cases from existing test suites, and create and run new test cases, to build confidence that changes have the intended effects and no unintended side-effects. Regression testing is almost universally employed by software organizations [39]. It is important for software quality, but it can also be prohibitively expensive. For example, we are aware of one software development organization that has, for one of its primary products, a regression test suite containing over 30,000 functional test cases that require over 1000 machine hours to execute. Hundreds of hours of engineer time are also needed to oversee this regression