Automated Test Case Generation based on Coverage Analysis Tim A. Majchrzak Department of Information Systems University of Muenster Muenster, Germany Email: tima@wi.uni-muenster.de Herbert Kuchen Department of Information Systems University of Muenster Muenster, Germany Email: kuchen@uni-muenster.de Abstract We present a tool for the automated generation of unit tests. It symbolically executes Java bytecode in order to find execution paths through a program. To eciently accomplish this task, it uses constraint solving, choice-point generation and backtracking. As the number of test cases found might be very high and most of them are redundant, we propose a novel way to eliminate test cases based on their contribution to the global coverage of the control- flow and data-flow. Besides discussing the techniques used to achieve this, we present experimental results to prove the feasibility of our approach. 1. Introduction Testing software is a cumbersome and expensive task. Unit testing is an important part of the overall test process as it oers the chance to discover a substantial part of errors in early development phases. Today, software developers and testers have to generate unit tests manually which obviously is a demanding task and often seen as a boring job. Even worse, it is hard to write unit tests that guarantee the desired coverage of the code. Ideally, unit tests could be created automatically with no or little intervention by testers. We hence investigate on automated generation of unit tests. Our tool, Muggl (Muenster generator of glass-box test cases), is based on GlassTT [1] which was designed at our department. Due to changes of fundamental design principles we have rewritten Muggl from scratch. It only incorporates the constraint solver built for GlassTT which has proven to be powerful. With the design changes made, we aim at learning from problems encountered with symbolic execution, while keeping the already known amenities. Along with the main proposal of this paper we will discuss some further advantages over the old tool. Muggl symbolically executes class files containing byte- code, as for example generated by the Java compiler javac. It uses a symbolic implementation of the Java virtual machine (JVM) [2] that oers the ability to treat input parameters of a method as logic variables. Conditional jumps and other instructions that allow alterations of the control flow lead to choice points generated when executed. Using a search algorithm processing the tree of potential paths through the program, Muggl tries to determine sets of parameters for the method to be tested. Each set of parameters found and the corresponding result establish a test case. Since the number of paths through a program is typically infinite, we need means to select a small finite set of representative test cases. In the approach presented, this selection is done based on control- and data-flow coverage. We suggest using control-flow and data-flow data generated from Java bytecode to eliminate redundant test cases. This paper is structured as follows. Section 2 introduces Muggl. In Section 3 we discuss the way we generate control- flow and data-flow information and keep track of the coverage. The elimination algorithm is described in Section 4, followed by experimental results in Section 5. Section 6 presents related work and Section 7 draws a conclusion. 2. Muggl 2.1. Basics Ideas and Architecture Instead of using source code, Muggl processes class-files consisting of Java bytecode [2]. Using bytecode instead of source code has two main advantages. Firstly, optimizations done by the compiler are taken into account. And secondly, many languages can be compiled to Java bytecode. All these languages can be tested using just one tool. As a low-level concept, Java bytecode is similar in concept to Microsoft Common Intermediate Language (CIL) [3] and even – with limitations – to assembler code. Extending the tool to CIL and possibly assembler is hence a future option (see [4] for possible problems). Having no source code available is unproblematic, as bytecode can be linked to source code with constructs like the LineNumberTable attribute [2]. Dynamically analyzing a program requires the program code to be executed in a runtime environment. Since the kind of analysis done by Muggl is sophisticated and employs sym- bolic execution as well as manipulation techniques, we decided against a vendor-supplied runtime environment. Rather than utilizing interfaces to the JVM developed by SUN, Muggl uses its own virtual machine. It implements the JVM specifications for Java 1.4 except for threading. Currently the changes for Java 1.5 and 1.6 are completed. Muggl’s architecture is separated into a GUI and the execu- tion core (Fig. 1). The GUI is designed to be used seamlessly. 2009 Third IEEE International Symposium on Theoretical Aspects of Software Engineering 978-0-7695-3757-3/09 $25.00 © 2009 IEEE DOI 10.1109/TASE.2009.33 259