Fault Detection Likelihood of Test Sequence Length
Fevzi Belli, Michael Linschulte
University of Paderborn,
Germany
e-mail:
{belli, linschu}@upb.de
Christof J. Budnik
Siemens Corporation, Corporate Re-
search, Princeton, NJ 08540, USA
e-mail:
christof.budnik@siemens.com
Harald A. Stieber
University of Applied Sciences
Nuremberg, Germany
e-mail:
harald.stieber@ohm-hochschule.de
Abstract— Testing of graphical user interfaces is important
due to its potential to reveal faults in operation and perfor-
mance of the system under consideration. Most existing test
approaches generate test cases as sequences of events of dif-
ferent length. The cost of the test process depends on the
number and total length of those test sequences. One of the
problems to be encountered is the determination of the test
sequence length. Widely accepted hypothesis is that the longer
the test sequences, the higher the chances to detect faults.
However, there is no evidence that an increase of the test se-
quence length really affect the fault detection. This paper in-
troduces a reliability theoretical approach to analyze the
problem in the light of real-life case studies. Based on a relia-
bility growth model the expected number of additional faults
is predicted that will be detected when increasing the length of
test sequences.
Keywords: Software Testing, Graphical User Interfaces,
Event Sequence Graphs, Software Reliability
I. INTRODUCTION
Graphical user interfaces (GUIs) add up to half or more
of the source code in software [1]. Testing GUIs is a diffi-
cult and challenging task for many reasons: First, the input
space possesses a potentially indefinite number of combina-
tions of events. Second, even simple GUIs possess an
enormous number of states due to interaction with the in-
puts. Last but not least, many complex dependencies may
hold between different states of the GUI system, and be-
tween its states and inputs. Test inputs of GUI usually
represent sequences of graphical object activities and/or se-
lections that will operate interactively with the objects such
as Interaction Sequences and Event Sequences in [2, 11, 13,
15].
While testing, the crucial decision is when to stop testing
(test termination problem) [1, 3, 13]. Exercising a set of test
cases, test results can be satisfying, but be limited to these
special test cases. Thus, for the quality judgment of a system
under consideration (SUC) further quantitative arguments
are needed, usually realized by well-defined coverage crite-
ria. Most of the existing approaches are based on test se-
quences to be covered when testing GUI [4, 15]. The present
paper analyzes the dependency of the fault detection from
the length of test sequences. Thus, the question we attempt
to answer is:
To what extent does the likelihood increase to detect
faults if the length of the test sequences is increased?
To answer this question, the analysis considered follow-
ing aspects:
• Suitable software reliability growth models are se-
lected by their appropriateness for predicting the
expected number of additional faults that will be de-
tected when increasing the length of test sequences.
• For our experiments the length of sequences varied
from 2 to 4, defining three groups of test sets which
needed special care for estimating model parame-
ters.
The data used for the reliability analysis performed in
this paper are borrowed from our previous paper [4] which
presented event sequence graphs (ESG) as testing approach
enabling testing with different length of sequences. ESGs,
similar to the concept of event flow graphs [15, 16], are used
for analysis and validation of user interface requirements
prior to implementation and testing of the code. The present
paper chooses ESG notation [2, 3] because it intensively
uses formal, graph-theoretical notions and algorithms which
are developed independently from and prior to event flow
graphs.
Related work is summarized in the next section. Sec-
tion 3 introduces the terminology and notion used in our ap-
proach, and discusses various aspects of software reliability
determination. Section 4 reports on two case studies. The
first one is performed on a public domain software system
for personal music management. The second one is per-
formed on a large commercial online touristic reservation
system called ISELTA. Reliability analysis of the results is
carried out in Section 5. Section 6 concludes the paper
summarizing the results and outlines our research work
planned.
II. RELATED WORK
Methods based on finite-state automata have been used
for long when modeling and validating complex systems,
e.g., for conformance testing [8, 13, 18], as well as for speci-
fication and testing of system behavior [1, 2, 4, 13]. Numer-
ous methods for GUI testing, including convincing empiri-
cal studies to validate the approaches have been introduced
in [9, 15, 16, 19]. These methods are quite different from the
combinatorial ones, e.g., pairwise testing, which requires
that for each pair of input parameters of a system, every
combination of these parameters' valid values must be cov-
ered by at least one test case [21].
A different approach for GUI testing has been intro-
duced in [16] which deploys methods of knowledge engi-
2010 Third International Conference on Software Testing, Verification and Validation
978-0-7695-3990-4/10 $26.00 © 2010 IEEE
DOI 10.1109/ICST.2010.51
402