16 IEEE SOFTWARE Published by the IEEE Computer Society 0740-7459/10/$26.00 © 2010 IEEE
W
hat if someone argued that one of your
basic conceptions about how to de-
velop software was misguided? What
would it take to change your mind?
That’s essentially the dilemma
faced by advocates of test-driven de-
velopment (TDD). The TDD paradigm argues
that the basic cycle of developing code and then
testing it to make sure it does what it’s supposed
to do—something drilled into most of us from the
time we began learning software development—
isn’t the most effective approach. TDD replaces
the traditional “code then test” cycle. First, you
develop test cases for a small increment of func-
tionality; then you write code that makes those
tests run correctly. After each increment, you
refactor the code to maintain code quality.
1
TDD proponents assert that frequent, incre-
mental testing not only improves the delivered
code’s quality but also generates a cleaner design.
If you haven’t already tried TDD, what data might
convince you to try radically changing your soft-
ware development approach to get those beneits?
Would the experience of a recognized expert help?
In this column, we offer both data regarding
TDD’s effectiveness and the critique of an expert
based on applying it in the ield.
Compiling the Evidence
Our data comes from a study conducted by ive of
us—namely, Burak Turhan, Lucas Layman, Mad-
eline Diep, Forrest Shull, and Hakan Erdogmus.
2
The study was based on a systematic literature
review to aggregate demonstrated evidence about
TDD’s effectiveness. The review searched the lit-
erature from 1999, looking for any study that
provided some quantitative assessment of TDD’s
effectiveness compared to traditional software
development. The search results were iltered for
quality, which left 22 published articles that de-
scribed 33 unique studies.
The review distinguished three types of studies:
■ Controlled experiments compared TDD to
traditional development under controlled con-
ditions to minimize the effects of confound-
ing factors, such as developer experience or
the type of software being developed.
■ Pilot studies reported comparisons under
somewhat realistic conditions but tended to be
of short duration or on small problems.
■ Industry studies reported comparisons regard-
ing TDD’s effectiveness on real projects being
developed for a customer under real commer-
cial pressures.
Reasoning that more rigorous studies might be
fewer in number but should be more trustworthy,
the reviewers deined a category of “high rigor”
studies that met the following conditions:
■ The subjects included only graduate students
or professionals—that is, people who are more
experienced than the general population and
who should behave the most like developers in
industry or government organizations.
■ The study used a TDD process description
that matched the textbook deinition and
Forrest Shull, Grigori Melnik, Burak Turhan, Lucas Layman,
Madeline Diep, and Hakan Erdogmus
What Do We Know about
Test-Driven Development?
voice of evidence
Editor: Forrest Shull ■ Fraunhofer Center for Experimental Software Engineering,
Maryland ■ fshull@fc-md.umd.edu