REMOTE SENS. ENVIRON. 47:362-368 (1994)
Assessing the Classification Accuracy of
Multisource Remote Sensing Data
R. W. Fitzgerald* and B. G. Lees*
Classification accuracy has traditionally been expressed
by the overall accuracy percentage computed from the
sum of the diagonal elements of the error, confusion, or
misclassification matrix resulting from the application of
a classifier. This article assesses the adequacy of the
overall accuracy measure and demonstrates that it can
give misleading and contradictory results. The Kappa test
statistic assesses interclassifier agreement and is applied
in assessing the classification accuracy of two classifiers,
a neural network and a decision tree model on the same
data set. The Kappa statistic is shown to be a more discern-
ing statistical tool for assessing the classification accuracy
of different classifiers and has the added advantage of
being statistically testable against the standard normal
distribution. It gives the analyst better interclass discrimi-
nation that the overall accuracy measure. The authors
recommend that the Kappa statistic be used in preference
to the overall accuracy as a means of assessing classifica-
tion accuracy.
INTRODUCTION
In his timely article on assessing classification accuracy,
Congalton (1991) reviewed a number of interrelated
aspects of classification accuracy. The aspects reviewed
by Congalton include site- and non-site-specific accu-
racy, statistical techniques (the Kappa test statistic) for
comparing between classifiers, ground truthing errors,
the discreteness of classification schemes, spatial auto-
correlation, and sampling issues. Like Congalton, we
believe that the assessment of errors in the classification
of remote sensing and GIS data has been poorly exam-
ined and deserves intensive scrutiny.
*Geography Department, Australian National University, Can-
berra.
Address correspondence to R. W. Fitzgerald, Geography Dept.,
Australian National Univ., Canberra ACT 0200, Australia.
Received 29 April 1992; revised 8 June 1993.
362
The article examines two of the issues raised by
Congalton (1991): the assessment of site-specific accu-
racy by the application of a statistical technique which
was designed to test interclassifier agreement and the
Kappa test statistic itself. In doing so, we demonstrate
that the accepted method of assessing classification ac-
curacy, the overall accuracy percentage, is misleading
especially so when applied at the class comparison level.
The Kappa statistic is shown to be a statistically more
sophisticated measure of interclassifier agreement than
the overall accuracy and gives better interclass discrimi-
nation than the overall accuracy.
The article advances the application of the Kappa
statistic by using it to compare the classification results
of two supervised classifiers, a neural network and deci-
sion tree classifier applied to the same input data set.
Each classifier successfully fused multisource remote
sensing and GIS images into a single classified floristic
land cover image. These two fused images were com-
pared to highlight the relative utility of the Kappa statis-
tic over the overall accuracy percentage especially at the
interclass level.
A backpropagation ne~aral network (Fitzgerald and
Lees, 1992) and a decision tree model (Lees and Ritman,
1991) were applied to the task of floristic land cover
classification. Both classification techniques come from
the artificial intelligence field and have been applied in
many disciplines where traditional statistical classifiers
have been found wanting.
Decision trees are a rule-based classifier and apply
top-down induction to the input data to partition each
input record into a class at the end of the branches
from user-defined decision rules. The resulting classifi-
cation is robust against the underlying statistical distri-
butions of the input variables and easily mixes continu-
ous (remote sensing data) and categorical variables (GIS
attribute data). It does, however, require large numbers
of training sites as a squared function of the number of
decision nodes.
The most popular and robust neural network is
0034-4257 / 94 / $7.00
©Elsevier Science Inc., 1994
655 Avenue of the Americas, New York, NY 10010