852 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 43, NO. 4, APRIL 2005
Use of the Bradley–Terry Model to Quantify
Association in Remotely Sensed Images
Alfred Stein, Jagannath Aryal, and Gerrit Gort
Abstract—Thematic maps prepared from remotely sensed im-
ages require a statistical accuracy assessment. For this purpose,
the -statistic is often used. This statistic does not distinguish
between whether one unit is classified as another, or vice versa. In
this paper, the Bradley–Terry (BT) model is applied for accuracy
assessment. This model compares categories pairwise. The prob-
ability of one class over another class is estimated as well as the
expected values of class pixels. The study is illustrated with an Ad-
vanced Spaceborne Thermal Emission and Reflection Radiometer
image from the Netherlands, to which a maximum-likelihood
classification with the Euclidean distance is applied. An error
matrix is generated using an IKONOS image from the same area
as ground truth. It is shown to which degree the BT model extends
the -statistic. A comparison with the Mahalanobis distance
is made. Standardization is carried out to overcome problems
emerging from the fact that a common BT model does not include
the number of correctly classified pixels. The study shows how the
BT model serves as an alternative to the usual -statistic.
Index Terms—Bradley–Terry (BT) model, estimates of parame-
ters, measures of association.
I. INTRODUCTION
R
EMOTELY sensed images are widely used for collection
of spatial information, e.g., for the preparation of maps.
The collection of data by remote sensing is generally more
efficient and cheaper than by direct observation and measure-
ment on the ground. Data collected by sensors, however, may
be affected by atmospheric factors between sensor and the
value reflected at the earth surface, local impurities at the earth
surface, technical deficiencies of sensors, and other factors.
Consequently, the quality of maps produced by remote sensing
needs to be assessed [1]. Classification accuracy refers to the
extent to which the classified image or map corresponds with
the description of a class at the earth surface. This is commonly
described by an error matrix, in which the overall accuracy and
the accuracy of the individual classes can be calculated. The
-statistic may then be used for testing on homogeneity [2].
Its use is somewhat controversial, because its value depends
strongly on the marginal distributions [3]. Indeed, it measures
the strength of agreement without taking into account the
strength of disagreement.
Manuscript received July 22, 2004; revised August 30, 2004.
A. Stein is with International Institute for Geo-Information Science and Earth
Observation (ITC), 7500 AA Enschede, The Netherlands (e-mail: stein@itc.nl).
J. Aryal is with His Majesty’s Government of Nepal, Ministry of Land Re-
form and Management, Kathmandu, Nepal (e-mail: aryal@itc.nl).
G. Gort is with Biometris, Wageningen University, 6700 AC, Wageningen,
The Netherlands.
Digital Object Identifier 10.1109/TGRS.2005.843569
As an alternative, we focus in this study on pairwise com-
parisons, dealing with the structure of agreement and disagree-
ment between categories. An extension to the -statistic is the
Bradley–Terry (BT) model [4]. Based on the logistic regression
model, it may provide more detail upon strength and direction
of disagreemnt. It has been used in the past, but as far as we are
aware never for remote sensing images.
The aim of research described in this paper is to study the use
of the BT model as an alternative measure for association in the
remotely sensed image for the -statistic. To do so, we imple-
mented and interpreted a test of significance of the BT model for
paired preferences in a supervised maximum-likelihood classifi-
cation of classes from an Advanced Spaceborne Thermal Emis-
sion and Reflection Radiometer (ASTER) image 2000. The BT
model is fitted to the error matrix, and a test of significance for
the parameter estimate of every class is carried out.
II. CONCEPTS AND METHODS
A common aim in classifying a multispectral image is to au-
tomatically categorize all pixels in the image into classes [5].
Each pixel in the image is assigned in a Boolean fashion into
one class. As a result, a thematic layer from the multispectral
image emerges. Digital supervised image classification depends
on choices of land cover or land use type as based upon the
study area. A classification algorithm gives a classified thematic
layer after performing the automated classification. Pixel-by-
pixel spectral information is used as the basis for automated land
cover classification. The thematic layer is assessed next in terms
of accuracy by relating the thematic characteristic of a scene
identified objects [6]. Representative samples are selected for
each class from different locations of the image. The amount of
training data usually represents less than 1% to 5% of the pixels
[7].
Several classification algorithms exist [8]. Accuracy assess-
ment of classification is usually carried out by evaluating error
matrices. An error matrix is the square matrix having an equal
number of rows and columns. The columns contain the ref-
erence data, and the rows contain the classified data [2], [7].
On the basis of the error matrix, overall classification accuracy,
producer’s accuracy, and user’s accuracy are calculated. In addi-
tion, the -statistic for each individual category within the ma-
trix can be calculated. In all these cases, the accuracy of any
individual pixel is associated in a strictly Boolean fashion, i.e.,
the classification is either right or wrong.
A. Bradley–Terry Model
The Bradley–Terry (BT) model makes a pairwise compar-
ison among individuals, classes, or categories. It focuses
0196-2892/$20.00 © 2005 IEEE