852 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 43, NO. 4, APRIL 2005 Use of the Bradley–Terry Model to Quantify Association in Remotely Sensed Images Alfred Stein, Jagannath Aryal, and Gerrit Gort Abstract—Thematic maps prepared from remotely sensed im- ages require a statistical accuracy assessment. For this purpose, the -statistic is often used. This statistic does not distinguish between whether one unit is classified as another, or vice versa. In this paper, the Bradley–Terry (BT) model is applied for accuracy assessment. This model compares categories pairwise. The prob- ability of one class over another class is estimated as well as the expected values of class pixels. The study is illustrated with an Ad- vanced Spaceborne Thermal Emission and Reflection Radiometer image from the Netherlands, to which a maximum-likelihood classification with the Euclidean distance is applied. An error matrix is generated using an IKONOS image from the same area as ground truth. It is shown to which degree the BT model extends the -statistic. A comparison with the Mahalanobis distance is made. Standardization is carried out to overcome problems emerging from the fact that a common BT model does not include the number of correctly classified pixels. The study shows how the BT model serves as an alternative to the usual -statistic. Index Terms—Bradley–Terry (BT) model, estimates of parame- ters, measures of association. I. INTRODUCTION R EMOTELY sensed images are widely used for collection of spatial information, e.g., for the preparation of maps. The collection of data by remote sensing is generally more efficient and cheaper than by direct observation and measure- ment on the ground. Data collected by sensors, however, may be affected by atmospheric factors between sensor and the value reflected at the earth surface, local impurities at the earth surface, technical deficiencies of sensors, and other factors. Consequently, the quality of maps produced by remote sensing needs to be assessed [1]. Classification accuracy refers to the extent to which the classified image or map corresponds with the description of a class at the earth surface. This is commonly described by an error matrix, in which the overall accuracy and the accuracy of the individual classes can be calculated. The -statistic may then be used for testing on homogeneity [2]. Its use is somewhat controversial, because its value depends strongly on the marginal distributions [3]. Indeed, it measures the strength of agreement without taking into account the strength of disagreement. Manuscript received July 22, 2004; revised August 30, 2004. A. Stein is with International Institute for Geo-Information Science and Earth Observation (ITC), 7500 AA Enschede, The Netherlands (e-mail: stein@itc.nl). J. Aryal is with His Majesty’s Government of Nepal, Ministry of Land Re- form and Management, Kathmandu, Nepal (e-mail: aryal@itc.nl). G. Gort is with Biometris, Wageningen University, 6700 AC, Wageningen, The Netherlands. Digital Object Identifier 10.1109/TGRS.2005.843569 As an alternative, we focus in this study on pairwise com- parisons, dealing with the structure of agreement and disagree- ment between categories. An extension to the -statistic is the Bradley–Terry (BT) model [4]. Based on the logistic regression model, it may provide more detail upon strength and direction of disagreemnt. It has been used in the past, but as far as we are aware never for remote sensing images. The aim of research described in this paper is to study the use of the BT model as an alternative measure for association in the remotely sensed image for the -statistic. To do so, we imple- mented and interpreted a test of significance of the BT model for paired preferences in a supervised maximum-likelihood classifi- cation of classes from an Advanced Spaceborne Thermal Emis- sion and Reflection Radiometer (ASTER) image 2000. The BT model is fitted to the error matrix, and a test of significance for the parameter estimate of every class is carried out. II. CONCEPTS AND METHODS A common aim in classifying a multispectral image is to au- tomatically categorize all pixels in the image into classes [5]. Each pixel in the image is assigned in a Boolean fashion into one class. As a result, a thematic layer from the multispectral image emerges. Digital supervised image classification depends on choices of land cover or land use type as based upon the study area. A classification algorithm gives a classified thematic layer after performing the automated classification. Pixel-by- pixel spectral information is used as the basis for automated land cover classification. The thematic layer is assessed next in terms of accuracy by relating the thematic characteristic of a scene identified objects [6]. Representative samples are selected for each class from different locations of the image. The amount of training data usually represents less than 1% to 5% of the pixels [7]. Several classification algorithms exist [8]. Accuracy assess- ment of classification is usually carried out by evaluating error matrices. An error matrix is the square matrix having an equal number of rows and columns. The columns contain the ref- erence data, and the rows contain the classified data [2], [7]. On the basis of the error matrix, overall classification accuracy, producer’s accuracy, and user’s accuracy are calculated. In addi- tion, the -statistic for each individual category within the ma- trix can be calculated. In all these cases, the accuracy of any individual pixel is associated in a strictly Boolean fashion, i.e., the classification is either right or wrong. A. Bradley–Terry Model The Bradley–Terry (BT) model makes a pairwise compar- ison among individuals, classes, or categories. It focuses 0196-2892/$20.00 © 2005 IEEE