Statistics & Probability Letters 44 (1999) 221 – 228 www.elsevier.nl/locate/stapro Kernel estimators of the ROC curve are better than empirical Chris J. Lloyd a ; , Zhou Yong b a Australian Graduate School of Management, University of New South Wales, Kensington 2052, Australia b Institute of Applied Mathematics, Academica Sinica, Beijing, People’s Republic of China Received November 1998; received in revised form December 1998 Abstract The receiver operating characteristic (ROC) is a curve used to summarise the performance of a binary decision rule. It can be expressed in terms of the underlying distributions functions of the diagnostic measurement that underlies the rule. Lloyd (1998) has proposed estimating the ROC curve from kernel smoothing of these distribution functions and has presented asymptotic formulas for the bias and standard deviation of the resulting curve estimator. This paper compares the asymptotic accuracy of the kernel-based estimator with the fully empirical estimator. It is shown that the empirical estimator is decient compared to the kernel estimator and that this deciency is unbounded as sample size increases. A simulation study using both unimodal and bimodal distributions indicates that the gains in accuracy are signicant for realistic sample sizes. Kernel-based ROC estimators can now be recommended. c 1999 Elsevier Science B.V. All rights reserved MSC: primary 62G05; 60F17; secondary 62E20; 62G20 Keywords: Relative deciency; Empirical estimator; Kernel estimator; ROC curve 1. Introduction The receiver operating characteristic (ROC) curve is used to describe a diagnostic test which predicts presence or absence of a binary trait, often disease. It is a plot of the “true positive fraction” i.e. the power of the test against the “false positive fraction” i.e. the size of the test. It is commonly supposed that the test is governed by an underlying continuous variable X such that disease is diagnosed if X ¿c. If the distribution function of X is F 1 conditional on disease and F 0 conditional on non-disease then the ROC curve can be written as R(p)=1 - F 1 (F 1 0 (1 - p)); 06p61: When F 0 is dicult to invert, the ROC curve can be displayed by plotting 1 - F 1 (c) against 1 - F 0 (c) for a range of threshold values c. There have been several attempts in the literature to summarise a test’s * Corresponding author. 0167-7152/99/$ - see front matter c 1999 Elsevier Science B.V. All rights reserved PII: S0167-7152(99)00012-7