doi: 10.1111/j.1467-9469.2006.00541.x Board of the Foundation of the Scandinavian Journal of Statistics 2006. Published by Blackwell Publishing Ltd, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA Vol 34: 419–431, 2007 Improving the Efficiency of the Nelson–Aalen Estimator: the Naive Local Constant Estimator MONTSERRAT GUILLEN Department of Econometrics, University of Barcelona JENS P. NIELSEN Festina Lente and University of Copenhagen ANA M. PEREZ-MARIN Department of Econometrics, University of Barcelona ABSTRACT. The Nelson–Aalen estimator is well known to be an asymptotically efficient esti- mator of the cumulative hazard function, see Andersen et al. (Statistical models based on counting processes, Springer-Verlag, New York, 1993) among many others. In this paper, we show that the efficiency of the Nelson–Aalen estimator can be considerably improved by using more information in the estimation process than the traditional Nelson–Aalen estimator uses. While our approach results in a biased estimator, the variance improvement is substantial. By optimizing the balance between the bias loss and the variance improvement, we obtain results on the efficiency gain. Several examples for known failure time distributions are used to illustrate these ideas. Key words: cumulative hazard function, efficiency, Nelson–Aalen estimator, survival analysis 1. Introduction There is a comprehensive knowledge of the main statistical properties of the basic nonpara- metric estimators of the survival function, the hazard rate, the density function and the distribution function. Azzalini (1981), Reiss (1981) and Falk (1983) can be mentioned as examples of contributions to the knowledge of the theoretical properties of the kernel dis- tribution function estimator introduced by Nadaraya (1964). Bandwidth selection is essential in nonparametric estimation. A number of methods have been proposed for selecting the smoothing parameter in kernel density estimation (see, e.g. Rudemo, 1982; Bowman, 1984). Wand & Jones (1995, Ch. 3) give a thorough discussion of those methods. Sarda (1993) and Altman & Leger (1995) studied bandwidth selection for estimating distribution functions. Falk (1983) also gave relative efficiencies of kernel type esti- mators of distribution functions. Bowman et al. (1998) discussed the performance of optimal, data-based methods of band- width choices for distribution functions leading to results which do not have analogues in the context of density estimation. In the discussion they pointed out that ‘care should be taken in cases where the largest survival times are censored. A further issue arises from the fact that survival times are usually greater than zero. This “edge effect” will also require special attention’. In this paper, we will use counting process theory for nonparametric estimation of the cumulative hazard rate function of a duration variable in the context of censored data. Moreover, the behaviour of durations near zero (edge effect) is studied in detail. Both prob- lems were mentioned in Bowman et al. (1998) but have not been addressed before. The Nelson–Aalen estimator, devised by Nelson (1969, 1972) and Aalen (1978), has proved useful for a number of applications in fields including actuarial science, biostatistics, finance