Modeling Spatial Variation in Leukemia
Survival Data
Robin Henderson, Silvia Shimakura, and David Gorst
In this article we combine ideas from spatial statistics with lifetime data analysis techniques to investigate possible spatial variation in
survival of adult acute myeloid leukemia patients in northwest England. Exploratory analysis suggests both clinically and statistically
signicant variation in survival rates across the region. A multivariate gamma frailty model incorporating spatial dependence is proposed
and applied, with results conrming the dependence of hazard on location.
KEY WORDS: Cancer; Frailty; Geostatistics; Hierarchic model; Latent process; Semiparametric model.
1. INTRODUCTION
Although leukemia survival rates continue to improve as
more effective therapies are introduced, considerable between-
patient heterogeneity remains conditional on treatment and
known prognostic factors (see, e.g., Cassileth et al. 1992;
Estey, Shen, and Thall 2001; Schoch et al. 2001). In this arti-
cle we investigate whether at least part of this heterogeneity
might be linked to spatial effects, using data maintained by the
North West Leukemia Register in the United Kingdom. This
is a high-quality database that holds records of incidence and
subsequent survival status of all leukemia cases in northwest
England. In a previous informal study, Gorst (1995) suggested
that there could be district-to-district variation in survival rates
above and beyond what might be expected to occur by chance
alone. Such a nding, if substantiated, would be of consid-
erable interest. It could be due to patient management differ-
ences between treatment centers, which could have an impor-
tant inuence on future clinical practice, or due to background
variation in population or environmental characteristics, neces-
sitating further epidemiologic study.
We investigate whether the survival distribution for acute
myeloid leukemia (AML) in adults is homogeneous across the
region after allowing for known risk factors. We use regis-
ter data on the 1,043 cases recorded between 1982 and 1998.
AML represents the biggest single category of adult leukemia
in the register. Figure 1 shows residential locations of the
AML cases in the study period, together with the 24 admin-
istrative districts that make up the region. The boxed area is
100 km
120 km, and the numbers are district identiers,
used for later reference. Apparent clustering is of course due
in large part to the population distribution. In this work we
do not discuss the detection and modeling of spatial variation
in disease incidence , for which there are now well-established
methods (Elliott, Wakeeld, Best, and Briggs 2000). Instead,
we concentrate on subsequent survival by extending standard
survival models to the spatial setting. The simple cloropeth
map in Figure 2 of estimated relative risks between districts,
Robin Henderson is Reader, Department of Mathematics and Statistics,
Lancaster University, Lancaster LA1 4YF, U.K. (E-mail: Robin.Henderson@
lancaster.ac.uk). Silvia Shimakura is Lecturer, Departamento de Estatística,
Universidade Federal do Paraná, Caixa Postal 19081, 81531-990, Curitiba,
PR, Brazil (E-mail: Silvia.Shimakura@est.ufpr .br). David Gorst is Consultant
Haematologist, Department of Haematology, Royal Lancaster Inrmary, Lan-
caster LA1 4RP, U.K. This work is supported in part by the CAPES Founda-
tion, Brazil. The authors thank the editor and three reviewers for constructive
comment on earlier versions of this article. Silvia Shimakura was sponsored
by CAPES grant BEX1139/96-7.
explained more fully later, suggests substantial variability
between districts and also some apparent clustering of districts
with similar risks. There seems to be a region of high risk run-
ning from northeast to southwest, with a low-risk region to the
west. To investigate, we adopt a multivariate frailty approach
that incorporates the effects of known covariates, individual
heterogeneity, and spatial traits. Our ultimate goal is to model
possible residual spatial variation in survival after accounting
for known subject-specic prognostic factors and unobserved
individual heterogeneity.
The article is organized as follows. In Section 2 we summa-
rize an initial survival analysis of the AML data using stan-
dard univariate methods based on a Cox model with and with-
out frailty. In Section 3 we investigate possible variation in
survival across the region after allowing for covariate effects,
using a lattice structure based on the 24 districts. We use a
Bayesian hierarchic multivariate gamma model and Markov
Chain Monte Carlo (MCMC) methodology for estimation, and
the deviance information criterion (DIC) (Spiegelhalter et al.
2002) to compare competing models. In Section 4 we take
an alternative approach, using the exact locations of the sub-
jects’ residences rather than knowledge only of their district,
using an additive gamma frailty model that allows a propor-
tion of the total frailty to be explained by a spatially varying
component. We provide closing remarks and conclusions in
Section 5.
2. INITIAL SURVIVAL ANALYSIS
To set the scene, we begin with a standard survival analy-
sis ignoring any spatial variation across the region. Data con-
sist of observation times t , death/censoring indicators Ä, and
covariates x for 1,043 patients. Median survival time was just
over 6 months, though some patients survived for more than
13 years. Some 16% of cases were censored. Complete infor-
mation is available for four covariates: age; sex (0
D
F, 1
D
M);
white blood cell count (WBC) at diagnosis, truncated at 500
units with 1 unit
D
50
10
9
/L; and a measure of deprivation
for the enumeration district of residence. For this, we use the
Townsend score, which is a quantitative measure with a range
of
ƒ
7 to 10 in the AML data, higher values indicating less
afuent areas (Townsend, Phillimore, and Beattie 1988). The
Townsend score is available for each of the 8,131 enumeration
© 2002 American Statistical Association
Journal of the American Statistical Association
December 2002, Vol. 97, No. 460, Application and Case Studies
DOI 10.1198/016214502388618753
965