Research Article
Received 4 June 2012, Accepted 5 December 2013 Published online 13 January 2014 in Wiley Online Library
(wileyonlinelibrary.com) DOI: 10.1002/sim.6084
A maximum likelihood method for
secondary analysis of nested case-control
data
Agus Salim,
a
*
†
Ma Xiangmei,
b
Li Jialiang
c
and Marie Reilly
d
Many epidemiological studies use a nested case-control (NCC) design to reduce cost while maintaining study
power. Because NCC sampling is conditional on the primary outcome, routine application of logistic regression
to analyze a secondary outcome will generally be biased. Recently, many studies have proposed several methods
to obtain unbiased estimates of risk for a secondary outcome from NCC data. Two common features of all
current methods requires that the times of onset of the secondary outcome are known for cohort members
not selected into the NCC study and the hazards of the two outcomes are conditionally independent given the
available covariates. This last assumption will not be plausible when the individual frailty of study subjects
is not captured by the measured covariates. We provide a maximum-likelihood method that explicitly models
the individual frailties and also avoids the need to have access to the full cohort data. We derive the likelihood
contribution by respecting the original sampling procedure with respect to the primary outcome. We use propor-
tional hazard models for the individual hazards, and Clayton’s copula is used to model additional dependence
between primary and secondary outcomes beyond that explained by the measured risk factors. We show that the
proposed method is more efficient than weighted likelihood and is unbiased in the presence of shared frailty for
the primary and secondary outcome. We illustrate the method with an application to a study of risk factors for
diabetes in a Swedish cohort. Copyright © 2014 John Wiley & Sons, Ltd.
Keywords: biobank; case-cohort; frailty models; historical controls; registry
1. Introduction
The nested case-control (NCC) study design offers impressive cost and time savings compared with a
full cohort study, while retaining most of the study power. Moreover, obtaining valid estimates of odds-
ratios can be easily performed using conditional logistic regression. However, in addition to the primary
outcome for which the study was designed, researchers will often be interested in analyzing a secondary
outcome. For example, a variable collected as a predictor of the primary outcome may subsequently be
interesting as an outcome. Because the sampling was conditional on the primary outcome (and perhaps
also on one or more matching variables), routine application of ordinary or conditional logistic regres-
sion for the secondary analysis will generally produce biased estimates of odds-ratios. Until recently,
this inability to analyze a secondary outcome has been considered a drawback of the NCC design when
compared with other cost-savings designs such as case cohort, where the sub-cohort is selected without
regard to any particular outcomes, and thus analysis of multiple outcomes does not require new controls
to be selected [1]. Instead, the same sub-cohort is used, and parameters of interest can be estimated using
weighted likelihood (WL) method [2]. Recently, several articles have proposed methodology for ana-
lyzing a secondary outcome using data collected in NCC studies [3–5]. Salim et al. [4] proposed a WL
a
Department of Mathematics and Statistics, La Trobe University, Bundoora VIC3086, Australia
b
Saw Swee Hock School of Public Health, National University of Singapore, 21 Lower Kent Ridge Road, Singapore
c
Department of Applied Probability and Statistics, National University of Singapore, 21 Lower Kent Ridge Road, Singapore
d
Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
*Correspondence to: Agus Salim, Department of Mathematics and Statistics, La Trobe University, Bundoora VIC3086,
Australia.
†
E-mail: a.salim@latrobe.edu.au
1842
Copyright © 2014 John Wiley & Sons, Ltd. Statist. Med. 2014, 33 1842–1852