Research Article Received 4 June 2012, Accepted 5 December 2013 Published online 13 January 2014 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.6084 A maximum likelihood method for secondary analysis of nested case-control data Agus Salim, a * Ma Xiangmei, b Li Jialiang c and Marie Reilly d Many epidemiological studies use a nested case-control (NCC) design to reduce cost while maintaining study power. Because NCC sampling is conditional on the primary outcome, routine application of logistic regression to analyze a secondary outcome will generally be biased. Recently, many studies have proposed several methods to obtain unbiased estimates of risk for a secondary outcome from NCC data. Two common features of all current methods requires that the times of onset of the secondary outcome are known for cohort members not selected into the NCC study and the hazards of the two outcomes are conditionally independent given the available covariates. This last assumption will not be plausible when the individual frailty of study subjects is not captured by the measured covariates. We provide a maximum-likelihood method that explicitly models the individual frailties and also avoids the need to have access to the full cohort data. We derive the likelihood contribution by respecting the original sampling procedure with respect to the primary outcome. We use propor- tional hazard models for the individual hazards, and Clayton’s copula is used to model additional dependence between primary and secondary outcomes beyond that explained by the measured risk factors. We show that the proposed method is more efficient than weighted likelihood and is unbiased in the presence of shared frailty for the primary and secondary outcome. We illustrate the method with an application to a study of risk factors for diabetes in a Swedish cohort. Copyright © 2014 John Wiley & Sons, Ltd. Keywords: biobank; case-cohort; frailty models; historical controls; registry 1. Introduction The nested case-control (NCC) study design offers impressive cost and time savings compared with a full cohort study, while retaining most of the study power. Moreover, obtaining valid estimates of odds- ratios can be easily performed using conditional logistic regression. However, in addition to the primary outcome for which the study was designed, researchers will often be interested in analyzing a secondary outcome. For example, a variable collected as a predictor of the primary outcome may subsequently be interesting as an outcome. Because the sampling was conditional on the primary outcome (and perhaps also on one or more matching variables), routine application of ordinary or conditional logistic regres- sion for the secondary analysis will generally produce biased estimates of odds-ratios. Until recently, this inability to analyze a secondary outcome has been considered a drawback of the NCC design when compared with other cost-savings designs such as case cohort, where the sub-cohort is selected without regard to any particular outcomes, and thus analysis of multiple outcomes does not require new controls to be selected [1]. Instead, the same sub-cohort is used, and parameters of interest can be estimated using weighted likelihood (WL) method [2]. Recently, several articles have proposed methodology for ana- lyzing a secondary outcome using data collected in NCC studies [3–5]. Salim et al. [4] proposed a WL a Department of Mathematics and Statistics, La Trobe University, Bundoora VIC3086, Australia b Saw Swee Hock School of Public Health, National University of Singapore, 21 Lower Kent Ridge Road, Singapore c Department of Applied Probability and Statistics, National University of Singapore, 21 Lower Kent Ridge Road, Singapore d Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden *Correspondence to: Agus Salim, Department of Mathematics and Statistics, La Trobe University, Bundoora VIC3086, Australia. E-mail: a.salim@latrobe.edu.au 1842 Copyright © 2014 John Wiley & Sons, Ltd. Statist. Med. 2014, 33 1842–1852