Unlinked vital events in census-based longitudinal studies can bias subsequent analysis Dermot O’Reilly*, Michael Rosato, Sheelah Connolly Department of Epidemiology and Public Health, Queen’s University, Mulhouse Building, Royal Victoria Hospital, Grosvenor Road, Belfast BT12 6BJ, Northern Ireland, UK. Accepted 25 May 2007 Abstract Objective: To examine the potential biases arising from the nonlinkage of census records and vital events in longitudinal studies. Study Design and Setting: A total of 56,396 deaths of residents of Northern Ireland in the 4 years after the 2001 Census were linked to the 2001 Census records. The characteristics of matched and nonmatched death records were compared using multivariate logistic regres- sion. Subject attributes were as recorded on the death certificate. Results: In total, 3,392 (6.0%) deaths could not be linked to a census record. Linkage rates were lowest in young adults, males, the unmarried, people living in communal establishments, or living in areas that were more deprived or had recorded low census enumeration. For those aged less than 65 years at census, this linkage would exclude from analysis 20.2% of suicides and 19.7% of deaths by external causes. Conclusion: The nonlinkage of census and death records is a combination of nonenumeration at census and deficient information about the deceased recorded at the time of death. Unmatched individuals may have been more disadvantaged or socially isolated, and analysis based on the linked data set may therefore show some bias and perhaps understate true social gradients. Ó 2008 Elsevier Inc. All rights reserved. Keywords: Nonlinkage bias; Census non-enumeration; Mortality; Longitudinal studies; Study design; Northern Ireland 1. Introduction Longitudinal studies represent something of a gold stan- dard in epidemiological research. Although often difficult to set up and maintain, they eventually provide a potent mechanism for examining the relationship between expo- sure status and a range of outcomes. Increasingly, longitu- dinal studies based on enumerated census populations are becoming the norm. Such studies are particularly efficient as the data, routinely collected government vital statistics, involve no additional responder burden, and nonresponder bias, which plagues longitudinal studies, is not a problem. One such successful study is the United Kingdom (UK) Of- fice for National Statistics Longitudinal Study [1,2]. This is an ongoing decennial linkage of all censuses from 1971 to 2001 for a representative 1% sample of the England and Wales population, coupled with a yearly linkage of routine vital events such as deaths, births, cancer registrations, and migration. All other countries within the UK are currently preparing similar studies: Scotland, linking the 1991 and 2001 Censuses [3]; and Northern Ireland, starting with the 2001 Census [4]. Additionally, the Swiss National Co- hort has linked the 1990 and 2000 Censuses and included mortality data from 1991 to 2005 [5]. Since 1981, New Zealand has linked mortality for the 3 years after each of its quinquennial censuses [6]. Finally, the US National Lon- gitudinal Mortality Study incorporates data from the 1980 Census [7]. The quality of these census-event studies depends on the completeness of the linkage between the census and subse- quent events. Even if the initial cohort is representative, systematic problems with linkage may result in findings that are not. Although linkage rates close to 100% are rou- tinely achieved in countries with successful population reg- istration systems (e.g., Finland, Sweden, or Denmark) [8e10], this is not usually possible in countries such as the UK, which have no such systems in place. Crucially, be- cause complete information from the data sets to be com- bined is generally not available, it is not possible to test the completeness of any census-event linkage [11,12]. This can only be estimated, usually by comparing the actual number of events linked to the data against the expected number in the population, given that the longitudinal study * Corresponding author. Tel.: þ44-02890-632746; fax: þ44-02890- 231907. E-mail address: d.oreilly@qub.ac.uk (D. O’Reilly). 0895-4356/08/$ e see front matter Ó 2008 Elsevier Inc. All rights reserved. doi: 10.1016/j.jclinepi.2007.05.012 Journal of Clinical Epidemiology 61 (2008) 380e385