METHODS The eﬀect of non-random loss to follow-up on group mean estimates in a longitudinal study Ludovic G.P.M. van Amelsvoort 1 , Anna J.H.M. Beurskens 1,2 , IJmert Kant 1 & Gerard M.H. Swaen 1 1 Department of Epidemiology, Maastricht University, Maastricht, The Netherlands; 2 Hogeschool Zuyd, Department of Physiotherapy, Heerlen, The Netherlands Accepted in revised form 29 July 2003 Abstract. Bias due to selective non-response is often neglected in large-scale epidemiological studies. And, although some recent techniques enable adjustment for selective non-response, these are rarely applied. The Maastricht Cohort Study, a study on fatigue at work among 12140 respondents at baseline, enabled us to estimate the degree of bias in a real life data set. After seven subsequent measurements, spanning a 2- year period, 8070 respondents remained in the co- hort. Two traditional ways of presenting longitudinal mean levels (means using all data, and means using only complete cases) are compared with adjusted mean levels, using mixed models. The diﬀerence be- tween the complete case and overall mean levels and the adjusted means were about 2% for the continu- ous fatigue score and 6% for the proportion of fa- tigued cases. For the company mean scores the observed bias due to selective non-response might be as much as 30% for some of the company means for the continuous fatigue score and up to 160% for the estimated number of fatigued cases. We therefore conclude that bias due to selective non-response needs serious attention. Next to making vigorous attempts to minimize longitudinal non-response, the use of statistical adjustment is also recommended. Keywords: Cohort study, Data analysis, Loss to follow up Introduction The eﬀect of selective non-response on parameter estimates in longitudinal studies is most often not mentioned in the description of epidemiological studies and certainly not routinely investigated. It might be that most epidemiologists assume that the bias that would occur due to selective drop out is small if the drop out remains within reasonable lim- its. Most of the studies describing bias due to longi- tudinal non-response have been clinical trials [1, 2]. It is, however, interesting to know if, and to what ex- tent, this bias also occurs in observational studies. Traditionally there are two ways to compare the mean levels between diﬀerent time points: (1) Com- paring the average levels using all available data on per time point (all data mean). (2) Comparing the average levels using only data from individuals with a complete longitudinal record (complete case mean). If for example individuals with high baseline level of the observed parameter were more likely to drop out, one would expect method 1 to lead to an underestimation of this parameter on subsequent measurements and lead to a lower change over time. Method 2 would lead to underestimation of the parameter under study on all measurements, but has the advantage that the time trend will be estimated more accurately. The last decades two new methods have been developed: Mixed model and GEE estimates, as described by Diggle et al. [3] which have the advantage of using all data, and will lead to unbiased estimates as long as the process leading to missing data meets the criteria for missing completely at random (MCAR, as deﬁned by Rubin [4] and Little and Wang [5]: That is that under the MCAR mechanism missing data are not related to any of the values of any of the variables, whether missing or observed.) or even Missing at Random (MAR, as deﬁned by Rubin [4] and Little and Wang [5], that means that given two variables, the probability of non-response depends on only one of the two variables, not on the other.) for mixed model estimates. According to this deﬁnition the drop out might be depended on level of the preceding measurement. This in contrast to MCAR condition, where the missing values have indeed to be missing totally at random. Most studies encountered, as- sumed the impact of bias due to selective non- response to be small. Non-response in longitudinal studies cannot be avoided and a number of methods have been de- scribed to decrease longitudinal drop out [6]. As long as the longitudinal drop out is completely at random (Missing completely at random or MCAR.) diﬀer- ential bias that may aﬀect the study results, is not European Journal of Epidemiology 19: 15–23, 2004. Ó 2004 Kluwer Academic Publishers. Printed in the Netherlands.