Handling missing data in Patient Reported Outcome: an application to the Barthel Index on elderly people with fractures Urko Aguirre, José M Quintana, Susana García, Carlota Las-Hayas, Miren Orive, Nerea González, Gemma Navarro Abstract Introduction Medical research studies based on follow-up measurements are often subject to high rates of missingness. This nonresponse may be related to the dropout before the ending of the study, or to the unavailability during one or more measurements. On the other hand, when the information comes from surveys, patients often deny responding or are unable to answer several questions. Consequently, outcome measures and relevant covariates collected in the study achieve important missingness rates (up to 40%) yielding a Complete Case analysis (CC) where only observations without missing values are considered in the analyses. As result, beta estimates and p-values of these exposure factors obtained from the performed statistical methods could be biased [1]. This is a common problem in studies dealing with Patient Reported Outcomes (PROs) tools. Because of that, many researchers use different imputation methods, such as simple imputation methods (mean, regression imputation, K- Nearest Neig! hbours (KNN)) as well as more advanced methods, to avoid losing valuable information from those who are unable or unwilling to answer the questions and to conclude misleading inferences about changes in the mean response over time. The most popular approaches for solving incomplete data are: K-Nearest Neighbours [1] and Multiple Imputation techniques using Markov Chain Monte Carlo (MCMC) [2]. Given a pattern with missing values, and according to a distance metric, K-Nearest Neighbour selects the K closests observations with known values in the variables to be imputed. The replaced value is computed depending of the type of the data: the mode can be used for qualitative data, and the mean for continuous data. Often, the most used distance metric is the Euclidean. From another hand, the MCMC method uses a single chain to produce the imputations. The posterior mode computed from the Expectation-Maximization (EM) algorithm is used as the starting values for the MCMC process. Selection of the appropriate imputation method is a difficulty itself. Available literature about the benefits of each method exists, although it is not conclusive. Methodology Study population This is a prospective study of people older than 65 years old who were attended at the emergency room (ER) of 7 acute hospitals who have suffered a sudden fall and fractured their wrist or hip. They were asked at the time of the fall, retrospectively, about their ability to care theirself previously by means of the Barthel Index (BI) [3] and 6 months after the wrist or hip fracture. This index consists of 10 items with two-three options response scale and it is summarized into a total score ranged from 0 to 100. The higher the BI, the more functional independent is the patient. Statistical analysis We have applied the MCMC and K-NN algorithms and compared them to the CC method. In the case of the K-NN method, we selected K= 10 and the Euclidean distance. The Barthel Index was measured as before and after the joint fracture. The change measured as the difference between the post and pre-fracture was the outcome of interest,