Chapter 3 From Dirty Data to Credible Scientific Evidence: Some Practices Used to Clean Data in Large Randomised Clinical Trials 1 Claes-Fredrik Helgesson Clean Data, Dirty Data and Data Cleaning There are cleaned data, but the cleaned set is not complete yet. The cleaning is under way. They have also to call sites and monitors. Stefan is, for instance, cleaning access databases. At data-management, excerpt from field-notes [055:001]. The excerpt is from a visit to a company specialised in data management services for large clinical trials. The company specialised in gathering, preparing and analysing data about patients participating in clinical trials and regularly performs these tasks for pharmaceutical companies. What piques my interest here is the term data cleaning, and the metaphors of dirty and clean data that comes with it. In a handbook of clinical trials dirty data is, for instance, defined as ' ... a collection of data that has not been cleaned, checked orĀ· edited, and may therefore contain errors and omissions. See Data cleaning.' (Earl-Slater, 2002). Randomised clinical trials (RCTs) are often described as the gold standard for gaining scientific evidence about drug therapies. Given this strong position, it seems pertinent to take a closer look at the practices involved in solidifying their results. This chapter contributes to such an endeavour by focusing on how data is corrected and verified in large RCTs. Drawing on participant observation 1 The research for this project, 'Market and evidence', was supported by a research grant from The Bank of Sweden Tercentenary Foundation. Earlier versions has been presented at the workshop on evidence-based practice organised by Ingemar Bohlin and Morten Sager at the University of Gothenburg, 19-20 May 2008; at 4S/EASST, Rotterdam, 21-23 August 2008; and at EGOS, Amsterdam, 10-12 July 2008. This chapter has benefited from comments by a number of people who have read and commented different earlier versions. In addition to the participants at the P6 seminar at Technology and Social Change, Linkoping University, I would in particular like to mention Ingemar Bohlin, Ben Heaven, Ericka Johnson, Tiago Moreira, Morten Sager, Catherine Will, and Steve Woolgar.