Introducing an unbiased software procedure for image checking in a large research institution Enrico M. Bucci 1,2 , Giorgia Adamo 3 , Alessandro Frandi 3 , Cinzia Caporale 3 1. Sbarro Health Research Organization, c/o Temple University 1900 N 12th St - 19122 Philadelphia, PA - US 2. Resis Srl - Via Ivrea, 8/1 -10010 Samone (TO) – Italy 3. Institute of Biomedical Technologies, National Research Council of Italy, Rome, Italy The status of ƌelatiǀe oďjeĐtiǀity attƌiďuted to photogƌaphiĐ doĐuŵeŶts ǁas seǀeƌely ĐhalleŶged in the transition from classical photography to digital imaging, because the same software used for producing and analysing digital images was very early used to retouch the images to be published. While this can be acceptable in principle – for example, intensity calibration of a digital image can be required for a quantitative analysis – it is also true that image manipulation aiming to deceive the readers of a scientific paper unfortunately became extremely easy. The once difficult photographic retouching is today technically available to anyone; thus, an easy prediction would be that illicit manipulation of scientific images should be highly prevalent. In particular, once the original obstacle (i.e. technical feasibility) has been lifted, there are certain conditions which would lead to a higher number of misconduct cases connected to image manipulation, namely: 1) the manipulation confers some strong advantages to the person committing it; 2) the probability of being discovered is low; 3) even after an actual fraud is discovered, the consequences for the offender are mild, if any. Indirect evidence for the hypothesis that fraudulent image manipulations are indeed increasingly common comes from the U.S. Office for Research Integrity (ORI) database. In fact, since the introduction of Photoshop in 1988, the number of ORI cases with questioned images has been growing exponentially. 1 However, image manipulations which surfaced in ORI cases are by definition originating from a tiny selection of research groups – only cases involving US Federal Funding are reported to and considered by ORI – and, even for the population considered, ORI cases are suspected to be only the tip of the iceberg. 2 In recognition of this problem, we thus decided to measure the actual extent of suspect image manipulation in the biomedical literature by performing an unbiased, automated analysis of a large image sample obtained from recent scientific publications, supplemented by expert analysis for verification of the findings. To this aim, we tweaked some home-made software with available open-source and commercial tools, to get an efficient pipeline for the extraction and processing of images from the scientific literature on a bulk scale. Briefly, we proceed through the following steps using the specified tools: a) Using a home-made software to process the different pdf format efficiently, we first performed an image-extraction step, used to extract all figures from each article and then to break down figures into single panels representing the results of specific experiments; b) Using a commercially-available tool to identify duplicated panels, both in the same paper and among the papers included in the sample;