RESEARCH Open Access Reproducible bioinformatics project: a community for reproducible bioinformatics analysis pipelines Neha Kulkarni 1 , Luca Alessandrì 1 , Riccardo Panero 1 , Maddalena Arigoni 1 , Martina Olivero 2 , Giulio Ferrero 3 , Francesca Cordero 3* , Marco Beccuti 3† and Raffaele A. Calogero 1*† From Italian Society of Bioinformatics (BITS): Annual Meeting 2017 Cagliari, Italy. 05-07 July 2017 Abstract Background: Reproducibility of a research is a key element in the modern science and it is mandatory for any industrial application. It represents the ability of replicating an experiment independently by the location and the operator. Therefore, a study can be considered reproducible only if all used data are available and the exploited computational analysis workflow is clearly described. However, today for reproducing a complex bioinformatics analysis, the raw data and the list of tools used in the workflow could be not enough to guarantee the reproducibility of the results obtained. Indeed, different releases of the same tools and/or of the system libraries (exploited by such tools) might lead to sneaky reproducibility issues. Results: To address this challenge, we established the Reproducible Bioinformatics Project (RBP), which is a non-profit and open-source project, whose aim is to provide a schema and an infrastructure, based on docker images and R package, to provide reproducible results in Bioinformatics. One or more Docker images are then defined for a workflow (typically one for each task), while the workflow implementation is handled via R-functions embedded in a package available at github repository. Thus, a bioinformatician participating to the project has firstly to integrate her/his workflow modules into Docker image(s) exploiting an Ubuntu docker image developed ad hoc by RPB to make easier this task. Secondly, the workflow implementation must be realized in R according to an R-skeleton function made available by RPB to guarantee homogeneity and reusability among different RPB functions. Moreover she/he has to provide the R vignette explaining the package functionality together with an example dataset which can be used to improve the user confidence in the workflow utilization. Conclusions: Reproducible Bioinformatics Project provides a general schema and an infrastructure to distribute robust and reproducible workflows. Thus, it guarantees to final users the ability to repeat consistently any analysis independently by the used UNIX-like architecture. Keywords: Reproducible research, Docker, Whole transcriptome sequencing, microRNA sequencing, Chromatin Immuno precipitation sequencing, Community, Single nucleotide variants * Correspondence: francesca.cordero@unito.it; raffaele.calogero@unito.it † Marco Beccuti and Raffaele A. Calogero contributed equally to this work. 3 Department of Computer Sciences, University of Torino, Torino, Italy 1 Department of Molecular Biotechnology and Health Sciences, University of Torino, Torino, Italy Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Kulkarni et al. BMC Bioinformatics 2018, 19(Suppl 10):349 https://doi.org/10.1186/s12859-018-2296-x