Future Generation Computer Systems 23 (2007) 382–390 www.elsevier.com/locate/fgcs GEMMA — A Grid environment for microarray management and analysis in bone marrow stem cells experiments Francesco Beltrame, Adam Papadimitropoulos, Ivan Porro , Silvia Scaglione, Andrea Schenone, Livia Torterolo, Federica Viti Department of Communication Computer and System Sciences, University of Genoa, Italy Received 31 January 2006; received in revised form 21 April 2006; accepted 2 July 2006 Available online 30 August 2006 Abstract Microarray techniques are successfully used to investigate thousands gene expression profiling in a variety of genomic analyses such as gene identification, drug discovery and clinical diagnosis, providing a large amount of genomic data for the overall research community. A Grid based Environment for distributed Microarray data Management and Analysis (GEMMA) is being built. This platform is planned to pro- vide shared, standardized and reliable tools for managing and analyzing biological data related to bone marrow stem cell cultures, in order to maxi- mize the results of distributed experiments. Different microarray analysis algorithms may be offered to the end-user, through a web interface. A set of modular and independent applications may be published on the portal, and either single algorithms or a combination of them might be invoked by the user, through a workflow strategy. Services may be implemented within an existing Grid computing infrastructure to solve problems con- cerning both large datasets storage (data intensive problem) and large computational times (computing intensive problem). Moreover, experimental data annotation may be collected according to the same rules and stored through the Grid portal, by using a metadata schema, which allows a com- prehensive and replicable sharing of microarray experiments among different researchers. The environment has been tested, so far, as regards per- formance results concerning Grid parallelization of a microarray based gene expression analysis. First results show a very promising speedup ratio. c 2006 Elsevier B.V. All rights reserved. Keywords: Grid; Bioinformatics; Mesenchymal stem cells; Microarray 1. Introduction In recent years, genomics and proteomics have fundamen- tally changed the scientific approach to the study of the molec- ular basis of cell and tissue behaviours both in physiological and pathological conditions, giving a new comprehensive view to the research community. In particular, microarray techniques have been successfully used for thousands of gene expres- sion profiles in a variety of genomic analyses such as gene identification, drug discovery and clinical diagnosis, provid- ing a genome-wide system-level understanding of cellular pro- cesses and transcriptional networks. Experiments based on this technology typically generate overwhelming volumes of data, Corresponding address: Department of Communication, Computer and System Sciences, University of Genoa, Viale Causa 13, 16145 Genova, Italy. Tel.: +39 010 3532789; fax: +39 010 353 2948. E-mail address: pivan@bio.dist.unige.it (I. Porro). unprecedented in biological research, which may become a mine for the overall research community. One of major advantages of microarray screening concerns the delivery, for each biological system used, of a valuable data base which can be interrogated each time new aspects of the system come under investigation. Moreover, microarrays can be used to query many genes at the same time. A single array can contain anywhere from a few hundreds to tens of thousands of spots (e.g. Affymetrix’s HG U133 Plus 2.0 array is able to analyse more than 47,000 transcripts). A huge amount of quantified values is then derived after hybridisation and image processing techniques. However, the great potential of microarrays can be exploited only if many such experiments are done for various biological conditions and for many individuals. Obviously, the amount of data generated by this type of experiment can be excessive, but, at the same time, this may provide great opportunities. Techniques related to microarray experiments allow re- searchers to design experiments with more biological 0167-739X/$ - see front matter c 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.future.2006.07.008