Journal of Computer Aided Chemistry , Vol.7,125-136 (2006) ISSN 1345-8647 125 Integrated Data Mining of Transcriptome and Metabolome Based on BL-SOM Mitsuru Yano a,c , Shigehiko Kanaya b , Md. Altaf-Ul-Amin b , Ken Kurokawa b , Masami Yokota Hirai c , Kazuki Saito * a,c a Graduate School of Pharmaceutical Science, Chiba University, Inage-ku, Chiba 263-8522, Japan, b Graduate School of Information Science, Nara Institute of Science and Technology, Nara 630-0101, Japan. c RIKEN Plant Science Center, Yokohama 230-0045, Japan. (Received July 10,2006; Accepted August 28, 2006 ) With the recent advances of sophisticated bioinformatics tools and analytical technology, the systems biological approach becomes more realistic to solve biological problems. The aim of this study is to illustrate a feasibility of batch-learning self-organizing map (BL-SOM) for high-throughput analysis of post-genomic data on plant biology. BL-SOM is a modification of the conventional SOM, which provides colored feature maps independent of the order of data input. We conducted the nutritional stress experiments on a model plant Arabidopsis thaliana and applied the BL-SOM to analysis of the data sets of metabolome and transcriptome obtained by Fourier-transform ion cyclotron resonance mass spectrometry (FT-ICRMS) and DNA microarray. BL-SOM analyses of non-targeted metabolome data provided metabolic fingerprinting of global responses to nutritional stress and developments. Using the clustering performance of BL-SOM, we deduced metabolite identification from accurate mass values and elucidated a metabolic mechanism responding to sulfur deficiency. Transcriptome data were also analyzed by BL-SOM, and genes responding to sulfur deficiency were classified according to their expression patterns. The result showed that functionally related genes were clustered in the same or neighbor lattice points. We examined each cluster and deduced putative functions of genes involved in glucosinolate biosynthesis, and the function of some of those genes was identified by biochemical experiments. Our present study suggests that application of BL-SOM to an integrated post-genomic omics data gives great possibilities for more accurate prediction in systems biology. The BL-SOM software is provided freely at our web site (http://prime.psc.riken.jp). Key Words: batch-learning SOM, metabolome, transcriptome, data-mining, plant science * ksaito@faculty.chiba-u.jp Copyright 2006 Chemical Society of Japan