976 J SCI IND RES VOL 70 NOVEMBER 2011 Journal of Scientific & Industrial Research Vol. 70, November 2011, pp. 976-981 *Author for correspondence E-mail: satyapsingh@yahoo.com; satyapsingh125@gmail.com HaloBase: development of database system for halophilic bacteria and archaea with respect to proteomics, genomics & other molecular traits Hetal Ukani 1 , Megha K Purohit 1 , John J Georrge 2 , Sneha Paul 2 and Satya P Singh 1* 1 Department of Biosciences, Saurashtra University, Rajkot 360 005, India 2 Department of Bioinformatics, Christ College, Vidya Niketan, Rajkot 360 005, India Received 18 February 2011; revised 01 September 2011; accepted 05 September 2011 This study presents HaloBase, a specialized genome database for halophilic bacteria and archaea, which covers molecular aspects and diversity based studies. In HaloBase, 23 organisms were selected and integrated their detailed information consisting more than 55,000 genes, about 50,000 proteins with the functionality to view its 3D structure and about 1,000 structural RNAs. The database, available at http://halobase.info, will provide a platform for storage and analysis of halophiles for research and commercial purposes. Keywords: Data base, Genome, Halophiles, Microsoft access, Proteins Introduction Extremophiles, in general, hold secret about the molecular evolution of life and are being explored from both microbial diversity and biotechnological standpoint 1,2 . Halophilic microbes are one of the most widely studied groups of extremophiles. Over the years, authors have been involved with the halophlic/haloalkaliphilic bacteria from saline habitats of coastal Gujarat, exploring diver- sity, phylogeny, enzymatic profiling and properties of extracellular enzymes 3-10 . In addition to haloalkaliphilic bacteria, salt-tolerant alkaliphilic actinomycetes are also being explored with respect to their diversity and biotechnological potential 11-16 . In recent years, attention is being focused on the entire community of ecological niche using metagenomic approaches 17-20 . This study presents creation of HaloBase database system for halophilic bacteria and archaea, which would be useful to researchers working in microbiology and molecular biology focusing on genes and proteins. HaloBase reflects exhaustive diversity and functional analysis of saline system. Methodology HaloBase contains range of information about genes, proteins, RNAs, genome sequence, protein structure, proteomics and genomics. The database also facilitates easy search on the functionality of data with wide search criteria. It was constructed in Microsoft Access 2007 (http://office.microsoft.com/en-gb/access/default.aspx) for the easy access and portability. Halophilic bacteria and archaea were selected for inclusion of HaloBase. The backend database was created in Microsoft Access and front end was done in ASP (http:// msdn.microsoft.com/en-us/library/aa286483.aspx), which provided easy web access to database for data entry, retrieval and analysis. Data was collected from the various publicly available databases [NCBI (www.ncbi.nlm.nih.gov), PDB (www.pdb.org) and GOLD (http://genomesonline.org)]. Data Input In HaloBase, 23 organisms (Table. 1) were selected and integrated their detailed information consisting of more than 55,000 genes, about 50,000 proteins with the functionality to view their 3D structures and about 1,000 structural RNAs. Whole genome sequence was also collected in 5 different formats for the selected organism under study. It also included functionality to add and edit data to the database through a user friendly web based portal facilitating researchers to update and maintain database system as per their new research and findings. Major aspects taken into consideration are habitat characteristics, biocatalytic potential, salinity and molecular properties, proteomics and genomics. Whole