J P P JOURNAL OF PROTEINS AND PROTEOMICS 8(3), 2017, pp. 159-167 Corresponding Author: Debasisa Mohanty E-mail: deb@nii.res.in Received: September 1, 2017 Accepted: September 20, 2017 Published: September 24, 2017 Review Article DATABASES DEVELOPED IN INDIA FOR BIOLOGICAL SCIENCES Gitanjali Yadav 1 and Debasisa Mohanty 2* 1 National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi-110067, India 2 National Institute of Immunology, Aruna Asaf Ali Marg, New Delhi-110067, India Abstract: The complexity of biological systems requires use of a variety of experimental methods with ever increasing sophistication to probe various cellular processes at molecular and atomic resolution. The availability of technologies for determining nucleic acid sequences of genes and atomic resolution structures of biomolecules prompted development of major biological databases like GenBank and PDB almost four decades ago. India was one of the few countries to realize early, the utility of such databases for progress in modern biology/biotechnology. Department of Biotechnology (DBT), India established Biotechnology Information System (BTIS) network in late eighties. Starting with the genome sequencing revolution at the turn of the century, application of high-throughput sequencing technologies in biology and medicine for analysis of genomes, transcriptomes, epigenomes and microbiomes have generated massive volumes of sequence data. BTIS network has not only provided state of the art computational infrastructure to research institutes and universities for utilizing various biological databases developed abroad in their research, it has also actively promoted research and development (R&D) projects in Bioinformatics to develop a variety of biological databases in diverse areas. It is encouraging to note that, a large number of biological databases or data driven software tools developed in India, have been published in leading peer reviewed international journals like Nucleic Acids Research, Bioinformatics, Database, BMC, PLoS and NPG series publication. Some of these databases are not only unique, they are also highly accessed as reflected in number of citations. Apart from databases developed by individual research groups, BTIS has initiated consortium projects to develop major India centric databases on Mycobacterium tuberculosis, Rice and Mango, which can potentially have practical applications in health and agriculture. Many of these biological databases have also helped in development of novel data mining methods, prediction strategies and data driven application software or web servers. In this article, we give an overview of biological databases developed in India and their impact on data driven research in biology. We also provide some suggestions for planning training programs in biological data science for making transitions to big data revolution in biology by combining advanced techniques like Deep Learning with biological big data. Keywords: Biological big data; Database development; Indian efforts; Biomedical databases; plant databases; BTIS network 1. Introduction A variety of experimental methods with ever increasing sophistication are being used to probe complex biological systems at molecular and atomic resolution. Applications of high-throughput sequencing technologies in biology and medicine for analysis of genomes, transcriptomes, epigenomes and microbiomes have generated massive volumes of sequence data. Advances in mass spectrometry based techniques have also helped in generating huge volumes of proteomic data and together they constitute a major component of big data in biology (Li and Chen, 2014). High-throughput studies have also generated