De Novo Transcriptome Sequencing Reveals Important Molecular Networks and Metabolic Pathways of the Plant, Chlorophytum borivilianum Shikha Kalra 1 , Bhanwar Lal Puniya 2 , Deepika Kulshreshtha 2 , Sunil Kumar 1 , Jagdeep Kaur 1 , Srinivasan Ramachandran 2 , Kashmir Singh 1 * 1 Department of Biotechnology, Panjab University, Chandigarh, India, 2 G N Ramachandran Knowledge Center, Institute of Genomics and Integrative Biology (Council of Scientific and Industrial Research), New Delhi, India Abstract Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina’s HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum. Citation: Kalra S, Puniya BL, Kulshreshtha D, Kumar S, Kaur J, et al. (2013) De Novo Transcriptome Sequencing Reveals Important Molecular Networks and Metabolic Pathways of the Plant, Chlorophytum borivilianum. PLoS ONE 8(12): e83336. doi:10.1371/journal.pone.0083336 Editor: Luis Herrera-Estrella, Centro de Investigacio ´ n y de Estudios Avanzados del IPN, Mexico Received April 2, 2013; Accepted November 1, 2013; Published December 23, 2013 Copyright: ß 2013 Kalra et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: Funding came from the Council of Scientific and Industrial Research (CSIR), India (csirhrdg.res.in). Grant number: 38(1338)/12/EMRII. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. * E-mail: kashmirbio@pu.ac.in Introduction Chlorophytum borivilianum is an important species of liliaceae family due to its exceptional medicinal properties. Overexploita- tion and extensive harvesting of the wild strands has threatened its status as ‘endangered’ species by International Union for Conservation of Nature and Natural Resources (IUCN) [1]. Due to its high medicinal properties, the species has been recognized as 26 th among the top priority medicinal plants to be protected and promoted by the Medicinal Plant Board, Government of India. Chlorophytum borivilianum brag the exuberant references in many Ayurvedic classics like Charka Samhita (2 nd century B.C.), Sushrut Samhita (2 nd century A.D.), Raja Nighantu (17 th century A.D.) etc. (http://www.safedmusli.net). Its tubers are used for aphrodisiac, adaptogen, antiageing, health restorative and health promoting purposes. The amalgamation of C. borivilianum leaves with other herbs such as Withania sominifera, Emblica officinalis etc. makes the body resistant against sex related diseases and also delays menopause [2]. The above attributes have made C. borivilianum an essential ingredient in Ayurvedic, Unani and Allopathic formulations. Major phytochemical components reported from the roots of C. borivilianum include steroidal saponins, fructans and fructoligosaccharides (FOS), acetylated mannans, phenolic com- pounds and proteins [3,4]. Steroidal saponins, are considered to be the principal bioactive components responsible for the pharma- cological properties [5] and borivilianosides, furostane type steroidal saponins, have been isolated and characterized from this plant [6,7]. Steroidal saponins are synthesized via the mevalonic acid (MVA) pathway, pervasively operating in cytoplasm [8], or through the newly discovered non-mevalonate pathway (MEP) located in plastids [9,10]. Cyclization of precursor compound, 2, 3-oxidosqualene, involving oxidosqualene cyclase (OSC) com- bined with modifications on steroid skeletons like hydroxylations and glycosylations lead to the formation of various saponins. Several OSC genes like cycloartenol synthase (CAS), lupeol synthase (LS), b- amyrin synthase (b- AS) have been cloned from various plant systems [11,12]. According to the proposed pathway [13], some specific CYP450s and UDP-glycosyltransferases (UGTs) may catalyze the conversion of cycloartenol to various steroidal PLOS ONE | www.plosone.org 1 December 2013 | Volume 8 | Issue 12 | e83336