American Journal of Bioinformatics Research 2014, 4(1): 11-22
DOI: 10.5923/j.bioinformatics.20140401.03
Functional Characterization of Expressed Sequence Tags
of Bread Wheat (Triticum aestivum) and Analysis of
CRISPR Binding Sites for Targeted Genome Editing
Shailesh Sharma
*
, Santosh Kumar Upadhyay
*
National Agri-Food Biotechnology Institute (Department of Biotechnology, Government of India), C-127, Industrial Area,
S.A.S. Nagar, Phase 8, Mohali, Punjab, 160071, India
Abstract Bread wheat (Triticum aestivum) is one of the leading food crop worldwide. However, functional
characterization of wheat genome is still under progress due to its huge size (~17 Gb). We aimed to contribute in this project
by functional characterization EST sequences. Wheat EST sequences (1.2 million available in the EST database) were
cleaned and assembled into 27268 contigs at stringent parameters. About 89% (24339) contigs were functionally annotated
using BlastX search at NCBI-NR protein database with 10
-5
e-value. The annotated contigs were further classified into Gene
Ontology terms and mapped for KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway using Blast2GO program. A
total of 78827 GO terms and 132 KEGG pathways were identified. Purine, and starch and sucrose metabolism were major
pathways. Inositol phosphate metabolism pathway, responsible for the synthesis of phytic acid (an anti-nutritional
component), was also significantly represented in wheat. We identified 3327 EST-SSRs in 2832 contigs and probable
CRISPR binding sites in each contigs. Further, a hypothetical phytic acid biosynthetic pathway and possible important target
genes to reduce the phytic acid content in wheat by CRISPR-Cas system has also been described. Our study provides the
genetic information about an important food crop as well as method for nutritional improvement using a modern
biotechnology tool.
Keywords Bread wheat, EST, KEGG, GO, CRISPR, Phytic acid
1. Introduction
Bread wheat (Triticum aestivum) is one of the most
important food crop which accounts for ~21% food calories
of 75% word population (Braun, et al. 2010). Figure is
continuously increasing and it is estimated that the demand
of wheat will be double by 2050. On the other hand, changes
in climatic condition might decrease the production of wheat
in coming years (Rosegrant et al. 2010). Introduction of new
genetic and molecular biology tools for genome sequencing
and genome engineering might be very useful in
understanding the wheat biology and improvement in crop
yield along with breeding programs (Wilson et al. 2004;
Upadhyay et al 2013). Bread wheat has one of the most
complicated allohexaploid and largest ~17 Gb genome,
which is about 40 time of the rice genome (Arumuganathan
and Earle 1991). Characterization of such kind of genome is
it-shelf a big challenge; however it is an utmost need.
* Corresponding author:
haitoshailesh@gmail.com (Shailesh Sharma)
santoshnbri@hotmail.com (Santosh Kumar Upadhyay)
Published online at http://journal.sapub.org/bioinformatics
Copyright © 2014 Scientific & Academic Publishing. All Rights Reserved
Expressed sequence tags (EST) are very useful
information about the gene sequence and their expression
(Duggan et al. 1999). EST sequencing of many plant species
or either completed or under way, and they are very useful in
gene discovery (Ewing et al. 1999; Fulton et al. 2002; ;
Hughes et al. 2004; Ronning et al. 2003; Schlueter et al.
2004). Since the sequences of genome is continuously
increasing due to the decrease in sequencing cost, functional
annotation and characterization has become great challenge.
In case of wheat (www.wheatgenome.org), sequencing of
genome is rapidly succeeding, functional characterization of
wheat ESTs can be a quick and complementary approach.
Further, this resource will be highly valuable in crop
improvement program as well as during the annotation of
wheat genome.
Genome manipulation has become very important factor
for crop improvement. Transcription activator like effector
nucleases and Zink finger nucleases has been used for
valuable mutation in plants and other organisms (Chen et al.
2013; Zhang et al. 2010). However, these technologies
require protein engineering and complicated in designing. A
new technology based on prokaryotic type II CRISPR-Cas9
(Clustered regularly interspaced short palindromic
repeat-CRISPR associated protein) system has been reported