Open Access Research Article Journal of Data Mining in Genomics & Proteomics J o u r n a l o f D a t a M i n i n g i n G e n o m i c s & P r o t e o m i c s ISSN: 2153-0602 Kaur et al., J Data Mining Genomics Proteomics 2016, 7:3 DOI: 10.4172/2153-0602.1000203 J Data Mining Genomics Proteomics ISSN: 2153-0602 JDMGP, an open access journal Volume 7 • Issue 3 • 1000203 Keywords: PPR protein; Legumes; Restorer of fertility like-PPR (RFL); Synteny; P sub-class; Mitochondrion Introduction PPR motifs containing proteins were frst discovered from the genome of Arabidopsis thaliana [1,2] and later reported in other sequenced eukaryotes. PPR proteins have gained importance in context of their role in various RNA processing events such as RNA stabilization, splicing, editing, cleavage and transcriptional activation [3]. Tough PPRs are encoded by nuclear genome, they are mostly targeted to either mitochondria or plastids for their functions [4] and thus play an important role in organeller gene regulation. By using classical genetic screens, number of PPR mutants have been characterized with varied phenotypes ranging from those showing photosynthetic defect [5] to restricted growth [6], defective seed and embryo development [7], aberrant leaf growth [8] and restoration of pollen fertility [9]; implying the role of PPRs as sequence specifc RNA binding proteins in organelles. Other reports also suggest important role of PPR and these includes, abnormal splicing of chloroplast targeted PPR encoding Rpl2 gene in rice resulted in mutant with white stripe leaf (WSL mutant) characterized by enhanced sensitivity to abiotic stresses and chlorotic striations during its early development [10], Rf1A in rice functions in atp6 mRNA editing [11], RPF2 afects mitochondrial nad9 and cox3 mRNAs in arabidopsis [12] and so on. Non plant organisms have very few PPRs whereas great expansion of this gene family via retrotransposition has been observed in plants [13]. Teir number in a particular species could range from less than 30 in eukaryotes (Chlamydomonas reinhardtii) [14] to 1882 members in T. aestivum [15]. PPR proteins are categorized into diferent sub-classes and sub- groups on the basis of the sequence content and arrangement of peptide repeat motifs that constitutes their structural and functional divergence [16]. It is the sequence variability within repeats that provides specifcity to the action of diferent members of this protein family. Te two major sub-classes are denoted as P and PLS. Classical PPRs or P class PPRs are defned as those containing degenerate 35 amino acid peptide motif present in multiple tandem repeats and this sub-class constitutes half of the PPR family in any plant species. PPR motif is known to form two anti-parallel α-helices that interact to produce a helix-turn-helix motif, series of which forms a superhelix with central groove for interaction with RNA [17]. Many P class proteins have special appendages present at C-terminal domain (PRORP, SMR, LAGLIDADG etc.) that confers functional specifcity to proteins due to presence of variable motifs. Proteins with LAGLIDADG motif are involved in catalytic processes due to its similarity with group-1 intron maturases [18] and those with SMR domain are related to MutS2 family which participate in transcription or repair of chloroplast DNA [19]. PRORP (proteinaceous RNaseP) sub-class possess metallonuclease domain which are involved in processing of mitochondrial tRNA, for example arabidopsis PRORP3 protein [20]. Te classical P motif when interspersed by L motifs (36 amino acids) and S motifs (31 amino acids) in triplets constitute PLS sub-class, wherein this ordered association could have variable number of S motif repeats [21]. PLS-PPRs also possess additional C terminal domains designated as E (extended), E + (slightly longer than E domain) and DYW (characterised by Asp-Tyr-Trp triplet at terminating end). Tus, a PLS protein will terminate with either a PPR motif or a non- PPR motif i.e., E motif, EE + motif or EE + DYW motif sequence. Te members of these three sub-groups are mainly involved in RNA editing in chloroplast and mitochondria [22]. *Corresponding author: Kishor Gaikwad, National Research Centre on Plant Biotechnology, Indian Agricultural Research Institute, New Delhi, India, Tel: 011- 25841787/25842789; Fax: +911125843984; E-mail: kish2012@nrcpb.org Received June 30, 2016; Accepted July 11, 2016; Published July 18, 2016 Citation: Kaur P, Verma M, Chaduvula PK, Saxena S, Baliyan N, et al. (2016) Insights into PPR Gene Family in Cajanus cajan and Other Legume Species. J Data Mining Genomics Proteomics 7: 203. doi:10.4172/2153-0602.1000203 Copyright: © 2016 Kaur P, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Insights into PPR Gene Family in Cajanus cajan and Other Legume Species Parampreet Kaur, Mohit Verma, Pavan K Chaduvula, Swati Saxena, Nikita Baliyan, Alim Junaid, Ajay K Mahato, Nagendra Kumar Singh and Kishor Gaikwad* National Research Centre on Plant Biotechnology, Pusa, New Delhi, India Abstract PPR proteins comprises of several hundred members among land plants and govern a fascinating array of functions in organeller genomes that ranges from participation in stabilization of organeller transcripts, RNA editing to fertility restoration of CMS lines. Despite the availability of genome sequences of several legume species, comprehensive cataloguing of members of PPR gene family has not been carried out. In the current study, we identifed 523, 830, 534, 816, 441 and 677 PPR proteins in Cajanus, Glycine, Phaseolus, Medicago, Vigna and Cicer genomes, respectively and their complete in silico categorization was undertaken to classify them into various sub-classes and their localization prediction. Chromosomal coordinates of 271 Cajanus PPR genes were predicted and their homologues were identifed in 5 other legumes revealing extensive genome conservation. PPR genes of all 6 legume species were further probed to identify restorer of fertility-like PPRs (RFLs) on the basis of protein clustering and followed by homology searches to already known Rf-PPR genes. Seventy RFL PPR genes (P sub-class) were identifed and were scrutinized by phylogenetic analysis which revealed extended similarity and common features shared by these RFLs across the species. Some of these RFL PPRs were present as small clusters in Glycine, Phaseolus, Vigna and Cicer genomes. This study has generated a knowledge base about PPR gene family in legumes and opens several avenues for future investigations into their molecular functions, evolutionary relationships and their potential in identifying markers to enable cloning of Rf genes.