GC constituents and relative codon expressed amino acid composition in cyanobacterial phycobiliproteins Vinod K. Kannaujiya, Rajesh P. Rastogi, Rajeshwar P. Sinha Laboratory of Photobiology and Molecular Microbiology, Centre of Advanced Study in Botany, Banaras Hindu University, Varanasi 221005, India abstract article info Article history: Received 15 October 2013 Received in revised form 17 April 2014 Accepted 12 June 2014 Available online 14 June 2014 Keywords: Amino acids Codon usage Cyanobacteria GC constituents Phycobiliproteins The genomic as well as structural relationship of phycobiliproteins (PBPs) in different cyanobacterial species are determined by nucleotides as well as amino acid composition. The genomic GC constituents inuence the amino acid variability and codon usage of particular subunit of PBPs. We have analyzed 11 cyanobacterial species to explore the variation of amino acids and causal relationship between GC constituents and codon usage. The study at the rst, second and third levels of GC content showed relatively more amino acid variability on the levels of G3 + C3 position in comparison to the rst and second positions. The amino acid encoded GC rich level including G rich and C rich or both correlate the codon variability and amino acid availability. The uctuation in amino acids such as Arg, Ala, His, Asp, Gly, Leu and Glu in α and β subunits was observed at G1C1 position; however, uctuation in other amino acids such as Ser, Thr, Cys and Trp was observed at G2C2 position. The coding selection pressure of amino acids such as Ala, Thr, Tyr, Asp, Gly, Ile, Leu, Asn, and Ser in α and β subunits of PBPs was more elaborated at G3C3 position. In this study, we observed that each subunit of PBPs is codon specic for particular amino acid. These results suggest that genomic constraint linked with GC constituents selects the codon for particular amino acids and furthermore, the codon level study may be a novel approach to explore many problems associated with genomics and proteomics of cyanobacteria. © 2014 Elsevier B.V. All rights reserved. 1. Introduction Cyanobacteria are Gram-negative photoautotrophic organisms and an excellent source of several natural products (Rastogi and Sinha, 2009; Rastogi et al., 2010; Richa et al., 2011). As a result of vast genetic diversity, some rapid analysis is being developed for the monitoring of metabolites as well as phycobilisome composition of cyanobacteria (Parsiegla et al., 2012). Since, cyanobacteria are obligate photosynthetic organisms; the analysis of chemical composition of various components in photosynthetic machinery is indispensable. The fast progress in ge- nome sequencing has opened many new research avenues to explore hidden biochemical and molecular phenomena in cyanobacteria. Pres- ently, more than 41 cyanobacterial strains have been fully sequenced that can be employed for the analysis of gene to amino acid at different codon levels to explore the explanation of many questions regarding the molecular biology. The investigation of concealed information about nu- cleotide variability and related amino acid codons derived from differ- ent gene compositions of cyanobacterial species is now permissible at certain level in contrast to bacterial species. In the bacterial study, it is well dened that the base (G + C) composition correlates with the amino acid composition (Sueoka, 1961). The percentage composition of bacterial GC content can be ranged from 22.5 to 72% in total genome. However, Lighteld et al. (2011) have recently reported variations in GC content ranging from 16.6 to 74.9% in total bacterial genome. In bacterial system, G + C variability results to changes in amino acid composition through codon redundancy at both synonymous and non- synonymous codon changes, more on third position codons in compar- ison to the rst and second ones (D'Onofrio et al., 1991; de Miranda et al., 2000; Harrison and Charlesworth, 2011; Knight et al., 2001; Singer and Hickey, 2000; Wada, 1992; Wilquet and Van de Casteele, 1999). However, mutation inside the synonymous codons has not al- tered amino acid composition, but bias still exists within and between species due to preference of specic codon selection in each amino acid (Agashe et al., 2013; Sharp et al., 1995). The amino acid GARP (Gly, Ala, Arg and Pro) is coded by GC-rich codons (~100% GC content), but still it shows bias in base composition (Bharanidharan et al., 2004; Lobry, 1997; Singer and Hickey, 2000). Nat- ural bias in codon usage invariably promotes heterologous gene expres- sion among nucleotides (Plotkin and Kudla, 2011). Abnormality of the variation is reproducing through redundancy at the third codon position often in a synonymous region of the nucleotide composition, however, the rst and second codon position redundancy also marked smaller amino acid compositional variation (Wada, 1992). Gene 546 (2014) 162171 Abbreviations: PBPs, phycobiliproteins; GC, guanine and cytosine; AT, adenosine and thymine; TM, transmembrane; PS, photosystem; C-PC, cyanobacterial phycocyanin; C-PE, cyanobacterial phycoerythrin; C-APC, cyanobacterial allophycocyanin; kDa, kiloDalton; Sp, species; SD, standard deviation; SE, standard error; P-value, probability- value; R, correlation coefcient. Corresponding author. E-mail addresses: r.p.sinha@gmx.net, rpsinhabhu@gmail.com (R.P. Sinha). http://dx.doi.org/10.1016/j.gene.2014.06.024 0378-1119/© 2014 Elsevier B.V. All rights reserved. Contents lists available at ScienceDirect Gene journal homepage: www.elsevier.com/locate/gene