pubs.acs.org/Biochemistry Published on Web 07/20/2010 r 2010 American Chemical Society 7190 Biochemistry 2010, 49, 7190–7201 DOI: 10.1021/bi101093a Sole and Stable RNA Duplexes of G-Rich Sequences Located in the 5 0 -Untranslated Region of Protooncogenes Sarika Saxena, Daisuke Miyoshi,* ,‡,§ and Naoki Sugimoto* ,‡,§ Frontier Institute for Biomolecular Engineering Research (FIBER) and § Faculty of Frontiers of Innovative Research in Science and Technology (FIRST), 7-1-20 Minatojima-Minamimachi, Chuo-ku, Kobe 650-0047, Japan Received July 9, 2010 ABSTRACT: Guanine- (G-) rich nucleic acid sequences can form four-stranded structures called G-quad- ruplexes. It is widely held that the formation of a G-quadruplex in RNA is more feasible than in DNA because of the lack of a complementary strand in mRNA. Here, we analyzed sequences of 5 0 -untranslated regions of protooncogenes and surprisingly found that these regions showed an enrichment of not only guanine (G) but also cytosine (C) nucleotides. Since neighboring cytosine- (C-) rich regions can affect the formation and stability of a G-quadruplex structure, we further investigated the properties of DNA and RNA structures of G-rich and GC-rich regions. We selected typical GC-rich RNA sequences from protooncogenes and corres- ponding DNA sequences and investigated their structures. It was found that the GC-rich RNA sequences formed stable A-form duplexes as their major structure independent of the surrounding conditions, including the presence of different cations (Na þ ,K þ , or Li þ ) or molecular crowding with 40 wt % poly(ethylene glycol) with an average molecular mass of 200 Da although there are a few exceptions in which only a combination of K þ and molecular crowding induced a G-quadruplex structure of an extremely G-rich RNA sequence. In contrast, structural polymorphisms involving duplexes, G-quadruplexes, and i-motifs were observed for GC-rich DNA sequences depending on the surrounding factors. These results demonstrate the considerable structural and functional differences in GC-rich sequences of the genome (DNA) and transcriptosome (mRNA) with respect to the nucleic acid backbone. Moreover, it was suggested that structural study for a G-rich RNA sequence should be carried out under cell-mimicking condition where K þ and crowding cosolutes exist. DNA is known to form duplexes with Watson-Crick base pairs (1). However, polymorphic structures of DNA have attrac- ted great attention over the past 2 decades (2, 3). In particular, it is well-known that G-rich and C-rich nucleic acid sequences can form four-stranded structures, the G-quadruplex, and i-motif, respectively (4, 5). Sequences with high potential to form DNA G-quadruplexes have been found in many regions of the genome such as telomeres (6), promoter regions of protooncogenes (7), growth factors (8), immunoglobulin switch regions (3), insulin regulatory sequences (9), and the region responsible for fragile X syndrome (10). Recent publications have demonstrated the presence of G-quadruplex-forming sequences throughout the human genome and their enrichment adjacent to transcription start sites, thus emphasizing the biological significance of this structure (11-20). Structural studies using X-ray (21-23) crys- tallography and NMR (24-27) have further revealed that naturally occurring G-rich DNA sequences form G-quadru- plexes with various combinations of strand directions. Moreover, it has been demonstrated that since genomic DNA, except for its telomeres, is double stranded, G-quadruplex formation by G-rich sequences requires the canonical Watson-Crick base pairs to open to form Hoogsteen base pairs (28, 29). These structural transitions among secondary structures modulating gene func- tion are influenced by pH, temperature, cations, and molecu- lar crowding, which are essential as chemical stimuli inside cells (30-33). Not only for DNA, but also for RNA, bioinformatic analysis has revealed numerous G-quadruplex-forming sequences in trans- cribed mRNAs, especially in their 5 0 -untranslated regions (UTR) 1 (11). For example, RNA G-quadruplex formation has been pro- posed in insulin-like growth factor-II (IGF-II) (34), fragile X mental retardation protein (FMR1) (35), fibroblast growth fac- tor 2 (FGF-2) (36), matrix metalloproteinase (MT3MMP) (37), human neuroblastoma RAS viral oncogene (v-ras) (38), and zinc- finger protein (Zic-1) (39). Moreover, RNA G-quadruplex struc- tures are thought to be involved in the regulation of translation both in vitro and in vivo (40-42) and in various other biological functions such as structural roles (43), intron splicing (44), and protein binding (37, 45). Recently, the thermodynamic stability of the RNA G-quad- ruplex at its natural position within the 5 0 -UTR of NRAS has This work was supported in part by Grants-in-Aid for Scientific Research, the “Core Research” project (2009-2014), and the “Academic Frontier” project (2004-2009) from the Ministry of Education, Culture, Sports, Science, and Technology, Japan, the Hirao Taro Foundation of the Konan University Association for Academic Research, and the Long- Range Research Initiative Project of Japan Chemical Industry Association. *To whom correspondence should be addressed. D.M.: phone, þ81-78- 303-1426; fax, þ81-78-303-1495; e-mail, miyoshi@center.konan-u.ac.jp, N.S.: phone, þ81-78-303-1416; fax, þ81-78-303-1495; e-mail, sugimoto@ konan-u.ac.jp. 1 Abbreviations: UTR, untranslated region; CD, circular dichroism; PAGE, polyacrylamide gel electrophoresis; T m , melting temperature; PEG, poly(ethylene glycol); PAGE, polyacrylamide gel electrophoresis; UV, ultravisible.