NMR Observation of a Novel C-Tetrad in the Structure
of the SV40 Repeat Sequence GGGCGG
P. K. Patel, Neel S. Bhavesh, and R. V. Hosur
1
Department of Chemical Sciences, Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai 400 005, India
Received March 3, 2000
We report the NMR structure of the DNA sequence
d-TGGGCGGT in Na
solutions at neutral pH, contain-
ing a repeat sequence from SV40 viral genome. The
structure is a novel quadruplex incorporating the
C-tetrad formed by symmetrical pairing of four Cs via
NH
2
OO
2
H-bonds in a plane. The C-tetrad has a wider
cavity compared to G-tetrads and stacks well over the
adjacent G4-tetrad, but poorly on the G6 tetrad. The
quadruplex helix is largely underwound by 8 –10° com-
pared to B-DNA except at the C5–G6 step. To our
knowledge this is the first report of C-tetrad formation
in DNA structures, and would be of significance from
the point of view of both structural diversity and spe-
cific recognition. © 2000 Academic Press
Key Words: NMR structure; G-quadruplex; C-tetrad;
GGGCGG repeat.
Multistranded DNA structures have assumed great
importance in recent years with the realization that
they play important roles in DNA recombination, rep-
lication, disease control, etc. on the one hand and DNA
packaging inside a living cell on the other (1, 2). Among
these, the quadruplex structures formed by many re-
peat sequences containing stretches of Gs have exhib-
ited great variety, dependence on experimental condi-
tions, especially the cations (3– 6). They have created
new paradigms and underscored the possibility of hith-
erto not understood roles for DNA function. The vari-
ability in the quadruplex structures is seen to be so
large that given a repeat sequence and the experimen-
tal conditions, it is hardly possible to predict the char-
acteristics of the structure. Thus investigations on dif-
ferent sequences under different conditions have led to
the discoveries of many new structural motifs such as,
G:C:G:C tetrad (7, 8), U-tetrad (9), A-tetrad (10, 11),
and T-tetrad (12). We report here the observation of yet
another new motif, namely, the C-tetrad, which we
discovered in the sequence d-TGGGCGGT. The
GGGCGG sequence contained in this DNA is biologi-
cally significant for several reasons: (i) it is a repeat
sequence in Simian virus (SV) 40, playing important
roles in viral encapsidation (13, 14), (ii) it is a target for
many anti cancer drugs (13), (iii) it is the recognition
sequence of the SP1 transcription factor (15) and (iv) it
is a very common sequence in CpG islands in verte-
brate genomes (16).
MATERIALS AND METHODS
DNA samples. The oligonucleotide was synthesized on an applied
Bio-systems 392 automated DNA synthesizer on 10 M scale using
solid phase -cyanoethyl phosphoramidite chemistry, cleaved from
support and purified by standard procedures (17, 18). The NMR
sample was prepared at a monomer strand concentration range of
1–2 mM in 0.6 ml (90% H
2
O/10%D
2
O) having 10 mM sodium phos-
phate, 0.2 mM EDTA, pH 7.0, and 200 mM NaCl. For experiments in
D
2
O, the same sample was repeatedly lyophilized from D
2
O.
NMR data acquisition and processing. NMR data were obtained
on a VARIAN UNITY-plus 600 spectrometer. Temperature depen-
dence one-dimensional spectra (-5–50°C) and NOESY spectra in
H
2
O were recorded using jump-and-return pulse sequence (19) for
H
2
O suppression. Phase sensitive NOESY (20) and TOCSY (21)
spectra in D
2
O were recorded with mixing times of 80, 100, 200, and
300 ms for NOESY and 60 and 20 ms for TOCSY. A DQF-COSY was
recorded for coupling constant estimation. In all the 2D experiments,
the time domain data consisted of 2048 complex points in t
2
and
400 – 600 fids (free induction decay signals) in t
1
dimension. The
VARIAN data were processed using VNMR, Felix-230 and Felix-97
software on an IRIS workstation. The data were apodized by shifted
(60 –90°) sine bell functions prior to 2D Fourier transformations.
Experimental restraints. The cross-peaks in the NOESY spectra
in D
2
O were integrated and the intensities in a low mixing time
NOESY were translated into interproton distances using the initial
rate approximation using CH5–CH6 cross-peak intensity as the ref-
erence (2.46 Å). Then, these cross-peaks in different spectra were
classified as strong, medium and weak according to their relative
intensities and the interproton distances were restrained with upper
and lower bounds of 0.2, 0.5, and 1.0 Å from their calculated
distances, respectively. The narrow bounds were mostly on strong
intranucleotide cross-peaks for which the possible distance ranges
are small and known. For the sequential internucleotide NOEs, loose
Abbreviations used: DNA, deoxyribonucleic acid, NMR, nuclear
magnetic resonance, NOESY, nuclear Overhauser enhancement
spectroscopy, TOCSY, total correlation spectroscopy, DQF-COSY,
double quantum filtered correlation spectroscopy, IRMA, iterative
relaxation matrix analysis; MD, molecular dynamics.
1
To whom correspondence should be addressed. Fax: 091-22-215
2110. E-mail: hosur@tifr.res.in.
Biochemical and Biophysical Research Communications 270, 967–971 (2000)
doi:10.1006/bbrc.2000.2479, available online at http://www.idealibrary.com on
967 0006-291X/00 $35.00
Copyright © 2000 by Academic Press
All rights of reproduction in any form reserved.