928 VOLUME 33 NUMBER 9 SEPTEMBER 2015 NATURE BIOTECHNOLOGY
Eric Westhof is at Architecture & Reactivity
of RNA, University of Strasbourg, Institute of
Molecular and Cellular Biology of the CNRS,
Strasbourg, France.
e-mail: e.westhof@ibmc-cnrs.unistra.fr
approaches allow experimental probing of
accessible RNA regions and determination of
RNA structure (essentially at the secondary-
structure level) on a transcriptome-wide scale
4,5
.
However, although they identify nucleotides
or regions that form helical domains or are
engaged in some sort of base pairing, they
cannot identify the paired strand or nucleotide.
This is why additional data—such as phyloge-
netic analysis, free-energy minimizations and
computational modeling—are needed. If all the
proximity and pairing information in a given
RNA structure were available at sufficiently
high resolution, in principle the structure
could then be determined by computational
methods.
The approach of Ramani et al.
1
, called
RNA proximity ligation (RPL), is a first
step toward determining the missing three-
dimensional connectivity. The authors use
deep sequencing and proximity ligation to
obtain information on the spatial proximity
of nucleotides in a complex mixture of nonde-
natured RNA transcripts extracted from yeast
cells stripped of their cell wall or from cultured
human cells (Fig. 1).
In the first step, RNases (endogenous RNases
for yeast cells, exogenous RNases for human
cells) are allowed to cleave RNA. Next, exog-
enous T4 RNA ligase I is added to randomly
ligate free RNA ends, and the chimeric mol-
ecules resulting from ligation are submitted
for deep sequencing. The sequencing results
are compared to the known primary structure
of an RNA of interest. The authors find that
the vast majority of ligations are intra- rather
than inter-molecular. The data are noisy, with
random ligations seen between termini that are
close or far in sequence and in space. To correct
for PCR errors, ligations between termini closer
than 10 bases are not counted. Ligations that
restore the original sequence are not detected.
Although the ligation data are noisy, the
authors can improve the signal-to-noise ratio
by analyzing the data with 21-nucleotide long
windows, which reveals an enrichment of
Our knowledge of RNA structure derives from
studies using X-ray crystallography and nuclear
magnetic resonance spectroscopy. However,
many RNAs and RNA-protein complexes are not
amenable to these time-consuming methods for
a variety of reasons, such as a requirement for
large quantities of starting RNA or understand-
ing of the optimal solubility and crystallization
conditions that preserve molecular integrity.
This has led to the development of alternative
approaches for inferring RNA structure.
Phylogenetic analysis, which looks for
patterns of nucleotide co-variation across
conserved RNA sequences, is very useful
for predicting RNA secondary structure.
Homologous sequences are expected to yield
similar folds and maintain the same number
and lengths of core helices. However, this
method requires sufficient sequence variation
data and careful sequence alignments.
Another type of in silico approach relies
on experimentally derived energies of base-
paired stacks to compute the secondary struc-
tures of a given RNA sequence with the lowest
free energy and maximum number of base
pairs. Although these methods are constantly
improving, they are mathematically complex
and can yield several potential solutions with
accuracies of 75–80% (ref. 3), making the true
structure difficult to determine.
Experimental methods involving chemi-
cal and enzymatic probes have been used to
determine the accessibility and dynamics of
RNA structures by measuring the susceptibil-
ity of specific nucleotides to modification or
cleavage. The chemicals are chosen according
to their reactivity with specific atoms on the
base or sugar-phosphate backbone, whereas
the endonucleases cleave folded RNA mole-
cules with a preference for unpaired or paired
regions. The resulting data can be used to con-
strain the number of RNA secondary structures
predicted by computational methods.
In recent years, chemical and enzymatic
probing methods have been adapted for use
with deep sequencing technologies. These
Solving the three-dimensional structure of an
RNA molecule means laborious study by X-ray
crystallography or nuclear magnetic resonance
spectroscopy. However, faster complementary
methods are on the horizon, many involving
deep sequencing and sophisticated compu-
tational analysis. In this issue, Ramani et al.
1
use deep sequencing and proximity ligation
to identify nucleotide regions that interact in
folded RNA molecules. The method provides
an entirely new source of information on intra-
molecular RNA interactions that, with further
improvement, may enable accurate prediction
of RNA structure.
Single-stranded RNA molecules have a
strong tendency to fold back on themselves,
locally and globally, creating complex spatial
architectures. Folding relies on stacking hydro-
gen bonds between nucleobases. All base-base
interactions that involve at least two ‘stan-
dard’ hydrogen bonds can be classified into
12 families. Each family is a 4 × 4 matrix of
the four RNA bases—U, C, A, G
2
. The com-
mon Watson-Crick pairs belong to one of these
families, and the other 11 families are made up
of non-Watson-Crick pairs.
Watson-Crick pairs form the double-
stranded hairpins of RNA secondary struc-
ture. The remaining families are involved in
the formation of RNA modules—the building
blocks of tertiary structure—and long-range
intramolecular contacts. RNA architecture
can therefore be viewed as the hierarchical
assembly of preformed hairpins defined by
Watson-Crick base pairs and RNA modules
maintained by non-Watson-Crick base pairs.
Computational approaches to solving RNA
structure often follow this model, determin-
ing secondary structure before building up
tertiary structure.
RNA structure from deep sequencing
Eric Westhof
A method to identify interacting regions in a folded RNA is a step toward solving RNA structures from sequencing data.
NEWS AND VIEWS
npg
© 2015 Nature America, Inc. All rights reserved.