189
There has been considerable progress made over the past
year in linking experimental and theoretical approaches to
protein folding. Recent results from several independent lines
of investigation suggest that protein folding mechanisms and
landscapes are largely determined by the topology of the
native state and are relatively insensitive to details of the
interatomic interactions. This dependence on low-resolution
structural features, rather than high-resolution detail, suggests
that it should be possible to describe the fundamental physics
of the folding process using relatively low-resolution models.
Recent experiments have set benchmarks for testing new
models and progress has been made in developing theoretical
models for interpreting and predicting experimental results.
Addresses
Department of Biochemistry, Box 357350, University of Washington,
Seattle, WA 98195, USA
*e-mail: dabaker@u.washington.edu
Current Opinion in Structural Biology 1999, 9:189–196
http://biomednet.com/elecref/0959440X00900189
© Elsevier Science Ltd ISSN 0959-440X
Abbreviations
CI-2 chymotrypsin inhibitor II
CspB Bacillus cold-shock protein
DC diffusion collision
HP hydrophobic–polar
SH Src homology
Introduction
An exciting feature of research into protein folding is that
experimental and theoretical approaches can be closely
integrated. There can be considerable synergy between
the two: new experimental results can drive new theoret-
ical developments, whereas the failures of current
theories highlight areas demanding further experimental
investigation. In this review, we focus on recent advances
at the interface between theory and experiment in pro-
tein folding, with an emphasis on small, two-state folding
proteins because of their simplicity.
Is the folding process extensively optimized by
natural selection?
Before considering models for folding based on simple
physical principles, it is important to determine whether
the folding process has been extensively optimized by
natural selection; if it has, then a workable theory must
take into account the features that have been optimized.
Clearly, protein stability and function are the result of
extensive evolutionary optimization. Very few of the
enormous number of possible sequences actually fold to
a unique three-dimensional structure and, of these, per-
haps an even smaller fraction carry out a useful
biological function.
Are folding rates and mechanisms also subject to evolu-
tionary optimization? This question can be investigated
by comparing the folding rates of protein sequences gen-
erated in the laboratory with those of naturally occurring
sequences. Phage display selection for properly folded
proteins was used to retrieve heavily mutated sequences
of two small protein domains that retain the ability to
fold from high complexity libraries [1
••
,2
••
]. Although
the stabilities of all the variants were less than those of
the parent wild-type proteins, their folding rates fluctu-
ated around those of the native proteins, with roughly
half of the variants folding faster and half folding slower
than the wild-type proteins. A highly simplified SH3
domain variant, in which the vast majority of the
residues outside the binding pocket were replaced by
one of five amino acids (isoleucine, lysine, glutamic acid,
glycine or alanine), folded several times faster than the
wild-type SH3 domain. These data suggest that folding
rates are not extensively optimized by natural selection,
as proteins selected in the laboratory without regard to
the folding rate fold as fast or faster than naturally occur-
ring proteins.
Native state topology is a critical determinant
of folding rates and mechanisms
The phage display studies suggest that folding rates are
not strongly dependent on the protein sequence, as large
sequence perturbations have relatively small effects on
folding rates. A comparison of the folding rates of homolo-
gous proteins has led to similar conclusions. For example,
despite large differences in the stabilities of mesophilic,
thermophilic and hyperthermophilic members of the
Bacillus cold-shock protein (CspB) family, the folding rates
and solvent accessibilities of the folding transition states
are virtually identical for the three proteins [3
••
]. The
interactions that are responsible for the differences in sta-
bility thus appear to be made after the folding transition
state. Similar folding rates and mechanisms are also
observed across the SH3 protein family [4–6].
A higher resolution comparison of folding mechanisms of
homologous proteins has been made possible by the char-
acterization of the effects of point mutations on the
folding rates of the Src and spectrin SH3 domains, which
have only 30% sequence identity. Despite their very dif-
ferent sequences, the available data suggest that their
folding transition states are similar [7
••
,8
••
]. Together
with the results from the homologous protein sets and the
phage display libraries described in previous paragraphs,
these data suggest that topology is a critical determinant
of folding rate and mechanism.
The preceding studies demonstrate that native state topolo-
gy is an important determinant of the folding mechanism,
Matching theory and experiment in protein folding
Eric Alm and David Baker*