NATURE BIOTECHNOLOGY VOLUME 25 NUMBER 3 MARCH 2007 297
Sequence-activity relationships guide directed evolution
Joelle N Pelletier & Robert Lortie
A new method for in vitro evolution integrates computational analysis and experimental screening.
Directed evolution to improve the properties
of proteins faces a problem of numbers: for an
average-sized protein, the sequence space to be
explored is astronomically large, whereas prac-
tical screening capacity is limited to thousands
or millions of variants. Virtual screening in
silico can analyze a much higher fraction of the
total number of variants and has considerable
potential, but is not yet widely implemented. In
this issue, Fox et al.
1
present an alternative—a
combined computational and experimental
method in which sequence space is sifted along
multiple paths marked by experimental data
points. Their approach, which incorporates a
protein sequence-activity relationship (ProSAR)
algorithm, promises to become a useful general
strategy for evolving proteins with novel prop-
erties.
The authors apply ProSAR to address the
challenging synthesis of the (3R,5S)-dihy-
droxyheptanoate side chain of atorvastatin, the
active ingredient in the blockbuster cholesterol-
lowering drug Lipitor. Their starting point, a
halohydrin dehalogenase from Agrobacterium
radiobacter, catalyzes the cyanation of ethyl (S)-
4-chloro-3-hydroxybutyrate into the synthetic
intermediate ethyl (R)-4-cyano-3-hydroxy-
butyrate, but with an efficiency under process
conditions that is several thousand–fold too low
to be economically viable. The authors’ invest-
ment in 18 rounds of directed evolution, each
providing a modest improvement over the pre-
vious one, paid off handsomely: they achieved
Joelle N. Pelletier is at the Département de chimie
and Département de biochimie, Université de
Montréal, C.P. 6128, Succursale Centre-Ville,
Montréal, Québec, Canada, H3C 3J7 and Robert
Lortie is at the Biotechnology Research Institute,
National Research Council, 6100 Royalmount
Avenue, Montréal, Québec, Canada, H4P 2R2
and Département de chimie, Université de
Montréal.
e-mail: joelle.pelletier@umontreal.ca or
robert.lortie@cnrc-nrc.gc.ca
Mutated sequences Performance
Partial least-squares regression
New backbone
Postulate
Effect on
performance
1 2
1
1
3
3
4
2
+ +
+
+
–
–
?
– –
Include 1
Discard 2
Discard 4
Retest 3
New mutations
2 4
3
1 3 4
0
1,000
2,000
3,000
4,000
Relative productivity
Target productivity
Generations
0 5 10 15
a
b
Figure 1 Mathematical analysis bolsters directed evolution. (a) Partial least-squares regression is used
to predict the effects of individual mutations on the performance of enzyme variants each of which
carries multiple mutations. Deleterious mutations are discarded, and favorable mutations are included
for the next round of screening. Whenever the effect of a given mutation is uncertain, it is retested in
a subsequent round. (b) The rewards of a small increase in enzyme productivity (~1.5-fold) per round
become evident in the later rounds.
NEWS AND VIEWS
© 2007 Nature Publishing Group http://www.nature.com/naturebiotechnology