proteins STRUCTURE O FUNCTION O BIOINFORMATICS Computing van der Waals energies in the context of the rotamer approximation Gevorg Grigoryan, 1 Alejandro Ochoa, 2 and Amy E. Keating 1 * 1 Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139 2 Lewis Thomas Lab, Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544 INTRODUCTION It has long been known to chemists that molecules tend to adopt staggered, rather than eclipsed, dihedral confor- mations. 1 When the first few crystal structures of proteins were solved, it became apparent that the same is true for amino-acid side chains. 2 Side-chain v angles do not vary over all possible values, but rather cluster in tight distribu- tions around conformations called rotamers (rotational isomers). Beginning in the 1970s, rotamer libraries were compiled to represent side-chain conformations observed in proteins of known structure. 2–4 Ponder and Richards developed the first complete rotamer library by examining 19 high-resolution crystal structures, 5 and many variants of this work based on larger structural datasets have been published since then 6–8 (reviewed by Dunbrack 9 ). The differences between most rotamer libraries lie in their size (number of rotamers per amino acid), the procedure used for discarding potentially bad experimental data and whether or not rotamers are defined as a function of back- bone conformation. The rotamer libraries developed by Dunbrack and Cohen 10 and by Lovell et al. 11 are among the most commonly used today. Most protein side chains adopt conformations very close to library rotamers, a concept referred to as ‘‘rota- mericity’’. Shrauber et al. have shown that although sig- nificant outliers from rotameric conformations do exist, ABSTRACT The rotamer approximation states that protein side-chain con- formations can be described well using a finite set of rotational isomers. This approximation is often applied in the context of computational protein design and structure prediction to reduce the complexity of structural sampling. It is an effective way of reducing the structure space to the most relevant conformations. However, the appropriateness of rotamers for sampling struc- ture space does not imply that a rotamer-based energy land- scape preserves any of the properties of the true continuous energy landscape. Specifically, because the energy of a van der Waals interaction can be very sensitive to small changes in atomic separation, meaningful van der Waals energies are par- ticularly difficult to calculate from rotamer-based structures. This presents a problem for computational protein design, where the total energy of a given structure is often represented as a sum of precalculated rigid rotamer self and pair contribu- tions. A common way of addressing this issue is to modify the van der Waals function to reduce its sensitivity to atomic posi- tion, but excessive modification may result in a strongly non- physical potential. Although many different van der Waals mod- ifications have been used in protein design, little is known about which performs best, and why. In this paper, we study 10 ways of computing van der Waals energies under the rotamer approximation, representing four general classes, and compare their performance using a variety of metrics relevant to protein design and native-sequence repacking calculations. Scaling van der Waals radii by anywhere from 85 to 95% gives the best per- formance. Linearizing and capping the repulsive portion of the potential can give additional improvement, which comes pri- marily from getting rid of unrealistically large clash energies. On the other hand, continuously minimizing individual rotamer pairs prior to evaluating their interaction works acceptably in native-sequence repacking, but fails in protein design. Additionally, we show that the problem of predicting relevant van der Waals energies from rotamer-based structures is strongly nonpairwise decomposable and hence further modifi- cations of the potential are unlikely to give significant improve- ment. Proteins 2007; 68:863–878. V V C 2007 Wiley-Liss, Inc. Key words: van der Waals energy; rotamer approximation; pro- tein design; pairwise energy functions; discrete structural sam- pling. Abbreviations: AAD, average absolute deviation; Db02, Dunbrack rotamer library from 2002; Db99, Dunbrack rotamer library from 1999; L-J, Lennard-Jones; LR A 90 , linearly repulsive van der Waals using 90% radii with all non-bonded terms capped; LR 90 , linearly repulsive van der Waals with 90% radii; MAD, median abso- lute deviation; NCE, neighborhood conformational energy; PRM, pairwise rotamer minimization; R60-95, modifications in which van der Waals radii are scaled by 60 to 95%; RCE, rotameric conformational energy; RR00, Richardson and Richardson penultimate rotamer library; RRexp, Richardson and Richardson penultimate library with expanded aromatics; RRX1, Richardson and Richardson penultimate rotamer library with expanded v 1 ; vdW, van der Waals. Grant sponsor: National Institutes of Health; Grant number: GM67681; Grant sponsor: National Science Foundation; Grant number: 0216437 *Correspondence to: Amy E. Keating, Department of Biology, Massachusetts Institute of Technology, Room 68-622A, Cambridge, MA 02139. E-mail: keating@mit.edu Received 16 October 2006; Accepted 20 January 2007 Published online 6 June 2007 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/prot.21470 V V C 2007 WILEY-LISS, INC. PROTEINS 863