Protein folds Robert L. Jernigan National Institutes of Health, Bethesda, Maryland, USA Important recent theoretical studies of proteins have aimed to provide fewer details and more global descriptions of protein folds. Several newer theoretical approaches attempt to achieve correct overall folds at less than atomic detail. Atomic structures based on these fold skeletons can presumably be completed, as has been adequately demonstrated for Ca coordinates. A rapid evaluation of large numbers of low-resolution fold patterns is possible by using pairwise hydrophobicities. Such lower-resolution methods are needed to model larger biological assemblies. Experimental screening of sequences by cassette mutagenesis or other similar approaches can permit a rapid cataloging of viable sequences. Also, the screening of sequences by calculation against a library of known structures has been demonstrated to be an important general method. To supplement an incomplete library of structures, however, large numbers of folded conformations could be generated by computer. Current Opinion in Structural Biology 1992, 2:248-256 Introduction The role of theory and calculation in protein folding and structural studies now spans a wide range of approaches for diverse purposes, from averaging and mining the structural data to more abstract considerations such as the evolution of function and structure. Because of the growth in the number of protein structures, it is always useful to collate and collect new averages and distribu tions of their structural properties, such ms was reported in [1]. Recently, interesting reviews were published on folding intermediates [2], molten globules [3] and fold ing patterns [4]. Chan and Dill [5"] reviewed polymer collapse models and their relationship to folded proteins. Recently, Allegra et al. [6] have used a worm-like chain model to look at the collapsed state. In work with a sim liar flavor, Bairamov el al. [7] compared the locations of turns in proteins with their positions in a uniformly segmented chain. Alonso and Dill [8 °] developed a the ory of solvophobic interactions opposing conformational entropy. They used this simple approach to comprehend the dependence of folding free energies on denaturant concentration for a series of mutants of staphylococcal nuclease. Interesting new approaches have been developed to deal with a wide range of protein folds. The bases for evaluat- ing overall folds are often non-atomic potentials of mean force that have been derived from known structures. In addition, new ways of reducing the number of points for defining a fold have emerged. Furthermore, a new, direct way of comparing unknown sequences against known structures in order to determine the most likely fold is now available. This method does depend, however, on having a complete set of folded protein conformations. As an alternative, it may become possible, by using fast computers, to generate large numbers of protein folds against which to test unknown sequences. Potentials of mean force for testing protein folds Early on, Tanaka and Scheraga [9] counted pairs of long-range, nonbonded, physically close residue pairs. Miyazawa and Jernigan [10] repeated this in a more sophisticated way for a larger sample, also deriving residue-residue pair potentials. More recently, Gregoret and Cohen [11] "also derived similar pair potentials, but with a larger sample. Comparisons of these different pair wise hydrophobicities reveal that the values of Tanaka and Scheraga are similar to those of Miyazawa and Jerni gan for taking residues from solvent interaction to inter action with another specific residue, whereas the values of Gregoret and Cohen correspond closely to those of Miyazawa and Jernigan for replacing an interaction with an average residue with another specific residue (RLJerni- gan, unpublished data). That the results of these corn parisons are found to be so similar may indicate that the results are relatively insensitive to the number of proteins in the sample. A recent study reports some frequent close contact pairs and triplets [12"]. All such inspections of globular protein structures have concluded that the hydrophobic interactions are the strongest. An inspection of the frequencies of long- range close pairs of residues [10] has indicated that hydrophobic hydrophobic pairs are strongest but are not very specific [10,11,13"] whereas the polar-polar in- teractions are the most specific but are not so strong. Of course, the polar-hydrophobic pairs are intermediate in strength between the other ~'o classes. These pairwise 248 (~ Current Biology Ltd ISSN 0959-440X