A New Folding Paradigm for Repeat Proteins Tommi Kajander, ² Aitziber L. Cortajarena, ² Ewan R. G. Main, ²,‡ Simon G. J. Mochrie,* ,²,§ and Lynne Regan ², | Departments of Molecular Biophysics and Biochemistry, Physics and Applied Physics, and Chemistry, Yale UniVersity, New HaVen, Connecticut 06520, Department of Chemistry, UniVersity of Sussex, Falmer, BN1 9QG, U.K. Received April 14, 2005; E-mail: simon.mochrie@yale.edu How a protein’s amino acid sequence specifies its structure and properties stands as a grand challenge of the post-genomic era. Repeat proteins, 1-5 which are composed of tandem arrays of a basic structural motif, account for more than 5% of the proteins in multicellular organisms in the Swiss-Prot database. In addition, leucine-rich repeats, zinc finger repeats, ankyrin repeats, and tetratricopeptide repeats (TPRs) 2 all rank among the 20 most common protein folds in the Pfam database. It is therefore surprising that the folding of repeat proteins has been little studied, 3 especially because their modular, repetitive structures promise a more tractable folding problem than for globular proteins. Here, we demonstrate that the folding of TPR proteins can be quantitatively described by the classical one-dimensional Ising model, 6,7 which thus represents a new folding paradigm for repeat proteins. Moreover, for the first time, a theoretical model predicts protein stability in detail. Our approach has been to synthesize and then examine the structure and behavior of a series of designed proteins containing different numbers of an identical repeated unit, which is a consensus sequence based on the natural prevalence of each amino acid at each position in the TPR motif. 4 We have determined the crystal structure of such a protein, CTPRa8*, which contains eight identical consensus TPR repeats and which is shown in Figure 1A,B. As may be seen from the figure, each repeat is composed of two helices, which are arrayed to form a superhelix. A key feature of this structure, and those of repeat proteins in general, 5 is that, in contrast to globular proteins, there are no sequentially distant amino acid contacts. 2 This is illustrated in Figure 1C, which shows a contact map for CTPRa8*, making it clear that CTPRa8* exhibits extensive amino acid contacts only within a helix and between nearest-neighbor helices. This observation suggests that it may be possible to understand the stability of TPRs on the basis of the collective behavior of the individual helices, interacting with each other via nearest-neighbor interactions. Indeed, as we show in the present communication, the folding/unfolding transitions within a series of consensus TPRs are quantitatively well described by the classical one-dimensional Ising model. 6,7 According to this description, the TPRs’ constituent helices correspond to Ising spins (s i )(1) and interact via a nearest- neighbor coupling. Thus, spin up (s i )+1) in the Ising model corresponds to the folded state of a TPR helix, while spin down (s i )-1) corresponds to the unfolded state. It follows that folding/ unfolding of TPRs, and likely of all repeat proteins, 8 does not conform to the all-or-nothing, folded-or-unfolded, two-state transi- tion that is generally assumed for small globular proteins. 9 Instead, the Ising description prescribes the existence of partially folded configurations with significant statistical weight. ² Department of Molecular Biophysics and Biochemistry, Yale University. University of Sussex. § Departments of Physics and Applied Physics, Yale University. | Department of Chemistry, Yale University. Figure 1. Crystal structure of CTPRa8*. (A) View perpendicular to the superhelical axis. Each TPR repeat is colored either red or blue. (B) View along the superhelical axis. (C) Contact map for CTPRa8*. The axes correspond to the residue numbers in the protein sequence. A square is placed at each position where two residues lie within 3-5 Å of each other in the structure. Therefore, points near the diagonal represent local contacts, while points far from the diagonal correspond to sequentially distant contacts. Contacts between backbone atoms are given above the diagonal, and contacts between all atoms are given below the diagonal. The diagonal is color- coded according to (A) and (B). Published on Web 06/30/2005 10188 9 J. AM. CHEM. SOC. 2005, 127, 10188-10190 10.1021/ja0524494 CCC: $30.25 © 2005 American Chemical Society