Domain Behavior during the Folding of a Thermostable Phosphoglycerate Kinase Martin J. Parker,* ,‡ James Spencer, Graham S. Jackson, Steven G. Burston, ‡,§ Laszlo L. P. Hosszu, | C. Jeremy Craven, | Jonathan P. Waltho, | and Anthony R. Clarke Department of Biochemistry, School of Medical Sciences, UniVersity of Bristol, UniVersity Walk, Bristol BS8 1TD, U.K. and Krebs Institute for Biomolecular Research, Department of Molecular Biology and Biotechnology, UniVersity of Sheffield, P.O. Box 594, Sheffield S10 2UH, U.K. ReceiVed June 4, 1996; ReVised Manuscript ReceiVed September 19, 1996 X ABSTRACT: Bacillus stearothermophilus phosphoglycerate kinase (bsPGK) is a monomeric enzyme of 394 residues comprising two globular domains (N and C), covalently linked by an interdomain R-helix (residues 170-185). The molecule folds to the native state in three stages. In the first, each domain rapidly and independently collapses to form an intermediate in which the N-domain is stabilized by 5.1 kcal mol -1 and the C-domain by 3.3 kcal mol -1 over their respective unfolded conformations. The N-domain then converts to a folded state at a rate of 1.2 s -1 (ΔG I-F ) 3.8 kcal mol -1 ), followed by the C-domain at 0.032 s -1 (ΔG I-F ) 12.1 kcal mol -1 ). It is this last step that limits the rate of acquisition of enzyme activity. In the dynamics of unfolding in water, the N-domain converts to the intermediate state at a rate of 8 × 10 -4 s -1 , some 10 7 times faster than the C-domain. Consequently, the most populated intermediate in the folding reaction has a native-like N-domain, while that in the unfolding direction has a native-like C-domain. In a conventional sense, therefore, the folding/unfolding kinetics of bsPGK can be described as random order. Consistent with these observations, cutting the molecule in the interdomain helix produces two, independently stable units comprising residues 1-175 and 180-394. A detailed comparison of their folding behavior with that of the whole molecule reveals that true interdomain contacts are relatively weak, contributing 1.4 kcal mol -1 to the stability of the active enzyme. The only interactions which contribute to the stability of rapidly formed intermediates or to transition states along the productive folding pathways are those within domain cores. Contacts formed either between domains or with the interdomain helix are made only in the folded ground state, but do not constitute a separate step in the folding mechanism. Intriguingly, the most pronounced effect of interdomain contacts on the kinetics of folding is inhibitory; the presence of the C-domain appearing to reduce the effective rate of acquisition of native structure within the N-domain. For methodological reasons and because they serve as the simplest model objects, small single-domain proteins have formed the focus of most folding studies (Kim & Baldwin, 1990; Creighton, 1992). There is less information, however, about the development of higher levels of organization in large proteins which are composed of well-defined structural domains. Although these proteins require specific interdo- main interactions to maintain the native state, little is known of the strength of interdomain contacts at each stage in folding and of the part these interactions play in the folding mechanism. Addressing this problem produces an immediate difficulty, that of defining the term “domain”. A domain is usually described as a substructure within a protein with one or more of the following properties (Garel, 1992). (i) When isolated, it forms the same well-defined folded conformation as it does in the intact molecule. In this respect it has a high degree of structural autonomy, the energetic definition. (ii) It acts as a discrete genetic unit which can be identified in different proteins, the evolutionary definition. (iii) On examination of the global, three-dimensional fold, it appears as a distinct substructure, the morphological or topological definition. (iv) It has a particular mechanistic function within the protein, e.g., a unit within an enzyme which binds one of the reactants in a multisubstrate reaction, the functional definition. At one extreme, there are proteins which fulfill all four criteria, but are composed of domains with no intimate noncovalent interactions necessary for their individual bio- logical activities. In these, chain connectivity is used to maintain crude proximity (e.g., the type κ-immunological light chain (Tsunenga et al., 1987)). In other proteins, the domains interact more extensively and cannot fold as separate units, making it impossible to examine them in isolation (e.g., R-lactalbumin (Peng & Kim, 1994)). Phosphoglycerate kinase (PGK) 1 is a more promising paradigm for studies of the folding of multi-domain proteins (Betton et al., 1984, 1985; Yon et al., 1988, 1990). This * Author to whom correspondence should be addressed. University of Bristol. | University of Sheffield. § Present address: Howard Hughes Medical Institute Research Laboratories, Boyer Center for Molecular Medicine, 295 Congress Avenue, New Haven, CT 06510. X Abstract published in AdVance ACS Abstracts, November 15, 1996. 1 Abbreviations: 3-PGA, 3-phosphoglycerate; bsPGK, phosphoglyc- erate kinase from Bacillus stearothermophilus; bsPGK, W290Y bsPGK; CD, circular dichroism; DTT, dithiothreitol; EDTA, ethylene- diaminetetraacetic acid; GAPDH, glyceraldehyde-3-phosphate dehy- drogenase from horse muscle; GuHCl, guanidinium hydrochloride; IPTG, isopropyl -D-thiogalactoside; NMR, nuclear magnetic resonance; PCR, polymerase chain reaction; PGK, 3-phosphoglycerate kinase; PMSF, phenylmethanesulfonyl fluoride; SDS-PAGE, sodium dodecyl sulfate polyacrylamide gel electrophoresis; TEA, triethanolamine hydrochloride; TrisHCl, tris(hydroxymethyl)methylamine hydrochlo- ride. 15740 Biochemistry 1996, 35, 15740-15752 S0006-2960(96)01330-X CCC: $12.00 © 1996 American Chemical Society