Enhanced Prediction Accuracy of Protein Secondary Structure Using Hydrogen Exchange Fourier Transform Infrared Spectroscopy Bernoli I. Baello, Petr Pancoska, and Timothy A. Keiderling 1 Department of Chemistry, University of Illinois at Chicago, 845 W. Taylor Street (M/C 111), Chicago, Illinois 60607-7061 Received July 7, 1999 A novel equilibrium hydrogen exchange Fourier transform IR (HX-FTIR) spectroscopy method for pre- dicting secondary structure content was employed us- ing spectra obtained for a training set of 23 globular proteins. The IR bandshape and frequency changes re- sulting from controlled levels of H–D exchange were ob- served to be protein-dependent. Their analysis revealed these variations to be partly correlated to secondary structure. For each protein, a set of 6 spectra was mea- sured with a systematic variation of the solvent H–D ratio and was subjected to factor analysis. The most significant component spectra for each protein, repre- senting independent aspects of the spectral response to deuteration, were each subjected to a second factor analysis over the entire training set. Restricted multiple regression (RMR) analysis using the loadings of the principal components from 19 of these H–D analyses revealed an improvement in prediction accuracy com- pared with conventional bandshape-based analyses of FTIR data. Nearly a factor of 2 reduction in error for prediction of helix fractions was found using s 1 , the av- erage spectral response for the H–D set. In some cases, significant error reduction for prediction of minor com- ponents was found using higher factors. Using the same analytical methods, prediction errors with this new deu- teration–response–FTIR method were shown to be even better than those obtained by use of electronic circular dichroism (ECD) data for helix predictions and to be significantly lower for ECD-based sheet prediction, mak- ing these the best secondary structure predictions ob- tained with the RMR method. Tests of a limited variable selection scheme showed further improvements, consis- tent with previous results of this approach using ECD data. © 2000 Academic Press Optical spectra, FTIR, 2 Raman, and electronic and vibrational CD (ECD and VCD) have been widely used as a basis for secondary structure analyses of proteins. A variety of mathematical tools have been developed for extracting quantitative estimates of the fractions of helix, sheet, and other components from these spectral data (1–29). Bandshape-based analyses have domi- nated ECD and VCD methods (30, 31), while band- shape- and frequency-based approaches (20, 32) have been employed for IR and Raman analyses. These tech- niques involve relatively rapid measurements with an intrinsically fast time scale whose analyses provide structural insight either by themselves or upon combi- nation with data from other techniques. Combined spectral studies, for instance, data from FTIR with ECD (13, 19, 24), or VCD with ECD (23, 24), often provide structural details and precision not available from one technique alone. Similarly, hydrogen exchange is a broadly used tech- nique for protein structure studies particularly for fold- ing analyses. Its chemical and physical mechanisms have been studied extensively (33– 40), and their im- pact has proven to be most valuable with deuteration- sensitive spectroscopic methods such as IR (41–51), Raman (52), NMR (37, 38, 53– 62), mass spectrometry (63– 68), and neutron diffraction (69). In particular, FTIR and VCD measurements have often been carried out on proteins in D 2 O-based solu- tions due to interference from H 2 O absorbances. The H–O–H deformation band (1650 cm -1 ) overlaps the amide I (primarily CAO stretch) frequency region, which provides the most structurally informative IR spectral changes. Using D 2 O, the amide I' band (N- 1 To whom correspondence should be addressed. Fax: (312) 996- 0431. E-mail: tak@uic.edu. 2 Abbreviations used: FTIR, Fourier transform infrared; ECD, electronic circular dichroism; VCD, vibrational CD; PC/FA, principal component method of factor analysis; RMR, restricted multiple re- gression; FC, fractional secondary structure composition. 46 0003-2697/00 $35.00 Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. Analytical Biochemistry 280, 46 –57 (2000) doi:10.1006/abio.2000.4483, available online at http://www.idealibrary.com on