Solvation and Hydration of Proteins and Nucleic Acids: A Theoretical View of Simulation and Experiment VLADIMIR MAKAROV AND B. MONTGOMERY PETTITT* Department of Chemistry, University of Houston, Houston, Texas 77204-5641 MICHAEL FEIG Departm ent of Molecular Biology, TPC-6, The Scripps Research Institute, La Jolla, California 92037 Received October 3, 2001 ABSTRACT Many theoretical, computational, and experimental techniques recently have been successfully used for description of the solvent distribution around macromolecules. In this Account, we consider recent developments in the areas of protein and nucleic acid solvation and hydration as seen by experiment, theory, and simulations. We find that in most cases not only the general phenomena of solvation but even local hydration patterns are more accurately discussed in the context of water distributions rather than individual molecules of water. While a few localized or high- residency waters are often associated with macromolecules in solution (or crystals from aqueous liquors), these are readily and accurately included in this more general description. The goal of this Account is to review the theoretical models used for the description of the interfacial solvent structure on the border near DNA and protein molecules. In particular, we hope to highlight the progress in this field over the past five years with a focus on comparison of simulation and experimental results. Introduction Water plays a central role in the thermodynamics and structure of macromolecules. In particular, the stability and functionality of proteins and nucleic acids are dictated by specific as well as nonspecific solvent effects. The biological activity of these molecules generally occurs within a relatively narrow range of temperature, solvent chemical potential, and ionic concentration. Most cellular functions are driven not by temperature gradients but by changes in solvent environments including varying pH and ionic activities as well as different solute concentra- tions between cellular and subcellular compartments. It is thus of both practical and fundamental interest to understand the relation of the aqueous solvent to these important macromolecules. In this Account, we will consider the central role water plays in biochemistry from a structural perspective. We give a perspective on the hydration patterns of both proteins and nucleic acids so that we may compare them. It is the understanding of context-sensitive hydration patterns which ultimately yields our most detailed un- derstanding of molecular recognition. Such recognition, whether intermolecular (binding) or intramolecular (fold- ing), occurs because of specific hydration patterns which mediate all outer-sphere interactions and contacts. In our own work we have found the language of probability distributions to describe most hydration and solvation phenomena for these macromolecules. We find that in most cases not only the general phenomena of solvation but even local hydration patterns are more accurately discussed in the context of water distributions rather than individual molecules of water. While a few localized or high-residency waters are often associated with macromolecules in solution (or crystals from aqueous liquors), these are readily and accurately included in this more general description. 1,2 We wish to consider the distribution of water given a nearby macromolecule. While much of the theoretical literature has used pair distribution functions in one form or another, 3 we and others have found it more convenient to use conditional single-molecule densities. 4 These have been termed perpendicular or proximal distribution functions, 4-6 and they essentially count water atoms with respect to the closest atom on the macromolecule or perpendicular to the surface. These distributions give a direct measure of the local density of solvent in the context of the macromolecular structure. Rather than consider individual, distinguishable molecules of hydration, one can consider the probability of finding a solvent molecule near the macromolecular solute of interest. It is natural to consider X-ray data in this probabilistic context rather than attempting to fit whole waters into density which may be only partially occupied. 1,7 It is useful to distinguish between water correlations (and thus interactions) with differing solute atoms since they exhibit different equi- librium distances. 8,9 In Figure 1, we show a typical simulated density of water near a protein. If we consider the water distribution radially out from the surface near an atom, we find the distributions given in Figure 2 for C, N, and O. 9 This model has had a striking experimental confirmation by careful analysis of X-ray data. 10 This method gives a quantitative measure to some familiar concepts for small molecules which can be transferred to macromolecules. The positions of the first minima in the solvent proximal distributions can be used to define the solvation shell or hydration layer around a given macro- molecule. We will use the language of these distributions throughout the rest of this Account. Proteins Over the past decade there has been an extensive ac- cumulation of evidence pointing to the critical role that solvation plays in the function and structural stability of proteins. 11,12 Simulation and theoretical studies of solva- Vladimir Makarov was a scientist at Paracel when this article was written and currently is with BionomiX. B. M. Pettitt is the Cullen Professor of Chemistry and Director of the Institute for Molecular Design. His research area is theoretical chemistry. Michael Feig is currently a research scientist at the Scripps Institute. Acc. Chem. Res. 2002, 35, 376-384 376 ACCOUNTS OF CHEM ICAL RESEARCH / VOL. 35, NO. 6, 2002 10.1021/ar0100273 CCC: $22.00  2002 American Chemical Society Published on Web 03/08/2002