Nonatomic Solvent-Driven Voronoi Tessellation of Proteins: An Open Tool to Analyze Protein Folds Borislav Angelov, 1 Jean-Franc ¸ ois Sadoc, 1 * Re ´ mi Jullien, 2 Alain Soyer, 3 Jean-Paul Mornon, 3 and Jacques Chomilier 3 1 Laboratoire de Physique des Solides, Universite ´ Paris 11, Orsay, France 2 Laboratoire des Verres, Universite ´ Montpellier 2, Montpellier, France 3 Laboratoire de Mine ´ralogie Cristallographie, Universite ´s Paris 6 et 7, case 115, Paris, France ABSTRACT A three-dimensional Voronoi tes- sellation of folded proteins is used to analyze geo- metrical and topological properties of a set of pro- teins. To each amino acid is associated a central point surrounded by a Voronoi cell. Voronoi cells describe the packing of the amino acids. Special attention is given to reproduction of the protein surface. Once the Voronoi cells are built, a lot of tools from geometrical analysis can be applied to investigate the protein structure; volume of cells, number of faces per cell, and number of sides per face are the usual signatures of the protein struc- ture. A distinct difference between faces related to primary, secondary, and tertiary structures has been observed. Faces threaded by the main-chain have on average more than six edges, whereas those related to helical packing of the amino acid chain have less than five edges. The faces on the protein surface have on average five edges within 1% error. The average number of faces on the protein surface for a given type of amino acid brings a new point of view in the characterization of the exposition to the solvent and the classification of amino acid as hydro- philic or hydrophobic. It may be a convenient tool for model validation. Proteins 2002;49:446 – 456. © 2002 Wiley-Liss, Inc. Key words: Voronoi tessellation; protein folding; hy- drophilic/hydrophobic properties INTRODUCTION The folding of an amino acid chain to a protein of a well-defined structure is still an enigma. Tremendous amounts of experimental work have been done in the field of molecular biology, biochemistry, and biological physics to understand this complex phenomenon. 1–4 . In this work, we present the ground for development of a geometrical theory of protein folding. As an adequate theoretical description, it has been first accepted that the folding of a protein is ruled by the common principle of minimal free energy. This is commonly referred to as the “old view” of protein folding. More recently, a new view 5–7 was intro- duced, which has admitted a funnel-like energy surface 8,9 consistent with multiple folding pathways. In addition, it has been assumed that topology determines protein fold- ing mechanisms. 10 Statistical analysis of contacting resi- dues has shown that their localization is not randomly distributed but highly favors particular lengths of peptides between them. The literature that has been devoted to this subject is not normalized for the moment, because one can see different terms such as: contact order, 11 closed loops, 12 or tightened end fragments. 13 How to predict the native state structure of a protein from its sequence 14,15 remains unclear. One possible way to overcome this failure of predictability of the molecular structure is not only to look at the energy landscape but also to examine in more details the information that comes from coordinates (i.e., from pure geometry of the protein structure). In the field of liquids, liquid crystals, crystal- line, and amorphous solids, the geometrical approach yielded many fruitful results. 16 –20 To analyze the struc- ture of folded proteins, it was proposed by some of us 21 to use a very sensitive geometrical method based on the so-called Voronoi tessellation (VT). 22 A tessellation is a mean to describe the space filled by a packing of solid polyhedra connected by their faces without empty space between them. Giving a set of discrete points in space, a Voronoi tessellation associates to each point a polyhedral domain, called a Voronoi cell, containing all the neighbor- hood closer to the considered points than to others. There are several examples of VT methods applied to proteins in the literature, 23–28 but only a few of them 26 –28 concern directly the packing of amino acids (AA) or fold recogni- tion. 29 Moreover, in Refs. 26 –28, the investigators used a Delaunay tessellation, which can be viewed as a first step before VT, and considered the -carbon locations as the starting set of points. Because an -carbon is almost Abbreviations: AA, amino acid; PDB, Protein Data Bank; RRPS, relaxed random packing of spheres; RSA, random sequential aggrega- tion; VT, Voronoi tessellation; VC, Voronoi cell. Grant sponsor: Marie Curie Program of the European Union; Grant sponsor: Centre National de la Recherche Scientifique, France. B. Angelov’s permanent address is Institute of Biophysics, Bulgar- ian Academy of Science, Acad. G. Bonchev Str. Bl. 21, Sofia 1113, Bulgaria. *Correspondence to: Jean-Franc ¸ois Sadoc, Laboratoire de Physique des Solides, Universite ´ Paris 11, Centre d’Orsay, 91405, Orsay, France. E-mail: sadoc@lps.u-psud.fr Laboratoire associe ´ au Centre National de la Recherche Scienti- fique (CNRS, France). Received 9 January 2002; Accepted 7 June 2002 Published online 00 Month 2002 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/prot.10220 PROTEINS: Structure, Function, and Genetics 49:446 – 456 (2002) © 2002 WILEY-LISS, INC.