Nonatomic Solvent-Driven Voronoi Tessellation of Proteins:
An Open Tool to Analyze Protein Folds
Borislav Angelov,
1
Jean-Franc ¸ ois Sadoc,
1
*
Re ´ mi Jullien,
2
Alain Soyer,
3
Jean-Paul Mornon,
3
and Jacques Chomilier
3
1
Laboratoire de Physique des Solides, Universite ´ Paris 11, Orsay, France
2
Laboratoire des Verres,
†
Universite ´ Montpellier 2, Montpellier, France
3
Laboratoire de Mine ´ralogie Cristallographie, Universite ´s Paris 6 et 7, case 115, Paris, France
ABSTRACT A three-dimensional Voronoi tes-
sellation of folded proteins is used to analyze geo-
metrical and topological properties of a set of pro-
teins. To each amino acid is associated a central
point surrounded by a Voronoi cell. Voronoi cells
describe the packing of the amino acids. Special
attention is given to reproduction of the protein
surface. Once the Voronoi cells are built, a lot of
tools from geometrical analysis can be applied to
investigate the protein structure; volume of cells,
number of faces per cell, and number of sides per
face are the usual signatures of the protein struc-
ture. A distinct difference between faces related to
primary, secondary, and tertiary structures has
been observed. Faces threaded by the main-chain
have on average more than six edges, whereas those
related to helical packing of the amino acid chain
have less than five edges. The faces on the protein
surface have on average five edges within 1% error.
The average number of faces on the protein surface
for a given type of amino acid brings a new point of
view in the characterization of the exposition to the
solvent and the classification of amino acid as hydro-
philic or hydrophobic. It may be a convenient tool
for model validation. Proteins 2002;49:446 – 456.
© 2002 Wiley-Liss, Inc.
Key words: Voronoi tessellation; protein folding; hy-
drophilic/hydrophobic properties
INTRODUCTION
The folding of an amino acid chain to a protein of a
well-defined structure is still an enigma. Tremendous
amounts of experimental work have been done in the field
of molecular biology, biochemistry, and biological physics
to understand this complex phenomenon.
1–4
. In this work,
we present the ground for development of a geometrical
theory of protein folding. As an adequate theoretical
description, it has been first accepted that the folding of a
protein is ruled by the common principle of minimal free
energy. This is commonly referred to as the “old view” of
protein folding. More recently, a new view
5–7
was intro-
duced, which has admitted a funnel-like energy surface
8,9
consistent with multiple folding pathways. In addition, it
has been assumed that topology determines protein fold-
ing mechanisms.
10
Statistical analysis of contacting resi-
dues has shown that their localization is not randomly
distributed but highly favors particular lengths of peptides
between them. The literature that has been devoted to this
subject is not normalized for the moment, because one can
see different terms such as: contact order,
11
closed loops,
12
or tightened end fragments.
13
How to predict the native state structure of a protein
from its sequence
14,15
remains unclear. One possible way
to overcome this failure of predictability of the molecular
structure is not only to look at the energy landscape but
also to examine in more details the information that comes
from coordinates (i.e., from pure geometry of the protein
structure). In the field of liquids, liquid crystals, crystal-
line, and amorphous solids, the geometrical approach
yielded many fruitful results.
16 –20
To analyze the struc-
ture of folded proteins, it was proposed by some of us
21
to
use a very sensitive geometrical method based on the
so-called Voronoi tessellation (VT).
22
A tessellation is a
mean to describe the space filled by a packing of solid
polyhedra connected by their faces without empty space
between them. Giving a set of discrete points in space, a
Voronoi tessellation associates to each point a polyhedral
domain, called a Voronoi cell, containing all the neighbor-
hood closer to the considered points than to others. There
are several examples of VT methods applied to proteins in
the literature,
23–28
but only a few of them
26 –28
concern
directly the packing of amino acids (AA) or fold recogni-
tion.
29
Moreover, in Refs. 26 –28, the investigators used a
Delaunay tessellation, which can be viewed as a first step
before VT, and considered the -carbon locations as the
starting set of points. Because an -carbon is almost
Abbreviations: AA, amino acid; PDB, Protein Data Bank; RRPS,
relaxed random packing of spheres; RSA, random sequential aggrega-
tion; VT, Voronoi tessellation; VC, Voronoi cell.
Grant sponsor: Marie Curie Program of the European Union; Grant
sponsor: Centre National de la Recherche Scientifique, France.
B. Angelov’s permanent address is Institute of Biophysics, Bulgar-
ian Academy of Science, Acad. G. Bonchev Str. Bl. 21, Sofia 1113,
Bulgaria.
*Correspondence to: Jean-Franc ¸ois Sadoc, Laboratoire de Physique
des Solides, Universite ´ Paris 11, Centre d’Orsay, 91405, Orsay,
France. E-mail: sadoc@lps.u-psud.fr
†
Laboratoire associe ´ au Centre National de la Recherche Scienti-
fique (CNRS, France).
Received 9 January 2002; Accepted 7 June 2002
Published online 00 Month 2002 in Wiley InterScience
(www.interscience.wiley.com). DOI: 10.1002/prot.10220
PROTEINS: Structure, Function, and Genetics 49:446 – 456 (2002)
© 2002 WILEY-LISS, INC.