Prediction of Peptide Structure: How Far Are We? Annick Thomas, 1* Se´bastien Deshayes, 1 Marc Decaffmeyer, 1 Marie He´le`ne Van Eyck, 2 Benoit Charloteaux, 1 and Robert Brasseur 1 1 Centre de Biophysique Mole ´culaire Nume ´rique (CBMN) FSAGx, 2, Passage des De ´porte ´s, Gembloux 5030, Belgium 2 Biosiris SA, Parc Crealys, rue Phocas Lejeune 30/17, Gembloux 5032, Belgium ABSTRACT Rational design of peptides is a challenge, which would benefit from a better knowl- edge of the rules of sequence–structure–function relationships. Peptide structures can be approached by spectroscopy and NMR techniques but data from these approaches too frequently diverge. Structures can also be calculated in silico from primary se- quence information using three algorithms: Pepstr, Robetta, and PepLook. The most recent algorithm, PepLook introduces indexes for evaluating struc- tural polymorphism and stability. For peptides with converging experimental data, calculated structures from PepLook and, to a lesser extent from Pepstr, are close to NMR models. The PepLook index for polymorphism is low and the index for stability points out possible binding sites. For peptides with divergent experimental data, calculated and NMR structures can be similar or, can be different. These differences are apparently due to polymorphism and to different conditions of structure assays and calcu- lations. The PepLook index for polymorphism maps the fragments encoding disorder. This should pro- vide new means for the rational design of peptides. Proteins 2006;65:889–897. V V C 2006 Wiley-Liss, Inc. Key words: peptides; structural prediction; model- ling; CD; NMR structures; Boltzmann- Stochastic INTRODUCTION Interest in biological applications of peptides has expanded during the last decade.One very promising application for therapeutics and basic research is the de- velopmentof CPPs (cell penetrating peptides),which carry proteins and nucleic acids into cells. 1–4 Unfortu- nately,our understanding ofsequence to function rela- tionships for peptides being most often quite poor does not aid the efficient design of new molecules. This partly results from the current approach toward characteriza- tion of peptide structure.We think of peptidesas we think of proteins and the ‘‘one sequence-one structure’’ paradigm,which is common for many proteins,is very limiting when applied to peptides.Indeed,peptides are short and therefore have less possibilities of self-stabiliza- tion than proteins,and severalstructures may have the same degree of stability. Our cartesian education has not prepared us to juggle with the diversity of conformations and it would be interesting to open our minds to this di- versity in order to understand the numerous ‘‘selective’’ interactions peptides may be involved in. Experimental approaches to elucidate peptide structures include CD, EPR, FITR, NMR, and X ray crystallography. These tech- niques demonstrate that in some instances,the solvent may have a major influence on peptide structure. Here, we sustain that molecular modeling could contribute sig- nificantly to the approach of peptide sequence to function relationships.We first evaluate data from three in silico algorithms available on web servers (Pepstr, Robetta, and PepLook) and we compare model structures with experi- mental data and discuss their relevance.Pepstr is the oldest server for sequences of 7–25 amino acids and even for the structure of transportan (27 aa). Robetta is clearly dedicated to proteins rather than to peptides, but the pos- sibility of calculating peptides as short as 20–25 residues remains available. PepLook is specifically dedicated to peptides up to 30 residues.In this article, we compare five types of peptides shorter than 30 amino acids, previ- ously studied by experimental approaches: an antimicro- bial peptide,magainin 2, two cell-penetrating peptides (transportan and hCT (9–32)), a fusion peptide (HA2 fusion peptide), and a b hairpin. METHODS Peptide Sequences and PDB Code of Structures We excluded peptides stabilized by a disulfide bridge Magainin 2: GIGKFLHSAKKFGKAFVGEIMNS (2MAG. pdb) Transportan:GWTLNSAGYLLGKINLKALAALAKKIL (1SMZ.pdb) hCT (9–32): LGTYTQDFNKFHTFPQTAIGVGAP Fusion peptide: GLFGAIAGFIENGWEGMIDG (1IBO.pdb) TRP zipper: GEWTWDDATKTWTWTE (1LE3.pdb) Grant sponsor: Interuniversity Poles of Attraction Programme— Belgian State, Prime Minister’s Office—Federal Office for Scientific, Technical and Cultural Affairs (PAI); Grant number: P5/33; Grant sponsor:Ministe`re de la Re´gion Wallonne;Grant numbers: 14540 (PROTMEM), and 215140 (aBUSTEC);Grant sponsor:FNRS; Grant number: 3.4546.02. *Correspondenceto: Annick Thomas, Centre de Biophysique Mole´culaire Nume ´rique (CBMN) FSAGx, 2, Passage des De ´porte ´s, Gembloux 5030, Belgium. E-mail: thomas.a@fsagx.ac.be Received 7 February 2006; Revised 12 May 2006; Accepted 3 July 2006 Published online 3 October 2006 in Wiley InterScience (www. interscience.wiley.com). DOI: 10.1002/prot.21151 V V C 2006 WILEY-LISS, INC. PROTEINS: Structure, Function, and Bioinformatics 65:889–897 (2006)