Available online at www.sciencedirect.com Computational Biology and Chemistry 31 (2007) 389–392 Software Note PreSSAPro: A software for the prediction of secondary structure by amino acid properties Susan Costantini a,b , Giovanni Colonna b , Angelo M. Facchiano a,b, a Laboratory of Bioinformatics and Computational Biology, Institute of Food Science, CNR, via Roma 52 A/C, 83100 Avellino, Italy b CRISCEB, Research Center of Computational and Biotechnological Sciences, Second University of Naples, via Costantinopoli 16, 80138 Naples, Italy Received 22 June 2007; accepted 10 August 2007 Abstract PreSSAPro is a software, available to the scientific community as a free web service designed to provide predictions of secondary structures starting from the amino acid sequence of a given protein. Predictions are based on our recently published work on the amino acid propensities for secondary structures in either large but not homogeneous protein data sets, as well as in smaller but homogeneous data sets corresponding to protein structural classes, i.e. all-alpha, all-beta, or alpha–beta proteins. Predictions result improved by the use of propensities evaluated for the right protein class. PreSSAPro predicts the secondary structure according to the right protein class, if known, or gives a multiple prediction with reference to the different structural classes. The comparison of these predictions represents a novel tool to evaluate what sequence regions can assume different secondary structures depending on the structural class assignment, in the perspective of identifying proteins able to fold in different conformations. The service is available at the URL http://bioinformatica.isa.cnr.it/PRESSAPRO/. © 2007 Elsevier Ltd. All rights reserved. Keywords: Amino acid propensities; Structural class of proteins; Secondary structure prediction; Protein structure 1. Introduction The propensities for different secondary structures represent intrinsic properties of amino acids, used in the last three decades to investigate protein structure. In the 1970s Chou and Fas- man developed their pioneering prediction method based on the statistical propensity of amino acids for secondary structures, evaluated on the few tens of proteins for which the three- dimensional structures determined by X-ray diffraction were available. On the basis of such propensities, it was possible to evaluate the mean propensity for the different secondary struc- tures along a given sequence, and so to predict its secondary structure (Chou and Fasman, 1974a,b; Chou, 1989). Propensities evaluated in the early works, or their re-evaluated versions, are still used for developing new algorithms and predictive methods (Wang and Feng, 2005; Fuchs and Alix, 2005). Corresponding author at: Institute of Food Science, CNR, via Roma 52 A/C, 83100 Avellino, Italy. Tel.: +39 0825 299625; fax: +39 0825 299813. E-mail addresses: angelo.facchiano@isa.cnr.it, angelo.facchiano@unina2.it (A.M. Facchiano). The PreSSAPro service is based on our recent paper (Costantini et al., 2006) which investigated a new point of view about amino acid propensities. The main question in our work was what is the best protein dataset to evaluate the amino acid propensities, either larger but not homogeneous or smaller but homogeneous sets, and how the composition of the protein dataset affects these propensities. We evaluated the amino acid propensities for three types of secondary structures (i.e. helix, beta-strand and coil) for 2168 proteins reported in the PDBselect dataset. The success of predictions based on these propensities was improved in comparison to the original Chou and Fasman method, based on few tens of proteins. Then, this dataset was subdivided into three subsets corresponding to the secondary structural classes, i.e. all-alpha, all-beta and alpha–beta pro- teins, according to the definition of Nakashima et al. (1986), that consider proteins with >15% alpha-helical content and <10% beta-strand content as all-alpha proteins, with <15% alpha- content and >10% beta-content as all-beta proteins, with >15% alpha-content and >10% beta-content as mixed proteins, and the remaining as irregular. For each subset, the amino acid propen- sities have been calculated and used for predicting the secondary structure of the proteins belonging to that subset. The success of 1476-9271/$ – see front matter © 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.compbiolchem.2007.08.010