Evolutionary Algorithm to ab initio Protein Structure Prediction with Hydrophobic Interactions T. W. de Lima, P. H. R. Gabriel, A. C. B. Delbem R. A. Faccioli, I. N. da Silva Abstract— Proteins are polymers whose chains are composed of 20 different monomers, called amino acids. The problem of Protein Structure Prediction (PSP) is the determination of protein 3D conformation from its amino acid sequence. Two main strategies are usually employed to work with PSP: homology and Ab initio approaches. This paper presents an Evolutionary Algorithm to PSP using an Ab initio approach (ProtPred). The predictions are evaluated using fitness functions based on potential energies (electrostatic and van der Waals) and hydrophobic interactions. The proposed approach uses dihedral angles and main angles of the lateral chains to model a protein structure. ProtPred is evaluated using relatively complex cases for an Ab initio approach. Results have shown that ProtPred is a consistent approach. I. I NTRODUCTION Currently several proteins are used as drugs. Prominent ex- amples are the known protein insulin, the major drug against diabetes or, more recently, the anti-HIV protein T20 [1]. However, sometimes in order to use proteins as drugs is necessary to know their tertiary structure (3D conformation), that determines their function. In the last years, several amino acids sequences of proteins have been obtained, however their tertiary structure remain unknown. PSP is a computationally open problem and several me- thodologies have been investigated to solve it. The interest in discovering a methodology to solve PSP extends into many fields of research, including biochemistry, medicine, biology, engineering and scientific disciplines. Native protein structures have been determined using X-ray crystallography methods and magnetic nuclear resonance [2]. The latter has its application restricted to proteins with small size, while the former needs a great amount of laboratory processing that requires high cost. On the other hand, approaches for PSP range from empiri- cal researches to mathematical modeling for protein potential energy. An algorithm strategy to solve PSP uses information from protein homology to guide a search process. Despite Telma Woerle de Lima, Paulo Henrique Ribeiro Gabriel and Alexan- dre Claudio Botazzo Delbem are with the Institute of Mathematics and Computer Sciences of the University of Sao Paulo, 400 Trabalhador Sao Carlense Avenue Sao Carlos, Brazil e-mail:{telma,acbd}@icmc.usp.br, phrg@grad.icmc.usp.br Rodrigo Antonio Faccioli and Ivan Nunes da Silva are with the Sao Carlos Engineering School of the University of Sao Paulo, 400 Trabalhador Sao Carlense Avenue Sao Carlos, Brazil e-mail: {faccioli,insilva}@sel.eesc.usp.br the relevant results obtained using such strategies, algorithms based on protein homology are highly dependent on the set of proteins with native known structure. This set is extremely smaller than the universe of proteins [3]. On the other hand, Ab initio PSP does not depend on previous knowledge of protein structures. This is one of the most important unsolved problems in molecular bio- physics [4]. At first glance, it may not seem complex since by knowing the exact formulation of the physical environment within a cell, where proteins fold, it is possible to mimic the folding process in nature by computing the molecular dynamics based on our knowledge of the physical laws [5]. Nevertheless, we do not completely understand the driving forces involved in protein folding. Perturbations in the po- tential energy landscape may result in a different folding pathway, generating a different 3D structure. Due to insufficient results obtained for the tertiary protein structure from amino acid sequences, many different com- putational algorithms have been investigated. Among these algorithms, Evolutionary Algorithms (EAs) have presented relevant results [6] [7] [8] [9] [10] [11] [12]. EAs are powerful tools of optimization inspired in natural evolution and have been applied to many complex problems in the most different areas of human knowledge [13]. Another aspect of PSP is that it is a multi criterion problem since several energies should considered in order to evaluate biochemistry interactions in a protein structure [3]. A complete model of forces acting is hard to computationally implement. For example, the protein interaction with the sol- vent is very complex. It depends on how solvent components change their distribution in relation to the protein molecule during the folding. Due to difficult like that, approximated models are used. This implies that only main energies are employed. This paper presents an evolutionary Ab initio approach to PSP (called ProtPred) using as fitness potential energies and the hydrophobic interaction between the amino acids. Sec- tion II presents basics concepts of evolutionary algorithms. Section III introduces the PSP problem and its characteristics. In Section IV is presented ProtPred. The results obtained can be seem in Section V and Section VI presents the conclusions of this work. 612 1-4244-1340-0/07/$25.00 c 2007 IEEE