Bioinfo Publications 33 ARABESQUE: A TOOL FOR PROTEIN STRUCTURAL COMPARISON USING DIFFERENTIAL GEOMETRY AND KNOT THEORY World Research Journal of Peptide and Protein ISSN: 2278-4586 & E-ISSN: 2278-4608, Volume 1, Issue 1, 2012, pp.-33-40. Available online at http://www.bioinfo.in/contents.php?id=131 HOI TIK ALVIN LEUNG 1 , BERNARDO OCHOA MONTAÑO 2 , TOM BLUNDELL 2 , MICHELE VENDRUSCOLO 1 AND RINALDO WANDER MONTALVÃO 1 * 1 Department of Chemistry, University of Cambridge, Lensfield Road, CB2 1EW, Cambridge, UK. 2 Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, CB2 1GA, Cambridge, UK. *Corresponding Author: Email- rwm35@cam.ac.uk Received: January 11, 2012; Accepted: February 14, 2012 Abstract- We present ARABESQUE, a new tool for protein structure analysis, which includes structure comparison, generation of annotated structural alignments, and annotated superposition of structures. By combining differential geometry and knot theory, this method produces an accurate analysis of structural conservation in a family of proteins. The annotated alignment and superposed structures are used to char- acterise the local and global structural information content, to refine the sequence alignment and to produce fragments and 3D probability density functions for comparative modelling. Key words- Protein structure comparison, Protein structure alignment, Differential geometry, Knot theory. World Research Journal of Peptide and Protein ISSN: 2278-4586 & E-ISSN: 2278-4608, Volume 1, Issue 1, 2012 Introduction The comparison of the structures of proteins enables one to deter- mine close and distant relationships amongst them [1]. This type of analysis can be used to classify proteins into families with simi- lar folds and properties [2]. Such a classification is very useful since ensembles of related proteins within a family contain enough information to allow patterns in both sequences and struc- tures to be identified. These patterns play a vital role in the under- standing of a variety of aspects of protein behaviour, including their structural stability, biological activity, molecular evolution and structural conservation. While close relationships can easily be identified by using sequence similarity alone, distant relationships can often be determined only through a comparison of three- dimensional structures, since two proteins with low sequence identity can share similar folds, biological function and physico- chemical properties. These aspects follow as a direct conse- quence of the fact that the tertiary structure of proteins is more conserved than their sequences due to the action of selective pressures on the protein function(s) [3-4]. The systematic organisation of proteins into families can be used to predict the fold of proteins through homology modeling [5]. In this approach, the structure of a protein is predicted from its se- quence using information derived from homologous (i.e. diver- gently evolved) structures, together with additional rules inferred from general structural data [5-8]. Homology modelling programs can predict the structure for proteins from their amino acid se- quences by extrapolating their structural features from the struc- tures in their families. MODELLER [5] and ORCHESTRAR [6-8] are examples of homology modelling packages used for building new structures from currently available structures. It is not always easy to identify structurally conserved regions based on sequence alone, especially when the average percent- age of identity (PID) for a given protein family is low. Consequent- ly the development of methods to identify such regions is essen- tial not only for protein comparison but also for homology model- ling. The successful identification of structurally conserved regions demands a measure for structural divergence between two pro- tein fragments that satisfies the triangle inequality rule [9]. Howev- er, most of the current measures, such as RMSD, violate this rule and are unable to judge dissimilarity [9], thus creating substantial difficulties for using clustering algorithms to identify structurally conserved regions in protein families with large structural diver- gences. Other geometrical measures have been employed in order to Citation: Hoi Tik Alvin Leung, et al (2012) ARABESQUE: A tool for protein structural comparison using differential geometry and knot theo- ry. World Research Journal of Peptide and Protein, ISSN: 2278-4586 & E-ISSN: 2278-4608, Volume 1, Issue 1, pp.-33-40. Copyright: Copyright©2012 Hoi Tik Alvin Leung, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.