Defining scaffold geometries for interacting with proteins: geometrical classification of secondary structure linking regions Tran T. Tran Christina Kulis Steven M. Long Darryn Bryant Peter Adams Mark L. Smythe Received: 9 December 2009 / Accepted: 31 August 2010 / Published online: 23 September 2010 Ó Springer Science+Business Media B.V. 2010 Abstract Medicinal chemists synthesize arrays of mole- cules by attaching functional groups to scaffolds. There is evidence suggesting that some scaffolds yield biologically active molecules more than others, these are termed priv- ileged substructures. One role of the scaffold is to present its side-chains for molecular recognition, and biologically relevant scaffolds may present side-chains in biologically relevant geometries or shapes. Since drug discovery is primarily focused on the discovery of compounds that bind to proteinaceous targets, we have been deciphering the scaffold shapes that are used for binding proteins as they reflect biologically relevant shapes. To decipher the scaf- fold architecture that is important for binding protein sur- faces, we have analyzed the scaffold architecture of protein loops, which are defined in this context as continuous four residue segments of a protein chain that are not part of an a-helix or b-strand secondary structure. Loops are an important molecular recognition motif of proteins. We have found that 39 clusters reflect the scaffold architecture of 89% of the 23,331 loops in the dataset, with average intra-cluster and inter-cluster RMSD of 0.47 and 1.91, respectively. These protein loop scaffolds all have distinct shapes. We have used these 39 clusters that reflect the scaffold architecture of protein loops as biological descriptors. This involved generation of a small dataset of scaffold-based peptidomimetics. We found that peptidom- imetic scaffolds with reported biological activities matched loop scaffold geometries and those peptidomimetic scaf- folds with no reported biologically activities did not. This preliminary evidence suggests that organic scaffolds with tight matches to the preferred loop scaffolds of proteins, implies the likelihood of the scaffold to be biologically relevant. Keywords Protein interaction Á Privileged structures Á b-turns Á Clustering Á Protein loops Á Molecular recognition Á Scaffold Á Biologically relevant Á Peptidomimetics Á Biological descriptors Introduction One of the cornerstones of drug discovery are scaffolds. Scaffolds are also referred to as templates, substructures, chemotypes, core structures or molecular frameworks. In hit discovery, the goal is to identify a series of molecules based on a scaffold that show good preliminary structure– activity relationships. In lead optimization, affinity and pharmacokinetic parameters are optimized by modulating functional groups attached to the scaffold, or by even amending elements of the scaffold. In lead hopping, Electronic supplementary material The online version of this article (doi:10.1007/s10822-010-9384-y) contains supplementary material, which is available to authorized users. T. T. Tran (&) Á S. M. Long Á M. L. Smythe (&) Protagonist Pty Ltd, PO Box 6421, St Lucia, QLD 4067, Australia e-mail: t.tran@protagonist.com.au M. L. Smythe e-mail: m.smythe@protagonist.com.au T. T. Tran Á C. Kulis Á M. L. Smythe Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia S. M. Long Á D. Bryant Á P. Adams Department of Mathematics, University of Queensland, Brisbane, QLD 4072, Australia 123 J Comput Aided Mol Des (2010) 24:917–934 DOI 10.1007/s10822-010-9384-y