Adaptations of the Helix-Grip Fold for Ligand Binding and Catalysis in the START Domain Superfamily Lakshminarayan M. Iyer, Eugene V. Koonin, and L. Aravind * National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland ABSTRACT With a protein structure compari- son, an iterative database search with sequence profiles, and a multiple-alignment analysis, we show that two domains with the helix-grip fold, the star- related lipid-transfer (START) domain of the MLN64 protein and the birch allergen, are homologous. They define a large, previously underappreciated superfamily that we call the START superfamily. In addition to the classical START domains that are primarily involved in eukaryotic signaling medi- ated by lipid binding and the birch antigen family that consists of plant proteins implicated in stress/ pathogen response, the START superfamily in- cludes bacterial polyketide cyclases/aromatases (e.g., TcmN and WhiE VI) and two families of previously uncharacterized proteins. The identification of this domain provides a structural prediction of an impor- tant class of enzymes involved in polyketide antibi- otic synthesis and allows the prediction of their active site. It is predicted that all START domains contain a similar ligand-binding pocket. Modifica- tions of this pocket determine the ligand-binding specificity and may also be the basis for at least two distinct enzymatic activities, those of a cyclase/ aromatase and an RNase. Thus, the START domain superfamily is a rare case of the adaptation of a protein fold with a conserved ligand-binding mode for both a broad variety of catalytic activities and noncata- lytic regulatory functions. Proteins 2001;43:134 –144. Published 2001 Wiley-Liss, Inc. † Key words: START; aromatase; cyclase; lipid bind- ing; birch allergen INTRODUCTION The interaction of proteins with small molecules is the basis of enzymatic activity and its regulation. The recogni- tion of small molecules by proteins involves a variety of globular domains that catalyze specific reactions on them, use them as cofactors in other reactions, and bind them and undergo a conformational change in the process. Comparative analyses of protein sequences and structures have resulted in the elucidation of diverse modes of regulatory interactions between distinct protein domains and small molecules. 1–7 Several domains such as Per–Arnt– Sim domain (PAS), 4 cGMP Phosphodiesterase–adenylyl cyclase–FhlA domain (GAF), 5 Aspartokinase– chorismate mutase–TyrR domain (ACT), 3 ATP-cone, 8 Pleckstrin Ho- mology domain (PH), 9 and Sec14p 10 are specialized small- molecule-binding domains that regulate metabolic or sig- nal transduction pathways based on interactions with specific ligands. Alternatively, some of these domains use the bound ligands as sensors for stimuli such as redox potential and light. Most of these domains have diversified in evolution to accommodate different ligands but retain some shared structural features, such as various amino acids for the ACT domain. 3 On a very small number of occasions, variants of the same domain, such as the double-stranded -helix 11 and USPA 12 domains, are used both for catalysis and for ligand binding. These cases are of particular interest for understanding the general prin- ciples of protein/small-molecule interactions and the evolu- tionary diversification of protein families for performing distinct functions. Here, we describe the star-related lipid- transfer (START) domain superfamily, which represents a case of exaptation of the same fold for the simple binding of different molecules and catalytic activity. The START domain was initially identified as a wide- spread lipid-binding domain present in multicellular eu- karyotes (plants and animals) with a regulatory role in signal transduction, 13 analogous to the PH 9 and Sec14p 10 domains. Representatives of the START domain family have been shown to bind different ligands such as sterols (steroidogenic acute phase response protein) and phosphati- dylcholine (PPCT). 14,15 Ligand binding by the START domain also can regulate the activities of other domains that co-occur with the START domain in multidomain proteins such as Rho-gap, the homeodomain, and the thioesterase domain. 13 The subsequent solution of the crystal structure of the START domain from the MLN64 protein 16 revealed that this domain adopts an / struc- ture of the helix-grip [TATA binding protein (TBP)-like] fold, as classified in the Structural Classification of Pro- teins (SCOP) database. 17 The examination of the START domain structure has also suggested the presence of a binding pocket that has been implicated in the accommoda- tion of the lipid ligand. A similarity between the structures A list of the Genbank identifier numbers of all the identified START domains, including the 59 plant proteins, is available at ftp:// ncbi.nlm.nih.gov/pub/aravind/. *Correspondence to: L. Aravind, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894. E-mail: aravind@ncbi.nlm.nih.gov Received 26 September 2000; Accepted 14 December 2000 Published online 16 February 2001 PROTEINS: Structure, Function, and Genetics 43:134 –144 (2001) Published 2001 WILEY-LISS, INC. † This article is a US govern- ment work and, as such, is in the public domain in the United States of America.