Identification of a Novel Archaebacterial Thioredoxin: Determination of Function through Structure Sudeepa Bhattacharyya, Bahram Habibi-Nazhad, Godwin Amegbey, Carolyn M. Slupsky, § Adelinda Yee, | Cheryl Arrowsmith, | and David S. Wishart* ,‡ Faculty of Pharmacy and Pharmaceutical Sciences, UniVersity of Alberta, Edmonton, Alberta, Canada T6G 2N8, Department of Biochemistry, UniVersity of Alberta, Edmonton, Alberta, Canada T6G 2H7, and DiVision of Molecular and Structural Biology, Ontario Cancer Institute, 610 UniVersity AVenue, Toronto, Ontario, Canada M5G 2M9 ReceiVed July 20, 2001; ReVised Manuscript ReceiVed NoVember 28, 2001 ABSTRACT: As part of a high-throughput, structural proteomic project we have used NMR spectroscopy to determine the solution structure and ascertain the function of a previously unknown, conserved protein (MtH895) from the thermophilic archeon Methanobacterium thermoautotrophicum. Our findings indicate that MtH895 contains a central four-stranded -sheet core surrounded by two helices on one side and a third on the other. It has an overall fold superficially similar to that of a glutaredoxin. However, detailed analysis of its three-dimensional structure along with molecular docking simulations of its interaction with T7 DNA polymerase (a thioredoxin-specific substrate) and comparisons with other known members of the thioredoxin/glutaredoxin family of proteins strongly suggest that MtH895 is more akin to a thioredoxin. Furthermore, measurement of the pK a values of its active site thiols along with direct measurements of the thioredoxin/glutaredoxin activity has confirmed that MtH895 is, indeed, a thioredoxin and exhibits no glutaredoxin activity. We have also identified a group of previously unknown proteins from several other archaebacteria that have significant (34-44%) sequence identity with MtH895. These proteins have unusual active site -CXXC- motifs not found in any known thioredoxin or glutaredoxin. On the basis of the results presented here, we predict that these small proteins are all members of a new class of truncated thioredoxins. The exponential growth in genome sequence data has placed increasing pressure on protein chemists to rapidly identify the function of many unknown or unclassified proteins. In cases where sequence comparisons fail to identify potential homologues or functional analogues, structural studies may go a long way toward revealing the function of the protein of interest (1-4). Because of the potential applications in functional classification, structural biologists are beginning to develop high-throughput X-ray and NMR methods for rapid functional and structural characterization of proteins. Indeed, a number of international structural genomic initiatives are now underway aimed at solving the structure (and identifying the function) of a large number of proteins from a variety of model organisms (4). One such model organism is the archaebacterium Methanobacterium thermoautotrophicum (ΔH) (MtH), 1 a small thermophilic bacterium first sequenced in 1996 (5). This particular archeon was chosen for this pilot project not only for its phylogenetic uniqueness but also because it offered an opportunity to better understand the structural basis of the differential thermo- stability between thermophilic and mesophilic proteins (1). To date, more than a dozen protein structures have been solved and characterized for this particular organism (1). The genome of M. thermoautotrophicum (MtH) contains about 1870 proteins, of which fewer than 50% have been assigned functions based on BLAST sequence analysis (6). MtH895 is a small 77-residue protein identified in 1999 as a conserved hypothetical protein with unassigned function. Because of its small size and good solution behavior, this protein was chosen for detailed structural analysis by NMR spectroscopy. Here we wish to report on the high-resolution structure of MtH895 and the subsequent identification of this protein (through sequential, structural, and biochemical comparisons) as what appears to be the smallest known member of the thioredoxin family. Supported by the Natural Sciences and Engineering Research Council (GP01957270), the Protein Engineering Network of Centres of Excellence (PENCE), and the Ontario Cancer Institute (OCI). * To whom correspondence should be addressed. Tel: 780-492-0383. Fax: 780-492-5305. E-mail: david.wishart@ualberta.ca. Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta. § Department of Biochemistry, University of Alberta. | Division of Molecular and Structural Biology, Ontario Cancer Institute. 1 Abbreviations: BLAST, basic local alignment search tool; CE, combinatorial extension; DSS, 2,2-dimethyl-2-silapentane-5-sulfonic acid; DTNB, 5,5-dithiobis(2-nitrobenzoic acid); DTT, dithiothreitol; EDTA, ethylenediaminetetraacetic acid; Grx, glutaredoxin; HSQC, heteronuclear single-quantum coherence spectroscopy; MD, molecular dynamics; MtH, Methanobacterium thermoautotrophicum (strain H); NADPH, nicotinamide adenine dinucleotide phosphate (reduced); NMR, nuclear magnetic resonance; NOE, nuclear Overhauser effect; NOESY, nuclear Overhauser effect spectroscopy; PSI-BLAST, position-specific iterative BLAST; RMSD, root mean square deviation; SCOP, structural classification of proteins; TOCSY, total correlation spectroscopy; Trx, thioredoxin. 4760 Biochemistry 2002, 41, 4760-4770 10.1021/bi0115176 CCC: $22.00 © 2002 American Chemical Society Published on Web 03/22/2002