TRENDS in Biochemical Sciences Vol.26 No.4 April 2001 http://tibs.trends.com 0968-0004/01/$ – see front matter. Published by Elsevier Science Ltd. PII: S0968-0004(01)01826-6 215 Research Update THUMP – a predicted RNA-binding domain shared by 4-thiouridine,pseudouridine synthases and RNA methylases L. Aravind and Eugene V. Koonin Sequence profile searches were used to identify an ancient domain in ThiI-like thiouridine synthases, conserved RNA methylases,archaeal pseudouridine synthases and several uncharacterized proteins. We predict that this domain is an RNA-binding domain that adopts an α/ β fold similar to that found in the C-terminal domain of translation initiation factor 3 and ribosomal protein S8. The biochemical diversity and hence the functional versatility of cellular RNAs is vastly extended by numerous nucleotide modifications. These modifications range from simple methylation of the primary bases to the formation of a variety of atypical bases such as pseudouridine, archaeosine, thiouridines, wyosine and queuosine 1–3 . A combination of sequence analysis and experimental studies has identified several classes of enzymes involved in base modification 4–8 . A common feature of many of these in situ RNA-base-modifying enzymes is the fusion of specific RNA-binding domains such as S4, PUA and NusB, to the respective catalytic domains 8 . These RNA- binding domains have also been detected in other proteins involved in translation and RNA processing 8 . Here, we show that the 4-thiouridine biosynthesis enzyme ThiI (Ref. 9) shares a previously unknown domain with other RNA-modifying enzymes, including pseudouridine synthases and RNA methylases, and predict that this domain binds RNA. 4-Thiouridine (s 4 U) is a modified base that is present in bacterial and archaeal tRNAs; in Escherichia coli, the formation of s 4 U is catalyzed by the IscS and ThiI enzymes, which also participate in thiamine biosynthesis 10–12 . The ThiI protein contains a central PP-loop ATPase domain 13 and a C-terminal rhodanese-like domain 14,15 , which together catalyze the formation of s 4 U. This occurs via adenylylation of the 4-carbonyl group of uridine followed by sulfur-insertion by nucleophilic attack of the adenylated carbonyl group 12 . No function has been assigned to the N-terminal portion of the ThiI protein. A search of the non-redundant protein database (NCBI) using the PSI-BLAST program 16 , with an expect (E)-value threshold of 0.01 for inclusion of sequences into a profile, using the N-terminal region (GenBank GI:1773107 residues 57–169) of ThiI as the query, detected similar sequences in a variety of predicted RNA methylases from archaea, eukaryotes and bacteria, and in several uncharacterized proteins. Further PSI-BLAST searches exploiting a profile constructed from all these sequences additionally detected a similar region in an archaea-specific family of predicted pseudouridine synthases (PSUSs) typified by the MJ0421 protein from Methanococcus jannaschii. However, this conserved region was not detectable in other thiouridine synthases such as TrmU or MiaB, which is involved in the synthesis of thiolated adenine derivatives. Thus, this conserved region is shared by enzymes that are predicted to carry out at least three unrelated types of RNA- modification, namely methylation, pseudouridylation and thiouridylation, and can be predicted to define a previously undetected domain involved in RNA metabolism. We named this domain THUMP after th iou ridine synthases, m ethylases and P SUSs. The THUMP domain consists of 100–110 amino acid residues, and a multiple-alignment-based secondary- structure prediction using the PHD program 17 revealed an α/β fold (Fig. 1). The succession of the predicted secondary- structure elements in the THUMP domain is identical to that in the experimentally determined structures of two proteins, the C-terminal domain of the translation initiation factor 3 (IF3-C) and the N-terminal domain of ribosomal protein S8. Sequence-structure threading using the hybrid fold recognition method 18 and the ThiI THUMP domain as query recovers the IF3-C (PDB:1TIG) as the best hit. Thus, despite the lack of detectable sequence similarity, the THUMP domain probably shares a common structural fold with these proteins that are involved in translation. The architectures of the THUMP-domain-containing proteins are generally analogous to those of the S4-, PUA- and NusB-domain-containing proteins, which combine a variety of catalytic domains with an RNA-binding domain 8 . Together, these observations suggest that THUMP is an RNA-binding domain that mediates the delivery of various RNA-modifying activities to their target RNAs. ThiI-like s 4 U-synthases are common in bacteria and archaea, but are apparently absent from eukaryotes, which is consistent with the presence of this modification only in prokaryotic tRNAs (Ref. 2). The C-terminal rhodanese-like domain is present only in the ThiI proteins from proteobacteria and Thermoplasma; in other organisms this activity is probably supplied by a distinct, stand- alone version of this domain (Fig. 2). The PSUSs that contain the THUMP domain are found only in the archaea, and the THUMP domains in these proteins differ from all other versions in having a C4-Zn-finger insertion near the N-terminus (Figs 1,2). In most archaea, the gene encoding this PSUS is adjacent to a large operon that contains genes for several ribosomal proteins, which suggests that this PSUS modifies rRNA. The THUMP-domain-containing predicted RNA methylases comprise two families, one of which (typified by mouse ROSA26AS and YpsC from Bacillus subtilis) is conserved in bacteria, archaea and eukaryotes. Together with the presence, in the methyltransferase domain, of an (ND)PPY signature characteristic of adenine methylases, this suggests that these enzymes methylate an adenine in the conserved core of rRNA. In proteobacteria, this methylase is fused to a second predicted purine methylase domain (Fig. 2), which indicates that these proteins catalyze two distinct RNA modifications. The other family of