STRUCTURE NOTE Structure of HI1333 (YhbY), a Putative RNA-Binding Protein From Haemophilus influenzae Mark A. Willis, 1 Wojciech Krajewski, 1 Vani Rao Chalamasetty, 2 Prasad Reddy, 2 Andrew Howard, 3,4 and Osnat Herzberg 1 * 1 Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, Rockville, Maryland 2 The National Institute of Standards and Technology, Gaithersburg, Maryland 3 Advanced Photon Source, Argonne National Laboratory, Argonne, Illinois 4 Illinois Institute of Technology, Chicago, Illinois Introduction. The structures of a number of small / RNA-binding proteins with diverse biological functions are known. 1 Their topologies and the locations of the RNA- binding sites vary considerably, consistent with the plastic- ity of RNA due to base pair mismatches, bulges, and loops. Yet the protein-binding surfaces can be recognized be- cause they are enriched with positively charged residues that either form salt bridges with the negatively charged RNA or contribute favorably to the electrostatic environ- ment. Protein regions that exhibit conformational flexibil- ity are also good candidates for RNA-protein interactions because binding is usually accompanied by some mutual conformational adjustments. 1 We have determined the crystal structure of HI1333 (YhbY) from Haemophilus influenzae, a protein annotated as hypothetical in se- quence databases. We propose that this protein and its close sequence relatives (25 in the nonredundant sequence database at the time of writing) comprise a new class of RNA-binding proteins. Materials and Methods. The gene encoding HI1333 was amplified from Haemophilus influenzae KW20 genomic DNA and cloned into pRE1 2 for expression in E. coli MZ1. Cells were grown in LB media containing ampicillin (50 g/mL) at 32°C until the A 650 reached 0.4 and diluted with an equal volume of fresh LB media kept at 60°C, achieving 42°C where protein induction occurs. Cells were broken by passage through a French press and cell debris removed by centrifugation at 100,000 g for 1 h. The protein was purified by a combination of DE-52 anion-exchange chroma- tography and cation exchange on a Shodex CM-2025 HPLC column. Fractions containing the protein were pooled, concen- trated, and dialyzed against 20 mM NaPO 4 , pH 7.0, 100 mM NaCl to achieve a final protein concentration of 18 mg/mL. The molecular weight of the protein was confirmed by MALDI-TOF mass spectroscopy, and dynamic light scatter- ing indicated that the protein is monomeric in solution. Crystals of HI1333 belonging to space group P2 1 (with cell dimensions of a = 30.6 A ˚ ,b = 51.9 A ˚ ,c = 59.1 A ˚ , = 102.2° and two molecules in the asymmetric unit) ap- peared in a few days at room temperature in hanging-drop vapor diffusion experiments using 5 L of protein (18 mg/mL in 20 mM NaPO 4 , pH 7.0, 100 mM NaCl), 1.5 L 15% heptane-1,2,3-triol, and 3.5 L well solution (29 –30% PEG 4000, 100 mM Tris pH 8.0, 10 mM CaCl 2 ). Diffraction data at 100 K on cryoprotected crystals (using perflu- oropolyether) oil MW = 2800 and 32% PEG 4000, 100 mM Tris, pH 8.0, 15 mM CaCl 2 , 6.1% heptane-1,2,3-triol, 15% glycerol) were collected on the IMCA-CAT beamlines (17-ID and 17-BM) at the Advanced Photon Source (Argonne National Laboratory, Argonne, IL). In addition to the 1.37 A ˚ native data, two MAD data sets were collected at 1.85 A ˚ . One set for a platinum derivative was obtained by soaking crystals in cryosolution augmented with 2 mM K 2 PtCl 4 for 3 days before flash-cooling, and the second set for a bromide derivative was obtained by soaking crystals for 90 s in a 1 M NaBr cryosolution. All data sets were processed with the HKL Suite 3 and scaled by using SOLVE 4 (MAD data) and XPREP 5 (native data). Heavy atom sites were found by using SOLVE 4 and CNS. 6 The SOLVE phases derived from the combined MAD data sets were modified by the program RESOLVE 7 and used to produce a high-quality electron density map into which the two molecules in the asymmetric unit were built with use of the program O. 8 Initial refinement using all the native data from 25.9 to 1.37A ˚ was performed with CNS, which was followed by refinement with SHELX-97 9 using anisotropic displacement parameter refinement. Na- tive data processing statistics and refinement statistics are shown in Tables I and II, respectively. Results and Discussion. HI1333 is a 99 amino acid /protein consisting of a four-stranded mixed -sheet sandwiched between two helices on one side and one helix on the other [Fig. 1(a)]. The packing of molecules in the Grant sponsor: National Institutes of Health; Grant number: P01 GM57890. *Correspondence to: Osnat Herzberg, Center for Advanced Re- search in Biotechnology, 9600 Gudelsky Drive, Rockville, MD 20850. E-mail: osnat@carb.nist.gov Received 24 June 2002; Accepted 26 June 2002 Published online 00 Month 2002 in Wiley InterScience (www. interscience.wiley.com). DOI: 10.1002/prot.10225 PROTEINS: Structure, Function, and Genetics 49:423– 426 (2002) © 2002 WILEY-LISS, INC.