STRUCTURE NOTE
Structure of HI1333 (YhbY), a Putative RNA-Binding
Protein From Haemophilus influenzae
Mark A. Willis,
1
Wojciech Krajewski,
1
Vani Rao Chalamasetty,
2
Prasad Reddy,
2
Andrew Howard,
3,4
and
Osnat Herzberg
1
*
1
Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, Rockville, Maryland
2
The National Institute of Standards and Technology, Gaithersburg, Maryland
3
Advanced Photon Source, Argonne National Laboratory, Argonne, Illinois
4
Illinois Institute of Technology, Chicago, Illinois
Introduction. The structures of a number of small /
RNA-binding proteins with diverse biological functions are
known.
1
Their topologies and the locations of the RNA-
binding sites vary considerably, consistent with the plastic-
ity of RNA due to base pair mismatches, bulges, and loops.
Yet the protein-binding surfaces can be recognized be-
cause they are enriched with positively charged residues
that either form salt bridges with the negatively charged
RNA or contribute favorably to the electrostatic environ-
ment. Protein regions that exhibit conformational flexibil-
ity are also good candidates for RNA-protein interactions
because binding is usually accompanied by some mutual
conformational adjustments.
1
We have determined the
crystal structure of HI1333 (YhbY) from Haemophilus
influenzae, a protein annotated as hypothetical in se-
quence databases. We propose that this protein and its
close sequence relatives (25 in the nonredundant sequence
database at the time of writing) comprise a new class of
RNA-binding proteins.
Materials and Methods. The gene encoding HI1333
was amplified from Haemophilus influenzae KW20 genomic
DNA and cloned into pRE1
2
for expression in E. coli MZ1.
Cells were grown in LB media containing ampicillin (50
g/mL) at 32°C until the A
650
reached 0.4 and diluted with
an equal volume of fresh LB media kept at 60°C, achieving
42°C where protein induction occurs. Cells were broken by
passage through a French press and cell debris removed by
centrifugation at 100,000 g for 1 h. The protein was
purified by a combination of DE-52 anion-exchange chroma-
tography and cation exchange on a Shodex CM-2025 HPLC
column. Fractions containing the protein were pooled, concen-
trated, and dialyzed against 20 mM NaPO
4
, pH 7.0, 100 mM
NaCl to achieve a final protein concentration of 18 mg/mL.
The molecular weight of the protein was confirmed by
MALDI-TOF mass spectroscopy, and dynamic light scatter-
ing indicated that the protein is monomeric in solution.
Crystals of HI1333 belonging to space group P2
1
(with
cell dimensions of a = 30.6 A
˚
,b = 51.9 A
˚
,c = 59.1 A
˚
, =
102.2° and two molecules in the asymmetric unit) ap-
peared in a few days at room temperature in hanging-drop
vapor diffusion experiments using 5 L of protein (18
mg/mL in 20 mM NaPO
4
, pH 7.0, 100 mM NaCl), 1.5 L
15% heptane-1,2,3-triol, and 3.5 L well solution (29 –30%
PEG 4000, 100 mM Tris pH 8.0, 10 mM CaCl
2
). Diffraction
data at 100 K on cryoprotected crystals (using perflu-
oropolyether) oil MW = 2800 and 32% PEG 4000, 100 mM
Tris, pH 8.0, 15 mM CaCl
2
, 6.1% heptane-1,2,3-triol, 15%
glycerol) were collected on the IMCA-CAT beamlines (17-ID
and 17-BM) at the Advanced Photon Source (Argonne
National Laboratory, Argonne, IL). In addition to the 1.37
A
˚
native data, two MAD data sets were collected at 1.85 A
˚
.
One set for a platinum derivative was obtained by soaking
crystals in cryosolution augmented with 2 mM K
2
PtCl
4
for
3 days before flash-cooling, and the second set for a
bromide derivative was obtained by soaking crystals for
90 s in a 1 M NaBr cryosolution.
All data sets were processed with the HKL Suite
3
and
scaled by using SOLVE
4
(MAD data) and XPREP
5
(native
data). Heavy atom sites were found by using SOLVE
4
and
CNS.
6
The SOLVE phases derived from the combined
MAD data sets were modified by the program RESOLVE
7
and used to produce a high-quality electron density map
into which the two molecules in the asymmetric unit were
built with use of the program O.
8
Initial refinement using
all the native data from 25.9 to 1.37A
˚
was performed with
CNS, which was followed by refinement with SHELX-97
9
using anisotropic displacement parameter refinement. Na-
tive data processing statistics and refinement statistics
are shown in Tables I and II, respectively.
Results and Discussion. HI1333 is a 99 amino acid
/ protein consisting of a four-stranded mixed -sheet
sandwiched between two helices on one side and one helix
on the other [Fig. 1(a)]. The packing of molecules in the
Grant sponsor: National Institutes of Health; Grant number: P01
GM57890.
*Correspondence to: Osnat Herzberg, Center for Advanced Re-
search in Biotechnology, 9600 Gudelsky Drive, Rockville, MD 20850.
E-mail: osnat@carb.nist.gov
Received 24 June 2002; Accepted 26 June 2002
Published online 00 Month 2002 in Wiley InterScience (www.
interscience.wiley.com). DOI: 10.1002/prot.10225
PROTEINS: Structure, Function, and Genetics 49:423– 426 (2002)
© 2002 WILEY-LISS, INC.