Genome Sequences of Escherichia coli B strains
REL606 and BL21(DE3)
Haeyoung Jeong
1
, Valérie Barbe
2
, Choong Hoon Lee
1,3
,
David Vallenet
2
, Dong Su Yu
1
, Sang-Haeng Choi
1
, Arnaud Couloux
2
,
Seung-Won Lee
1
, Sung Ho Yoon
1
, Laurence Cattolico
2
,
Cheol-Goo Hur
1,4
, Hong-Seog Park
1,4
, Béatrice Ségurens
2
,
Sun Chang Kim
3
, Tae Kwang Oh
1,5
, Richard E. Lenski
6
,
F. William Studier
7
⁎, Patrick Daegelen
2,8
⁎ and Jihyun F. Kim
1,4
⁎
Escherichia coli K-12 and B have been the subjects of classical experiments
from which much of our understanding of molecular genetics has emerged.
We present here complete genome sequences of two E. coli B strains, REL606,
used in a long-term evolution experiment, and BL21(DE3), widely used to
express recombinant proteins. The two genomes differ in length by 72,304 bp
and have 426 single base pair differences, a seemingly large difference for
laboratory strains having a common ancestor within the last 67 years.
Transpositions by IS1 and IS150 have occurred in both lineages. Integration
of the DE3 prophage in BL21(DE3) apparently displaced a defective
prophage in the λ attachment site of B. As might have been anticipated
from the many genetic and biochemical experiments comparing B and K-12
over the years, the B genomes are similar in size and organization to the
genome of E. coli K-12 MG1655 and have N 99% sequence identity over ∼ 92%
of their genomes. E. coli B and K-12 differ considerably in distribution of IS
elements and in location and composition of larger mobile elements. An
unexpected difference is the absence of a large cluster of flagella genes in B,
due to a 41 kbp IS1-mediated deletion. Gene clusters that specify the LPS
core, O antigen, and restriction enzymes differ substantially, presumably
because of horizontal transfer. Comparative analysis of 32 independently
isolated E. coli and Shigella genomes, both commensals and pathogenic
strains, identifies a minimal set of genes in common plus many strain-specific
genes that constitute a large E. coli pan-genome.
© 2009 Elsevier Ltd. All rights reserved.
1
Korea Research Institute of
Bioscience and Biotechnology
(KRIBB), 111 Gwahangno,
Yuseong, Daejeon 305-806,
Korea
2
CNRS UMR 8030, Genoscope
(CEA), 2 rue Gaston Crémieux,
CP 5706, 91000 Evry Cedex,
France
3
Department of Biological
Sciences, Korea Advanced
Institute of Science and
Technology, Daejeon 305-701,
Korea
4
Functional Genomics Program,
University of Science and
Technology, Yuseong, Daejeon
305-333, Korea
5
21C Frontier Microbial
Genomics and Applications
Center, Yuseong, Daejeon
305-806, Korea
6
Department of Microbiology
and Molecular Genetics,
Michigan State University, East
Lansing, MI 48824, USA
7
Biology Department,
Brookhaven National
Laboratory, P.O. Box 5000,
Upton, NY 11973-5000, USA
8
Inserm, 101 rue de Tolbiac,
75013 Paris, France
*Corresponding authors. E-mail addresses: jfk@kribb.re.kr; daegelen@genoscope.cns.fr; studier@bnl.gov.
Abbreviations used: SNP, single base pair difference; LPS, lipopolysaccharide.
doi:10.1016/j.jmb.2009.09.052 J. Mol. Biol. (2009) 394, 644–652
Available online at www.sciencedirect.com
0022-2836/$ - see front matter © 2009 Elsevier Ltd. All rights reserved.