TRENDS in Genetics Vol.17 No.10 October 2001 http://tig.trends.com 0168-9525/01/$ – see front matter © 2001 Elsevier Science Ltd. All rights reserved. PII: S0168-9525(01)02447-7 589 Review Alex Mira Howard Ochman* Nancy A. Moran Dept of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona 85721, USA. *e-mail: hochman@ email.arizona.edu When compared with eukaryotes, bacteria, including eubacteria and archaebacteria, accommodate a rather narrow range of variation in genome size. Whereas eukaryotic genomes vary in size by four orders of magnitude (from about 10 7 –10 11 basepairs), there is only about one order-of-magnitude difference across bacterial genome sizes 1–3 . However, the difference in the ranges of genome size in eukaryotes and bacteria is not reflected in corresponding differences in gene number. Unlike eukaryotes, the genome size variation in bacteria translates almost directly into biochemical, physiological and organismal complexity because the majority of sequences are functional protein- coding regions (Fig. 1). Among bacteria for which complete genomic sequences are available, a tenfold variation in genome size is reflected by a similar difference in total gene number 4,5 (Table 1). By contrast, yeast and humans have genomes that differ by almost 300-fold in size, yet they have only a sixfold difference in gene number 6–8 . What is the source of variation in genome size in bacteria? On the basis of the distribution of genome sizes and the orientation of apparently duplicated genes, it was once thought that new bacterial genomes evolved by repeated events of genome doubling 1,9,10 . However, subsequent analyses of additional genomes provided several lines of evidence against this hypothesis. First, related bacteria having genomes of similar sizes often contain very different complements of genes, and arrangements of duplicated genes are not consistent across taxa 11,12 . Second, the variation in Although bacteria increase their DNA content through horizontal transfer and gene duplication, their genomes remain small and, in particular, lack nonfunctional sequences.This pattern is most readily explained by a pervasive bias towards higher numbers of deletions than insertions. When selection is not strong enough to maintain them,genes are lost in large deletions or inactivated and subsequently eroded.Gene inactivation and loss are particularly apparent in obligate parasites and symbionts,in which dramatic reductions in genome size can result not from selection to lose DNA,but from decreased selection to maintain gene functionality.Here we discuss the evidence showing that deletional bias is a major force that shapes bacterial genomes. Deletional bias and the evolution of bacterial genomes Alex Mira, Howard Ochman and Nancy A. Moran TRENDS in Genetics 0 2000 4000 6000 8000 0 2000 4000 6000 8000 Genome size (kb) M. leprae Total number of genes Fig. 1. Association between genome size and gene number in bacteria. Numbers include protein-coding and RNA genes (R 2 = 0.945). When the number of annotated pseudogenes is added to the number of functional genes, Mycobacterium leprae falls on the regression line. Taxa are listed in Table 1. 62 Pelet, A. et al. (1998) Various mechanisms cause RET-mediated signaling defects in Hirschsprung disease. J. Clin. Invest. 101, 14151423 63 Takahashi, M. et al. (1999) Co-segregation of MEN2 and Hirschsprung disease: the same mutation of RET with both gain and loss-of- function? Hum. Mutat. 13, 331336 64 Mulligan, L.M. et al. (1994) Diverse phenotypes associated with exon 10 mutations of the RET proto-oncogene. Hum. Mol. Genet. 3, 21632167 65 Geneste, O. et al. (1999) Two distinct mutations of the RET receptor causing Hirschsprung disease impair the binding of signalling effectors to a multifunctional docking site. Hum. Mol. Genet. 8, 19891999 66 Pasini, B. et al. (1995) Loss of function effect of RET mutations causing Hirschsprung disease. Nat. Genet. 10, 3540 67 Cosma, M.P. et al. (1998) Mutations in the extracellular domain cause RET loss of function by a dominant negative mechanism. Mol. Cell. Biol. 18, 33213329 68 Angrist, M. et al. (1998) Human GFRA1: cloning, mapping, genomic structure, and evaluation as a candidate gene for Hirschsprung disease susceptibility. Genomics 48, 354362 69 Myers, S.M. et al. (1999) Investigation of germline GFR alpha-1 mutations in Hirschsprung disease. J. Med. Genet. 36, 217220 70 Onochie, C.I. et al. (2000) Characterisation of the human GFRalpha-3 locus and investigation of the gene in Hirschsprung disease. J. Med. Genet. 37, 669673 71 Vanhorne, J.B. et al. Cloning and characterization of the GFRA2 locus and investigation of the gene in Hirschsprung disease. Hum. Genet. (in press) 72 Bolk, S. et al. (2000) A human model for multigenic inheritance: phenotypic expression in Hirschsprung disease requires both the RET gene and a new 9q31 locus. Proc. Natl. Acad. Sci. U. S. A. 97, 268273 73 Chakravarti, A. and Lyonnet, S. (2001) Hirschsprung disease. In Molecular Bases of Hereditary Diseases (Scriver et al., eds), pp. 62316255,. 74 Lang, D. et al. (2000) Pax3 is required for enteric ganglia formation and functions with Sox10 to modulate expression of c-ret. J. Clin. Invest. 106 (8), 963971 75 Grimm, J. et al. (2001) Novel p62dok family members, dok-4 and dok-5, are substrates of the c-Ret receptor tyrosine kinase and mediate neuronal differentiation. J. Cell Biol. 154, 345–354 76 Burton, M.D. et al. (1997) RET proto-oncogene is important for the development of respiratory CO 2 sensitivity. J. Auton. Nerv. Syst. 63, 137–143