TRENDSin Microbiology Vol.10 No.2 February 2002 http://tim.trends.com 0966-842X/01/$ – see front matter © 2001 Elsevier Science Ltd. All rights reserved. PII: S0966-842X(01)02293-4 94 Review Robert A. Edwards* University of Tennessee Health Sciences Center, M SB 101 858 M adison Ave, Memphis, TN 38163, USA. *e-mail: redwards@utmem.edu Gary J. Olsen Stanley R. M aloy Dept of Microbiology, University of Illinois at Urbana-Champaign, 601 S. Goodwin Ave, Urbana, IL 61801, USA. Microbial genomics initially focused on sequencing genomes fr om di ver se par t s of t he evol ut i onar y t r ee. This approach allowed broad comparisons of distantly related species, and provided novel insights into bacterial evolution [1–3]. Subsequently, by various combinations of accident, competition and deliberate desi gn, t he sequences of mul t i pl e cl osel y r el at ed genomes have been determined (Table 1, published as supplementary information online at http://archive.bmn.com/supp/tim/edwards.pdf). Compar i sons of cl osel y r el at ed genomes pr ovi de novel insights into evolutionary forces acting over a shorter time scale [4,5]. Comparisons of multiple Salmonella enter i ca genomes are particularly revealing because (1) t he sequence dat a fr om mul t i pl e ser ovar s ar e publ i cl y avai l abl e; (2) t he ser ovar s ar e ver y cl osel y related; (3) the serovars have unique features that affect their virulence; (4) a great deal is known about the pathogenesis, genetics and biochemistry of some ser ovar s; and (5) t he pl et hor a of genet i c and biochemical tools available for Salmonella allow facile bi ol ogi cal t est s of pr edi ct i ons made i n si l i co . This review will focus on comparative genomics of Salmonella serovars. We will briefly review the rationale behind sequencing different serovars, and discuss regions of similarity and differences among the salmonellae. Using select examples, we will discuss the sequence var iat ion, acquisit ion of sequences via gene transfer, and evolution of the serovars, questions that will become t he focus of compar at ive genomics bet ween ser ovar s in t he post -genomic er a. Approaches for comparison of Salmonella genomes A major goal of comparative analysis of Salmonella genomes is to identify the genetic similarities and differences responsible for the unique virulence at t r i but es of t hese cl osel y r el at ed bact er i al pathogens. Several other methods that do not require genome sequence compar i sons have been used t o identify unique sequences within the genomes of related bacteria. For example, several genes unique to different Salmonella enterica genomes have been identified by subtractive hybridization [6,7]. An alternative technique used physical mapping to identify insertions or deletions from the chromosome; however, this technique only identifies differences resulting from large chromosomal insertions and deletions [8]. Brute-force genome sequencing appr oaches have r ecent l y become an economi cal l y viable alternative to these biochemical approaches. Owing to the low cost of high-throughput DNA sequencing, compar at ive genome sequencing is bot h a cheaper and more efficient way of identifying all of t he genet i c di ffer ences bet ween cl osel y r el at ed bact er i a. Genome sequenci ng al so enabl es t he use of microarray approaches that provide simple ways to compare larger numbers of similar genomes. An array is constructed based upon the DNA sequence of one genome and is then hybridized with labeled probes fr om t he second genome. Spot s of t he ar r ay t hat do not hybridize identify sequences missing from the second genome. These individual spots can be targeted for mutation, sequencing or genetic transfer. Properties of Salmonella serovars Salmonella enterica i ncl udes sever al cl osel y r el at ed serovars that cause disease in humans and animals (Box 1). Long before the advent of full-genome sequencing, hybridization analysis indicated that the ~2100 Salmonella ser ovar s shar e >90% DNA cont ent [9]. These studies also indicated that most genes ar e al so pr esent i n Salmonella’s near est cousi n, Escher i chi a col i , with between 80% and 85% identity bet ween cor r esponding genes [9,10]. Genome sequencing demonstrated that the median homology bet ween E. col i and Salmonella genomes is 80% [11]. Although the Salmonella ser ovar s ar e cl osel y r el at ed, t her e ar e i mpor t ant di ffer ences bet ween them. Many serovars have different host ranges or cause distinct disease symptoms in different hosts [11,12]. Some Salmonella ser ovar s ar e ‘gener al i st s’, infecting a wide variety of animals – for example, Sal monel l a enter i ca Typhimurium and Enteritidis infect humans, mice and chickens, causing gastroenteritis in humans, a systemic infection in mice and an asymptomatic chronic infection in chickens. Other serovars are host-adapted, infecting only a few species – for example, Sal monel l a enter i ca Choleraesuis primarily infects swine and Salmonella enter i ca Dublin primarily infects cattle, although t hese ser ovar s r ar el y cause di sease i n ot her ani mal s. As the number of completed genome sequences increases, there is increasing emphasis on comparative genomic analysis of closely related organisms. Comparison of the similarities and differences between the five publicly available Salmonella genome sequences reveals extensive sequence conservation among the Salmonella serovars. However, horizontal gene transfer has provided each genome w ith between 10% and 12% of unique DNA. Genome comparisons of the closely related salmonellae emphasize the insights that can be gleaned from sequencing genomes of a single species. Comparative genomics of closely related salmonellae Robert A. Edwards, Gary J. Olsen and Stanley R. Maloy