Environmental Microbiology (2005) 7(2), 149–152 doi:10.1111/j.1462-2920.2005.00774.x
© 2005 Society for Applied Microbiology and Blackwell Publishing Ltd
(No claim to original US government works)
Blackwell Science, LtdOxford, UKEMIEnvironmental Microbiology 1462-2912Society for Applied Microbiology and Blackwell Publishing Ltd, 20057 2149152MiscellaneousGenomics updateM. Y. Galperin
*For correspondence. E-mail galperin@ncbi.nlm.nih.gov; Tel.
(+1) 301 435 5910; Fax (+1) 301 435 7794.
Genomics update
Life is not defined just in base pairs
Michael Y. Galperin*
National Center for Biotechnology Information, National
Library of Medicine, National Institutes of Health,
Bethesda, MD 20894, USA
This year marks the 10th anniversary of the completion of
the Haemophilus influenzae genome sequence (Fleis-
chmann et al., 1995), which marked the beginning of the
genome era. This field has seen breathtaking progress
ever since with more than 200 complete microbial
genomes available in public databases by the end of
2004. Arguably, the most visible accomplishment of
genome sequencing has been completion of the human
genome, announced recently by the International Human
Genome Sequencing Consortium (2004). Although the
practical completion of the human genome has been
announced back in 2001 (Lander et al., 2001), and reports
that it is ‘practically finished’ have been appearing on a
regular basis, the current version covers 2.85 billion
bases, which is ~99% of the euchromatic part of human
genome and contains just 341 gaps. It is also extremely
accurate, with estimated error rate of 1 base per 10
5
.
While human hardly can be considered an object of envi-
ronmental microbiology, it has enormous environmental
impact, which justifies listing this genome in Table 1. Fur-
thermore, human genome already serves as a reference
point in genome analyses of other eukaryotic organisms.
Comparisons of human and chimp genomes (Clark et al.,
2003; Watanabe et al., 2004) have been particularly
instructive. The importance of human genome sequence
will increase even further once we get a better idea of the
protein set that it encodes.
A clear benefit of microbial genome sequencing has
been a much better understanding of the minimal require-
ments for cellular life. We now know that, e.g. the bacte-
rium Mycoplasma genitalium has a single chromosome,
which consists of 580 074 base pairs and carries genes
for three ribosomal RNAs (5S, 16S, and 23S), 36 tRNAs,
and 478 proteins (Fraser et al., 1995). We also know that
about a hundred of the protein-coding genes can be dis-
rupted without impairing the ability of this bacterium to
grow on a synthetic peptide-rich broth containing the nec-
essary nutrients (Hutchison et al., 1999), suggesting that
the truly minimal gene set necessary for the cell life might
be even smaller, in the 300–350 gene range (Mushegian
and Koonin, 1996; Koonin, 2000; Peterson and Fraser,
2001). Furthermore, we know that the cell of Aquifex aeoli-
cus with its 1521 protein-coding genes is capable of
autonomous, autotrophic existence in the environment,
requiring for growth only hydrogen, oxygen, carbon diox-
ide, and mineral salts (Deckert et al., 1998). It would seem
that 60 years after Erwin Schrödinger wrote his book
‘What is Life?’ we should finally be able to answer the
question. However, Nature never ceases to challenge the
limits of our imagination. First there were highly degraded
genomes of the bacteria Buchnera sp., Wigglesworthia
sp., and Blochmannia sp., which are obligate intracellular
symbionts of insect cells and function as organelles rather
than as separate organisms (Andersson, 2000). The cryp-
tomonad Guillardia theta, in addition to its own nucleus, a
chloroplast and a mitochondrion, contains a nucleomorph
– vestigial nucleus of a former algal endosymbiont – with
a 551-kb genome. The archaeon Nanoarchaeum equitans
only grows in a mixed culture with another archaeon,
Ignicoccus sp., and appears even to borrow its membrane
lipids (Jahn et al., 2004). Still, all these genomes encode
their own transcription and translation systems that certify
them as (former?) ‘organisms’. To challenge our defini-
tions of life even further, here comes the discovery of a
virus that raises all sorts of uneasy questions. Mimivirus
was first described in the amoeba Acanthamoeba polyph-
aga as particles of 400–800 nm in diameter with the char-
acteristic icosahedral capsid morphology and assumed to
have double-stranded DNA genome of ~800 kb, larger
than any virus seen before (La Scola et al., 2003).
Sequencing of its genome revealed that it is actually even
larger, almost 1.2 Mb, and carries 1262 predicted genes,
911 of them protein-coding (Raoult et al., 2004). The gene
list includes most of the genes typical for the nucleocyto-
plasmic large DNA viruses, such as poxvirus. However,
mimivirus additionally encodes proteins involved in trans-
lation and DNA repair, chaperonins and metabolic
enzymes, many of which have never been seen in viruses.
These include, among others, Arg-, Cys-, Met- and Tyr-
specific aminoacyl-tRNA synthetases, translation initiation