Environmental Microbiology (2005) 7(2), 149–152 doi:10.1111/j.1462-2920.2005.00774.x © 2005 Society for Applied Microbiology and Blackwell Publishing Ltd (No claim to original US government works) Blackwell Science, LtdOxford, UKEMIEnvironmental Microbiology 1462-2912Society for Applied Microbiology and Blackwell Publishing Ltd, 20057 2149152MiscellaneousGenomics updateM. Y. Galperin *For correspondence. E-mail galperin@ncbi.nlm.nih.gov; Tel. (+1) 301 435 5910; Fax (+1) 301 435 7794. Genomics update Life is not defined just in base pairs Michael Y. Galperin* National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA This year marks the 10th anniversary of the completion of the Haemophilus influenzae genome sequence (Fleis- chmann et al., 1995), which marked the beginning of the genome era. This field has seen breathtaking progress ever since with more than 200 complete microbial genomes available in public databases by the end of 2004. Arguably, the most visible accomplishment of genome sequencing has been completion of the human genome, announced recently by the International Human Genome Sequencing Consortium (2004). Although the practical completion of the human genome has been announced back in 2001 (Lander et al., 2001), and reports that it is ‘practically finished’ have been appearing on a regular basis, the current version covers 2.85 billion bases, which is ~99% of the euchromatic part of human genome and contains just 341 gaps. It is also extremely accurate, with estimated error rate of 1 base per 10 5 . While human hardly can be considered an object of envi- ronmental microbiology, it has enormous environmental impact, which justifies listing this genome in Table 1. Fur- thermore, human genome already serves as a reference point in genome analyses of other eukaryotic organisms. Comparisons of human and chimp genomes (Clark et al., 2003; Watanabe et al., 2004) have been particularly instructive. The importance of human genome sequence will increase even further once we get a better idea of the protein set that it encodes. A clear benefit of microbial genome sequencing has been a much better understanding of the minimal require- ments for cellular life. We now know that, e.g. the bacte- rium Mycoplasma genitalium has a single chromosome, which consists of 580 074 base pairs and carries genes for three ribosomal RNAs (5S, 16S, and 23S), 36 tRNAs, and 478 proteins (Fraser et al., 1995). We also know that about a hundred of the protein-coding genes can be dis- rupted without impairing the ability of this bacterium to grow on a synthetic peptide-rich broth containing the nec- essary nutrients (Hutchison et al., 1999), suggesting that the truly minimal gene set necessary for the cell life might be even smaller, in the 300–350 gene range (Mushegian and Koonin, 1996; Koonin, 2000; Peterson and Fraser, 2001). Furthermore, we know that the cell of Aquifex aeoli- cus with its 1521 protein-coding genes is capable of autonomous, autotrophic existence in the environment, requiring for growth only hydrogen, oxygen, carbon diox- ide, and mineral salts (Deckert et al., 1998). It would seem that 60 years after Erwin Schrödinger wrote his book ‘What is Life?’ we should finally be able to answer the question. However, Nature never ceases to challenge the limits of our imagination. First there were highly degraded genomes of the bacteria Buchnera sp., Wigglesworthia sp., and Blochmannia sp., which are obligate intracellular symbionts of insect cells and function as organelles rather than as separate organisms (Andersson, 2000). The cryp- tomonad Guillardia theta, in addition to its own nucleus, a chloroplast and a mitochondrion, contains a nucleomorph – vestigial nucleus of a former algal endosymbiont – with a 551-kb genome. The archaeon Nanoarchaeum equitans only grows in a mixed culture with another archaeon, Ignicoccus sp., and appears even to borrow its membrane lipids (Jahn et al., 2004). Still, all these genomes encode their own transcription and translation systems that certify them as (former?) ‘organisms’. To challenge our defini- tions of life even further, here comes the discovery of a virus that raises all sorts of uneasy questions. Mimivirus was first described in the amoeba Acanthamoeba polyph- aga as particles of 400–800 nm in diameter with the char- acteristic icosahedral capsid morphology and assumed to have double-stranded DNA genome of ~800 kb, larger than any virus seen before (La Scola et al., 2003). Sequencing of its genome revealed that it is actually even larger, almost 1.2 Mb, and carries 1262 predicted genes, 911 of them protein-coding (Raoult et al., 2004). The gene list includes most of the genes typical for the nucleocyto- plasmic large DNA viruses, such as poxvirus. However, mimivirus additionally encodes proteins involved in trans- lation and DNA repair, chaperonins and metabolic enzymes, many of which have never been seen in viruses. These include, among others, Arg-, Cys-, Met- and Tyr- specific aminoacyl-tRNA synthetases, translation initiation