Evolutionary Dynamics of the Human Endogenous Retrovirus Family HERV-K Inferred from Full-Length Proviral Genomes Javier Costas Departamento de Bioloxı ´a Fundamental, Facultade de Bioloxı ´a, Universidade de Santiago de Compostela, Campus Sur s/n E-15782 Santiago de Compostela, (A Corun ˜ a), Spain Received: 5 February 2001 / Accepted: 22 March 2001 Abstract. Several distinct families of endogenous ret- roviruses exist in the genomes of primates. Most of them are remnants of ancient germ-line infections. The human endogenous retrovirus family HERV-K represents the unique known case of endogenous retrovirus that ampli- fied in the human genome after the divergence of human and chimpanzee lineages. There are two types of HERV-K proviral genomes differing by the presence or absence of 292 bp in the pol-env boundary. Human- specific insertions exist for both types. The analyses shown in the present work reveal that several lineages of type 1 and type 2 HERV-K proviruses remained trans- positionally active after the human/chimpanzee split. The data also reflect the important role of mosaic evolution (either by recombination or gene conversion) during the evolutionary history of HERV-K. Key words: Human endogenous retrovirus — HERV — HML-2 — HERV-K — Master gene model — Ret- rovirus-like elements — Transposable elements — In- terspersed elements — Mosaic evolution Introduction The human genome harbors a wide variety of endog- enous retroviruses, representing at least 5% of total DNA (Smit 1996). They most likely stem from germ-line ret- roviral integrations at different times during primate evo- lution. Presumably, subsequent retrotransposition (al- though re-infection cannot be formally ruled out) often led to an increase in copy number (Lo ¨wer et al. 1996). The vast majority of these insertions persist within the human genome as solitary long terminal repeats (LTRs), created by homologous recombination between the 5' and 3' LTRs of an intact proviral element (Lo ¨wer et al. 1996). HERV-K, also referred to as HML-2, is the biologi- cally most active human endogenous retrovirus (HERV) family, retaining the capacity to be expressed at the RNA and protein levels, and to form virus-like particles (Lo ¨ wer et al. 1993). A few proviruses have been detected preserving long open reading frames (ORFs) for several proteins (Barbulescu et al. 1999; Mayer et al. 1999; To ¨n- jes et al. 1999). DNA hybridization data have suggested that HERV-K first entered the primate genome shortly after the split of New World and Old World monkeys (Mariani-Costantini et al. 1989). Phylogenetic analyses of HERV-K LTR sequences revealed the existence of distinct subgroups with different integration times, de- termined by locus-specific PCR of several members from each subgroup (Medstrand and Mager 1998). Re- markably, this and other studies have shown that the most recent insertions of HERV-K post-date the human/ chimpanzee split, representing the unique known family of HERVs that include human-specific insertions (Med- strand and Mager 1998; Barbulescu et al. 1999; Mayer et al. 1999; Lebedev et al. 2000). The hallmark of this long period of activity is the existence of several thousand Correspondence to: Javier Costas; email: bfcostas@usc.es J Mol Evol (2001) 53:237–243 DOI: 10.1007/s002390010213 © Springer-Verlag New York Inc. 2001