GENOMICS Vol. 80, Number 4, October 2002 Copyright © 2002 Elsevier Science (USA). All rights reserved. 0888-7543/02 $35.00 402 Article doi:10.1006/geno.2002.6843, available online at http://www.idealibrary.com on IDEAL INTRODUCTION L1s, the most abundant representative of long interspersed nuclear elements (LINEs) in the human genome, are mainly retrotranspositionally defective and only some of them (30–60) retain the capacity for retrotransposition. The repre- sentatives of this minor group harbor an internal RNA-poly- merase II (PolII) promoter and two open reading frames, ORF1 and -2. ORF1 encodes a 40-kDa RNA binding protein colocalized with L1 RNA in cytoplasmic ribonucleoprotein particles (RNPs) [1], which are likely intermediate in retro- transposition. ORF2 encodes a protein-possessing endonu- clease (EN) and reverse transcriptase (RT) activities [2]. A remarkable feature of the L1 retrotransposition is its cis pref- erence [3], due to which retrotransposition-competent L1s predominantly transpose their own copies. Due to a trans- complementation effect the L1 transposition machinery is probably used for Alu retrotranspositions and for rare retro- transpositions of mRNA copies [4]. We have detected a new family of the retrotranscripts spread throughout the human genome likely also using the transposition machinery of L1s. The family includes at least 56 members, which contain a full copy of U6 small nuclear RNA (snRNA) and a 3' part of L1 at their 5' and 3' ends, respectively. According to the A New Family of Chimeric Retrotranscripts Formed by a Full Copy of U6 Small Nuclear RNA Fused to the 3' Terminus of L1 Anton Buzdin, * Svetlana Ustyugova, Elena Gogvadze, Tatiana Vinogradova, Yuri Lebedev, and Eugene Sverdlov Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow 117871, Russia * To whom correspondence and reprint requests should be addressed. Fax: 7-(095)-330-65-38. E-mail: anton@humgen.siobc.ras.ru. Long interspersed nuclear elements (LINE-1, L1) constitute a large family of mammalian retrotransposons that have been replicating and evolving in mammals for more than 100 mil- lion years and now compose 17% of the human genome. They have an important creative role in human genomic evolution through mechanisms such as new integrations, generation of processed pseudogenes, and transfer of non-L1 DNA flanking their 3' ends to new genomic locations. Here we present evidence that the L1 integration machinery was used for the cre- ation of a new family of chimeric retrotranscripts, which contain a full copy of U6 small nuclear RNA and a 3' part of L1 at their 5' and 3' ends, respectively. There are at least 56 mem- bers of this family in the human genome. The integrations of such fused retrotranscripts into the human genome took place until recently. Here we report one U6–L1 insertion that is poly- morphic in humans. We also propose a mechanism used to generate chimeric retrotranscripts. Key Words: L1, LINE, retroelements, human genome, template switch mechanism, chimeras, integrations evidence obtained, the integrations of such fusions in the human genome took place until recently. RESULTS AND DISCUSSION To reveal differences between the human and chimpanzee genomes in integrations of L1 interspersed repeats, we have prepared a library of the genomic sequences flanking the human-specific integrations of L1 (A.B. et al., manuscript in preparation). We used our recently published technique, tar- geted genomic difference analysis, which allows genome- wide screening of such differences [5]. The individual clones of the library were sequenced and the sequences of L1 flank- ing regions were searched against GenBank to assign them to certain genomic positions. Among other clones of the library, there was one with a sequence in its unique part (Fig. 1). This sequence matched a DNA stretch (acc. no. AL138764) located on human chromosome 10p13 and was characterized by an unusual fusion of a complete U6 small nuclear RNA copy with a 3' portion of an L1 (Fig. 1). This retrotranscript sequence was called U6–L1 10p13. Its 5' part is a full-size, 107-bp sequence of U6 snRNA in the sense orientation, 100% identical to the human U6 consensus (given in RepBase