GENOMICS Vol. 80, Number 4, October 2002
Copyright © 2002 Elsevier Science (USA). All rights reserved.
0888-7543/02 $35.00
402
Article
doi:10.1006/geno.2002.6843, available online at http://www.idealibrary.com on IDEAL
INTRODUCTION
L1s, the most abundant representative of long interspersed
nuclear elements (LINEs) in the human genome, are mainly
retrotranspositionally defective and only some of them
(30–60) retain the capacity for retrotransposition. The repre-
sentatives of this minor group harbor an internal RNA-poly-
merase II (PolII) promoter and two open reading frames,
ORF1 and -2. ORF1 encodes a 40-kDa RNA binding protein
colocalized with L1 RNA in cytoplasmic ribonucleoprotein
particles (RNPs) [1], which are likely intermediate in retro-
transposition. ORF2 encodes a protein-possessing endonu-
clease (EN) and reverse transcriptase (RT) activities [2]. A
remarkable feature of the L1 retrotransposition is its cis pref-
erence [3], due to which retrotransposition-competent L1s
predominantly transpose their own copies. Due to a trans-
complementation effect the L1 transposition machinery is
probably used for Alu retrotranspositions and for rare retro-
transpositions of mRNA copies [4]. We have detected a new
family of the retrotranscripts spread throughout the human
genome likely also using the transposition machinery of L1s.
The family includes at least 56 members, which contain a full
copy of U6 small nuclear RNA (snRNA) and a 3' part of L1
at their 5' and 3' ends, respectively. According to the
A New Family of Chimeric Retrotranscripts Formed
by a Full Copy of U6 Small Nuclear RNA
Fused to the 3' Terminus of L1
Anton Buzdin,
*
Svetlana Ustyugova, Elena Gogvadze, Tatiana Vinogradova,
Yuri Lebedev, and Eugene Sverdlov
Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow 117871, Russia
*
To whom correspondence and reprint requests should be addressed. Fax: 7-(095)-330-65-38. E-mail: anton@humgen.siobc.ras.ru.
Long interspersed nuclear elements (LINE-1, L1) constitute a large family of mammalian
retrotransposons that have been replicating and evolving in mammals for more than 100 mil-
lion years and now compose 17% of the human genome. They have an important creative role
in human genomic evolution through mechanisms such as new integrations, generation of
processed pseudogenes, and transfer of non-L1 DNA flanking their 3' ends to new genomic
locations. Here we present evidence that the L1 integration machinery was used for the cre-
ation of a new family of chimeric retrotranscripts, which contain a full copy of U6 small
nuclear RNA and a 3' part of L1 at their 5' and 3' ends, respectively. There are at least 56 mem-
bers of this family in the human genome. The integrations of such fused retrotranscripts into
the human genome took place until recently. Here we report one U6–L1 insertion that is poly-
morphic in humans. We also propose a mechanism used to generate chimeric retrotranscripts.
Key Words: L1, LINE, retroelements, human genome,
template switch mechanism, chimeras, integrations
evidence obtained, the integrations of such fusions in the
human genome took place until recently.
RESULTS AND DISCUSSION
To reveal differences between the human and chimpanzee
genomes in integrations of L1 interspersed repeats, we have
prepared a library of the genomic sequences flanking the
human-specific integrations of L1 (A.B. et al., manuscript in
preparation). We used our recently published technique, tar-
geted genomic difference analysis, which allows genome-
wide screening of such differences [5]. The individual clones
of the library were sequenced and the sequences of L1 flank-
ing regions were searched against GenBank to assign them to
certain genomic positions. Among other clones of the library,
there was one with a sequence in its unique part (Fig. 1). This
sequence matched a DNA stretch (acc. no. AL138764) located
on human chromosome 10p13 and was characterized by an
unusual fusion of a complete U6 small nuclear RNA copy
with a 3' portion of an L1 (Fig. 1). This retrotranscript
sequence was called U6–L1 10p13. Its 5' part is a full-size,
107-bp sequence of U6 snRNA in the sense orientation, 100%
identical to the human U6 consensus (given in RepBase