Mol Genet Genomics (2008) 279:385–401 DOI 10.1007/s00438-008-0319-4 123 ORIGINAL PAPER Transposable elements in CoVea (Gentianales: Rubiacea) transcripts and their role in the origin of protein diversity in Xowering plants Fabrício Ramon Lopes · Marcelo Falsarella Carazzolle · Gonçalo Amarante Guimarães Pereira · Carlos Augusto Colombo · Claudia Marcia Aparecida Carareto Received: 25 June 2007 / Accepted: 2 January 2008 / Published online: 30 January 2008 Springer-Verlag 2008 Abstract Transposable elements are major components of plant genomes and they inXuence their evolution, acting as recombination hot spots, acquiring speciWc cell functions or becoming part of protein-coding regions. The latter is the subject of the present analysis. This study is a report on the annotation of transposable elements (TEs) in expressed sequences of CoVea arabica, CoVea canephora and CoVea racemosa, showing the occurrence of 383 ESTs and 142 unigenes with TE fragments in these three CoVea species. Based on selected unigenes, it was possible to suggest 26 putative proteins with TE-cassette insertions, demonstrat- ing a likely contribution to protein variability. The genes for two of those proteins, the fertility restorer (FR) and the pyrophosphate-dependent phosphofructokinase (PPi-PFKs) genes, were selected for evaluating the impact of TE- cassettes on host gene evolution of other plant genomes (Arabidopsis thaliana, Oryza sativa and Populus trichocarpa). This survey allowed identifying a FR gene in O. sativa har- boring multiple insertions of LTR retrotransposons that originated new exons, which however does not necessarily mean a case of molecular domestication. A possible trans- duction event of a fragment of the PPi-PFK -subunit gene mediated by Helitron ATREPX1 in Arabidopsis thaliana was also highlighted. Keywords Transposable elements · CoVea genome · Protein diversity · Molecular domestication · Gene transduction Introduction Transposable elements (TEs) are genetic units capable of moving within genomes and often making duplicate copies of themselves. As a consequence of this activity, they are mutagenic and can produce sequence changes in single genes, as well as large genome rearrangements (Zhang and Peterson 1999), both of which can alter the pattern of gene expression and function (Jordan et al. 2003). Moreover, TEs generate an enormous variability that can be used to create new genes or exons (reviewed in Kidwell and Lish 1997; Bennetzen 2000, 2005; Jordan et al. 2003; van de Lagemaat et al. 2003; VolV 2006) and new regulatory sequences (Jordan et al. 2003), besides being the source of transcription-regulating signals (Thornburg et al. 2006) and genome expansion or contraction (FedoroV 2000; Bennet- zen 2002). TEs are present in almost all organisms so far studied (Kidwell 2002; Shapiro and von Sternberg 2005), and in some genomes, like Zea mays, they can represent about 60–80% of the nuclear genome (Meyers et al. 2001). The occurrence of TEs in intronic and intergenic regions has been widely reported (SanMiguel et al. 1996; Tikhonov et al. 1999; Bennetzen 2000). It was further demonstrated that these elements also contribute substantially to the Communicated by M.-A. Grandbastien. F. R. Lopes · C. M. A. Carareto (&) Laboratory of Molecular Evolution, Department of Biology, UNESP, São Paulo State University, 15054-000 São José do Rio Preto, São Paulo, Brazil e-mail: carareto@ibilce.unesp.br M. F. Carazzolle · G. A. G. Pereira Laboratory of Genomics and Expression, Department of Genetics and Evolution, Institute of Biology, UNICAMP, State University of Campinas, 13083-970 Campinas, São Paulo, Brazil C. A. Colombo IAC, Agronomic Institute of Campinas, 13001-970 Campinas, São Paulo, Brazil