Invasion of protein coding genes by green algal ribosomal group I introns Hilary A. McManus a,⇑ , Louise A. Lewis a , Karolina Fuc ˇíková a , Peik Haugen b a Department of Ecology and Evolutionary Biology, University of Connecticut, 75 North Eagleville Rd., Storrs, CT 06269, USA b Department of Chemistry, and The Norwegian Structural Biology Centre, University of Tromsø, N-9037 Tromsø, Norway article info Article history: Received 7 October 2010 Revised 4 June 2011 Accepted 17 September 2011 Available online 24 October 2011 Keywords: Chlorophyceae Chloroplast rbcL gene GIY-YIG Group I introns HNH Homing endonucleases abstract The spread of group I introns depends on their association with intron-encoded homing endonucleases. Introns that encode functional homing endonuclease genes (HEGs) are highly invasive, whereas introns that only encode the group I ribozyme responsible for self-splicing are generally stably inherited (i.e., ver- tical inheritance). A number of recent case studies have provided new knowledge on the evolution of group I introns, however, there are still large gaps in understanding of their distribution on the tree of life, and how they have spread into new hosts and genic sites. During a larger phylogenetic survey of chlorophyceaen green algae, we found that 23 isolates contain at least one group I intron in the rbcL chlo- roplast gene. Structural analyses show that the introns belong to one of two intron lineages, group IA2 intron-HEG (GIY-YIG family) elements inserted after position 462 in the rbcL gene, and group IA1 introns inserted after position 699. The latter intron type sometimes encodes HNH homing endonucleases. The distribution of introns was analyzed on an exon phylogeny and patterns were recovered that are consis- tent with vertical inheritance and possible horizontal transfer. The rbcL 462 introns are thus far reported only within the Volvocales, Hydrodictyaceae and Bracteacoccus, and closely related isolates of algae differ in the presence of rbcL introns. Phylogenetic analysis of the intron conserved regions indicates that the rbcL699 and rbcL462 introns have distinct evolutionary origins. The rbcL699 introns were likely derived from ribosomal RNA L2449 introns, whereas the rbcL462 introns form a close relationship with psbA introns. Published by Elsevier Inc. 1. Introduction Group I introns are widely distributed in genes of various lineages throughout the tree of life (Bhattacharya et al., 1996; Haugen et al., 2005; Saldanha et al., 1993). Transcribed as part of precursor transcripts, the introns are precisely removed post- transcriptionally by a group I ribozyme encoded by the intron itself. Group I introns are therefore known as self-splicing introns. Among photosynthetic lineages, group I introns have been found in the nuclear genes of diatoms, euglenoids and red algae and in the chloroplast genes of brown algae and other chlorophyll a and c-containing algae. The nucleus, mitochondrion and chloroplast of the green algae contain group I introns and the embryophytes have group I introns in the chloroplast and mitochondrion (Bhattacharya et al., 1994, 1996; Haugen et al., 2005). The widespread distribution of group I introns in nature is in agreement with their ability to effectively spread into homologous, but intronless DNA, by a highly efficient mechanism known as ‘‘homing’’ (Edgell, 2009). Mobility of introns through ‘‘homing’’ is promoted by highly specific intron-encoded endonucleases (i.e., homing endonucleases, or HEs) that typically recognize and cleave sequences of 15–25 bp in length. This cleavage initiates a double- stranded break and repair pathway that results in an insertion of the intron sequence into the intron-lacking allele. Conserved pro- tein motifs found within HEs are used to distinguish among the five different families, i.e., LAGLIDADG, GIY-YIG, His-Cys box, HNH and PD-(D/E)XK (Marcaida et al., 2010). In addition to the process of ‘‘homing’’, phylogenetic analyses suggest that intron- HEG elements also have the ability to spread into heterologous sites (see Haugen and Bhattacharya, 2004). Movement into heter- ologous sites is restricted, in part, by the limited ability of homing endonucleases to recognize and cleave new DNA targets. This lim- itation is sometimes overcome by the insertion of the intron-HEG element into neighboring sites. One striking example of the same intron-HEG element in neighboring genic positions is found in the small (S) rRNA gene of fungi. Here, a group IC2 ribozyme with a LAGLIDADG HEG inserted into the P9 structure has spread into positions S1210, S1224 and S1247 (Haugen and Bhattacharya, 2004). Within the green algae the majority of group I introns are found in nuclear and chloroplast genes (Haugen et al., 2005). Chloroplast group I introns have been reported from a wide variety of genes and relatively few described to occur in the rbcL gene, the first 1055-7903/$ - see front matter Published by Elsevier Inc. doi:10.1016/j.ympev.2011.09.027 ⇑ Corresponding author. E-mail addresses: mcmanuha@lemoyne.edu (H.A. McManus), louise.lewis@ uconn.edu (L.A. Lewis), karolina.fucikova@uconn.edu (K. Fuc ˇíková), peik.haugen@ uit.no (P. Haugen). Molecular Phylogenetics and Evolution 62 (2012) 109–116 Contents lists available at SciVerse ScienceDirect Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev