Phylogenetics and repeatome analysis of the southern African tribe Heliophileae (Brassicaceae) Mert Dogan, Milan Pouch, Petra Hloušková, Terezie Mandáková & Martin A. Lysak CEITEC – Central European Institute of Technology, Masaryk University, Brno 625 00, Czech Republic Literature cited Mandáková, T., et al. (2012) Taxon 61: 989-1000. Novák, P., et al. (2013) Bioinformatics 29: 792-793. This study was supported by the CEITEC 2020 (grant no. LQ1601) project. H. elongata H. juncea H. diffusa H. circaeoides Ty1/copia Ty3/gypsy LINE DNA trans rDNA Satellite unclass. H. amplexicaulis H. chrithmifolia H. collina H. deserticola H. seselifolia H. variabilis H. crithmifolia Introduction The unigeneric tribe Heliophileae (Brassicaceae) includes c. 90 Heliophila species, all endemic to southern Africa (Fig. 1). The tribe is morphologically the most diversified Brassicaceae lineage in every aspect of habit, foliage, flower, and fruit morphology. Despite this diversity genome evolution across the tribe was not analyzed thoroughly. Here we present the updated phylogenetic framework and summarize our recent progress in analyzing the composition and evolution of nuclear repeatomes in Heliophila species representing different intra-tribal clades. Material & Methods We constructed a large-scale phylogeny of Heliophileae based on ITS sequences for 188 accessions, uncovering four main clades with limited inter- and intra-clade resolution (Fig. 2). Hence, we have employed 48 newly developed nuclear single-copy (SC) gene markers to get new insights into intra-tribal relationships (Fig. 6). Illumina Miseq low-coverage genome sequencing was conducted in 16 Heliophila species (Table 1). The repeats were annotated using the graph- based clustering method of the RepeatExplorer pipeline (Novák et al., 2013) with 156-bp paired-end reads and their genome abundances were analyzed (Fig. 3). Comparative analyses were performed in order to determine repeat clusters shared among the analyzed species. A repeat abundance matrix was created and used for phylogenetic analyses (Fig. 5). Repeat abundances were treated as continuous characters and used to construct phylogenies under Brownian motion model, using a single rate parameter in RevBayes. Probes and primers were designed from the most abundant satellite repeats and transposable elements (TEs) to get information on chromosomal localization of the repeats (Fig. 4). Objectives ✓ to infer robust intra-tribal phylogenetic hypotheses based on multiple nuclear and chloroplast markers ✓ to identify and characterize dominating DNA repeats in selected Heliophila species ✓ to analyze evolution of repeatomes within given phylogenetic frameworks Results & Conclusions ✓ LTR-retrotransposons were predominant repeats in all Heliophila genomes analyzed; Ty3/gypsy (particularly Athila) retroelements were the most abundant repeatome component (6.23% to 20.42%). Tandem repeats were found in relatively low abundances in all species – 1.32% to 7.67% (Fig. 3). ✓ Four main intra-tribal clades (A to D) were identified in the updated ITS phylogeny (Fig. 2), compared to only three clades resolved earlier (Mandáková et al., 2012). The four clades were corroborated by single-copy nuclear gene trees (see Fig. 6 for an example) as well as by phylogenies based on repeat abundances (Fig. 5). ✓ Due to the high number of shared satellite repeats, Clade A may represent a younger radiation compared to clades B and C (Fig. 3). This will be confirmed by analysis of dated nuclear and chloroplast phylogenetic trees. ✓ Comparative cytogenetic analyses as well as phylogenetic reconstruction of intra-tribal relationships based on single-copy nuclear and chloroplast genes are in progress (Fig. 6). A B Genome size (Mbp) No. analyzed reads Genome coverage H. africana - 1,102,494 - H. arenaria c. 489 974,744 c. 0.31x H. cornuta 474.33 1,071,450 0.35x H. linearis 361.86 675,960 0.29x H. lactea 405.87 1,301,448 0.5x H. pusilla 366.75 737,284 0.31x H. elongata c. 400.98 1,415,490 c. 0.55x H. juncea - 1,795,764 - H. amplexicaulis - 902,884 - H. chrithmifolia 356.97 1,523,142 0.66x H. collina c. 410.76 1,253,998 c. 0.47x H. deserticola c. 371.64 1,475,492 c. 0.62x H. seselifolia c. 312.96 865,938 c. 0.43x H. variabilis 337.41 1,200,834 0.55x H. diffusa 425.43 712,492 0.26x H. circaeoides - 1,119,664 - Figure 3. Repeat abundances in Heliophila species analyzed. DNA satellite repeats shared among two or more species are highlighted by colour coding, and their genome proportion (GP) and monomer length are given. Single/low hits were discarded. Figure 2. Fifty percent majority-rule consensus tree of the Bayesian inference of the ITS region. Four main clades (A-D) were identified. D B C A A B C D Figure 1. Distribution of Heliophila species in South Africa and Namibia. The highest species diversity is confined to the Cape Floristic Region. Figure 5. Maximum a posteriori trees based on repeat abundances. (A) Ty3/gypsy, (B) tandem repeats. Note that analysis of Ty1/copia elements resulted in the identical tree topology as in (A). Table 1. Genome size, number of analyzed Illumina reads and estimated genome coverage in 16 Heliophila species. 0 10 20 30 40 50 H. africana H. arenaria H. cornuta H. linearis H. lactea H. pusilla D B C A Figure 4. Fluorescence in situ hybridization localization of the most abundant tandem repeats and TEs on mitotic chromosomes in six Heliophila species. All but one tandem repeat (Heli_Sat5 in H. juncea) localized to (peri)centromeric regions; two TEs showed dispersed localization. H. juncea (2n =16) Heli_Sat5 (CL1) 45S rDNA H. diffusa (2n 60) Satellite DNA (CL2) H. africana (2n = 20) Heli_Sat1 (CL1) H. circaeoides (2n = 38) Ty1/copia (Gag domain) H. arenaria (2n = 20) Ty3/gypsy (Gag domain) H. variabilis (2n = 22) Heli_Sat6 (CL 1) (sub)telomeric Satellite DNA (CL 2) GP (%) (A) Ty3/gypsy (B) Tandem repeats Figure 6. Fifty percent majority-rule consensus tree of the Bayesian inference of nuclear gene AT3G17850. Two paralogous copies of the gene were identified in most Heliophila species. This partly corroborates the purported mesohexaploid origin of the tribe (Mandáková et al., 2012). H. circaeoides H. diffusa H. juncea H. elongata H. crithmifolia H. seselifolia H. variabilis H. amplexicaulis H. collina H. deserticola H. cornuta H. pusilla H. lactea H. linearis H. arenaria H. africana H. circaeoides H. diffusa H. juncea H. elongata H. pusilla H. cornuta H. lactea H. arenaria H. africana H. linearis H. amplexicaulis H. collina H. seselifolia H. crithmifolia H. variabilis H. deserticola