Data Concatenation, Bayesian Concordance and Coalescent-Based Analyses of the Species Tree for the Rapid Radiation of Triturus Newts Ben Wielstra 1,2 *, Jan W. Arntzen 1 , Kristiaan J. van der Gaag 3 , Maciej Pabijan 4,5 , Wieslaw Babik 4 1 Naturalis Biodiversity Center, Leiden, The Netherlands, 2 Department of Animal and Plant Sciences, University of Sheffield, Sheffield, United Kingdom, 3 Leiden University Medical Center, Forensic Laboratory for DNA Research, Leiden, The Netherlands, 4 Institute of Environmental Sciences, Jagiellonian University, Krako ´ w, Poland, 5 Institute of Zoology, Jagiellonian University, Krako ´ w, Poland Abstract The phylogenetic relationships for rapid species radiations are difficult to disentangle. Here we study one such case, namely the genus Triturus, which is composed of the marbled and crested newts. We analyze data for 38 genetic markers, positioned in 3-prime untranslated regions of protein-coding genes, obtained with 454 sequencing. Our dataset includes twenty Triturus newts and represents all nine species. Bayesian analysis of population structure allocates all individuals to their respective species. The branching patterns obtained by data concatenation, Bayesian concordance analysis and coalescent-based estimations of the species tree differ from one another. The data concatenation based species tree shows high branch support but branching order is considerably affected by allele choice in the case of heterozygotes in the concatenation process. Bayesian concordance analysis expresses the conflict between individual gene trees for part of the Triturus species tree as low concordance factors. The coalescent-based species tree is relatively similar to a previously published species tree based upon morphology and full mtDNA and any conflicting internal branches are not highly supported. Our findings reflect high gene tree discordance due to incomplete lineage sorting (possibly aggravated by hybridization) in combination with low information content of the markers employed (as can be expected for relatively recent species radiations). This case study highlights the complexity of resolving rapid radiations and we acknowledge that to convincingly resolve the Triturus species tree even more genes will have to be consulted. Citation: Wielstra B, Arntzen JW, van der Gaag KJ, Pabijan M, Babik W (2014) Data Concatenation, Bayesian Concordance and Coalescent-Based Analyses of the Species Tree for the Rapid Radiation of Triturus Newts. PLoS ONE 9(10): e111011. doi:10.1371/journal.pone.0111011 Editor: Helge Thorsten Lumbsch, Field Museum of Natural History, United States of America Received June 19, 2014; Accepted September 22, 2014; Published October 22, 2014 Copyright: ß 2014 Wielstra et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: The authors confirm that all data underlying the findings are fully available without restriction. The raw 454 reads (in SFF format), the additional sequences obtained with Sanger sequencing (in ABI format), the SNP report produced with GS Amplicon Variant Analyzer 2.8, the alignments (raw, with indels removed, and collapsed into haplotypes), the input files for BAPS, MrBayes, BUCKy and *BEAST, and the individual gene trees resulting from MrBayes (created during the BUCKy exercise) and the species tree resulting from the eight *BEAST runs are available under Dryad Digital Repository entry doi:10.5061/ dryad.mm81p. Funding: BW is a Newton International Fellow. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. * Email: b.wielstra@sheffield.ac.uk Introduction The importance of molecular data in biological systematics can hardly be overstated, but potential pitfalls should be considered. In particular, a single gene tree does not necessarily reflect the phylogenetic relationships among species – hereafter referred to as the species tree – as phenomena like incomplete lineage sorting and introgression cloud the pattern of descent [1]. The mitochondrial genome is inherited as a single unit and gives rise to a single gene tree. On the other hand, the nuclear genome, due to its recombining nature, represents a collection of gene trees embedded in the species tree. To distill the species tree, a multitude of nuclear genes should be employed [2]. The progress in next-generation sequencing facilitates the production of large datasets and the use of multilocus datasets in systematics will soon become the norm [3,4]. The increase in molecular data is followed by advances in analytical methods. It is now realized that combining alignments in a supermatrix that is treated as if it was a single ‘supergene’ can be misleading as this approach ignores the discordance among gene trees in phylogeny reconstruction. High confidence can be appointed to incorrectly inferred evolutionary relationships under data concatenation [3,5,6]. Furthermore, choice of allele in the concatenation process in the case of heterozygote marker- individual combinations can lead to widely differing topologies [7,8]. Gene tree discordance is explicitly incorporated in Bayesian concordance analysis, in which individual gene trees are summa- rized to provide a ‘concordance factor’ per clade, representing the proportion of gene trees in which clades are present. [9,10]. Recent coalescent-based methods of species tree estimation take advantage of the information contained in a sample of gene trees more directly by conjointly estimating the species tree and individual gene trees, while explicitly taking into account incomplete lineage sorting as a source of gene tree discordance [2,11]. Despite these analytical advances, rapid radiations with temporally closely spaced branching events tend to show much PLOS ONE | www.plosone.org 1 October 2014 | Volume 9 | Issue 10 | e111011