Review Applications of next-generation sequencing to phylogeography and phylogenetics John E. McCormack a, , Sarah M. Hird a,b , Amanda J. Zellmer b , Bryan C. Carstens b , Robb T. Brumfield a,b a Museum of Natural Science, Louisiana State University, Baton Rouge, LA 70803, United States b Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, United States article info Article history: Available online 14 December 2011 Keywords: Population genomics Coalescence Reduced representation library Target enrichment High-throughput sequencing abstract This is a time of unprecedented transition in DNA sequencing technologies. Next-generation sequencing (NGS) clearly holds promise for fast and cost-effective generation of multilocus sequence data for phylo- geography and phylogenetics. However, the focus on non-model organisms, in addition to uncertainty about which sample preparation methods and analyses are appropriate for different research questions and evolutionary timescales, have contributed to a lag in the application of NGS to these fields. Here, we outline some of the major obstacles specific to the application of NGS to phylogeography and phylogenet- ics, including the focus on non-model organisms, the necessity of obtaining orthologous loci in a cost- effective manner, and the predominate use of gene trees in these fields. We describe the most promising methods of sample preparation that address these challenges. Methods that reduce the genome by restriction digest and manual size selection are most appropriate for studies at the intraspecific level, whereas methods that target specific genomic regions (i.e., target enrichment or sequence capture) have wider applicability from the population level to deep-level phylogenomics. Additionally, we give an over- view of how to analyze NGS data to arrive at data sets applicable to the standard toolkit of phylogeogra- phy and phylogenetics, including initial data processing to alignment and genotype calling (both SNPs and loci involving many SNPs). Even though whole-genome sequencing is likely to become affordable rather soon, because phylogeography and phylogenetics rely on analysis of hundreds of individuals in many cases, methods that reduce the genome to a subset of loci should remain more cost-effective for some time to come. Ó 2011 Elsevier Inc. All rights reserved. Contents 1. Introduction ......................................................................................................... 527 1.1. Multilocus studies and the promise of next-generation sequencing ....................................................... 527 1.2. Specific challenges to applying NGS to phylogeography and phylogenetics ................................................. 527 1.2.1. The need for homologous DNA regions from many individuals ................................................... 527 1.2.2. Cost-effective multiplexing and library preparation ............................................................ 527 1.2.3. The long reign of the gene tree in phylogeography and phylogenetics ............................................. 530 1.3. Primary data collection or marker development? ...................................................................... 530 2. Review of wet lab methods for sample preparation ......................................................................... 530 2.1. Multiplex PCR and amplicon sequencing ............................................................................. 530 2.2. Restriction digest-based methods .................................................................................. 530 2.2.1. RAD sequencing ......................................................................................... 531 2.2.2. Other restriction digest-based methods ...................................................................... 531 2.3. Target enrichment ............................................................................................... 531 2.3.1. Probe sets designed from ultraconserved elements for phylogenomics ............................................. 532 2.3.2. Probe sets designed from closely related genomes and transcriptome libraries ...................................... 532 2.4. Transcriptome sequencing ........................................................................................ 532 3. Data analysis and bioinformatics ........................................................................................ 532 3.1. The difference between Sanger and NGS data and the importance of coverage .............................................. 532 1055-7903/$ - see front matter Ó 2011 Elsevier Inc. All rights reserved. doi:10.1016/j.ympev.2011.12.007 Corresponding author. Address: Moore Laboratory of Zoology, Occidental College, 1600 Campus Rd., Los Angeles, CA 90041, United States. E-mail address: mccormack@oxy.edu (J.E. McCormack). Molecular Phylogenetics and Evolution 66 (2013) 526–538 Contents lists available at SciVerse ScienceDirect Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev