Bayesian analysis of molecular variance in pyrosequences quantifies population genetic structure across the genome of Lycaeides butterflies
- 1 May 2010
- journal article
- research article
- Published by Wiley in Molecular Ecology
- Vol. 19 (12) , 2455-2473
- https://doi.org/10.1111/j.1365-294x.2010.04666.x
Abstract
The distribution of genetic variation within and among populations is commonly used to infer their demographic and evolutionary histories. This endeavour has the potential to benefit substantially from high-throughput next-generation sequencing technologies through a rapid increase in the amount of data available and a corresponding increase in the precision of parameter estimation. Here we report the results of a phylogeographic study of the North American butterfly genus Lycaeides using 454 sequence data. This study serves the dual purpose of demonstrating novel molecular and analytical methods for population genetic analyses with 454 sequence data and expanding our knowledge of the phylogeographic history of Lycaeides. We obtained 341 045 sequence reads from 12 populations that we were able to assemble into 15 262 contigs (most of which were variable), representing one of the largest population genetic data sets for a non-model organism to date. We examined patterns of genetic variation using a hierarchical Bayesian analysis of molecular variance model, which provides precise estimates of genome-level phi(ST) while appropriately modelling uncertainty in locus-specific phi(ST). We found that approximately 36% of sequence variation was partitioned among populations, suggesting historical or current isolation among the sampled populations. Estimates of pairwise genome-level phi(ST) were largely consistent with a previous phylogeographic model for Lycaeides, suggesting fragmentation into two to three refugia during Pleistocene glacial cycles followed by post-Pleistocene range expansion and secondary contact leading to introgressive hybridization. This study demonstrates the potential of using genome-level data to better understand the phylogeographic history of populations.Keywords
This publication has 79 references indexed in Scilit:
- Hybrid selection of discrete genomic intervals on custom-designed microarrays for massively parallel sequencingNature Protocols, 2009
- Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencingNature Biotechnology, 2009
- Efficient mapping of Applied Biosystems SOLiD sequence data to a reference genome for functional genomic applicationsBioinformatics, 2008
- Estimation of Nucleotide Diversity, Disequilibrium Coefficients, and Mutation Rates from High-Coverage Genome-Sequencing ProjectsMolecular Biology and Evolution, 2008
- Microarray-based genomic selection for high-throughput resequencingNature Methods, 2007
- UniRef: comprehensive and non-redundant UniProt reference clustersBioinformatics, 2007
- A Coalescence-Guided Hierarchical Bayesian Method for Haplotype InferenceAmerican Journal of Human Genetics, 2006
- Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics researchBioinformatics, 2005
- Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequenceNature, 1998
- A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequencesJournal of Molecular Evolution, 1980