Inferring aspects of the population histories of species using coalescent analyses of non-coding nuclear DNA has grown in popularity. These inferences, such as divergence, gene flow, and changes in population size, assume that genetic data reflect simple population histories and neutral evolutionary processes. However, violating model assumptions can result in a poor fit between empirical data and the models. We sampled 22 nuclear intron sequences from at least 19 different chromosomes (a genomic transect) to test for deviations from selective neutrality in the gadwall (Anas strepera), a Holarctic duck. Nucleotide diversity among these loci varied by nearly two orders of magnitude (from 0.0004 to 0.029), and this heterogeneity could not be explained by differences in substitution rates. Using two different coalescent methods to infer models of population history and then simulating neutral genetic diversity under these models, we found that the among-locus heterogeneity in nucleotide diversity was significantly higher than expected for these simple models. Defining more complex models of population history demonstrated that a pre-divergence bottleneck was also unlikely to explain this heterogeneity. However, both selection and interspecific hybridization could account for the heterogeneity observed among loci. Regardless of the cause of the deviation, our results illustrate that violating key assumptions of coalescent models can mislead inferences of population history.
Microsatellite locus A27E1 alleles
Contains DNA sequences for 100 phased and aligned alleles from the microsatellite locus A27E1 that were sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
A27E1.fas
Intron 1 of CRYAB gene
Contains DNA sequences for 100 phased and aligned alleles from intron 1 of the CRYAB gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
CRYAB.fas
Intron 8 of Sf3A2 gene
Contains DNA sequences for 100 phased and aligned alleles from intron 8 of Sf3A2 gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
Sf3A2.fas
Intron 5 of ANXA11 gene
Contains DNA sequences for 100 phased and aligned alleles from intron 5 of the ANXA11 gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
ANXA11.fas
Intron 5 of CD4 gene
Contains DNA sequences for 100 phased and aligned alleles from intron 5 of CD4 gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
CD4.fas
Intron 19 of CHD1Z gene
Contains DNA sequences for 100 phased and aligned alleles from intron 19 of CHD1Z gene (sex-linked; Z-chromosome) sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
CHD1Z.fas
Intron 9 of CPD gene
Contains DNA sequences for 100 phased and aligned alleles from intron 9 of CPD gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
CPD.fas
Intron 8 of ENO1 gene
Contains DNA sequences for 100 phased and aligned alleles from intron 8 of ENO1 gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
ENO1.fas
Intron 2 of FAST gene
Contains DNA sequences for 100 phased and aligned alleles from intron 2 of FAST gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
FAST.fas
Intron 7 of FGB gene
Contains DNA sequences for 100 phased and aligned alleles from intron 7 of FGB gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
FGB.fas
Intron 3 of GH1 gene
Contains DNA sequences for 100 phased and aligned alleles from intron 3 of GH1 gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
GH1.fas
Intron 3 of GHRL gene
Contains DNA sequences for 100 phased and aligned alleles from intron 3 of GHRL gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
GHRL.fas
Intron 11 of GRIN1 gene
Contains DNA sequences for 100 phased and aligned alleles from intron 11 of GRIN1 gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
GRIN1.fas
Intron 2 of LCAT gene
Contains DNA sequences for 100 phased and aligned alleles from intron 2 of LCAT gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
LCAT.fas
Intron 2 of LDHB gene
Contains DNA sequences for 100 phased and aligned alleles from intron 2 of LDHB gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
LDHB.fas
Intron 2 of MSTN gene
Contains DNA sequences for 100 phased and aligned alleles from intron 2 of MSTN gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
MSTN.fas
Intron 12 of NCL gene
Contains DNA sequences for 100 phased and aligned alleles from intron 12 of NCL gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
NCL.fas
Intron 5 of ODC1 gene
Contains DNA sequences for 100 phased and aligned alleles from intron 5 of ODC1 gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
ODC1.fas
Intron 9 of PCK1 gene
Contains DNA sequences for 100 phased and aligned alleles from intron 9 of PCK1 gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
PCK1.fas
Intron 2 of SAA gene
Contains DNA sequences for 100 phased and aligned alleles from intron 2 of SAA gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
SAA.fas
Intron 12 of SOAT1 gene
Contains DNA sequences for 100 phased and aligned alleles from intron 12 of SOAT gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
SOAT1.fas
Intron 2 of SOX9 gene
Contains DNA sequences for 100 phased and aligned alleles from intron 2 of SOX9 gene sampled from gadwalls (Anas strepera). The first 50 alleles were sampled from Eurasia, and the second set of 50 alleles were sampled from North America. The sequences were aligned in Sequencher.
SOX9.fas
IM input file, 22 loci, 2 populations of gadwall
Contains the input file used in IM analysis with sequences truncated to meet the assumption of no intralocus recombination. Data were filtered using IMgc to optimally remove base pairs and/or chromosomes (a maximum of 5% of chromosomes, N = 5 copises) for each locus.
infile-IMgcFiltered.txt
Characteristics and primers of loci sequenced
This table contains full locus names, locus abbreviations, location of each locus within the chicken and zebra finch genomes, length of the intron, and primer sequences used for amplifying each locus.
Locus characteristics.docx
Input for *BEAST - 8 taxa, 22 loci
Input file for analyses of relative substitution rates in *BEAST. This file contains 8 deeply divergent taxa of Anseriformes sequenced at 22 nuclear introns. The command lines direct *BEAST to use a relaxed lognormal clock and the multispecies coalescent for rate calculations.
8taxa-22loci-RelaxClock.xml
ms.out.v4.2
This contains the R-script used to analyze data simulated using the coalescent program MS. This script contains detailed annotations describing the code. Also see the readme file for more details.