Skip to main content
Dryad

Data from: Can the genomics of ecological speciation be predicted across the divergence continuum from host races to species? A case study in Rhagoletis

Cite this dataset

Doellman, Meredith M. et al. (2020). Data from: Can the genomics of ecological speciation be predicted across the divergence continuum from host races to species? A case study in Rhagoletis [Dataset]. Dryad. https://doi.org/10.5061/dryad.nk98sf7pr

Abstract

Studies assessing the predictability of evolution typically focus on short-term adaptation within populations or the repeatability of change among lineages. A missing consideration in speciation research is to determine whether natural selection predictably transforms standing genetic variation within populations into differences between species. Here, we test whether host-related selection on diapause timing anticipates genome-wide differentiation during ecological speciation by comparing ancestral hawthorn and newly formed apple-infesting host races of Rhagoletis pomonella to their sibling species R. mendax that attacks blueberries. The responses of 57,857 single nucleotide polymorphisms in a diapause study on the hawthorn race strongly predicted the direction and magnitude of genomic divergence among the three flies at a field site in Fennville, Michigan, USA. As anticipated, the apple race and R. mendax show parallel changes in the frequencies of putative inversions on three chromosomes associated with the earlier fruiting times of apples and blueberries compared to hawthorns. A diapause experiment on R. mendax revealed compensatory mutations throughout the genome accounting for the earlier eclosion of blueberry, but not apple flies. Thus, a degree of predictability, although not complete, exists in the genomics of diapause across the ecological speciation continuum in Rhagoletis. The generality of this result is placed in the context of other similar systems.

Usage notes

Filtered vcf file with genotype probabilities

This is a vcf file generated using the GATK unified genotyper, then filtered as describe in the paper. The PL field provides the phred-scaled and normalized genotype probabilities produced using the GATK model.

meyers.doellman.Rmendax.pops.e.l.vcf

File mapping barcodes and ids to eclosion phenotypes

column 1 = inline barcode (5' end of the forward read), column 2 = sample id (sample id in the vcf file), column 3 = population_phenotype: Blueberry_E = R. mendax early, Blueberry_L = R. mendax late

meyers.doellman.Rmendax.e.l.barcode.map.txt

140110_I481_FCC3KD2ACXX_L1_PJM_3_1.fq.gz

Illumina forward reads, R. mendax early and late samples

140110_I481_FCC3KD2ACXX_L1_PJM_3_2.fq.gz

Illumina paired reads, R. mendax early and late samples

File mapping barcodes to ids for population sample

column 1 = inline barcode (5' end of the forward read), column 2 = sample id (sample id in the vcf file)

meyers.doellman.Rmendax.pop.barcode.map.txt

150206_I1125_FCC638CACXX_L5_Rmendax_1_2_1.fq.gz

llumina forward reads, R. mendax population sample

150206_I1125_FCC638CACXX_L5_Rmendax_1_2_1.fq.gz

Illumina paired reads, R. mendax population sample

Funding

National Science Foundation, Award: DEB-1638951

National Science Foundation, Award: DEB-1638997

National Science Foundation, Award: DEB-1639005

United States Department of Agriculture, Award: NIFA-2015-67013-23289

European Research Council Consolidator Grant