Skip to main content

Data from: Broad geographic sampling reveals predictable, pervasive, and strong seasonal adaptation in Drosophila

Cite this dataset

Machado, Heather E. et al. (2021). Data from: Broad geographic sampling reveals predictable, pervasive, and strong seasonal adaptation in Drosophila [Dataset]. Dryad.


To advance our understanding of adaptation to temporally varying selection pressures, we identified signatures of seasonal adaptation occurring in parallel among Drosophila melanogaster populations. Specifically, we estimated allele frequencies genome-wide from flies sampled early and late in the growing season from 20 widely dispersed populations. We identified parallel seasonal allele frequency shifts across North America and Europe, demonstrating that seasonal adaptation is a general phenomenon of temperate fly populations. Seasonally fluctuating polymorphisms are enriched at large chromosomal inversions and we find a broad concordance between seasonal and spatial allele frequency change. The direction of allele frequency change at seasonally variable polymorphisms can be predicted by weather conditions in the weeks prior to sampling, linking the environment and the genomic response to selection. Our results suggest that fluctuating selection is an important evolutionary force affecting patterns of genetic variation in Drosophila.


We assembled 73 samples of D. melanogaster, 61 representing newly collected and sequenced samples and 12 representing previously published samples (Bergland et al., 2014; Kapun et al., 2016). Locations, collection dates, number of individuals sampled, and depth of sequencing for all samples are listed in Supplemental Table 1 (Machado et al., 2021). For each sample, members of the Drosophila Real-Time Evolution Consortium collected an average of 75 male flies using direct aspiration from substrate, netting, or trapping at orchards and residential areas. Flies were confirmed to be D. melanogaster by examination of the male genital arch. We extracted DNA by first pooling all individuals from a sample, grinding the tissue together in extraction buffer, and using a lithium chloride – potassium acetate extraction protocol (see Bergland et al. 2014 for details on buffers and solutions). We prepared sequencing libraries using a modified Illumina protocol (Bergland et al. 2014) and Illumina TrueSeq adapters. Paired-end 125bp libraries were sequenced to an average of 94x coverage either at the Stanford Sequencing Service Center on an Illumina HiSeq 2000, or at the Stanford Functional Genomics facility on an Illumina HiSeq 4000.


The following sequence data processing was performed on both the new and the previously published data. We trimmed low-quality 3’ and 5’ read ends (sequence quality < 20) using the program cutadapt v1.8.1 (Martin 2011). We mapped the raw reads to the D. melanogaster genome v5.5 (and for D. simulans genome v2.01, using bwa v0.7.12 mem algorithms, with default parameters (Li & Durbin 2009), and used the program SAMtools v1.2 for bam file manipulation (functions index, sort, and mpileup) (Li et al. 2009). We used the program picard v2.0.1 to remove PCR duplicates ( and the program GATK v3.2-2 for indel realignment (McKenna et al. 2010). We called SNPs and indels using the program VarScan v2.3.8 using a p-value of 0.05, minimum variant frequency of 0.005, minimum average quality of 20, and minimum coverage of 10 (Koboldt et al. 2012). We filtered out SNPs within 10bp of an indel (they are more likely to be spurious), variants in repetitive regions (identified by RepeatMasker and downloaded from the UCSC Genome browser), and nucleotides with more than two alleles. Because we sequenced only male individuals, the X chromosome had lower coverage and was not used in our analysis.

Usage notes