The European gypsy moth (Lymantria dispar L.) was first introduced to Massachusetts in 1869 and within 150 years has spread throughout eastern North America. This large-scale invasion across a heterogeneous landscape allows examination of the genetic signatures of adaptation potentially associated with rapid geographic spread. We tested the hypothesis that spatially divergent natural selection has driven observed changes in three developmental traits that were measured in a common garden for 165 adult moths sampled from six populations across a latitudinal gradient covering the entirety of the range. We generated genotype data for 91,468 single nucleotide polymorphisms (SNPs) based on double digest restriction-site associated DNA sequencing (ddRADseq) and used these data to discover genome-wide associations for each trait, as well as to test for signatures of selection on the discovered architectures. Genetic structure across the introduced range of gypsy moth was small in magnitude (FST = 0.069), with signatures of bottlenecks and spatial expansion apparent in the rare portion of the allele frequency spectrum. Results from applications of Bayesian sparse linear mixed models were consistent with the presumed polygenic architectures of each trait. Further analyses were indicative of spatially divergent natural selection acting on larval development time and pupal mass, with the linkage disequilibrium like component of this test acting as the main driver of observed patterns. The populations most important for these signals were two range-edge populations established less than 30 generations ago. We discuss the importance of rapid polygenic adaptation to the ability of non-native species to invade novel environments.
Phenotypic data
This .csv file contains phenotypic measurements for each gypsy moth sample. The three phenotypic are Mass, Pupal Development Time (PD), and Total Development Time (TDT). TDT and PD were used to calculate Larval Development Time (LDT), which is TDT - PD. Further descriptions of these data are available in the manuscript. Samples are labeled with population identifiers followed by individual identifiers, with an underscore separating these identifiers.
gypsymoth_phenotypes.csv
Geographic Locations
This file contains the geographic coordinates for each sampled population. It is a tab-delimited text file.
gm_pop_loc.txt
Variance components for FST calculation
This file contains the variance components estimated using hierfstat. There is one set of components for each SNP. Components are: population (Pop), individual (Ind), and error (Error). The first column is the SNP identifier (SNP_id).
FSTvarcomp.csv
Exploratory Data Analysis Script
This R script was used to explore patterns within the data. Comments are used to briefly describe the analysis. A header gives contact information for the script author.
gm_exploratory_analysis.r
Input and Output Files for Berg & Coop
The zipped archive contains a directory in which input files and output files from the method of Berg & Coop (2014) are housed. Please refer to Jeremy Berg's GitHub repository (cited in the associated manuscript) for detailed instructions on how to execute these files with his scripts.
Berg_Coop.zip
Miscellaneous R Scripts and Inputs
This archive contains two directories, both of which were used to explore summary statistics and/or performance of the inference step for sets of GWAS SNPs. Each directory contains one or more R scripts, as well as the input files needed for the script. Please see the manuscript for further details. This is where the main results from GEMMA are located (see files: BSLMM_norm.csv and files ending in "_hyp.csv").
miscellaneous_scripts_data.zip
012 file
This .txt file is a 012 file of for the genotypes of all individuals used in this study. Rows are individuals and columns are loci.
gm_genotypes_012_final_5122017.txt
gemma_input
This zip file contains a weighted genotype file and all transformed phenotype files that were used as input in the program GEMMA. Phenotypes (Mass, LDT, and PD) were transformed in two ways: one correcting for population structure based on PC axes and the other was normal quantile transformed.
VCF file
This file is the final gzipped VCF file after filtering for all the individuals used in this study.
gm_genotypes_VCF_5122017.vcf.gz
assembly
The zip file contains a reference assembly using MaSuRcA and a gff file with annotations from MAKER