Skip to main content
Dryad

Whole genome sequencing (WGS) data from invasive pine sawfly Diprion similis

Cite this dataset

Davis, Jeremy S.; Linnen, Catherine (2023). Whole genome sequencing (WGS) data from invasive pine sawfly Diprion similis [Dataset]. Dryad. https://doi.org/10.5061/dryad.g4f4qrfv3

Abstract

Biological introductions are unintended “natural experiments” that provide unique insights into evolutionary processes. Invasive phytophagous insects are of particular interest to evolutionary biologists studying adaptation, as introductions often require rapid adaptation to novel host plants. However, adaptive potential of invasive populations may be limited by reduced genetic diversity—a problem known as the “genetic paradox of invasions”. One potential solution to this paradox is if there are multiple invasive waves that bolster genetic variation in invasive populations. Evaluating this hypothesis requires characterizing genetic variation and population structure in the invaded range. To this end, we assemble a reference genome and describe patterns of genetic variation in the introduced white pine sawfly, Diprion similis. This species was introduced to North America in 1914, where it has rapidly colonized the thin-needled eastern white pine (Pinus strobus), making it an ideal invasion system for studying adaptation to novel environments. To evaluate evidence of multiple introductions, we generated whole-genome resequencing data for 64 D. similis females sampled across the North American range. Both model-based and model-free clustering analyses supported a single population for North American D. similis. Within this population, we found evidence of isolation-by-distance and a pattern of declining heterozygosity with distance from the hypothesized introduction site. Together, these results support a single-introduction event. We consider implications of these findings for the genetic paradox of invasion and discuss priorities for future research in D. similis, a promising model system for invasion biology.

Methods

Full methods on how this data was prepared and processed can be found at the linked publication. In brief:

Raw genetic material was collected from larval tissue from insects and extracted using Qiagen kits. DNA libraries were prepared using KAPA HyperPrep kits and was then sequenced on NovaSeq 6000 S4 flowcell. 

vcftools and ANGSD programs were used to prepare the sam and beagle files respectively. See the paper methods for more details. 

Funding

National Science Foundation, Award: 2020660

National Science Foundation, Award: DEB-1257739

National Science Foundation, Award: DEB-1750946

Agricultural Research Service, Award: 2040-22430-028-00-D

American Genetics Association, Award: EECG award