Sequencing data for: Chronosequence of invasion reveals minimal losses of population genomic diversity, niche expansion, and trait divergence in the polyploid, leafy spurge
Data files
Sep 11, 2023 version files 43.78 GB
-
README.md
-
spurge_gbs_fastq_gz.zip
Abstract
Rapid evolution may play an important role in the range expansion of invasive species and modify forecasts of invasion, which are the backbone of land management strategies. However, losses of genetic variation associated with colonization bottlenecks may constrain trait and niche divergence at leading range edges, thereby impacting management decisions that anticipate future range expansion. The spatial and temporal scales over which adaptation contributes to invasion dynamics remain unresolved. We leveraged detailed records of the ~130-year invasion history of the invasive polyploid plant, leafy spurge (Euphorbia virgata), across ~500km in Minnesota, U.S.A. We examined the consequences of range expansion for population genomic diversity, niche breadth, and the evolution of germination behavior. Using genotyping-by-sequencing, we found some population structure in the range core, where introduction occurred, but panmixia among all other populations. Range expansion was accompanied by only modest losses in sequence diversity, with small, isolated populations at the leading edge harboring similar levels of diversity to those in the range core. The climatic niche expanded during most of the range expansion, and the niche of the range core was largely non-overlapping with the invasion front. Ecological niche models indicated that mean temperature of the warmest quarter was the strongest determinant of habitat suitability and that populations at the leading edge had the lowest habitat suitability. Guided by these findings, we tested for rapid evolution in germination behavior over the time course of range expansion using a common garden experiment and temperature manipulations. Germination behavior diverged from early to late phases of the invasion, with populations from later phases having higher dormancy at lower temperatures. Our results suggest that trait evolution may have contributed to niche expansion during invasion and that distribution models, which inform future management planning, may underestimate invasion potential without accounting for evolution.
README: Genomic sequencing Data For Manuscript: Chronosequence of invasion reveals minimal losses of population genomic diversity, niche expansion, and trait divergence in the polyploid, leafy spurge
Supporting genomic sequencing data for manuscript: Chronosequence of invasion reveals minimal losses of population genomic diversity, niche expansion, and trait divergence in the polyploid, leafy spurge
Thomas A. Lake, Ryan D. Briscoe Runquist, Lex E. Flagel, David A. Moeller
bioRxiv 2023.04.04.535556; doi: https://doi.org/10.1101/2023.04.04.535556
In 2019, the authors collected leafy spurge (Euphorbia virgata) leaf tissue from six individuals in each of 14 populations in Minnesota (hereafter: population samples). In addition, the authors collected tissue from one individual in each of 157 populations throughout Minnesota, northern Iowa, eastern South Dakota, eastern North Dakota, and western Wisconsin (hereafter: landscape samples).
The authors extracted DNA using QIAGEN DNeasy Plant Mini Kits (QIAGEN Inc.). Dual-indexed GBS (genotyping-by-sequencing) libraries were created using the BamHI + NsiI enzyme combination. All libraries were pooled and sequenced on an Illumina NovaSeq System (Illumina Inc., San Diego, CA, USA) with 1x100-bp sequencing. Once sequenced, the reads were demultiplexed and balanced with a mean quality score ≥ Q30 for all libraries.
The following 241 .fastq.gz files contain the DNA sequencing contents output from the Illumina NovaSeq System.
Description of the Data and file structure
The .fastq.gz files contain sequencing data output from the Illumina NovaSeq System. These are text files and can be opened and manipulated with any text editor. The files are named with the sample name and sample number.
Example: ANO001-0_S616_R1_001.fastq.gz
ANO001-0—The sample name. This corresponds to the geographic location of the sample, abbreviated by either Minnesota counties (e.g., ANO; Anoka County) or State (e.g., ND; North Dakota, SD for South Dakota, WIS for Wisconsin, and IOWA for Iowa).
S616—The sample number identifier. This corresponds to the order the samples were sequenced.
R1—The read. In this example, R1 means Read 1.
001—The last segment is always 001.
fastq.gz—The file is a .fastq file and is compressed with gzip to reduce file size.
The sample names also correspond to Table S1 in the manuscript.
Table S1. Population name, geographic location, and number of individuals sampled per population, totaling 14 population samples and 157 landscape samples of leafy spurge. Population samples are indicated in bold.
Population Latitude Longitude Number of Individuals Sampled
ANO001 45.28849 -93.12503 6
ANO002 45.20712 -93.29607 1
In the case of the ANO001 population, six .fastq.gz files correspond to the six individual sampled (population samples).
ANO001-0_S616_R1_001.fastq.gz
ANO001-1_S756_R1_001.fastq.gz
ANO001-2_S689_R1_001.fastq.gz
ANO001-3_S700_R1_001.fastq.gz
ANO001-4_S712_R1_001.fastq.gz
ANO001-5_S724_R1_001.fastq.gz
Sharing/access Information
This work is licensed under a CC0 1.0 Universal (CC0 1.0) Public Domain Dedication license.
Methods
Sampling and sequencing
In 2019, we collected leaf tissue from six individuals in each of 14 populations distributed evenly across Minnesota (hereafter: population samples). In addition, we collected tissue from one individual in each of 157 populations distributed relatively evenly across Minnesota, eastern South Dakota, eastern North Dakota, and western Wisconsin (hereafter: landscape samples). We sampled tissue from individuals that were at least five meters apart to minimize collecting from the same genet and placed tissues immediately in silica for preservation until DNA extraction.
We extracted DNA using QIAGEN DNeasy Plant Mini Kits (QIAGEN Inc.). Dual-indexed GBS (genotyping-by-sequencing) libraries were created using the BamHI + NsiI enzyme combination. All libraries were pooled and sequenced on an Illumina NovaSeq System (Illumina Inc., San Diego, CA, USA) with 1x100-bp sequencing. Once sequenced, the reads were demultiplexed and balanced with a mean quality score ≥ Q30 for all libraries. We filtered low-quality bases using Trimmomatic (Bolger et al. 2014) and used Stacks v.2.5.9 (Rochette et al. 2019) to build loci de novo (i.e., without aligning reads to a reference genome). Overall, we obtained 510 million reads across the 241 samples (599,386 – 3,376,078 of raw reads per individual). Mean read depth per locus ranged from 14x to 26x.
Usage notes
FASTQ format is a text-based format for storing both a biological sequence (usually nucleotide sequence) and its corresponding quality scores. FASTQ files may be opened with any standard text-based file editor (e.g., Notepad).