Data from: Population genomic insights into recent nutria (Myocastor coypus) invasion dynamics
Data files
Nov 18, 2025 version files 68.91 MB
-
finalRADseq_filtered_all.vcf
58.07 MB
-
priv_allelesEvsW_CA.vcf
10.84 MB
-
README.md
2.15 KB
Abstract
Nutria (Myocastor coypus) are semi‐aquatic rodents native to South America, introduced to the USA for fur farming during the early twentieth century. This species' herbivory can cause extensive damage to agriculture and wetland ecosystems. Although declared eradicated from California, USA, in the 1970s, nutria populations were recently discovered in the state's Central Valley and subsequently in the Sacramento–San Joaquin Delta, areas of significant agricultural and conservation importance. We report the use of a combination of nuclear single-nucleotide polymorphisms (SNPs) and mitochondrial (mtDNA; cytochrome b locus) markers to characterize the source and demographic history of the current invasion, to inform eradication efforts. Our study is the first to develop a SNP dataset for nutria, utilizing 6809 loci to characterize genetic diversity in comparison to several potential source populations. Multivariate analysis and Bayesian clustering of the SNP dataset found the greatest similarity to invasive nutria in central Oregon, USA, with minimal genetic differentiation in the Central Valley, excluding the leading edges of the invasion. Cytochrome b sequencing yielded a single contemporary California haplotype shared with nutria in Oregon and Washington, as well as in museum samples from California fur farms that predated eradication. Mantel tests found genetic differentiation between nutria in the Central Valley was best explained by ecological distance along rivers, while estimated effective migration surface (EEMS) analysis indicated gene flow was characterized by infrequent dispersal followed by rapid expansion in large, protected areas of emergent wetland habitat. These combined findings suggest contemporary California nutria represent a recent introduction that underwent rapid expansion. Our data further support treating the Central Valley as a single eradication unit while investing additional resources in targeting dispersal corridors to best achieve management goals. This study presents the first characterization of a regional nutria invasion within the larger context of global population and phylogenetics.
Dataset DOI: 10.5061/dryad.f1vhhmh92
Description of the data and file structure
VCF files (RADseq SNP dataset) and FASTA files (mitochondrial Cyt b and D-loop)
Individual FASTQ files can be found on NCBI's Sequence Read Archive: BioProject PRJNA1328208 (accession numbers SAMN51307640-SAMN51307946).
Novel mitochondrial sequences were uploaded to NCBI GenBank under accession numbers PX240733-35.
Sample metadata can be found in the supplemental materials (Table S1) of the publication.
Files and variables
File: finalRADseq_filtered_all.vcf
Description: final filtered dataset with all 6809 SNP loci and all individuals. This was used for all analyses except for the "private allele" analyses described below. In some instances this VCF was subsampled by individual for computational efficiency or to avoid sample size biases - see supplemental materials (Table S1) of publication for individuals included in the EEMS, RAxML phylogenetics, PCoA in Fig. 5A, and Structure analyses.
File: priv_allelesEvsW_CA.vcf
Description: filtered dataset with 403 SNP loci which were identified as containing private alleles within the East (Louisiana, Maryland, Texas and Virginia; 87 loci) or West (Oregon and Washington; 316 loci) regions of the USA. Following identification of private alleles, California samples were added back to the dataset to determine the relationship between California and other USA nutria populations.
Code/software
Access information
Other publicly accessible locations of the data:
- NCBI's Sequence Read Archive: BioProject PRJNA1328208 (accession numbers SAMN51307640-SAMN51307946)
Data was derived from the following sources:
- GBS RADseq - Illumina HiSeq5000
- Sanger sequencing for mitochondrial data
