Whole-genome evaluation of genetic rescue: The case of a curiously isolated and endangered butterfly
Data files
Jan 21, 2025 version files 2.13 GB
-
final_semiluna_22_chr_maxmiss_0.75_99_cov.recode.vcf.zip
2.13 GB
-
README.md
3.40 KB
Abstract
Genetic rescue, or the translocation of individuals among populations to augment gene flow, can help combat genetic erosion, inbreeding depression, and loss of adaptive potential in small and isolated populations. Genetic rescue is currently being considered for an endangered butterfly in Canada, the half-moon hairstreak (Satyrium semiluna). A small, unique population persists in Waterton Lakes National Park, Alberta, isolated from other populations by >350km. However, whether genetic rescue would actually be helpful has not been evaluated. Here, we generate the first chromosome-level genome assembly and whole-genome resequence data for the species. We find that the Alberta population’s genetic diversity is extremely low and very divergent from the nearest populations in British Columbia and Montana. Runs of homozygosity suggest this is due to a long history of inbreeding, and coalescent analyses show that the population has been small, isolated, yet stable for up to 40k years. When a population like this maintains its viability despite inbreeding and low genetic diversity, it has likely undergone purging of deleterious recessive alleles and could be threatened by their reintroduction via translocations. Ecological niche modelling indicates that the Alberta population also exhibits environmental associations that are atypical of the species. Together, these results suggest that translocations are likely to result in outbreeding depression. We infer that genetic rescue has a unique potential to be harmful rather than helpful at present. However, due to reduced adaptive potential, this population may still benefit from future genetic rescue as climate conditions change, and experimental population crosses should be completed.
README: Whole-genome evaluation of genetic rescue: The case of a curiously isolated and endangered butterfly
https://doi.org/10.5061/dryad.hmgqnk9tx
Description of the data and file structure
We collected a total of 19 adult *S. semiluna *using aerial nets throughout the summer of 2021, preferentially collecting worn individuals when possible to minimize impacts on populations. Eight individuals were collected from Blakiston Fan, Waterton Lakes National Park, Alberta, Canada, four from Richter Pass, British Columbia, Canada, three from Anarchist Mountain, BC, and four near Red Lodge, Montana, USA. Specimens from Waterton Lakes National Park were collected under the Parks Canada Agency Research and Collection Permit: WL-2021-39020. Specimens collected in British Columbia were collected on private land with landowner permissions, Nature Conservancy Canada Research Permit No. NCC_BC_2021_SS001, and Nature Trust of British Columbia #3461.
This datafile is a filtered genotype vcf file generated as follows. Short-read, whole-genome resequencing was completed on 15 individuals. We extracted genomic DNA from thoracic tissue using DNeasy Kits (Qiagen, Hilden, Germany), following the manufacturer's protocol with the addition of a bovine pancreatic ribonuclease A treatment (RNaseA, 4 μl at 100 mg/ml; Sigma-Aldrich Canada Co., Canada). Following extraction, genomic DNA was ethanol precipitated and stored in purified (50 μl Millipore) water at -20°C. PCR-free whole-genome library preparation was completed using an Ultra II FS DNA Library Prep Kit (New England Biolabs, Ipswich, MA, USA), followed by paired-end, 150-bp sequencing on an Illumina NovaSeq S1 300 flowcell (total output 1600M read pairs, 500Gbp), aiming for ~20x coverage per sample, at the Centre for Health Genomics and Informatics, University of Calgary. We processed raw reads using a pipeline based on recommendations from Genome Analysis Toolkit (GATK) Best Practices Guides (Van der Auwera & O'Connor 2020). After trimming adapter sequences and individual indexes, we aligned reads to our *S. semiluna *reference genome using BWA-MEM2 (Vasimuddin et al. 2019). Following removal of duplicate reads in BAM files using MarkDuplicates (Picard), alignments were passed to GATK’s HaplotypeCaller, which assembles local de-novo haplotypes on an individual-by-individual basis, generating an intermediate GVCF file for each individual. GVCF files were then used in GATK’s GenotypeGVCFs for joint genotyping of all individuals, using the “-all-sites” option. From the resulting multisample VCF file, we removed loci occurring on small, unassembled scaffolds (<1 Mb) and the Z chromosome, leaving only loci that occur on assembled autosomes. Filtering was completed using VCFtools 0.1.14 (Danecek et al. 2011), including the removal of indels, sites with >2 alleles, and sites with less than 99.9% accuracy (Phred scores < 30). We then applied an individual-specific read depth filter, removing lcoi with depths less than five or exceeding the 99th percentile of each individual. Finally, we removed loci with more than 25% missing data across all individuals resulting in a final dataset of 15 individuals and 23,889,641 SNPs.
Files and variables
File: final_semiluna_22_chr_maxmiss_0.75_99_cov.recode.vcf.zip
Description: Filtered vcf file