Data from: Immediate genetic augmentation and enhanced habitat connectivity are required to secure the future of an iconic endangered freshwater fish population
Data files
Oct 14, 2024 version files 13.85 MB
-
Dgen.csv
862 B
-
Dgeo.csv
504 B
-
MurrCat_data_results.csv
73.66 KB
-
Murrumbidgee_CataractDam_covariate.age.cohort.growth.csv
37.31 KB
-
README.md
11.03 KB
-
Report_DMacq23-8576_6_moreOrders_SNP_mapping_2.csv
13.72 MB
Abstract
Genetic diversity is rapidly lost from small, isolated populations by genetic drift. Measuring the level of genetic drift using effective population size (Ne) is highly useful for management. Single-cohort genetic Ne estimators approximate the number of breeders in one season (Nb): a value <100 signals likely inbreeding depression. Per-generation Ne <1000 estimated from multiple cohorts signals reduced adaptive potential. Natural populations rarely meet assumptions of Ne-estimation, so interpreting estimates is challenging. Macquarie perch is an endangered Australian freshwater fish threatened by severely reduced range, habitat loss and fragmentation. To counteract low Ne, augmented gene flow is being implemented in several populations. In the Murrumbidgee River, unknown effects of water management on among-site connectivity impede the design of effective interventions.
Using DArT SNPs for 328 Murrumbidgee individuals sampled across several sites and years with different flow conditions, we assessed population structure, site isolation, heterozygosity, inbreeding, and Ne. We tested for inbreeding depression, assessed genetic diversity and dispersal and evaluated whether individuals translocated from Cataract Reservoir to the Murrumbidgee River bred, and interbred with local fish.
We found strong genetic structure, indicating complete or partial isolation of river fragments. This structure violates assumptions of Ne estimation, resulting in strongly downwardly biased Nb estimates unless assessed per-site, highlighting the necessity to account for population structure while estimating Ne. Inbreeding depression was not detected, but with low Nb at each site, inbreeding and inbreeding depression are likely. These results flagged the necessity to address within-river population connectivity through flow-management and genetic mixing through translocations among sites and from other populations. Three detected genetically diverse offspring of a translocated Cataract fish and a local parent indicated that genetic mixing is in progress. Including admixed individuals in estimates yielded lower Ne but higher heterozygosity, suggesting heterozygosity is a preferable indicator of genetic augmentation.
https://doi.org/10.5061/dryad.j3tx95xqc
Description of the data and file structure
Dataset overview
Macquarie perch is an endangered Australian freshwater fish threatened by severely reduced range, habitat loss and fragmentation. The upper Murrumbidgee River harbors a key Macquarie perch population, where changes in river regulation resulted in many potential barriers to fish movement. Augmented gene flow from Cataract Reservoir is being implemented to improve population fitness, genetic diversity and adaptive potential. We used data for 330 individuals sampled from 2002 to 2023 across 12 upper Murrumbidgee sites, including juveniles born over 5 years (2018-2022) with different flow conditions to (i) assess population structure and the degree of isolation between river fragments, (ii) genetic diversity, (iii) the level of inbreeding, (iv) the number of breeding Macquarie perch individuals (Nb) within each site, and (v) evaluate whether the individuals translocated from Cataract Reservoir to the Murrumbidgee River bred, and interbred with local fish.
A commercial provider Diversity Arrays Technology (DArT) generated reduced-representation genomic sequencing dataset Report_DMacq23-8576_6_moreOrders_SNP_mapping_2.csv. SNP genotypes were obtained by co-analysing new Murrumbidgee samples and previously analysed Cataract Reservoir samples (included as a reference for detecting admixed offspring resulting from interbreeding of Murrumbidgee and translocated Cataract Reservoir individuals). DArT also used blast to map sequenced tags to the draft Macquarie perch genome (DU_Maus_v1.0, NCBI reference GCA_005408345.1).
The R script 2.Analyses_of_DArT_genotypes.R, was used to conduct data filtering, run analyses and create input files for other genetic software. This script requires two files as input: Report_DMacq23-8576_6_moreOrders_SNP_mapping_2.csv **(genotype file, containing data per locus) and Murrumbidgee_CataractDam_covariate.age.cohort.growth.csv (covariate file, containing data per individual).Analysis of isolation-by-distance conducted in this R script require two files, also provided here: **Dgen.csv (matrix of FST-values for isolation-by-distance analysis) and Dgeo.csv (matrix of river distances (in km) for isolation-by-distance analysis). Per-individual results of genetic analyses conducted in this script are collected in MurrCat_data_results.csv.
Acknowledgements
We acknowledge the First Nations throughout Australia, recognise their continuing connection to land, waters and culture, and pay our respects to their Elders past, present and emerging. This research was conducted on Ngarigo Country. This project was partially supported by South East Local Land Services through funding from the Australian Government’s National Landcare Program, NSW Department of Primary Industry, Icon Water, Australian Research Council Linkage Grant LP160100482 to Monash University, La Trobe University and University of Canberra, with Partner Organizations Department of Environment, Land, Water and Planning (DELWP, Victoria), Diversity Arrays Technology, Zoos Victoria, Environment, Planning & Sustainable Development Directorate (ACT Government), and Department of Biodiversity, Conservation and Attractions (Western Australia). Sampling was conducted under NSW Fisheries Scientific Collection Permit No. P07/0007-6.0 and University of Canberra ethics approval AEC 10389.
Files and variables
File: Dgeo.csv
Description: A matrix of pairwise per-site river distances (in km, geographic distances through river stream) for isolation-by-distance analysis of the upper Murrumbidgee Macquarie perch
File: Dgen.csv
Description: A matrix of pairwise per-site FST-values (genetic distances) for isolation-by-distance analysis of the upper Murrumbidgee Macquarie perch, produced by sambar through analyses described in R script 2.Analyses_of_DArT_genotypes.R
File: Report_DMacq23-8576_6_moreOrders_SNP_mapping_2.csv
*Description:** Genotype data file for Macquarie perch (Macquaria australasica) in original format from Diversity Arrays Technology (DArT, a commercial provider), to be analyzed using *dartRverse R package, as per script 2.Analyses_of_DArT_genotypes.R. SNP genotypes are obtained from reduced-representation sequencing data for individuals from Murrumbidgee River and Cataract Reservoir populations, using DS14 pipeline.
File: Murrumbidgee_CataractDam_covariate.age.cohort.growth.csv
Description: a covariate file with field data collected for each genotyped individual, results of the analysis of fish length (see the manuscript for details), and columns needed for creating input files for the Colony2 parentage/sibship analyses. NA- not available.
Variables
- id:individual identifier
- pop:where fish was caught (Murrumbidgee or Cataract)
- year_capture:year of capture
- Length:length of fish, in mm
- site:Site number (if Murrumbidgee, numbered from upstream to downstream) or name (if Cataract Reservoir)
- sampling_date:Date of sampling in DD/MM/YEAR format
-
age.cat:Inferred age category at capture based on length, as defined in the manuscript, and calculated in R script as follows:
#If pop=CataractReservoir assign “NA”,
#for the remaining individuals if year_capture<2020 OR Length >270 assign “adult”,
#for the remaining individuals if Length <90 assign “0-1YO”,
#for the remaining individuals if year_capture=2020 AND Length from 90 to 210 assign “1-2YO” , if year is 2021, 2022 or 2023 AND Length from 90 to 180 also assign “1-2YO”,
#the remaining individuals assign “2-3YO”.
- Inf.birth.year:Birth cohort, if identified based on size
- Sampling.year:Year of sampling (same as year of capture)
- infDOB:inferred date of birth (assumed 30/11 of the birth year)
- inf.age.at.capture:inferred age at capture, in years
- Growth_residuals:growth residual from Gompertz model
- Offspring:if YES- include in sibship analysis as offpsring
- Mother:if YES- include in sibship analysis as parent1
- Father:if YES- include in sibship analysis as parent2
File: MurrCat_data_results.csv
Description: per individual data file containing field data, inferences from field data, genetic results and inferences from genetic analyses. NA- not available.
Variables
- id:individual identifier
- pop:where fish was caught (Murrumbidgee or Cataract)
- year_capture:year of capture
- Length:length of fish, in mm
- site:Site number (if Murrumbidgee, numbered from upstream to downstream) or name (if Cataract Reservoir)
- sampling_date:Date of sampling
- age.cat:Inferred age category at capture based on length (calculated as explained above)
- Inf.birth.year:Birth cohort, if identified based on size
- Sampling.year:Year of sampling (same as year of capture)
- infDOB:inferred date of birth (assumed 30/11 of the birth year)
- inf.age.at.capture:inferred age at capture, in years
- Growth_residuals:growth residual from Gompertz model
- Offspring:if YES- include in sibship analysis as offpsring
- Mother:if YES- include in sibship analysis as parent1
- Father:if YES- include in sibship analysis as parent2
- FSFamilyID:family ID from Colony2 sibship analysis
- Duplicate of:sample with identical genotype
- PHt.Murr.Cat:heterozygosity (PHt, proportion of heterozygous sites) estimated from all Murrumbidgee and Cataract Dam samples
- PC1.MurrCat:value of PC1 from Murrumbidgee+Cataract PCoA
- PC2.MurrCat:value of PC2 from Murrumbidgee+Cataract PCoA
- Ancestry:Murrumbidgee, Cataract or admixed
- PC1.Murr.no.adm:value of PC1 from Murrumbidgee-only ancestry PCoA
- PC2.Murr.no.adm:value of PC2 from Murrumbidgee-only ancestry PCoA
- PHt.Murr.adm.Cat.recapture:heterozygosity (PHt) estimated from all fish captured in Murrumbidgee
- stream.dist.from.site1:stream distance from site to the most upstream Site 1
Code/software
R script 2.Analyses_of_DArT_genotypes.R was run using R version 4.3.3 (R Core Team 2021) with RStudio v2023.03.1+446 (RStudio Team 2020).
This script uses the following packages and functions:
dartr v2.0.4
dartRverse
adegenet
ade4
dartR
devtools
gdsfmt
SNPRelate
ggplot2
LEA
pcadapt
The following analyses are performed:
I. Genetic analyses of Murrumbidgee and Cataract Data
1.GENERAL FILTERING
2. CALCULATING INDIVIDUAL AND POP GENETIC DIVERISTY, PHt.Murr.Cat
3. PCA Murrumbidgee + CATARACT
II. ANALYSIS OF ADULTS of Murrumbidgee-only ancestry fish in SAMBAR
III. CREATING a DATASET of all fish PRUNED FOR PHYSICALLY LINKED LOCI
IV. Creating files for “naive” sibship analysis (no parentage)
V. Estimating effective population size Ne and Nb for different groups of samples (as explained in the manuscript)
1. LDNE on groups with pooled sites per separate cohorts (min N=12)
2. LDNE on all.sites cohort 2022 with admixed
3. LDNE on 10 sep.sites.sep.cohorts
4. LDNE on sep.sites.all.cohorts.no.admixed
5. LDNE on sep.sites.all.cohorts.with.admixed
6. LDNE on sep.sites.all.cohorts.no.admixed
7. LDNE on sep.sites.all.cohorts.with.admixed
VI. Creating a dataset for analysis in BAYESASS, via SAMBAR
VII. PCoA of Murrumbidgee-only (no admixed fish), by sampling sites
VIII. ESTIMATING POP GEN DIVERSITY FOR THE SAME POPULATIONS AS THOSE USED FOR NE CALCULATION
1. pooled sites per separate cohorts (min N=12)
2. all.sites cohort 2022 with admixed
3. sep.sites.sep.cohorts
3a. Murrells.2022 with admixed
4. sep.sites.all.cohorts.no.admixed
5. sep.sites.all.cohorts.with.admixed
6. sep.sites.all.cohorts.no.admixed
7. sep.sites.all.cohorts.with.admixed
IX. Creating an input for an additional COLONY2 analysis using a pruned data of loci with MAF>0.01
Access information
Other publicly accessible locations of the data:
- Bridges Data Repository, Doi: https://doi.org/10.26180/25467973
Data was derived from the following sources:
- Pavlova, A., Pearce, L., Sturgiss, F., Lake, E., Sunnucks, P., & Lintermans, M. (2024). Immediate genetic augmentation and enhanced habitat connectivity are required to secure the future of an iconic endangered freshwater fish population. Evolutionary Applications 17, e70019, https://doi.org/10.1111/eva.70019
This repository contains genetic and phenotypic data for southeatern Australian freshwater fish, Macquarie perch, analyzed in Pavlova A, Pearce L, Sturgiss F, Lake E, Sunnucks P, & Lintermans M. (2024). Immediate genetic augmentation and enhanced habitat connectivity are required to secure the future of an iconic endangered freshwater fish population. Evolutionary Applications, doi: 10.1111/eva.70019/
Genetic data- SNP genotypes from reduced representation sequencing data- were obtained by Diversity Arrays Technology (DArT). File Report_DMacq23-8576_6_moreOrders_SNP_mapping_2.csv is original DArT genotype data file for Macquarie perch (Macquaria australasica), an output of DS14 pipeline by Diversity Arrays Technology. Genotypes are for individuals from Murrumbidgee River and Cataract Reservoir populations (the only ones used during SNP genotyping). SNP genotypes are obtained from reduced-representation sequencing data.
This data was filtered and analyzed in R using script 2.Analyses_of_DArT_genotypes.R. This script is run to filter genotypes to remove loci with mean read depth <6 and >50, loci with reproducibility <95%, loci missing in >10% of individuals, individuals with >20% of missing data, loci with heterozygosity significantly higher than 0.5 in either Murrumbidgee (analysed as a whole) or Cataract Reservoir. One random SNP per DArT-tag was retained, to reduce loci that are highly non-independent through very close physical linkage. The final complete dataset of 3,447 biallelic SNPs scored for 375 individuals (328 Murrumbidgee, 47 Cataract Reservoir) and its subsets were used for assessment of genetic diversity, population structure, effective population sizes. A subset of data comprising adult individuals was analyzed for population structure, FST and isolation-by-distance, after retaining a single SNP locus within a 50-Kb fragment.
Apart from the genotype file, three additional files used as input by this script are provided: Murrumbidgee_CataractDam_covariate.age.cohort.growth.csv - a covariate file with additional information for each genotyped individual; column descriptions are given in the R script; Dgeo.csv- a matrix of river distances (in km) for isolation-by-distance analysis; and Dgen.csv- a matrix of FST-values for isolation-by-distance analysis, produced by sambar during analyses of adults.
All results of genetic and phenotypic analyses for each individual are summarized in MurrCat_data_results.csv.