Eastern bettong (Bettongia gaimardi) reintroduced to Mulligan's Flat Woodland Sanctuary and Tidbinbilla Nature Reserve: DArT SNPs + individual information
Data files
May 01, 2023 version files 9.58 MB
-
Brockett_et_al_B_gaimardi_Readme.txt
-
idpop_bettong_allsource.xlsx
-
idpop_bettong_cohortbypop.csv
-
SNP_mapping_1.csv
Abstract
Incorporating genetic data into conservation programmes improves management outcomes, but the impact of different sample-grouping methods on genetic diversity analyses is poorly understood. To this end, the multi-source reintroduction of the eastern bettong (Bettongia gaimardi) was used as a long-term case study to investigate how sampling regimes may affect common genetic metrics, and hence management decisions. The dataset comprised 5307 SNPs sequenced across 263 individuals. Samples included 45 founders from five genetically distinct Tasmanian source regions, and 218 of their descendants captured during annual monitoring at Mulligan’s Flat Woodland Sanctuary (MFWS; 121 samples across eight generations), and Tidbinbilla Nature Reserve (TNR; 97 samples across nine generations). The most management-informative sampling regime was found to be generational cohorts, providing detailed long-term trends in genetic diversity. When these generation-specific trends were not investigated, recent changes in population genetics were masked, and it became apparent that management recommendations would be less appropriate. The results also illuminated the importance of considering establishment and persistence as separate phases of a multi-source reintroduction. The establishment phase (useful for informing early adaptive management) should consist of no less than two generations, and continue until admixture is achieved (admixture defined here as >80% of individuals possessing >60% of source genotypes, with no one source composing >70% of >20% individuals’ genotype) is achieved. This ensures that the persistence phase analyses of population trends remain minimally biased. Based on this case study, we recommend that emphasis be given to the value of generationally specific analyses, and that conservation programmes collect DNA samples throughout the establishment and persistence phases, and avoid collecting genetic samples only when analysis is imminent. We also recommend that population genetic analyses for multi-source reintroductions consider whether admixture has been achieved when calculating descriptive genetic metrics.
Methods
Genotyping
Following the reintroduction monitoring protocol, adults were marked by PIT microchip at first capture and an ear biopsy was taken (Manning et al. 2019). DNA was extracted from these ear clips using a modified proteinase-K digestion and salting-out method (Miller et al., 1988) with ethanol precipitation. Plates were prepared according to DArT specifications (https://www.diversityarrays.com/orderinstructions/). SNP data was obtained by DArT’s Bettong DArTseq 1.0, an optimised complexity reduction technique using restriction enzymes PstI and SphI in a digestion/ligation modified from Kilian et al. (2012). The sequences produced were processed using DArT’s proprietary analytical pipelines.
Data Filtration
The initial 11,097 candidate SNPs returned by DArT were filtered in R v3.6.2 (R Core Team 2019), using the dartR package (Grueber and Georges, 2019). Firstly, a frequency plot was constructed, showing that missingness in the data was loci-based. Therefore, filtration began by applying a locus call-rate threshold of 0.95, ensuring that loci were represented in 95% of samples. Independent reproduction of the data was carried using the repavg function, with minimum acceptable repeatability set at 95%.
As the selection of minor allele frequency (MAF) thresholds can influence downstream analyses, with distinct clusters more difficult to detect under more stringent MAF cut-offs (Linck and Battey, 2019), a range of MAF values (0.01, 0.02, 0.05) were tested. Since the inclusion of rarer alleles can reveal smaller-scale population structure (De la Cruz and Raska, 2014; Tabangin et al., 2009), and Tabangin et al. (2009) have found that rare alleles do not give significantly more false positives than common alleles, the lowest MAF threshold that produced biological meaningful results was selected (MAF 0.02, Supplemental Material 1).
A function to set the minimum allowable sequence depth to 10 was applied, and an allele depth ratio maximum of two was implemented to remove SNPs where one allele was sequenced at much greater depths than the other, because these may be prone to being null alleles. Secondary SNPs were then filtered using the inbuilt dartR function gl.filter.secondaries, with the method set to “best”. In tags where multiple SNPs are present, this retains the SNP with the highest Polymorphic Information Content (Shafer et al., 2015). Here, 86 secondaries were removed. Tests for Hardy-Weinberg Equilibrium were run using gl.filter.hwe, identifying no issues. Individuals were then filtered using a call-rate of 0.90, removing 7 samples. Finally, any monomorphic loci created by other filtering processes were removed. Though Schmidt et al. (2021) caution that removing monomorphs can bias heterozygosity estimates, the sample sizes here are not particularly small and SNPs are only compared within the dataset. Assessments of filtering strategy on heterozygosity showed minimal differences in MFWS when SNPs were called on MFWS compared to the whole dataset (sensu Schmidt et al., 2021), and so the common method of filtering SNPs on the whole dataset was retained. The resultant dataset contained 5307 SNPs across 163 individuals.