Extant and extinct bilby genomes combined with indigenous knowledge improve conservation of a unique Australian marsupial
Data files
Apr 23, 2024 version files 12.64 MB
-
Bilby_03_Supplementary_Text_File-code-final.txt
-
Ninu_popgen_data.xlsx
-
README.md
Abstract
The Ninu (Greater bilby, Macrotis lagotis) is a desert-dwelling, culturally and ecologically important marsupial. In collaboration with Indigenous rangers and conservation managers, we generated the first Ninu chromosome-level genome assembly (3.66 Gbp) and genome sequences for the extinct Yallara (Lesser bilby, Macrotis leucura). We developed and tested a scat SNP panel, based on our genomic datasets, to inform current and future conservation actions, to undertake future ecological assessments, and improve our understanding of Ninu genetic diversity in managed and wild populations. We also assessed the beneficial impact of targeted conservation actions, like translocations, in the contemporary metapopulation (N=363 Ninu). Resequenced genomes (temperate Ninu=6; semi-arid Ninu=6; Yallara=4) revealed two major population crashes during global cooling events for both species and differences in Ninu genes involved in anatomical and metabolic pathway adaptations to aridity. Despite their 45-year long captive history, Ninu have fewer long runs of homozygosity than other larger mammals, which may be attributable to their boom-bust life-history. We also investigated the unique Ninu biology using 12 tissue transcriptomes revealing expression of all 115 conserved eutherian chorioallantoic placentation genes in the uterus; an XY1Y2 sex chromosome system generated by fusion of the X with a large telocentric autosome; and expansions in olfactory receptor genes. Together, we demonstrate the holistic value of genomics in improving key conservation management actions, understanding unique biological traits, and developing tools for Indigenous rangers to monitor remote wild populations.
README: Extant and extinct bilby genomes combined with Indigenous knowledge improve conservation of a unique Australian marsupial
https://doi.org/10.5061/dryad.gtht76htz
This accession contains both code used to generate the MassARRAY SNPs and DArTseq genotype data associated with the Ninu (Greater bilby, Macrotis lagotis) 1) scat genotyping MassARRAY design and 2) population genetic analysis described in Hogg et al.
Description of the data and file structure
The text file "Bilby_03_Supplementary_Text_File-code-final" contains R code in RMarkdown format used in the design of the SNP panel used in the custom MassARRAY scat genotyping assay. The code is annotated throughout and includes the following steps:
- Extracting allele depth and genotype depth
- Filtering on genotyping quality
- Removing replicate samples and lower quality samples
- SNP selection as detailed in the annotated code. SNP selection included selecting only one SNP per locus; sequence length >= 50 bp; SNP position in read 25:45; genotype quality score > 20; minimum average read depth per locus = 10; maximum average read depth per locus = 200; genotyping rate per loci >= 90%; genotyping rate per individual >= 80%; coverage difference between reference and alternate allele between 20% and 80%; reproducibility between technical replicates > 95%; paralogs (removing sequence similarity >= 25%); Hardy-Weinberg equilibrium and linkage disequilibrium threshold of 0.5. For the SNP panel designed for individual identification, SNPs were also filtered for heterozygosity between 0.2 and 0.6 and a minor allele frequency > 0.3.
The Excel workbook contains four sheets. For details of SNP calling and filtering, see the Supplementary of the paper.
- README contains some additional detail about the datasets
- SNP data contains the DArTseq genotypes at the n=9906 SNPs used in the population genetic analysis of n=363 Ninu. Data is formatted as one row per SNP and one column per sample. Genotypes are coded as 0/0 = homozygous for the reference allele (REF column), 0/1 = heterozygous, 1/1 = homozygous for the alternate allele (ALT column), NA = missing data. Data is aligned to the Ninu genome generated in this study (v1.9), loci positions are indicated by the CHROM, POS, and ID columns.
- MassARRAY SNPs contains the DArTseq genotypes at the n=35 SNPs selected for the MassARRAY scat genotyping assay. Data is formatted as one row per SNP and one column per sample. Genotypes are coded as 0 = reference allele homozygote (REF column), 1 = heterozygote, 2 = alternate allele homozygote (ALT column), NA = missing data. Data is aligned to an earlier version of the Ninu genome generated in this study (v1.4.3), loci positions are indicated by the CHROM, POS, and ID columns. Additional columns provide information on the quality and informativeness of each SNP.
- Sample key contains the list of Ninu sample IDs and their associated population metadata. Metadata includes the sample source population (for Ninu bred at source sites), founder population (for Ninu that were translocated), and offspring population (for Ninu that were born at translocated sites).
Sharing/Access information
The reference Ninu genome, resequenced Ninu and Yallara genomes, and transcriptomes are available on NCBI under PRJNA1049866 and PRJNA1049868 in addition to the Australasian Genomes Open Data Store:
Methods
There are two datasets included in the Excel workbook:
- A set of SNPs (n=9906) generated using DArTseq, a form of reduced representation sequencing and used in the population genetic analyses of the Ninu (n=363 samples). See Supplementary Note 2.4 for details of SNP calling and filtering. Briefly, reads were cleaned and aligned to the Ninu reference genome generated in this study (v1.9). Variants were called with Stacks and filtered to retain SNPs with a minor allele frequency of ≥0.01, minimum average allelic depth of 2.5 x per allele, allelic coverage difference ≤80%, call rate ≥70%, locus heterozygosity ≤90%, and reproducibility between technical replicates ≥90%, and remove putatively sex-linked SNPs.
- A set of SNPs (n=35) generated using DArTseq, a form of reduced representation sequencing and selected for use in the MassARRAY Ninu scat genotyping assay. The DArTseq reads were mapped to an earlier version of the Ninu reference genome generated in this study (v1.4.3). See Supplementary Note 2.5 for details of the MassARRAY SNP Panel Design.
The .txt file contains the R code in Rmarkdown format used to select SNPs for the scat genotyping assay.
The file also contains a README with additional information about the coding of genotypes, and a sample key that contains population metadata for each genotyped sample.