Single nucleotide polymorphism genotypes for the Australian blackspot shark and the milk shark in Northern Australian waters
Data files
Nov 07, 2024 version files 169.66 MB
-
CARCOA_sample_data.csv
32.89 KB
-
CARCOA_SNP_data_unfiltered.csv
109.46 MB
-
metadata_Dcarcha21-5958.xlsx
27.33 KB
-
metadata_DRHIzn21-5993.xlsx
23.33 KB
-
README.md
5.66 KB
-
Report_DRhizn21-5993_SNP_mapping_2.csv
60.10 MB
-
RHIACU_sample_data.csv
10.16 KB
Abstract
https://doi.org/10.5061/dryad.7h44j103q
Description of the data and file structure
Sample collection
We took tissue samples from 196 individual Rhizoprionodon acutus (Figure 1) and 634 individual *Carcharhinus coatesi *(Figure 2) that were collected in Northern Territory (Australia) waters between May 2018 and November 2019. These sharks were caught as bycatch from the commercial trawl fisheries from Australia’s Exclusive Economic Zone (EEZ) and retained for our research use. Australia’s EEZ around the NT encompasses the Timor Sea in the north-west, the Arafura Sea in the north and the Gulf of Carpentaria in the east (Figure 1, Figure 2). For initial analyses of patterns of population genetics, we grouped the NT samples into four regions. We included three additional regions (Pilbara, Kimberley and Papua New Guinea [PNG]) for R. acutus (Figure 1) and two additional regions (Kimberley and PNG) for C. coatesi (Figure 2). We included these additional samples from Western Australia (WA) and PNG to provide broader context to help understand the degree of genetic differentiation and connectivity among populations in NT waters relative to a broader sample across the regional distribution of these species. The Western Australian samples were provided by Alister Harry (WA Department of Primary Industries and Regional Development), and the PNG samples were provided by Will White (CSIRO).
We extracted DNA for single-nucleotide polymorphism (SNP) genotyping using the DArTSeq protocol through Diversity Arrays P/L (Kilian et al., 2012). We sent tissue samples in 100% ethanol to Diversity Arrays for DNA extraction and genomic library preparation for sequencing. DArTSeq involves an initial step of ‘genome reduction’ to subset a small fraction of the genome of each individual for high-throughput sequencing, followed by bioinformatics analysis to identify DNA sequences containing single nucleotide positions that vary among individuals within and among populations. The DArTSeq protocols for *Carcharhinus *and *Rhizoprionodon *are available for future projects via Diversity Arrays.
The data were used for genetic analyses to infer population structure as described in the published report for the Fisheries Research and Development Corporation associated with this dataset.
Kilian, A., Wenzl, P., Huttner, E., Carling, J., Xia, L., Blois, H., Caig, V., Heller-Uszynska, K., Jaccoud, D., & Hopper, C. (2012). Diversity arrays technology: A generic genome profiling technology on open platforms. In Data production and analysis in population genomics (pp. 67–89). Springer.
Files and variables
Description:
Unfiltered SNP datasets for Rhizoprionodon acutus and Carcharinus coatesi as well as CSV files with individual sample information including sampling location ('id' = individual sample identifier; 'pop' = stratum variable for calculation of genetic diversity metrics based on broad sampling region; 'lat' = latitude of sampling location (rounded); 'lon' = longitude of sampling location (rounded); and 'Full pop name' = full name of population strata for analysis. The files are in the default format required for filtering and analysis in the dartR package for the R statistical environment. The two metadata files describe the unfiltered SNP data provided by Diversity arrays P/L.
Gruber, B., Unmack, P. J., Berry, O. F., & Georges, A. (2018). dartr: An r package to facilitate analysis of SNP data generated from reduced representation genome sequencing. Molecular Ecology Resources, 18(3), 691–699. https://doi.org/10.1111/1755-0998.12745
R Core Team. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org
Code/software
Do, C., Waples, R. S., Peel, D., Macbeth, G. M., Tillett, B. J., & Ovenden, J. R. (2014). NeEstimator v2: Re-implementation of software for the estimation of contemporary effective population size (Ne) from genetic data. Molecular Ecology Resources, 14(1), 209–214. https://doi.org/10.1111/1755-0998.12157
Gruber, B., Unmack, P. J., Berry, O. F., & Georges, A. (2018). dartr: An r package to facilitate analysis of SNP data generated from reduced representation genome sequencing. Molecular Ecology Resources, 18(3), 691–699. https://doi.org/10.1111/1755-0998.12745
Jombart, T., & Ahmed, I. (2011). adegenet 1.3-1: New tools for the analysis of genome-wide SNP data. Bioinformatics, 27(21), 3070–3071. https://doi.org/10.1093/bioinformatics/btr521
Peakall, R., & Smouse, P. E. (2012). GenAlEx 6.5: Genetic analysis in Excel. Population genetic software for teaching and research—an update. Bioinformatics, 28(19), 2537–2539. https://doi.org/10.1093/bioinformatics/bts460
Pritchard, J. K., Stephens, M., & Donnelly, P. (2000). Inference of population structure using multilocus genotype data. Genetics, 155(2), 945–959. https://doi.org/10.1093/genetics/155.2.945
R Core Team. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org
The two datasets presented here include single nucleotide polymorphism (SNP data for the milk shark (Rhizoprionodon acutus) and the Australian blackspot shark (Carcarhinus coatesi) generated by Diversity Arrays Pty Ltd using the DArTSeq method. DArTSeq (Kilian et al 2012) involves an initial step of ‘genome reduction’ to subset a small fraction of the genome of each individual for high-throughput sequencing, followed by bioinformatics analysis to identify DNA sequences containing single nucleotide positions that vary among individuals within and among populations. The DArTSeq protocols for Carcharhinus and Rhizoprionodon are available for future projects via Diversity Arrays.
Using the DArTSeq protocol, 196 individual R. acutus and 634 individual C. coatesi sampled were genotyped. The samples came from sharks that were collected in Northern Territory (Australia) waters between May 2018 and November 2019 and from three additional regions including Western Australia (Pilbara and Kimberley regions) and Papua New Guinea.
Sampling locations of individuals and full unfilltered SNP genotypes are included in the dataset.
Kilian, A., Wenzl, P., Huttner, E., Carling, J., Xia, L., Blois, H., Caig, V., Heller-Uszynska, K., Jaccoud, D., & Hopper, C. (2012). Diversity arrays technology: A generic genome profiling technology on open platforms. In Data production and analysis in population genomics (pp. 67–89). Springer.