Genotyping-by-sequencing Single-nucleotide Polymorphism Dataset for Corynorhinus rafinesquii (CORA) and Myotis austroriparius (MYAU)
Data files
Aug 23, 2024 version files 88.36 GB
-
AH52MHBGX5_1_1_fastq.txt.gz
5.60 GB
-
AH52MHBGX5_2_1_fastq.txt.gz
5.45 GB
-
AH52MHBGX5_3_1_fastq.txt.gz
5.74 GB
-
AH52MHBGX5_4_1_fastq.txt.gz
5.57 GB
-
AHYLLJBGX5_1_1_fastq.txt.gz
8.60 GB
-
AHYLLJBGX5_2_1_fastq.txt.gz
8.58 GB
-
AHYLLJBGX5_3_1_fastq.txt.gz
8.60 GB
-
AHYLLJBGX5_4_1_fastq.txt.gz
8.58 GB
-
Bat_combined_keyfile_11_17_18_Data_Submission.txt
133.85 KB
-
README.md
3.59 KB
-
Undetermined_S0_L001_R1_001.fastq.gz
7.90 GB
-
Undetermined_S0_L002_R1_001.fastq.gz
7.78 GB
-
Undetermined_S0_L003_R1_001.fastq.gz
7.99 GB
-
Undetermined_S0_L004_R1_001.fastq.gz
7.96 GB
Abstract
Understanding underlying genetic structure is essential for the conservation and management of rare or uncommon species because it is important to protect their evolutionary potential and adaptability by preserving genetic diversity. Southeastern Myotis (Myotis austroriparius or MYAU) is an uncommon bat species that ranges across much of the southeastern United States. At the state level, MYAU is regarded as endangered or a Species of Greatest Conservation Need across nearly all its distribution. The overall objective of this study was to examine the genetic structure and genetic diversity of MYAU by determining levels of subpopulation connectivity across its range. We collected, sequenced, and analyzed tissue samples from 376 individuals from 38 sites, 11 states, and 8 ecoregions using genotyping-by-sequencing (GBS). We used Sanger sequencing to sequence a portion of the mtDNA control region from 472 tissue samples from 42 sites, 12 states, and 8 ecoregions. GBS results indicated that MYAU has a single, panmictic population with little genetic structure and should be managed as so. Results from mtDNA indicated higher levels of genetic structure, likely due to low effective population size, some level of sex-biased dispersal, and increased mutation rates, but not enough to consider separate management units or clades. Genetic diversity estimates were low to moderate. Results from this study can be used to infer and improve long-term protection and management protocols for MYAU. Researchers and managers should preserve gene flow and ensure subpopulations remain connected by maintaining forest corridors and protecting natural and artificial roosts for MYAU in order to prevent future population segregation.
README: Genotyping-by-sequencing SNP Dataset for Corynorhinus rafinesquii (CORA) and Myotis austroriparius (MYAU)
https://doi.org/10.5061/dryad.stqjq2c5x
These SNP data are from three Illumina sequencing runs of Corynorhinus rafinesquii (CORA) and Myotis austroriparius (MYAU) tissue samples using genotyping-by-sequencing (GBS). All details of the project and sequencing methods can be found in the published manuscripts. We are publishing two separate manuscripts: one for MYAU and one for CORA.
Publication 1 Title: Range-wide Population Genetic Structure and Genetic Diversity of Southeastern Myotis (Myotis austroriparius)
Publication 2 Title: Pending (The CORA manuscript is in preparation. We ask that you please refrain from using the CORA sequences at this time. We will update this once the manuscript has been published.)
Description of the data and file structure
This is a raw sequencing dataset from three sequencing runs on an Illumina Nextseq 500. This is the original dataset with no editing or filtering. This dataset was later edited and filtered for downstream analyses so if you wish to use this dataset, you may want to consider the following. Here are notes on samples that were edited or removed in downstream analyses for various reasons:
Samples 0811_02 and 0811_03 are duplicated as 0811-02 and 0811-03, respectively, and did not combine because of the underscore versus the dash in the name. 0811_02 and 0811-02 are the same samples. This is the same for 0811_03 and 0811-03 as well.
I removed sample NC_E7 because of unknown location and uncertainty on species. It is very likely MYAU, but the collector did not label properly. AR_D5 was removed because we later found out that this was a different species (misidentified by the collector). I also removed samples from sites 48, 49, and 50 because the site numbers were mixed up. However, I think I may have fixed that, but I'm not 100% sure. If site is not a concern, then those samples are likely fine because the state and ecoregion should still be correct. The site numbers are at the end of the file names. If you would like specific data on sites, please reach out to the primary author.
For MYAU, I removed site 73 (sample A68, which is the only sample from that site) because all loci had "NA", so it was uncertain why it was ever retained. I later determined it was CORA rather than MYAU as originally labeled.
For CORA, I removed the single sample from Georgia because it was the only one collected in that state. I also removed one SNP (TP756944) that was not genotyped in Texas. Sample CORA_OK_351 was removed because it is likely a misidentified species (misidentified by collector).
I have also included a keyfile with barcodes for each sample across all three sequencing runs. Each full sample name includes the species (MYAU or CORA), the state abbreviation (United States), the unique alphanumeric sample code (created by the authors), and the site number (assigned by the authors), in that order.
Each sequencing run contains 4 files, which each correspond to 1 of 4 lanes used during the sequencing run. Sequencing run AH27M2BGX9 (see Flowcell column in the keyfile) is named "undetermined" in the uploaded files. The other two sequencing runs are AH52MHBGX5 and AHYLLJBGX5.
Sharing/Access information
If you have specific questions about the dataset, such as requests for more specific metadata, please reach out to the corresponding author.
Code/Software
Please see the manuscript for a description of software used.
Methods
Please see the published manuscript for a detailed description of the methods.