SNPs markers of Trypoxylus dichotomus developed by using specific‐locus amplified fragment sequencing (SLAF‐seq) techniques
You, Chongjuan (2020), SNPs markers of Trypoxylus dichotomus developed by using specific‐locus amplified fragment sequencing (SLAF‐seq) techniques , Dryad, Dataset, https://doi.org/10.5061/dryad.qv9s4mwcz
The Japanese rhinoceros beetle Trypoxylus dichotomus is one of the largest beetle species in the world and is commonly used in traditional chinese medicine. Ten subspecies of T. dichotomus and a related Trypoxylus species (T. kanamorii) have been described throughout Asia, but their taxonomic delimitations remain problematic. To clarify issues such as taxonomy, and the degree of genetic differentiation of Trypoxylus populations, we investigated the genetic structure, genetic variability and phylogeography of 53 specimens of Trypoxylus species from 44 locations in five Asian countries (China, Japan, Korea, Thailand, and Myanmar). Using specific‐locus amplified fragment sequencing (SLAF‐seq) techniques, we developed 330,799 SLAFs over 114.16M reads, in turn yielding 46,939 high-resolution single nucleotide polymorphisms (SNPs) for genotyping.
For SLAF-seq analysis, DNA concentration was quantified using a NanoDrop-2000 spectrophotometer, and all DNA samples were diluted to 50 ng/μL. SLAF library construction was carried out following Sun et al. (2013) with minor modifications. To obtain evenly distributed SLAF tags and to avoid repetitive SLAF tags for maximum SLAF-seq efficiency, simulated restriction enzyme digestion was carried out in silico. Genomic DNA was digested using RsaI-HaeIII restriction enzyme and the reference genome of Dendroctonus ponderosae (https://www.ncbi.nlm.nih.gov/genome/?term=Dendroctonus+ponderosae) was used to predict enzyme digestion. DNA fragments of 264–364 bp were selected as SLAFs and prepared for paired-end sequencing on the Illumina High-Seq 2500 sequencing platform (Illumina, Inc.; San Diego, CA) at Biomarker Technologies Corporation (Beijing, China).
Raw pair-end reads were clustered based on sequence similarity. Sequences with over 90% identity were grouped in one SLAF tag, SLAFs with low-depth coverage were filtered out (Sun et al. 2013; Huang et al. 2016; Wang et al. 2019). Only groups with higher depth and four tags or fewer were identified as high-quality SLAFs with SLAFs possessing two, three, or four tags identified as polymorphic. In this study, depth was 17.63 × on average, and a total of 1, 374, 985 high‐quality unique SLAF tags were obtained with 330,799 of those tags considered polymorphic.
Development of SNP markers was based on reference sequence with very high depth in each SLAF tag. SAMtools and GATK were used for mapping and SNP calling (McKenna et al., 2010; Li et al., 2009; Wang et al. 2019). A total of 46, 939 SNPs with minor allele frequencies (MAF) of ≥ 0.05 and an integrity score of ≥80% were employed in downstream analyses.
Fundamental Research Funds for the Central Universities, Award: 2018ZY23