Raw data of samples of the Korean endemic shrub Abeliophyllum distichum in fastq format
Data files
Oct 01, 2024 version files 71.42 GB
-
Abeliophyllum_distichum_demultiplexed_fastq_files.zip
71.42 GB
-
README.md
1.07 KB
Abstract
White forsythia or Abeliophyllum distichum Nakai (Oleaceae) is the only member in its genus and is an endangered shrub endemic, with limited and fragmented distribution, to the Korean Peninsula. Due to its endemicity and distribution status, several populations of the species have been designated as Korean natural monuments, an IUCN Category III protected area. More recently, the species’ genomic variation and structure, and the demographic history of divergence of its genetic groups were investigated. In our current work, we look into the species’ genetic-landscape pattern, interpopulation connectivity, individual genetic group population size change history, and the species’ future distribution using SNPs and RAD loci datasets. Here, we provide the raw genomic data derived from high throughput sequencing of reduced representation libraries of 135 A. distichum individuals across nine natural populations (15 samples per population, including those from five natural monuments habitats) of the species. The libraries were prepared using the GBS protocol and demultiplexed using the Stacks 2 software to generate a dataset of fastq files for each sample. These datasets have been employed in our current study with the running title “Population connectivity and size reductions in the Anthropocene: the consequence of landscapes and historical bottlenecks in white forsythia fragmented habitats” submitted to the BMC Journal of Ecology and Evolution.
README: Raw data of samples of the Korean endemic shrub Abeliophyllum distichum in fastq format
DOI: 10.5061/dryad.69p8cz9b2
This README file explains the dataset.
This folder contains demultiplexed fastq files of 135 Abeliophyllum distichum samples across nine natural populations (15 per population). The files are the output of the ‘process_radtags’ program (in Stacks 2) on Illumina-sequenced paired-end reads, discarding those with low-quality scores (-q 10).
Each sample has a set of four fastq files: sample_XXX.1.fq and sample_XXX.2.fq files being those used in further analyses, and the "sample_XXX.rem.1.fq" and "sample_XXX.rem.2.fq" being the remainder files which were discarded after a quality check. Each sample starts with a population code (two capital letters), but also with additional information for populations with natural monument status (NM identification), before the final sample information (numbers). Please see Ong et al. (2023) (DOI: 10.1002/ece3.10792) for more precise population information.
Methods
We provide the raw genomic data derived from high throughput sequencing of reduced representation libraries of 135 A. distichum individuals across nine natural populations (15 samples per population, including those from five natural monuments habitats) of the species. The libraries were prepared using the GBS protocol and demultiplexed using the Stacks 2 software to generate a dataset of fastq files for each sample.
The files are the output of the ‘process_radtags’ program (in Stacks 2) after processing paired-end reads output from Illumina sequencing. Each sample is a set of four files: two for the single-end read and two for the pair-end read. Reads with low-quality scores (-q 10) had been discarded. According to Stacks 2 manual (Rochette et al., 2019) and protocol (Rivera-Colon and Catchen, 2022), the ‘process_radtags’ program keeps the reads in phase, hence, the first read in the sample_XXX.1.fq file is the mate of the first read in the sample_XXX.2.fq file. When one read in a pair is discarded due to low quality or a missing restriction enzyme cut site, the remaining read is considered a remainder read and is output into the sample_XXX.rem.1.fq file if the paired-end was discarded, or the sample_XXX.rem.2.fq file if the single-end was discarded (Rochette et al., 2019).