Skip to main content
Dryad

Data from: Substantial differences in bias between single-digest and double-digest RAD-seq libraries: a case study

Cite this dataset

Flanagan, Sarah P.; Jones, Adam G. (2018). Data from: Substantial differences in bias between single-digest and double-digest RAD-seq libraries: a case study [Dataset]. Dryad. https://doi.org/10.5061/dryad.qf916

Abstract

The trade‐offs of using single‐digest vs. double‐digest restriction site‐associated DNA sequencing (RAD‐seq) protocols have been widely discussed. However, no direct empirical comparisons of the two methods have been conducted. Here, we sampled a single population of Gulf pipefish (Syngnathus scovelli) and genotyped 444 individuals using RAD‐seq. Sixty individuals were subjected to single‐digest RAD‐seq (sdRAD‐seq), and the remaining 384 individuals were genotyped using a double‐digest RAD‐seq (ddRAD‐seq) protocol. We analysed the resulting Illumina sequencing data and compared the two genotyping methods when reads were analysed either together or separately. Coverage statistics, observed heterozygosity, and allele frequencies differed significantly between the two protocols, as did the results of selection components analysis. We also performed an in silico digestion of the Gulf pipefish genome and modelled five major sources of bias: PCR duplicates, polymorphic restriction sites, shearing bias, asymmetric sampling (i.e., genotyping fewer individuals with sdRAD‐seq than with ddRAD‐seq) and higher major allele frequencies. This combination of approaches allowed us to determine that polymorphic restriction sites, an asymmetric sampling scheme, mean allele frequencies and to some extent PCR duplicates all contribute to different estimates of allele frequencies between samples genotyped using sdRAD‐seq versus ddRAD‐seq. Our finding that sdRAD‐seq and ddRAD‐seq can result in different allele frequencies has implications for comparisons across studies and techniques that endeavour to identify genomewide signatures of evolutionary processes in natural populations.

Usage notes

Funding

National Science Foundation, Award: DEB-1119261, DEB-1401688, DGE-1252521, DBI-1300426

Location

United States
Texas
Gulf of Mexico
Corpus Christi