Data from: Genotype-free estimation of allele frequencies reduces bias and improves demographic inference from RADSeq data
Warmuth, Vera; Ellegren, Hans; Warmuth, Vera M. (2019), Data from: Genotype-free estimation of allele frequencies reduces bias and improves demographic inference from RADSeq data, Dryad, Dataset, https://doi.org/10.5061/dryad.1sd556t
Restriction-site associated sequencing (RADSeq) facilitates rapid generation of thousands of genetic markers at relatively low cost; however, several sources of error specific to RADSeq methods often lead to biased estimates of allele frequencies and thereby to erroneous population genetic inference. Estimating the distribution of sample allele frequencies without calling genotypes was shown to improve population inference from whole genome sequencing data, but the ability of this approach to account for RADSeq-specific biases remains unexplored. Here we assess in how far genotype-free methods of allele frequency estimation affect demographic inference from empirical RADSeq data. Using the well-studied pied flycatcher (Ficedula hypoleuca) as a study system, we compare allele frequency estimation and demographic inference from whole genome sequencing data with that from RADSeq data matched for samples using both genotype-based and genotype free methods. The demographic history of pied flycatchers as inferred from RADSeq data was highly congruent with that inferred from WGS data when allele frequencies were estimated directly from the read data. In contrast, when allele frequencies were derived from called genotypes, RADSeq-based estimates of most model parameters fell outside the 95% confidence interval (CI) of estimates derived from WGS data. Notably, more stringent filtering of genotypes tended to increase the discrepancy between parameter estimates from WGS and RADSeq data, respectively. The results from this study demonstrate the ability of genotype-free methods to improve AFS-based demographic inference from RADSeq data and highlight the need to account for uncertainty in NGS data regardless of sequencing method.