Data from: The impact of library preparation protocols on the accuracy of allele frequency estimates in Pool-Seq data
Cite this dataset
Kofler, Robert; Nolte, Viola; Schlötterer, Christian (2015). Data from: The impact of library preparation protocols on the accuracy of allele frequency estimates in Pool-Seq data [Dataset]. Dryad. https://doi.org/10.5061/dryad.p31j8
Sequencing pools of individuals (Pool-Seq) is a cost-effective method to determine genome-wide allele frequency estimates. Given the importance of meta-analyses combining data sets, we determined the influence of different genomic library preparation protocols on the consistency of allele frequency estimates. We found that typically no more than 1% of the variation in allele frequency estimates could be attributed to differences in library preparation. Also read length had only a minor effect on the consistency of allele frequency estimates. By far, the most pronounced influence could be attributed to sequence coverage. Increasing the coverage from 30- to 50-fold improved the consistency of allele frequency estimates by at least 27%. We conclude that Pool-Seq data can be easily combined across different library preparation methods, but sufficient sequence coverage is key to reliable results.