Data from: Constraints on the FST–heterozygosity outlier approach

Flanagan, Sarah P.1; Jones, Adam G.1

Published May 04, 2017 on Dryad. https://doi.org/10.5061/dryad.785bn

Data files

May 04, 2017 version files 131.37 MB

code_availability.txt

173 B
fdist2_output.zip

42.22 MB
numerical_analysis_figs3-5.zip

42.81 MB
numerical_analysis_randomsample_lositan.zip

38.15 MB
numerical_analysis_randomsample.zip

8.19 MB

Abstract

The FST-heterozygosity outlier approach has been a popular method for identifying loci under balancing and positive selection since Beaumont and Nichols first proposed it in 1996 and recommended its use for studies sampling a large number of independent populations (at least 10). Since then, their program FDIST2 and a user-friendly program optimized for large datasets, LOSITAN, have been used widely in the population genetics literature, often without the requisite number of samples. We observed empirical datasets whose distributions could not be reconciled with the confidence intervals generated by the null coalescent island model. Here, we use forward-in-time simulations to investigate circumstances under which the FST-heterozygosity outlier approach performs poorly for next-generation single-nucleotide polymorphism (SNP) datasets. Our results show that samples involving few independent populations, particularly when migration rates are low, result in distributions of the FST-heterozygosity relationship that are not described by the null model implemented in LOSITAN. In addition, even under favorable conditions LOSITAN rarely provides confidence intervals that precisely fit SNP data, making the associated p-values only roughly valid at best. We present an alternative method, implemented in a new R package named fsthet, which uses the raw empirical data to generate smoothed outlier plots for the FST-heterozygosity relationship.

Data from: Constraints on the FST–heterozygosity outlier approach

Data files

Abstract

Usage notes

numerical_analysis_figs3-5

fdist2_output

numerical_analysis_randomsample_lositan

numerical_analysis_randomsample

code_availability

Works referencing this dataset