Data from: A comparison of single-sample estimators of effective population sizes from genetic marker data
Data files
Jul 08, 2016 version files 1.45 MB
-
Ne.zip
Abstract
In molecular ecology and conservation genetics studies, the important parameter of effective population size (Ne) is increasingly estimated from a single sample of individuals taken at random from a population and genotyped at a number of marker loci. Several estimators are developed, based on the information of linkage disequilibrium (LD), heterozygote excess (HE), molecular coancestry (MC) and sibship frequency (SF) in marker data. The most popular is the LD estimator, because it is more accurate than HE and MC estimators and is simpler to calculate than SF estimator. However, little is known about the accuracy of LD estimator relative to that of SF and about the robustness of all single-sample estimators when some simplifying assumptions (e.g. random mating, no linkage, no genotyping errors) are violated. This study fills the gaps and uses extensive simulations to compare the biases and accuracies of the four estimators for different population properties (e.g. bottlenecks, nonrandom mating, haplodiploid), marker properties (e.g. linkage, polymorphisms) and sample properties (e.g. numbers of individuals and markers) and to compare the robustness of the four estimators when marker data are imperfect (with allelic dropouts). Extensive simulations show that SF estimator is more accurate, has a much wider application scope (e.g. suitable to nonrandom mating such as selfing, haplodiploid species, dominant markers) and is more robust (e.g. to the presence of linkage and genotyping errors of markers) than the other estimators. An empirical data set from a Yellowstone grizzly bear population was analysed to demonstrate the use of the SF estimator in practice.