Parentage studies and family reconstructions have become increasingly popular for investigating a range of evolutionary, ecological and behavioral processes in natural populations. However, a number of different assignment methods have emerged in common use, and the accuracy of each may differ in relation to the number of loci examined, allelic diversity, incomplete sampling of all candidate parents, and the presence of genotyping errors. Here we examine how these factors affect the accuracy of three popular parentage inference methods (COLONY, FaMoz and an exclusion-Bayes’ theorem approach by Christie et al. (2010a)) to resolve true parent-offspring pairs using simulated data. Our findings demonstrate that accuracy increases with the number and diversity of loci. These were clearly the most important factors in obtaining accurate assignments explaining 75-90% of variance in overall accuracy across 60 simulated scenarios. Furthermore, the proportion of candidate parents sampled had a small but significant impact on the susceptibility of each method to either false positive or false negative assignments. Within the range of values simulated, COLONY outperformed FaMoz, which outperformed the exclusion-Bayes’ theorem method. However, with 20 or more highly polymorphic loci, all methods could be applied with confidence. Our results show that for parentage inference in natural populations, careful consideration of the number and quality of markers will increase the accuracy of assignments and mitigate the effects of incomplete sampling of parental populations.
N500_adults_20percent
Simulated data - N500 low diversity population with 20% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.
N500_adults_40percent
Simulated data - N500 low diversity population with 40% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.
N500_adults_60percent
Simulated data - N500 low diversity population with 60% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.
N500_adults_100percent
Simulated data - N500 low diversity population with 100% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.
N1000_adults_20percent
Simulated data - N1000 high diversity population with 20% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.
N1000_adults_40percent
Simulated data - N1000 high diversity population with 40% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.
N1000_adults_60percent
Simulated data - N1000 high diversity population with 60% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.
N1000_adults_80percent
Simulated data - N1000 high diversity population with 80% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.
N1000_adults_100percent
Simulated data - N1000 high diversity population with 20% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.
N500_juveniles_0.1percenterror
Simulated data - N500 low diversity offspring with 0.1% genotyping error rates. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.
N500_juveniles_1.0percenterror
Simulated data - N500 low diversity offspring with 1.0% genotyping error rates. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.
N1000_juveniles_0.1percenterror
Simulated data - N1000 high diversity offspring with 0.1% genotyping error rates. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.
N1000_juveniles_1.0percenterror
Simulated data - N1000 high diversity offspring with 1.0% genotyping error rates. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.
Methods results
Results of parentage analyses using simulated data - Results from COLONY, FaMoz and an exclusion-Bayes' theorem approach by Christie et al. (Mol. Ecol. Res. 2010, 10, 115-128) to resolve true parent-offspring pairs under 60 simulated scenarios that incrementally simulate the number of loci, allelic diversity , adult sample size and genotyping error.
Supp.Mat. R-Scripts
Supplementary R-script for processing software outputs of COLONY, FaMoz and an exclusion-Bayes theorem approach by Christie et al. (Mol. Ecol. Res. 2010, 10, 115-128). Script also includes details of the Generalized Linear Model to assess the relative variance attributed to different factor affecting the accuracy of assignments.