Parentage studies and family reconstructions have become increasingly popular for investigating a range of evolutionary, ecological and behavioral processes in natural populations. However, a number of different assignment methods have emerged in common use, and the accuracy of each may differ in relation to the number of loci examined, allelic diversity, incomplete sampling of all candidate parents, and the presence of genotyping errors. Here we examine how these factors affect the accuracy of three popular parentage inference methods (COLONY, FaMoz and an exclusion-Bayes’ theorem approach by Christie et al. (2010a)) to resolve true parent-offspring pairs using simulated data. Our findings demonstrate that accuracy increases with the number and diversity of loci. These were clearly the most important factors in obtaining accurate assignments explaining 75-90% of variance in overall accuracy across 60 simulated scenarios. Furthermore, the proportion of candidate parents sampled had a small but significant impact on the susceptibility of each method to either false positive or false negative assignments. Within the range of values simulated, COLONY outperformed FaMoz, which outperformed the exclusion-Bayes’ theorem method. However, with 20 or more highly polymorphic loci, all methods could be applied with confidence. Our results show that for parentage inference in natural populations, careful consideration of the number and quality of markers will increase the accuracy of assignments and mitigate the effects of incomplete sampling of parental populations.

N500_adults_20percent

Simulated data - N500 low diversity population with 20% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.

N500_adults_40percent

Simulated data - N500 low diversity population with 40% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.

N500_adults_60percent

Simulated data - N500 low diversity population with 60% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.

N500_adults_80percent

N500_adults_100percent

Simulated data - N500 low diversity population with 100% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.

N1000_adults_20percent

Simulated data - N1000 high diversity population with 20% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.

N1000_adults_40percent

Simulated data - N1000 high diversity population with 40% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.

N1000_adults_60percent

Simulated data - N1000 high diversity population with 60% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.

N1000_adults_80percent

Simulated data - N1000 high diversity population with 80% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.

N1000_adults_100percent

Simulated data - N1000 high diversity population with 20% of adults sampled and 20 microsatellite loci. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.

N500_juveniles_0.1percenterror

Simulated data - N500 low diversity offspring with 0.1% genotyping error rates. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.

N500_juveniles_1.0percenterror

Simulated data - N500 low diversity offspring with 1.0% genotyping error rates. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.

N1000_juveniles_0.1percenterror

Simulated data - N1000 high diversity offspring with 0.1% genotyping error rates. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.

N1000_juveniles_1.0percenterror

Simulated data - N1000 high diversity offspring with 1.0% genotyping error rates. The first 10 and 15 loci were subset for analyses with 10 and 15 loci.

Methods results

Results of parentage analyses using simulated data - Results from COLONY, FaMoz and an exclusion-Bayes' theorem approach by Christie et al. (Mol. Ecol. Res. 2010, 10, 115-128) to resolve true parent-offspring pairs under 60 simulated scenarios that incrementally simulate the number of loci, allelic diversity , adult sample size and genotyping error.

Supp.Mat. R-Scripts

Supplementary R-script for processing software outputs of COLONY, FaMoz and an exclusion-Bayes theorem approach by Christie et al. (Mol. Ecol. Res. 2010, 10, 115-128). Script also includes details of the Generalized Linear Model to assess the relative variance attributed to different factor affecting the accuracy of assignments.

Data from: Relative accuracy of three common methods of parentage analysis in natural populations

Data files

Abstract

N500_adults_20percent

N500_adults_40percent

N500_adults_60percent

N500_adults_80percent

N500_adults_100percent

N1000_adults_20percent

N1000_adults_40percent

N1000_adults_60percent

N1000_adults_80percent

N1000_adults_100percent

N500_juveniles_0.1percenterror

N500_juveniles_1.0percenterror

N1000_juveniles_0.1percenterror

N1000_juveniles_1.0percenterror

Methods results

Supp.Mat. R-Scripts

Data from: Relative accuracy of three common methods of parentage analysis in natural populations

Data files

Abstract

Usage notes

N500_adults_20percent

N500_adults_40percent

N500_adults_60percent

N500_adults_80percent

N500_adults_100percent

N1000_adults_20percent

N1000_adults_40percent

N1000_adults_60percent

N1000_adults_80percent

N1000_adults_100percent

N500_juveniles_0.1percenterror

N500_juveniles_1.0percenterror

N1000_juveniles_0.1percenterror

N1000_juveniles_1.0percenterror

Methods results

Supp.Mat. R-Scripts

Works referencing this dataset