Major inconsistencies of inferred population genetic structure estimated in a large set of domestic horse breeds using microsatellites
Funk, Stephan et al. (2020), Major inconsistencies of inferred population genetic structure estimated in a large set of domestic horse breeds using microsatellites, Dryad, Dataset, https://doi.org/10.5061/dryad.tmpg4f4vh
STRUCTURE remains the most applied tool aimed at recovering the true, but unknown, population structure from observed microsatellite data or other genetic markers. About 30% of STRUCTURE-based studies could not be reproduced (Gilbert et al., 2012). Here we use a large set of data from 2323 horses from 93 domestic breeds plus the Przewalski horse, typed at 15 microsatellite markers, to evaluate how program settings, in particular the so far insufficiently evaluated number of replicates, impact the estimation of the optimal number of population clusters Kopt that best describe the observed data. Domestic horses are suited as a test case as there is extensive knowledge of the history of many breeds, extensive phylogenetic analyses. Different methods based on different genetic assumptions and statistical procedures (DAPC, FLOCK, PCoA and STRUCTURE with different run scenarios) all revealed the general, broad-scale relationships among the breeds that largely reflect known breed histories but diverged largely how they characterized small-scale patterns. STRUCTURE failed to consistently identify Kopt using the most widespread approach, the ΔK method, despite very large numbers of MCMCs (3,000,000) and replicates (100). The interpretation of breed structure over increasing numbers of K, without assuming a Kopt, was consistent with known breed histories. The over-reliance on Kopt should be replaced by a qualitative description of clustering over increasing K, which is scientifically more honest and has the advantage of being much faster and less computer intensive as lower numbers of MCMC iterations and repetitions suffice for stable results. Very large data sets are highly challenging for cluster analyses, especially when populations with complex genetic histories are investigated.
Samples collected during long-term studies on horse genetics. 15 autosomal microsatellite markers distributed on 14 chromosomes, from marker panels that are recomended for diversity studies fy ISAG-FAO and International Society for Animal Genetics.