Dataset for: Guidelines for standardising the application of discriminant analysis of principal components to genotype data
Data files
Sep 15, 2022 version files 2.41 GB
-
Data.zip
2.41 GB
-
README.pdf
207.42 KB
Abstract
Data and scripts required to replicate the analyses in Thia (2022) "Guidelines for standardising the application of discriminant analysis of principal components to genotype data" in Molecular Ecology.
This study aimed to address methodological misunderstandings and misuse of the DAPC method in population genetics. The analyses are used to illustrate that for genotype data comprising k effective populations, there are only k−1 PC axes that describe populations structure, and that are biologically informative. These PC axes are the only suitable axes for modelling the among-population differences with a DA. Use of many more than k−1 PC axes leads to decreasing biological relevancy of the final DA solution, with implications for misinterpretations of population structure.
Metapopulations with different migrations rates and levels of genetic differentaition were simualted using fastsimcoal v2.7.
Simulated individuals were imported into R v4.1.2 for further downstream analysis.
See the README for details on setting up computing environment, the working directory, and executing scripts.