Data from: An evaluation of a novel estimator of linkage disequilibrium
Data files
Apr 24, 2013 version files 2.51 MB
-
Cattle SNP data_Heredity.txt
2.51 MB
Abstract
The analysis of systems involving many loci is important in population and quantitative genetics. An important problem is the study of linkage disequilibrium (LD), a concept relevant in genome-enabled prediction of quantitative traits and in exploration of marker-phenotype associations. This article introduces a new estimator of a LD parameter (ρ^2) that is much easier to compute than a maximum likelihood (or Bayesian) estimate of a tetra-choric correlation. We examined the conjecture that the sampling distribution of the estimator of ρ^2 could be less frequency dependent than that of the estimator of r^2, a widely employed metric for assessing LD. This was done via an empirical evaluation of LD in 806 Holstein-Friesian cattle using 771 SNP markers, and of HapMap III data on 21,991 SNPs (chromosome 3) observed in 88 unrelated individuals from Tuscany. Also, 1600 haplotypes over a region of 1 Mb simulated under the coalescent were used to estimate LD using the two measures. Subsequently, a simulation study compared the new estimator with that of r2 using several scenarios of LD and allelic frequencies. From these studies it is concluded that ρ^2 provides a useful metric for the study of LD since the distribution of its estimator is less frequency-dependent than that of the standard estimator of r^2.