Skip to main content
Dryad

Data from: Genotyping-by-sequencing for estimating relatedness in non-model organisms: avoiding the trap of precise bias

Cite this dataset

Attard, Catherine R.M.; Beheregaray, Luciano B.; Moller, Luciana M.; Attard, Catherine R. M. (2017). Data from: Genotyping-by-sequencing for estimating relatedness in non-model organisms: avoiding the trap of precise bias [Dataset]. Dryad. https://doi.org/10.5061/dryad.t8ph5

Abstract

There has been remarkably little attention to using the high resolution provided by genotyping-by-sequencing (i.e. RADseq and similar methods) datasets for assessing relatedness in wildlife populations. A major hurdle is the genotyping error, especially allelic dropout, often found in this type of dataset that could lead to downward-biased, yet precise, estimates of relatedness. Here we assess the applicability of genotyping-by-sequencing datasets for relatedness inferences given their relatively high genotyping error rates. Individuals of known relatedness were simulated under genotyping error, allelic dropout, and missing data scenarios based on an empirical ddRAD dataset, and their true relatedness was compared to that estimated by seven relatedness estimators. We found that an estimator chosen through such analyses can circumvent the influence of genotyping error, with the estimator of Ritland (1996) shown to be unaffected by allelic dropout and to be the most accurate when there is genotyping error. We also found that the choice of estimator should not rely solely on the strength of correlation between estimated and true relatedness as a strong correlation does not necessarily mean estimates are close to true relatedness. We also demonstrated how even a large SNP dataset with genotyping error (allelic dropout or otherwise) or missing data still performs better than a perfectly genotyped microsatellite dataset of tens of markers. The simulation-based approach used here can be easily implemented by others on their own genotyping-by-sequencing datasets to confirm the most appropriate and powerful estimator for their dataset.

Usage notes

Location

Australia