Skip to main content
Dryad

Data from: Microhaplotypes provide increased power from short-read DNA sequences for relationship inference

Cite this dataset

Baetscher, Diana S. et al. (2017). Data from: Microhaplotypes provide increased power from short-read DNA sequences for relationship inference [Dataset]. Dryad. https://doi.org/10.5061/dryad.5863d

Abstract

The accelerating rate at which DNA sequence data is now generated by high-throughput sequencing instruments provides both opportunities and challenges for population genetic and ecological investigations of animals and plants. We show here how the common practice of calling genotypes from a single SNP per sequenced region ignores substantial additional information in the phased short-read sequences that are provided by high-throughput sequencing instruments. We target sequenced regions with multiple SNPs in kelp rockfish (Sebastes atrovirens) to determine “microhaplotypes” and then call these microhaplotypes as alleles at each locus. We then demonstrate how these multi-allelic marker data from 96 such loci dramatically increase power for relationship inference. The microhaplotype approach decreases false positive rates by several orders of magnitude, relative to calling bi-allelic SNPs, for two challenging analytical procedures, full sibling and single parent-offspring pair identification. The advent of phased short-read DNA sequence data, in conjunction with emerging analytical tools for their analysis, promises to improve efficiency by reducing the number of loci necessary for a particular level of statistical confidence, thereby lowering the cost of data collection and reducing the degree of physical linkage amongst markers used for relationship estimation. Such advances will facilitate collaborative research and management for migratory and other widespread species.

Usage notes

Funding

National Science Foundation, Award: 1260693

Location

California