Skip to main content
Dryad

Data from: RAD sequencing, genotyping error estimation and de novo assembly optimization for population genetic inference

Cite this dataset

Mastretta-Yanes, Alicia et al. (2014). Data from: RAD sequencing, genotyping error estimation and de novo assembly optimization for population genetic inference [Dataset]. Dryad. https://doi.org/10.5061/dryad.g52m3

Abstract

Restriction site-associated DNA sequencing (RADseq) provides researchers with the ability to record genetic polymorphism across thousands of loci for non-model organisms, potentially revolutionising the field of molecular ecology. However, as with other genotyping methods, RADseq is prone to a number of sources of error that may have consequential effects for population genetic inferences, and these have received only limited attention in terms of the estimation and reporting of genotyping error rates. Here we use individual sample replicates, under the expectation of identical genotypes, to quantify genotyping error in the absence of a reference genome. We then use sample replicates to (1) optimize de novo assembly parameters within the program Stacks, by minimizing error and maximizing the retrieval of informative loci, and; (2) quantify error rates for loci, alleles and SNPs. As an empirical example we use a double digest RAD dataset of a non-model plant species, Berberis alpina, collected from high altitude mountains in Mexico.

Usage notes

Location

Cerro Zamorano
Ajusco
Sierra Madre Occidental
La Malinche
Cerro Tlaloc
Nevado de Toluca
Faja Volcanica Transmexicana
Iztaccihuatl
Trans Mexican Volcanic Belt
Mexico
Cerro San Andres
Cofre de Perote
Transmexican Volcanic Belt