Skip to main content
Dryad

Supplementary materials: RADseq phylogenetics in two frog clades

Cite this dataset

Chambers, E. Anne et al. (2023). Supplementary materials: RADseq phylogenetics in two frog clades [Dataset]. Dryad. https://doi.org/10.5061/dryad.fbg79cnsp

Abstract

Restriction-site associated DNA sequencing (RADseq) has become an accessible way to obtain genome-wide data in the form of single nucleotide polymorphisms (SNPs) for phylogenetic inference. Nonetheless, how differences in RADseq methods influence phylogenetic estimation is poorly understood because most comparisons have largely relied on conceptual predictions rather than empirical tests. We examine how differences in ddRAD and 2bRAD data influence phylogenetic estimation in two non-model frog groups. We compare the impact of method choice on phylogenetic information, missing data, and allelic dropout, considering different sequencing depths. Given that researchers must balance input (funding, time) with output (amount and quality of data), we also provide comparisons of laboratory effort, computational time, monetary costs, and the repeatability of library preparation and sequencing. Both 2bRAD and ddRAD methods estimated well-supported trees, even at low sequencing depths, and had comparable amounts of missing data, patterns of allelic dropout, and phylogenetic signal. Compared to ddRAD, 2bRAD produced more repeatable datasets, had simpler laboratory protocols, and an overall faster bioinformatics assembly. However, many fewer parsimony-informative sites per SNP were obtained from 2bRAD data when using native pipelines, highlighting a need for further investigation into the effects of each pipeline on resulting datasets. Our study underscores the importance of comparing RADseq methods, such as expected results and theoretical performance using empirical datasets, before undertaking costly experiments.

Methods

Samples: 5 species of dendrobatid frog (n=12; two biological replicate samples) and 5 species of ranid frog (n=12; two biological replicate samples); total number of samples = 24

Sequencing: 2bRAD and ddRAD for all samples (fastq files available on SRA)

Bioinformatics assembly: iPyrad (for both ddRAD and 2bRAD) and Matz Lab native pipeline (for both ddRAD and 2bRAD) 

Analyses: RAxML tree reconstruction, calculation of shared sites (unambiguous synapomorphies), missing data, read depth, repeatability (based on biological replicate samples)

Funding

National Science Foundation, Award: 1556967