Skip to main content
Dryad

Capturing single-copy nuclear genes, organellar genomes, and nuclear ribosomal DNA from deep genome skimming data for plant phylogenetics: A case study in Vitaceae

Data files

Jul 23, 2021 version files 490.08 MB

Abstract

With the decreasing cost and availability of many newly developed bioinformatics pipelines, next-generation sequencing (NGS) has revolutionized plant systematics in recent years. Genome skimming has been widely used to obtain high-copy fractions of the genomes, including plastomes, mitochondrial DNA (mtDNA), and nuclear ribosomal DNA (nrDNA). In this study, through simulations, we evaluated the optimal (minimum) sequencing depth and performance for recovering single-copy nuclear genes (SCNs) from genome skimming data, by subsampling genome resequencing data and generating 10 datasets with different sequencing coverage in silico. We tested the performance of four datasets (plastome, nrDNA, mtDNA, and SCNs) obtained from genome skimming based on phylogenetic analyses of the Vitis clade at the genus level and Vitaceae at the family level, respectively. Our results showed that optimal minimum sequencing depth for high-quality SCNs assembly via genome skimming was about 10× coverage. Without the steps of synthesizing baits and enrichment experiments, coupled with incredibly low sequencing costs, we showcase that deep genome skimming (DGS) is as effective for capturing large datasets of SCNs as the widely used Hyb-Seq approach, in addition to capturing plastomes, mtDNA, and entire nrDNA repeats. DGS may serve as an efficient and economical alternative and may be superior to the popular target enrichment/Hyb-Seq approach.