Data from: Reconstructing phylogeny from reduced-representation genome sequencing data without assembly or alignment

Fan, Huan1; Ives, Anthony R.1; Surget-Groba, Yann2

Published May 30, 2018 on Dryad. https://doi.org/10.5061/dryad.r0hq0

Data files

May 30, 2018 version files 9.02 KB

Figure5b.tre

1.18 KB
Figure5c.tre

1.15 KB
Figure5d.tre

1.22 KB
Viburnum_k15_n2.tre

1.88 KB
Viburnum_ks15_n2_sba.tre

3.60 KB

Abstract

Reduced-representation genome sequencing such as RADseq aids the analysis of genomes by reducing the quantity of data, thereby lowering both sequencing costs and computational burdens. RADseq was initially designed for studying genetic variation across genomes at the population level, but has also proved to be suitable for interspecific phylogeny reconstruction. RADseq data pose challenges for standard phylogenomic methods, however, due to incomplete coverage of the genome and large amounts of missing data. Alignment-free methods are both efficient and accurate for phylogenetic reconstructions with whole genomes and are especially practical for non-model organisms; nonetheless, alignment-free methods have not been applied with reduced genome sequencing data. Here, we test a full-genome assembly and alignment-free method, AAF, in application to RADseq data and propose two procedures for reads selection to remove reads from restriction sites that were not found in taxa being compared. We validate these methods using both simulations and real datasets. Reads selection improved the accuracy of phylogenetic construction in every simulated scenario and the two real datasets, making AAF as good or better than a comparable alignment-based method, even though AAF had much lower computational burdens. We also investigated the sources of missing data in RADseq and their effects on phylogeny reconstruction using AAF. The AAF pipeline modified for RADseq or other reduced-representation sequencing data, phyloRAD, is available on github (https://github.com/fanhuan/phyloRAD).

Data from: Reconstructing phylogeny from reduced-representation genome sequencing data without assembly or alignment

Data files

Abstract

Figure5b

Figure5c

Figure5d

Figure S3a

Figure S3b

Data from: Reconstructing phylogeny from reduced-representation genome sequencing data without assembly or alignment

Data files

Abstract

Usage notes

Figure5b

Figure5c

Figure5d

Figure S3a

Figure S3b

Works referencing this dataset