Paralogs and off-target sequences improve phylogenetic resolution in a densely-sampled study of the breadfruit genus (Artocarpus, Moraceae)
Data files
Feb 11, 2021 version files 189.30 MB
-
Appendix_1_and_2.pdf
-
exon_aligned_trimmed.zip
-
exon_raw_unaligned.zip
-
Figs_S1-S25.zip
-
README_nametable.txt
-
supercontig_aligned_trimmed.zip
-
supercontig_raw_unaligned.zip
-
supermatrices.zip
-
Tables_S1-S4.zip
-
trees_for_dryad.zip
Abstract
We present a 517-gene phylogenetic framework for the breadfruit genus Artocarpus (ca. 70 spp., Moraceae), making use of silica-dried leaves from recent fieldwork and herbarium specimens (some up to 106 years old) to achieve 96% taxon sampling. We explore issues relating to assembly, paralogous loci, partitions, and analysis method to reconstruct a phylogeny that is robust to variation in data and available tools. While codon partitioning did not result in any substantial topological differences, the inclusion of flanking non-coding sequence in analyses significantly increased the resolution of gene trees. We also found that increasing the size of datasets increased convergence between analysis methods but did not reduce gene tree conflict. We optimized the HybPiper targeted-enrichment sequence assembly pipeline for short sequences derived from degraded DNA extracted from museum specimens. While the subgenera of Artocarpus were monophyletic, revision is required at finer scales, particularly with respect to widespread species. We expect our results to provide a basis for further studies in Artocarpus and provide guidelines for future analyses of datasets based on target enrichment data, particularly those using sequences from both fresh and museum material, counseling careful attention to the potential of off-target sequences to improve resolution.