Data from: Robustness of Felsenstein’s versus transfer bootstrap supports with respect to taxon sampling
Data files
Aug 08, 2023 version files 22.26 GB
-
Appendix-Online.pdf
-
Data.tar.gz
-
README.md
-
Results.tar.gz
-
Supplementary-Information.pdf
Abstract
The bootstrap method is based on resampling sequence alignments and re-estimating trees. Felsenstein’s bootstrap proportions (FBP) is the most common approach to assess the reliability and robustness of sequence-based phylogenies. However, when increasing taxon sampling (i.e., the number of sequences) to hundreds or thousands of taxa, FBP tends to return low supports for deep branches. The Transfer Bootstrap Expectation (TBE) has been recently suggested as an alternative to FBP. TBE is measured using a continuous transfer index in [0,1] for each bootstrap tree, instead of the binary {0,1} index used in FBP to measure the presence/absence of the branch of interest. TBE has been shown to yield higher and more informative supports, while inducing a very low number of falsely supported branches. Nonetheless, it has been argued that TBE must be used with care due to sampling issues, especially in datasets with high number of closely related taxa. In this study, we conduct multiple experiments by varying taxon sampling and comparing FBP and TBE support values on different phylogenetic depth, using empirical datasets. Our results show that the main critique of TBE stands in extreme cases with shallow branches and highly unbalanced sampling among clades, but that TBE is still robust in most cases, while FBP is inescapably negatively impacted by high taxon sampling. We suggest guidelines and good practices in TBE (and FBP) computing and interpretation.
Usage notes
Online supplementary material
Contains two files and two folders:
(i) Supplementary information (pdf) with supplementary figures and tables
(ii) Appendix Online (pdf) with description of command-line options
(iii) Data/: reference/bootstrap trees and alignments for each experiment.
(iv) Results/: FBP/TBE support results and R scripts to reproduce the figures.
See README for more details.