Supplementary Materials include results of simulation experiments to investigate the impact of phylogenetic regression with model violations.
Data files
May 06, 2024 version files 3.60 MB
Abstract
Modern comparative biology owes much to phylogenetic regression. At its conception, this technique sparked a revolution that armed biologists with phylogenetic comparative methods (PCMs) for disentangling evolutionary correlations from those arising from hierarchical phylogenetic relationships. Over the past few decades, the phylogenetic regression framework has become a paradigm of modern comparative biology that has been widely embraced as a remedy for shared ancestry. However, recent evidence has sown doubt over the efficacy of phylogenetic regression, and PCMs more generally, with the suggestion that many of these methods fail to provide an adequate defense against unreplicated evolution—the primary justification for using them in the first place. Importantly, some of the most compelling examples of biological innovation in nature result from abrupt lineage-specific evolutionary shifts, which current regression models are largely ill-equipped to deal with. Here we explore a solution to this problem by applying robust linear regression to comparative trait data. We formally introduce robust phylogenetic regression to the PCM toolkit with linear estimators that are less sensitive to model violations than the standard least-squares estimator, while still retaining high power to detect true trait associations. Our analyses also highlight an ingenuity of the original algorithm for phylogenetic regression based on independent contrasts, whereby robust estimators are particularly effective. Collectively, we find that robust estimators hold promise for improving tests of trait associations and offer a path forward in scenarios where classical approaches may fail. Our study joins recent arguments for increased vigilance against unreplicated evolution and a better understanding of evolutionary model performance in challenging–yet biologically important–settings.
README: Supplementary Material and figures showing results of simulation experiments and phylogenetic case studies
https://doi.org/10.5061/dryad.xpnvx0kn1
Supplementary Figures include results of simulation experiments to investigate the impact of phylogenetic regression with mismatched trees.
Supplementary Details associated with manuscript "Robust Phylogenetic Regression"
Supplementary Materials include results of simulation experiments to investigate the impact of phylogenetic regression with model violations.
Brief Description of Supplementary Figures (see documents for further information)
SUPPLEMENTARY FIGURE 1. Diagnostic plots depicting studentized residuals as a function of leverage for the L2 estimator applied to the example datasets shown in Figure 2 with PIC (a) and PGLS (b) regression.
SUPPLEMENTARY FIGURE 2. Histograms illustrating distributions of RPKM values across species (rows) and tissues (columns) for male (left column within tissue) and female (right column within tissue) from the Brawand et al. (2011) dataset.
SUPPLEMENTARY FIGURE 3. Investigating the impacts of evolutionary model violations for uncorrelated traits simulated under a shift model with variance .
SUPPLEMENTARY FIGURE 4. Investigating the impacts of evolutionary model violations for uncorrelated traits simulated under a shift model with variance .
SUPPLEMENTARY FIGURE 5. Investigating the impacts of the degree of true trait association for correlated traits simulated with true trait covariance .
SUPPLEMENTARY FIGURE 6. Investigating the impacts of the degree of true trait association for correlated traits simulated with true trait covariance .
SUPPLEMENTARY FIGURE 7. Phylogenetic regression in complex settings that include the joint scenario of a shift and nonzero trait covariance with = 256 species.
SUPPLEMENTARY FIGURE 8. Exploring robust regression for predicting female expression as a function of male expression in three tissues from nine species.
SUPPLEMENTARY FIGURE 9. Alluvial plots showing relative fractions of genes with significant (*P *≤ 0.05) and non-significant (*P *> 0.05) correlated expression in female and male brain, heart, and kidney tissues from nine species.
SUPPLEMENTARY FIGURE 10. Applying robust phylogenetic regression to muscle fiber data from 22 lizard species.
##