Phylogenetic comparative methods explore the relationships between quantitative traits adjusting for shared evolutionary history. This adjustment often occurs through a Brownian diffusion process along the branches of the phylogeny that generates model residuals or the traits themselves. For high-dimensional traits, inferring all pair-wise correlations within the multivariate diffusion is limiting. To circumvent this problem, we propose phylogenetic factor analysis (PFA) that assumes a small unknown number of independent evolutionary factors arise along the phylogeny and these factors generate clusters of dependent traits. Set in a Bayesian framework, PFA provides measures of uncertainty on the factor number and groupings, combines both continuous and discrete traits, integrates over missing measurements and incorporates phylogenetic uncertainty with the help of molecular sequences. We develop Gibbs samplers based on dynamic programming to estimate the PFA posterior distribution, over three-fold faster than for multivariate diffusion and a further order-of-magnitude more efficiently in the presence of latent traits. We further propose a novel marginal likelihood estimator for previously impractical models with discrete data and find that PFA also provides a better fit than multivariate diffusion in evolutionary questions in columbine flower development, placental reproduction transitions and triggerfish fin morphometry.
AquilegiaDiffusion
XML which runs columbine flower examples under the latent multivariate Brownian diffusion model
AquilegiaFac2
Runs the columbine flower example under the phylogenetic factor analysis model for 2 factors.
AquilegiaDiffusionMLE
Runs the marginal likelihood estimator for the columbine flower example under the latent multivariate Brownian diffusion model.
AquilegiaFacMLE1
Runs the marginal likelihood estimator for the columbine flower example under the phylogenetic factor analysis model for 1 factor.
AquilegiaFacMLE2
Runs the marginal likelihood estimator for the columbine flower example under the phylogenetic factor analysis model for 2 factors.
AquilegiaFacMLE3
Runs the marginal likelihood estimator for the columbine flower example under the phylogenetic factor analysis model for 3 factors.
AquilegiaSpeedDiffusion
Runs the columbine flower example under the latent multivariate Brownian diffusion model. This file is designed to measure a speed comparison between models.
AquilegiaSpeedFac2
Runs the columbine flower example under the phylogenetic factor analysis model with 2 factors. This file is designed to measure a speed comparison between models.
PoeciliidaeFac2
Runs the poeciliidae example under the phylogenetic factor analysis model with 2 factors.
PoeciliidaeFac3
Runs the poeciliidae example under the phylogenetic factor analysis model with 3 factors.
PoeciliidaeFac4
Runs the poeciliidae example under the phylogenetic factor analysis model with 4 factors.
PoeciliidaeFacMLE2
Runs the marginal likelihood estimate for the poeciliidae example under the phylogenetic factor analysis model with 2 factors.
PoeciliidaeFacMLE3
Runs the marginal likelihood estimate for the poeciliidae example under the phylogenetic factor analysis model with 3 factors.
PoeciliidaeFacMLE4
Runs the marginal likelihood estimate for the poeciliidae example under the phylogenetic factor analysis model with 4 factors.
PoeciliidaeFacMLE5
Runs the marginal likelihood estimate for the poeciliidae example under the phylogenetic factor analysis model with 5 factors.
PoeciliidaeDiffusionMLE
Runs the marginal likelihood estimate for the poeciliidae example under the latent multivariate Brownian diffusion model.
PoeciliidaeDiffusionSpeedTest
Runs the poeciliidae example under the latent multivariate Brownian diffusion model. This file is designed to measure a speed comparison between models.
PoeciliidaeFac3SpeedTest
Runs the poeciliidae example under the phylogenetic factor analysis model for 3 factors. This file is designed to measure a speed comparison between models.
PoeciliidaeFac4speedTest
Runs the poeciliidae example under the phylogenetic factor analysis model for 4 factors. This file is designed to measure a speed comparison between models.
TriggerfishFac5
Runs the triggerfish example under the phylogenetic factor analysis model for 5 factors.
TriggerfishDiffusionMLE
Runs the marginal likelihood estimator for the triggerfish example under the multivariate Brownian diffusion model.
TriggerfishFacMLE4
Runs the marginal likelihood estimator for the triggerfish example under the phylogenetic factor analysis model for 4 factors.
TriggerfishFacMLE5
Runs the marginal likelihood estimator for the triggerfish example under the phylogenetic factor analysis model for 5 factors.
TriggerfishFacMLE6
Runs the marginal likelihood estimator for the triggerfish example under the phylogenetic factor analysis model for 6 factors.
TriggerfishDiffusionSpeed
Runs the triggerfish example under the multivariate Brownian diffusion model. This file is designed to measure a speed comparison between models.
TriggerfishFac5Speed
Runs the triggerfish example under the phylogenetic factor analysis model for 5 factors. This file is designed to measure a speed comparison between models.
Supplementary figures and tables
Supplementary figures and tables for the paper Phylogenetic Factor Analysis which were not included in the main text.
supplementary_material_revised.pdf