Skip to main content
Dryad

Data from: A generalized K statistic for estimating phylogenetic signal from shape and other high-dimensional multivariate data

Cite this dataset

Adams, Dean C. (2014). Data from: A generalized K statistic for estimating phylogenetic signal from shape and other high-dimensional multivariate data [Dataset]. Dryad. https://doi.org/10.5061/dryad.8fc86

Abstract

Phylogenetic signal is the tendency for closely related species to display similar trait values due to their common ancestry. Several methods have been developed for quantifying phylogenetic signal in univariate traits and for sets of traits treated simultaneously, and the statistical properties of these approaches have been extensively studied. However, methods for assessing phylogenetic signal in high-dimensional multivariate traits like shape are less well developed, and their statistical performance is not well characterized. In this article, I describe a generalization of the K statistic of Blomberg et al. (2003) that is useful for quantifying and evaluating phylogenetic signal in highly-dimensional multivariate data. The method (Kmult) is found from the equivalency between statistical methods based on covariance matrices and those based on distance matrices. Using computer simulations based on Brownian motion, I demonstrate that the expected value of Kmult remains at 1.0 as trait variation among species is increased or decreased, and as the number of trait dimensions is increased. By contrast, estimates of phylogenetic signal found with a squared-change parsimony procedure for multivariate data change with increasing trait variation among species and with increasing numbers of trait dimensions, confounding biological interpretations. I also evaluate the statistical performance of hypothesis testing procedures based on Kmult and find that the method displays appropriate Type I error and high statistical power for detecting phylogenetic signal in high-dimensional data. Statistical properties of Kmult were consistent for simulations using bifurcating and random phylogenies, for simulations using different numbers of species, for simulations that varied the number of trait dimensions, and for different underlying models of trait covariance structure. Overall these findings demonstrate that Kmult provides a useful means of evaluating phylogenetic signal in high-dimensional multivariate traits. Finally, I illustrate the utility of the new approach by evaluating the strength of phylogenetic signal for head shape in a lineage of Plethodon salamanders.

Usage notes