Data from: Graphs in phylogenetic comparative analysis: Anscombe’s quartet revisited
Revell, Liam J., Del Rosario University, University of Massachusetts Boston
Schliep, Klaus, University of Massachusetts Boston
Valderrama, Eugenio, Programa de BiologíaUniversidad del Rosario Bogotá Colombia
Richardson, James E., Del Rosario University, Royal Botanic Garden Edinburgh
Published Jul 13, 2019 on Dryad.
Cite this dataset
Revell, Liam J.; Schliep, Klaus; Valderrama, Eugenio; Richardson, James E. (2019). Data from: Graphs in phylogenetic comparative analysis: Anscombe’s quartet revisited [Dataset]. Dryad. https://doi.org/10.5061/dryad.11340t4
1. In 1973 the statistician Francis Anscombe used a clever set of bivariate datasets (now known as Anscombe’s quartet) to illustrate the importance of graphing data as a component of statistical analyses. In his example, each of the four datasets yielded identical regression coefficients and model fits, and yet when visualized revealed strikingly different patterns of covariation between x and y. 2. Phylogenetic comparative methods are statistical methods too, yet visualizing the data and phylogeny in a sensible way that would permit us to detect unexpected patterns or unanticipated deviations from model assumptions is not a routine component of phylogenetic comparative analyses. 3. Here, we use a quartet of phylogenetic datasets to illustrate that the same estimated parameters and model fits can be obtained from data that were generated using markedly different procedures – including pure Brownian motion evolution and randomly selected data uncorrelated with the tree. Just as in the case of Anscombe’s quartet, when graphed the differences between the four datasets are quickly revealed. 4. The intent of this article is to help build the general case that phylogenetic comparative methods are statistical methods and consequently that graphing or visualization should invariably be included as an essential step in our standard data analytical pipelines. 5. Phylogenies are complex data structures and thus visualizing data on trees in a meaningful and useful way is a challenging endeavor. We recommend that the development of graphical methods for simultaneously visualizing data and tree should continue to be an important goal in phylogenetic comparative biology.
R markdown file containing the fully reproducible workflow of the study.
Built reproducible workflow containing all analyses of the study including (low quality versions of) all figures & tables.
National Science Foundation, Award: DEB-1350474 and DBI-1759940