Empirical data for: Extending phylogenetic regression models for comparing within-species patterns across the Tree of Life
Data files
Oct 07, 2024 version files 1.29 MB
-
pupfishGM.filtered.Rdata
1.29 MB
-
README.md
1.79 KB
Abstract
Biologists use phylogenetic comparative methods to characterize macroevolutionary trends of phenotypic change across species. Within-species variation can complicate such investigations, and procedures incorporating nonstructured (random) intraspecific variation have been developed. However, intraspecific variation can also be structured (sex-specific differences, allometric trends), and biologists often wish to compare such intraspecific patterns across taxa. Unfortunately, current analytical approaches cannot interrogate within-species patterns while simultaneously accounting for phylogenetic nonindependence. Thus, deciphering how intraspecific trends evolve remains a challenge. We introduce an extended phylogenetic generalized least squares (E-PGLS) procedure which facilitates comparisons of within-species patterns across species while accounting for phylogenetic nonindependence. Our method uses an expanded phylogenetic covariance matrix, a hierarchical linear model, and permutation methods to obtain empirical sampling distributions and effect sizes for model effects that can evaluate differences in intraspecific trends across species for both univariate and multivariate data while conditioning them on the phylogeny. The method has appropriate statistical properties and obtains evolutionary covariance estimates that reflect those from existing approaches for nonstructured (random) intraspecific variation. Additionally, E-PGLS can detect differences in structured intraspecific patterns across species when such trends are present. Thus, E-PGLS extends the reach of phylogenetic comparative methods into the intraspecific comparative realm, by providing the ability to evaluate within-species trends across species while simultaneously accounting for shared evolutionary history.
General Information
-
Author Information
Corresponding Investigator
Name: Dean C. Adams
Institution: Iowa State University, Ames, IowaCo-investigator 1
Name: Michael L. Collyer
Institution: Chatham University, Pittsburgh, Pennsylvania -
Recommended citation for this dataset: Adams and Collyer (2024), Data from: Extended phylogenetic regression models for comparing within-species patterns across the tree of life. Dryad Digital Repository.
DATA & FILE OVERVIEW
- Description of dataset
A total of 12 landmarks and 74 semilandmarks were digitized from lateral images of adult pupfish and were digitized using the digitizing software for R
, Steromorph
, version 1.6.5. The 74 semilandmarks were sampled from Bzier curves digitized on images, rather than points.
-
File List:
File 1 Name: pupfishGM.filtered.Rdata
File 1 Description: Rdata file containing all landmark data and specimen informationFile 2 Name: PupFishExample-DRYAD.R (hosted on Zenodo)
File 2 R script for obtaining phylogeny from Fishtree, and performing all analyses.
DATA-SPECIFIC INFORMATION FOR: pupfishGM.filtered.Rdata
The file contains a list ‘gdf2’ with the following components:
- coords: Procrustes-aligned landmark coordinates for 969 specimens (12 landmarks + 74 semilandmarks optimized using minimum bending energy)
- Csize: the centroid size for each specimen
- Sex: a factor designating the sex of each specimen
- Species: a factor designating the species to which each specimen belongs
- Population: a factor designating the population from which each specimen was obtained
- Sample: a factor designating the sample for each specimen
Specimens were obtained from museum collections (from the University of Michigan Museum of Zoology and the Museum of Southwestern Biology) containing a large number of males and females for each species. A total of 969 individuals representing both males and females from each of the seven species were used in this example. The sample sizes ranged from 75 to 239 individuals per species (with a mean of 138 individuals per species). An average of 69 individuals for each Species X Sex combination were measured (range: 35 - 122).
All specimens were imaged using either a Nikon D90 digital camera (MSB) or a Panasonic DC-GX850 digital camera (UMMZ), and from each, the locations of 12 landmarks and 74 semilandmarks were then digitized using the digitizing software for `R`, `Steromorph`, version 1.6.5. The 74 semilandmarks were sampled from Bézier curves digitized on images, rather than points. Geometric morphometric methods were then used to obtain body shape information. Specifically, generalized Procrustes analysis was performed to eliminate non-shape variation, where semilandmarks were slid based on minimizing bending energy. Next, the resulting Procrustes residuals were subjected to a principal components analysis, and the first 40 PCs (which together represented 99.4% of the total shape variation), were retained, and treated as a set of shape variables for all subsequent statistical analyses.
All digitizing and data analysis was performed in R, using the packages: geomorph, RRPP, and Stereomorph. The fish phylogeny was obtained from the R-package Fishtree.