The importance of evolutionary timelines when explaining the evolution of parental care strategies
Data files
Dec 05, 2024 version files 694.79 MB
-
code_and_data.zip
694.73 MB
-
Raw_Data_Supplemental_File.xlsx
54.62 KB
-
README.md
13.38 KB
Abstract
Comparative research on the evolution of parental care has followed a general trend in recent years, with researchers gathering data on clutch size or egg size and correlating these traits with ecological variables across a phylogeny. The goal of these studies is to shed light on how and why certain life history strategies evolve. However, results vary across studies, and we rarely have results explaining why the observed pattern occurred, leaving us with further hypotheses to test. By using a combination of comparative methods, we provide an explanation of how such patterns emerge based on the evolutionary timeline of the investment in eggs and life-history traits; this combination also allowed us to pinpoint why the pattern occurred. We do so with data on freshwater crayfish, which are ideal for such investigations because they exhibit a diversity in body size, post-ovipositional care strategies (associated with burrowing), and pre-ovipositional investment in eggs. Specifically, we tested whether a strong dependence on burrows is related to pre-ovipositional investment in eggs (i.e., larger eggs or more eggs).
We found no correlation between burrowing and the size or number of eggs crayfish lay; instead, body size was the best predictor of the number of eggs (but not the size of eggs) that each species lays. Interestingly, our analysis suggests that crayfish ancestors had a small clutch size, relatively large eggs, and a weak connection to burrows. Thus, the shift to heavily relying on burrows appeared after this lineage had already evolved relatively large eggs, which gives insights into the colonization of freshwater by an ancestral astacidean ancestor. While other studies show that the connection between life history and egg investment is not straightforward, our study provides a clear evolutionary timeline of the interplay between the evolution of pre-ovipositional parental care and life history strategies. Furthermore, our work showcases how merging multiple phylogenetically informed approaches can disentangle the origin and evolution of life history traits.
README: The evolution of burrowing, parental care, and egg investment in freshwater crayfish:
Authors: Zackary A. Graham, Zachary Loughman, Alexandre V. Palaoro
Contact about code and analyses: alexandre.palaoro@gmail.com
This readme has been divided in three parts. First, we will talk about file structure, then the code, the dataset.
File structure:
We made three types of analyses, each is contained in a different folder inside the ZIP file.
Two analyses use reversible-jump MCMC to test the magnitude and location of shifts in evolutionary optima across the phylogeny. Given the use of Bayesian MCMC implementations, these two analyses have two types of code: one to run in a cluster and generate the mcmc chains and stepstone analysis, and another code to parse the information from these chains. The two folders that contain all files required to run these analyses are the "clutch_size" and "egg_size" folders. In them, you will find the parsing code and each code used in a cluster. You will also find three folders inside: "data", "figures", and "results_mcmc". "data" contains the data files and the phylogeny. "figures" is a folder that receives the figures made in R. "results_mcmc" contains the chains and stepstones generated by the Bayesian analysis performed in a cluster. We are providing all the resulting files so you do not have to run them. You can load the chains and proceed with the analyses as you wish.
The third type of analyses relates to a phylogenetic linear regression and ancestral state reconstructions. In it, you will have the "data" and "figures" folder that save the same function as before. You will also find two codes: "tree_vis_file.R" and "pgls-run.R". The first is the code we used to build most of the phylo figures in the paper. The second file is the code for all the ancestral state reconstruction and pgls we present in the paper and supplementary files.
Code:
The code was made to be ran in a cluster and then parsed in a standard computer. All code that starts with "eggsize_", "stepstone_", or "rjmcmc_" were used in the Palmetto cluster (Clemson University) to run the mcmc chains or to perform the stepping stone procedure to calculate the marginal likelihood of the models. We are providing all files generated in the cluster in the folder "results_mcmc". Thus, you can run the parsing code to load those objects and chains and perform any sort of data treatment required.
The only exception is the "pgls" folder that contains non-bayesian analyses (for the most part). These can be ran in any computer without necessarily requiring the mcmc chains. However, one of the the ancestral reconstructions was made with bayesian approximation. For those, we provide the RDS files of the chains as well.
If you want to run the chains yourself, use any of the "rjmcmc_XX.R" code. That code will run all chains for yourself. The nomenclature is as follows:
- 11 = a global model where there no shifts in both intercept and slope.
- N1 = a model where lineages can shift their intercept optimum, but not slope.
- NN = a model where lineages can shift both their intercept and slope.
- These three models allow the lineages to vary their optimum, and we test how many shifts there were (and how significant they were).
- C2 = a model where we use an ancestral reconstruction of their burrowing status to show where shifts can occur. In this model, lineages vary in their intercept, but not slope. Here, their burrowing status follows a combination of morphology and traditional classification.
- F2 = a model where we use an ancestral reconstruction of their burrowing status to show where shifts can occur. In this model, lineages vary in their intercept, but not slope. Here, their burrowing status follows a morphology classification.
- T2 = a model where we use an ancestral reconstruction of their burrowing status to show where shifts can occur. In this model, lineages vary in their intercept, but not slope. Here, their burrowing status follows the traditional classification.
If you want to run the stepping stone of the chains to calculate the marginal likelihood, please run the "stepstone-XX.R" code. It follows the same nomenclature as the previous files.
Lastly, if you just want to look at the results, run the "parsing_code.R". Just ensure to follow the same file structure as here.
For the PGLS analysis, if you want to run the PGLS, just use the "pgls-run.R" code. It will do everything related to that analysis. The "tree_vis_file.R" is simply a code to build the phylogeny shown in Figure 2.
Dataset:
We are uploading two data sheets. One contains the raw data of everything we collected in the literature and one - called "Raw_Data_Supplemental_file.xlsx". The second data sheet is embedded in each "data" folder of each analysis, named "LH_data_reorder.csv".
The raw data contains every piece of information we found for species, with the citation to back it up. We are adding the metadata for this file below.
The second data sheet, used for data analysis, contains the average values for each species of crayfish we found information for. All variables are contained in the same file. These data were used to run all analyses contained in the manuscript. Body length, egg diameter, and number of eggs were all collected from the literature. If we found more than one egg number, we averaged both observations. If we only had a range of values, we also averaged them. "NA" cells represent species we found no information, while "null" cells represent species that have no important notes to use.
Again, to find this file, navigate to the "data" folder contained in any of the analysis folders (such as the "clutch_size" folder).
The phylogenetic tree was taken from:
Stern, D. B., J. Breinholt, C. Pedraza-Lara, M. López-Mejía, C. L. Owen, H. Bracken-Grissom, J. W. Fetzner, and K. A. Crandall. 2017. Phylogenetic evidence from freshwater crayfishes that cave adaptation is not an evolutionary dead-end. Evolution (N. Y). 71:2522–2532.
METADATA OF Raw_Data_Supplemental_File.csv
In the columns we have the variables, in rows we have the individual sources that we gathered relevant life history data from. Empty cells (NA) denote species where we did not find the information based on each source, as some sources only report partial life history data we were interested in.
- COLUMN A: species - the crayfish species
- COLUMN B: family - the family the crayfish belongs to. based on the phylogeny
- COLUMN C: n - the sample size of the crayfish life history measurements that were measured. Some are NA as exact sample size was not reported.
- COLUMN D: body.size.average - the average value of body length obtained from the literature. Can be two types of lengths: carapace length or post-orbital carapace length. Unit: cm
- COLUMN E: body.size.minimum - the minimum value of body length obtained from the literature. Can be two types of lengths: carapace length or post-orbital carapace length. Unit: cm
- COLUMN F: body.size.maximum - the maximum value of body length obtained from the literature. Can be two types of lengths: carapace length or post-orbital carapace length. Unit: cm
- COLUMN G: body.size.metric - type of body length measurement. Either carapace length (cl) or post-orbital carapace length (ocl).
- COLUMN H: cltuch.average - the average value of clutch size obtained from the literature.
- COLUMN I: clutch.minimum - the minimum value of clutch size obtained from the literature.
- COLUMN J: clutch.maximum - the maximum value of clutch size obtained from the literature.
- COLUMN K: egg.size.average.mm - the average value of egg size obtained from the literature. Unit: mm
- COLUMN L: egg.size.minimum.mm - the minimum value of egg size obtained from the literature. Unit :mm
- COLUMN M: egg.size.maximum.mm - the maximum value of egg size obtained from the literature. Unit: mm
- COLUMN N: body.size.average.from.literature - whether or not our final body size measurement came directly from the literature sources (1) or from taking the mid-point of the minimum and maximum values reported from the literature (0).
- COLUMN O: body.size.average.from.min.max - whether or not our final body size measurement came directly from the mid-point of the minimum and maximum values reported from the literature (1) or directly from a literature sources (0).
- COLUMN P: clutch.average.from.literature - whether or not our final clutch size measurement came directly from the literature sources (1) or from taking the mid-point of the minimum and maximum values reported from the literature (0).
- COLUMN Q: clutch.average.from.min.max - whether or not our final clutch size measurement came directly from the mid-point of the minimum and maximum values reported from the literature (1) or directly from a literature sources (0).
- COLUMN R: egg.size.average.from.literature - whether or not our final egg size measurement came directly from the literature sources (1) or from taking the mid-point of the minimum and maximum values reported from the literature (0).
- COLUMN S: egg.size.average.from.min.max - whether or not our final egg size measurement came directly from the mid-point of the minimum and maximum values reported from the literature (1) or directly from a literature sources (0).
- COLUMN T: primary.data.citation - the source for which we collected the information in columns C-M.
- COLUMN U: primary.data.title - the title of the source for which we collected the information in columns C-M.
- COLUMN V: source - the source for which we found the citation and life history information, whether it be from a specific paper, a previous life history database (i.e., bloomer_database, moore_database; detailed in paper) or a google scholar search.
- COLUMN W: body.size.citation - If data on a species body size was collected from a separate source from the primary.data.citation source (which may have had life history infomraiton but not body size information), we report it here.
- COLUMN X: body.size.title - The full title of the paper in column W (if mentioned).
- COLUMN Y: webplotdigitizer - whether or not body size/egg size/egg number data was collected from a figure using webplotdigitizter; 1 = yes, 0 = no.
METADATA OF LH_data_reorder.csv
In the columns we have the variables, in rows we have the individuals. We obtained mean values by averaging the individuals of the species. Empty cells (NA) denote species where we did not find the information (mainly for the egg diameter).
- COLUMN A: species - the crayfish species
- COLUMN B: genus - the genus the specific species belongs to. based on the phylogeny
- COLUMN C: family - the family the crayfish belongs to. based on the phylogeny
- COLUMN D: body.size.avg - the average value of body length obtained from the literature. Can be two types of lengths: carapace length or post-orbital carapace length. Unit: cm
- COLUMN E: bs.metric - type of body length measurement. Either carapace length (cl) or post-orbital carapace length (ocl)
- COLUMN F: fecundity.avg - mean number of eggs attached to the abdomen of a female crayfish
- COLUMN G: egg.diam.avg - mean diameter of the eggs attached to the abdomen of a female crayfish. Unit: mm
- COLUMN H: notes - any notes about the body size, fecundity or egg diameter data collected from the literature. If the cell contains "null", it means that are no notes for this species
- COLUMN I: burrow.lough - Morphological classification of burrowing status. Levels: Burrower, Non-burrower
- COLUMN J: traditional.burrowing.classification - Traditional classification of burrowing status. Levels: Primary burrower, Secondary burrower, Tertiary burrower
- COLUMN K: combined.burrowing.classification - Classification that mixes the morphological and traditional classifications. Levels: Semi-terrestrial burrower, Open-water burrower
The code was run in R software v4.2.2.
Packages used:
bayou 2.0 - v2.2.0 - https://github.com/uyedaj/bayou
tidyr - v2.6.2
ape - v5.7-1
geiger - v2.0.10
phytools - v1.5-1
nlme - v3.1.160
scales - v1.2.1
gplots - v3.1.3
phylolm - v2.6.2
plotrix - v3.8-2
foreach - v1.5.2
doParallel - v1.0.17
Acknowledgment
We thanks the Palmetto cluster at Clemson University for the allotment of computer time. This material is based on work supported by the National Science Foundation under Grant Nos. MRI# 2024205, MRI# 1725573, MRI# 1725573, and CRI# 2010270.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Sharing/access Information
The file structure and files can be seen and downloaded from:
DRYAD: 10.5061/dryad.9kd51c5qt
ZENODO: https://doi.org/10.5281/zenodo.7915922
GITHUB: https://github.com/alexandrepalaoro/crayfish-egg
The phylogenetic tree was taken from another paper:
Stern, D. B., J. Breinholt, C. Pedraza-Lara, M. López-Mejía, C. L. Owen, H. Bracken-Grissom, J. W. Fetzner, and K. A. Crandall. 2017. Phylogenetic evidence from freshwater crayfishes that cave adaptation is not an evolutionary dead-end. Evolution (N. Y). 71:2522–2532.