Mapping the correlations and gaps in studies of complex life histories
Data files
Jan 23, 2023 version files 1.33 MB
-
FINAL_DATASET.xlsx
-
README.docx
-
README.md
-
search_articles.xlsx
-
simpsons.csv
-
species_list.csv
Feb 26, 2024 version files 1.32 MB
-
FINAL_DATASET.xlsx
-
README.md
-
search_articles.xlsx
-
simpsons.csv
-
species_list.csv
Abstract
For species with complex life histories, phenotypic correlations between life-history stages constrain both ecological and evolutionary trajectories. Studies that seek to understand correlations across life history differ greatly in their experimental approach: some follow individuals (“individual longitudinal”), while others follow cohorts (“cohort longitudinal”). Cohort longitudinal studies risk confounding results through Simpson’s Paradox, where correlations observed at the cohort-level do not match that of the individual-level. Individual longitudinal studies are laborious in comparison, but provide a more reliable test of correlations across life-history stages. Our understanding of the prevalence, strength and direction of phenotypic correlations depends on the approaches that we use, but the relative representation of different approaches is unknown. Using marine invertebrates as a model group, we used a formal, systematic literature map to screen 17,000+ papers studying complex life histories, and characterized the study type (i.e. cohort longitudinal, individual longitudinal or single stage), as well as other factors. For 3,315 experiments from 1,716 articles, most (67.4%) focused on a single stage, and just 1.7% used an individual longitudinal approach. While life-history stages have been studied extensively, we suggest that the field prioritize individual longitudinal studies to understand the phenotypic correlations among stages.
README
Mapping the correlations and gaps in studies of complex life histories
This study is a literature map of marine invertebrate life histories. We have collated published articles and summarise their experimental methods as a means to evaluate the state of our field.
This document includes a description of all data files:
(1) FINAL_DATASET.xlsx--all data collected from over 3,000 published experiments in the literature
*light red shading is data summarised from main data source (within each tab these are labeled as Tables)
*light blue shading is data used in manuscript, tables or figures
(2) species_list.csv--species names given in each article were put through GBIF database to find species synonyms (i.e. if a single species was referred to by more than one name in the literature)
(3) simpons.csv--used to generate Figure1, a depiction of Simpson's Paradox, a central theme to the manuscript
(4) search_articles.xlsx--all articles accepted and rejected through literature mapping process
(1) FINAL_DATASET.xlsx
Tabs
all data
- Phylum
- Class
- Original species name
- this is the species name found in the article
- Gbif species name
- this is the species name found in GBIF database, this may match the original species name, or may differ if a single species has a name synonym
- Columns E to J
- 'y' (stage present in experiment)/ '-' (stage absent in experiment)
- F0_adult
- Embryo
- Larvae
- Metamorphosis
- Post-metamorphosis/juvenile
- F1_adult
- 'y' (stage present in experiment)/ '-' (stage absent in experiment)
- Lab_field
- where experiment was conducted
- L = lab
- F = field
- where experiment was conducted
- Study type
- S = Single stage (measure one stage)
- CL = Cohort longitudinal (measure more than one stage, replication level is cohort)
- IL = Individual longitudinal (measure more than one stage, replication level is individual)
- Dev_mode
- Development mode
- P = planktotrophic (long-lived, feeding larvae)
- L = lecithotrophic (short-lived, feeding larvae)
- D = direct development (crawl-away juveniles)
- Development mode
- Trait_measured
- traits measured in experiment (Table A2 in Appendix lists accepted traits for this map)
- Reference
- Collected
- Primary reviewer that collected data
- Search (see Methods, Appendix Table A4)
- 1conducted in 2014
- 2-conducted in 2014
- 3conducted in 2015
- 4conducted in 2021
study type
- Data used for Figure 2B
- Number of studies for each study type
field vs. lab
- Number of studies conducted in field vs laboratory
single stage
- Data used for Figure 3A
- number of single-stage studies that measured each stage
cohort longitudinal--counts for this experimental method
- Table 1--light gray diagonal used in Figure 3B
- Diagonal is number of studies that begin with each stage.
- Rest of row is the number of studies that also measure other stages
- Table 2--stages measured in each study
- 'y' (stage present in experiment)/ '-' (stage absent in experiment)
- Table 3--summary of Table 2
- number of studies that measured each stage combination Middle is counts for every combination of stage transition
- Table 4--Data used for Figure 4A
- From Table 2, the number of studies that measure sequential stages
- rows show studies that start at each stage and measure stages sequentially after
- Table 5--Data used for Figure 4B
- From Table 3, number of studies that measure stages that are not sequential
- Table 1--light gray diagonal used in Figure 3B
individual longitudinal--counts for this experimental method
- Table 1--light gray diagonal used in Figure 3C
- Diagonal is number of studies that begin with each stage.
- Rest of row is the number of studies that also measure other stages
- Table 2--stages measured in each study
- 'y' (stage present in experiment)/ '-' (stage absent in experiment)
- Table 3--summary of Table 2
- number of studies that measured each stage combination Middle is counts for every combination of stage transition
- Table 4--Data used for Figure 4C
- From Table 2, the number of studies that measure sequential stages
- rows show studies that start at each stage and measure stages sequentially after
- Table 5--Data used for Figure 4D
- From Table 3, number of studies that measure stages that are not sequential
- Table 1--light gray diagonal used in Figure 3C
direct development-counts for species with direct development only (Appendix)
- Table 1--Data used for Figure A2B
- Number of studies for each study type
- Table 2--Data used for Figure A3A
- number of single-stage studies that measured each stage
- Tables 3 - 7 cohort longitudinal counts
- Table 3 light gray diagonal used in Figure A3B
- Diagonal is number of studies that begin with each stage.
- Rest of row is the number of studies that also measure other stages
- Table 4--stages measured in each study
- 'y' (stage present in experiment)/ '-' (stage absent in experiment)
- Table 5--summary of Table 4
- number of studies that measured each stage combination Middle is counts for every combination of stage transition
- Table 6--Data used for Figure A4A
- From Table 4, the number of studies that measure sequential stages
- rows show studies that start at each stage and measure stages sequentially after
- Table 7--Data used for Figure A4B
- From Table 5, number of studies that measure stages that are not sequential
- Table 3 light gray diagonal used in Figure A3B
- Tables 8 - 11 individual longitudinal counts
- Table 8 light gray diagonal used in Figure A3B
- Diagonal is number of studies that begin with each stage.
- Rest of row is the number of studies that also measure other stages
- Table 9--stages measured in each study
- 'y' (stage present in experiment)/ '-' (stage absent in experiment)
- Table 10--summary of Table 9
- number of studies that measured each stage combination Middle is counts for every combination of stage transition
- Table 11--Data used for Figure A4C
- From Table 9, the number of studies that measure sequential stages
- rows show studies that start at each stage and measure stages sequentially after
- Table 8 light gray diagonal used in Figure A3B
- Table 1--Data used for Figure A2B
taxonomy--data used for Figure 5
- Counts and percentages for phyla represented in study (Figure 5A) and ten most common species (Figure 5B)
development mode
- Table 1 Development mode by number of experiments
- A single study can have more than one experiment for one species (this map has 3315 experiments)
- Table 2) Development mode and study type by number of experiments
- Table 3) Development mode by unique combinations of development mode, species and reference 'by study'
- An article with multiple experiments on one species may inflate frequencies of development mode.
- When accounting for articles that include multiple experiments on one species, there are 2944 unique studies
- Table 4) Compare this map to Marshall et al. (2012), The biogeography of marine invertebrate life histories. Annual Review of Ecology, Evolution, and Systematics 43: 97-114.
- Table 5) Table 4 as percent
- Percent misrepresentation is the percent difference between this map and the Marshall et al. (2012) estimates
- Table 1 Development mode by number of experiments
resolved species
- Original species name--species name taken from published study
- Gbif species name--species name matched with each original name, using the Gbif database
- Columns C and D of 'all data' tab
(2) species_list.csv
Used to search GBIF database (see R file)
- Phylum
- Class
- Species_name
(3) simpsons.csv
Used to generate Figure 1 main text
- Linepoints/lines shown in different colours in panels b and c; labelled as Cohorts 1, 2 and 3 in legend
- xembryo size values
- ylarval size values
(4) search_articles.xlsx
- List of articles accepted for literature map and articles rejected
- Search (Year)corresponds with searches 1-4 described in methods
- Articlefull citation