Investigating historical drivers of latitudinal gradients in polyploid plant biogeography: A multi-clade perspective

Hagen, Eric 1 ; Vasconcelos, Thais 2 ; Boyko, James2 ; Beaulieu, Jeremy 3

Published May 30, 2024 on Dryad. https://doi.org/10.5061/dryad.hx3ffbgnm

Data files

May 30, 2024 version files 88.13 MB

CE_out.zip

77.24 KB
gbif_points.zip

5.15 MB
Paleoclim.zip

82.82 MB
ploidy.zip

19.83 KB
README.md

7.36 KB
trees.zip

47.95 KB

Abstract

Premise of the Study

Methods

We reconstruct the ploidy states and ancestral niches of 1,032 angiosperm species at four paleoclimatic time slices ranging from 3.3 million years ago to the present, comprising taxa from four well-represented clades: Onagraceae, Primulaceae, Solanum (Solanaceae), and Pooideae (Poaceae). We use ancestral niche reconstruction models alongside a customized discrete character evolution model to allow reconstruction of states at specific time slices. Patterns of latitudinal movement are reconstructed and compared in relation to inferred ploidy shifts.

Key Results

We find that no single hypothesis applies equally well across all analyzed clades. While significant differences in median latitudinal occurrence were detected in the largest clade, Poaceae, no significant differences were detected in latitudinal movement in any clade.

Conclusions

Our preliminary study is the first to attempt to connect ploidy changes to continuous latitudinal movement, but we cannot favor one hypothesis over another. Given that patterns seem to be clade-specific, a larger number of clades must be analyzed in future studies for generalities to be drawn.

The proportion of polyploid plants in a community increases with latitude, and different hypotheses have been proposed about which factors drive this pattern. Here, we aim to understand the historical causes of the latitudinal polyploidy gradient using a combination of ancestral state reconstruction methods. Specifically, we assess whether (1) polyploidization enables movement to higher latitudes (i.e., polyploidization precedes occurrences in higher latitudes) or (2) higher latitudes facilitate polyploidization (i.e., occurrence in higher latitudes precedes polyploidization). We reconstruct the ploidy states and ancestral niches of 1,032 angiosperm species at four paleoclimatic time slices ranging from 3.3 million years ago to the present, comprising taxa from four well-represented clades: Onagraceae, Primulaceae, Solanum (Solanaceae), and Pooideae (Poaceae). We use ancestral niche reconstruction models alongside a customized discrete character evolution model to allow reconstruction of states at specific time slices. Patterns of latitudinal movement are reconstructed and compared in relation to inferred ploidy shifts. We find that no single hypothesis applies equally well across all analyzed clades. While significant differences in median latitudinal occurrence were detected in the largest clade, Poaceae, no significant differences were detected in latitudinal movement in any clade. Our preliminary study is the first to attempt to connect ploidy changes to continuous latitudinal movement, but we cannot favor one hypothesis over another. Given that patterns seem to be clade-specific, a larger number of clades must be analyzed in future studies for generalities to be drawn.

Description of the data and file structure

Contained within the

The dataset also contains six code files, meant to be run in order using inputs provided in this repository (order indicated by the beginnings of R file names, 01 through 06). 00_utility_functions.R contains functions necessary to execute the code in other files. The other code files are as follows:

01_assembling_ploidy_data.R: Assembles clean ploidy data from the raw files (included in this dataset) 
02_organizing_files.R: Assembles datasets for occurrence points and ploidy for each of the four clades included in our study; also prunes phylogenies for taxa in each clade that also possess ploidy data as well as sufficient occurrence data 
03_present_day_sprich.R: Assembles species richness raster map from occurrence points. 
04_running_machuruku.R: Runs machuruku range reconstructions to paleoclimatic time slices based on the climatic variables that characterize the distribution of each lineage included in the input phylogeny. 
05_running_machuruku_part2.R: Runs modified corHMM reconstructions of ploidy to each paleoclimatic time slice; also runs machuruku reconstructions for individual taxa such that the median latitude and longitude can be extracted from each individual reconstructed range. 
06_making_plots.R: Code to create the plots included in our manuscript.

The folder “CE_out” contains 113 raw genus-level ploidy data CSV files, with one file per genus (e.g., AegilopsPloidy.csv, AgropyronPloidy.csv, etc.). Each genus falls within one of the four clades of interest in our manuscript. Ploidy inferences used in our manuscript come from the "Ploidy inference" column.

The folder "gbif_points" contains four CSV files (one for each of our four clades of interest) containing occurrence points downloaded from the Global Biodiversity Information Facility (GBIF). Each row represents a single occurrence point, with column entries for latitude and longitude.

The folder "Paleoclim" contains files needed for reconstructing climatic data at various time slices. There is one folder for each time slice, as follows: 1_cur_CHELSA_V1_2B_r10m (present-day), 2_LGM_chelsa_v1_2B_r10m (the Last Glacial Maximum c. 21 thousand years ago [ka]), 3_LIG_v1_10m (Last Interglacial c. 130 ka), 4_787ka_MIS19_v1_r10m (Marine Isotope Stage 19 c. 787 ka), 5_3.205Ma_mPWP_v1_r10m (mid-Pliocene Warm Period c. 3.205 Ma), and 6_3.3Ma_M2_v1_r10m (Marine Isotope Stage M2 c. 3.3 Ma).

The folder "ploidy" contains four CSV files (one for each of our four clades of interest) containing ploidy inferences for individual species. This is a cleaned up version of the single-genus folders contained within the folder "CE_out."

The folder "Trees" contains four .tre files (one for each of our four clades of interest) containing molecular phylogenies of each clade.

Sharing/access information

Data derived from other sources are listed below:

Ploidy data (Rice et al. 2019; “The global biogeography of polyploid plants”)
GBIF data GBIF.org (23 July 2021) GBIF Occurrence Download (https://doi.org/10.15468/dl.pw2qns; GBIF.org (23 July 2021) GBIF Occurrence Download (https://doi.org/10.15468/dl.yesy2v; GBIF.org (23 July 2021) GBIF Occurrence Download (https://doi.org/10.15468/dl.vqm9q3; GBIF.org (23 July 2021) GBIF Occurrence Download (https://doi.org/10.15468/dl.gqu424; GBIF.org (23 July 2021) GBIF Occurrence Download (https://doi.org/10.15468/dl.3ucjgk; GBIF.org (23 July 2021) GBIF Occurrence Download (https://doi.org/10.15468/dl.78shpr; GBIF.org (23 July 2021) GBIF Occurrence Download ([https://doi.org/10.15468/dl.f9pq57].
Onagraceae phylogeny (Freyman and Höhna 2019: “Stochastic character mapping of state-dependent diversification reveals the tempo of evolutionary decline in self-compatible Onagraceae lineages”)
Primulaceae phylogeny (De Vos et al. 2014; “Small and ugly? Phylogenetic analyses of the “selfing syndrome” reveal complex evolutionary fates of monomorphic primrose flowers”)
Solanaceae phylogeny (Särkinen et al. 2013; “A phylogenetic framework for evolutionary study of the nightshades (Solanaceae): a dated 1000-tip tree”
Poaceae phylogeny (Spriggs et al. 2014; “C4 Photosynthesis Promoted Species Diversification during the Miocene Grassland Expansion”)
PaleoClim data from the LIG (Last Interglacial, c. 130 ka), MIS19 (Marine Isotope Stage 19, c. 787 ka), mPWP (mid-Pliocene Warm Period, c. 3.205 Ma), and M2 (Marine Isotope Stage M2, c. 3.3 Ma), all using the spatial resolution of 10 arc-minutes (Brown et al. 2018; “PaleoClim, high spatial resolution paleoclimate surfaces for global land areas”)

Code/Software

To run the two code files, you will need the following packages (versions used during the production of this dataset are also provided):
1. raster (3.6.26)
2. ape (5.6.2)
3. taxize (0.9.100)
4. rgbif (3.7.3)
5. maptools (1.1.4)
6. sp (1.5.0)
7. rgeos (0.5.9)
8. rworldmap (1.3.6)
9. data.table (1.14.2)
10. terra (1.7.55)
11. dismo (1.3.9)
12. usdm (1.1.18)
13. phytools (1.2.0)
14. stringr (1.5.0)
15. stringi (1.7.8)
16. machuruku (1.8.3)
17. corHMM (2.8)
18. geiger (2.0.10)
19. parallel (4.2.1)
20. MASS (7.3.58.1)
21. Peacock.test (1.0)
22. dispRity (1.7.0)
23. paleotree (3.4.5)
24. dplyr (1.1.3)
25. castor (1.7.3)

Investigating historical drivers of latitudinal gradients in polyploid plant biogeography: A multi-clade perspective

Data files

Abstract

README

Methods

Works referencing this dataset