Predicting global intraspecific trait variation of grasses
Data files
Mar 20, 2025 version files 1.09 GB
-
Code_for_Ecography_1_Random_forest_models_for_six_traits.R
26.99 KB
-
Code_for_Ecography_2_Range_models.R
9.43 KB
-
Code_for_Ecography_3_RF_to_rangemaps.R
9.11 KB
-
Code_for_Ecography_4_Map_script.R
5.46 KB
-
Code_for_Ecography_5_Map_of_species_richness_and_dtrait_(Brody).R
18.35 KB
-
Code_for_Ecography_6_loop_for_species_specific_maps.R
4.62 KB
-
data_files_for_ecography.zip
1.88 MB
-
output_maps-20250313T182354Z-001.zip
1.08 GB
-
README.md
1.03 KB
Mar 20, 2025 version files 4.54 GB
-
Code_for_Ecography_1_Random_forest_models_for_six_traits.R
26.99 KB
-
Code_for_Ecography_2_Range_models.R
9.43 KB
-
Code_for_Ecography_3_RF_to_rangemaps.R
9.11 KB
-
Code_for_Ecography_4_Map_script.R
5.46 KB
-
Code_for_Ecography_5_Map_of_species_richness_and_dtrait_(Brody).R
18.35 KB
-
Code_for_Ecography_6_loop_for_species_specific_maps.R
4.62 KB
-
data_files_for_ecography_UPDATED_03.20.25.zip
3.25 MB
-
output_maps-20250313T182354Z-001.zip
1.08 GB
-
rasterstacks.zip
3.45 GB
-
README.md
1.54 KB
Abstract
Plant traits are important for understanding community assembly and ecosystem processes, yet our understanding of intraspecific trait variation (ITV) is limited. This gap in our knowledge is partially because collecting trait data across a species’ entire range is impractical, let alone across the ranges of multiple species within a plant family. Using machine learning techniques to predict spatial ITV is an attractive and cost-effective alternative to sampling across a species range, although this has not been applied beyond regional scales. We compiled a trait database of over 1,000 grass species (family: Poaceae), encompassing six key functional traits: specific leaf area (SLA), leaf dry matter content (LDMC), plant height, leaf area, leaf nitrogen (Nmass) and leaf phosphorus content (Pmass). Using a random forest machine learning approach, we predicted local trait values within species' ranges considering climate, soil type, phylogeny, lifespan, and photosynthetic pathway as influential factors. An iterative random forest modeling technique incorporated correlations between traits, resulting in improved model performance (observed vs. predicted R2 range of 0.72 - 0.91). Our models also highlight the importance of climate in predicting trait variation. For a subset of species (n = 860), we projected trait predictions across their known distribution, informed by expert maps from Kew Botanical Gardens, to create global maps of ITV for grasses. Such maps have the potential to inform conservation efforts and predictions of grazing and fire dynamics in grasslands worldwide. Overall, our research demonstrates the value and ecological applications of predicting plant traits.
Here you can find the data and code required to run the analyses described in the manuscript, “Predicting global intraspecific variation of grasses”.
The redacted dataset can be downloaded at “Data for random forestsredacted_v2.csv” in the zipped datafiles folder. Note, this is a redacted dataset that does not include any data from TRY (https://www.try-db.org/TryWeb/Home.php) per their data sharing agreements.
To fully understand the dataset and different column titles, refer to the Metadata in the same folder.
The analysis includes 6 separate R scripts which should be run in order of their names. Start with “Code for Ecography 1…”. This R script calls for the dataset to be uploaded. This code produces additional datafiles which are saved to your working directory and called for in the next scripts. After running this initial script, proceed to “Code for Ecography 2…” and so on.
Also included are the 860 maps produced with the code showing intraspecific trait variation.
CHANGE LOG
March 20, 2025: added the raster stacks which are used to make the species-specific trait maps. Also, revised the redacted dataset to include rows with TRY data where trait records are now entered as NA. This retains the species information, climate/soil data for the location, but honors the TRY agreement of not sharing the actual data. If you have permission from TRY, you may use the source information, species information, and coordinates, to fill in the rows of redacted data.