Chemical properties of foliar metabolomes represent a key axis of functional trait variation in forests of the tropical Andes
Data files
Nov 13, 2025 version files 50.72 MB
-
Chadwick_ProcB_10.1098_rspb.2025.1721_DatasetS1.csv
230.45 KB
-
Chadwick_ProcB_10.1098_rspb.2025.1721_DatasetS2.csv
4.32 KB
-
Chadwick_ProcB_10.1098_rspb.2025.1721_DatasetS3.tre
1.25 MB
-
Chadwick_ProcB_10.1098_rspb.2025.1721_DatasetS4.csv
49.14 MB
-
Chadwick_ProcB_10.1098_rspb.2025.1721.R
75.62 KB
-
DisciplineSpecificMetadata.json
8.50 KB
-
README.md
23.07 KB
Abstract
Plants interact with their environment through diverse specialized metabolites that protect them from abiotic stressors like drought or radiation and biotic stressors like herbivores or pathogens. However, few studies have considered the chemical properties of metabolites as a potential axis of functional trait variation along environmental gradients. Here, we examined how the chemical properties of foliar metabolomes, such as mean aromaticity, hydrophobicity, and polarity, as well as commonly used morphological traits, vary with climate and elevation among 16 forest plots in the tropical Andes of Bolivia. We found that chemical properties were weakly related to morphological traits among tree species, yet both varied significantly with climate and elevation. In particular, abundance-weighted mean hydrophobicity decreased, and polar surface area increased with elevation and in colder and drier climates. Additionally, co-occurring species showed increasing chemical similarity with elevation for the most-aromatic and most-polar metabolites. These results suggest that abiotic stress associated with colder, drier climates and solar radiation acts as a filter for metabolome chemical properties. This contrasts with chemical dissimilarity observed at lower elevations, which is likely driven by pressure from host-specialized enemies in warmer, wetter climates. Our results introduce the possibility that chemical defenses may be constrained by abiotic stressors. Morphological traits and foliar metabolome chemical properties for each species-by-plot are reported in Dataset S1. Community-weighted mean values are reported in Dataset S2. The structural similarities among 20,571 metabolites are reported as a Qemistree dendrogram in .tre phylogeny format as Dataset S3. Masses, molecular formulae, predicted structures, classifications, and chemical properties and sample-level abundances for 20,571 unique metabolites are provided in Dataset S4.
Dataset DOI: 10.5061/dryad.2rbnzs83c
Description of the data and file structure
Files and variables
File: Chadwick_ProcB_10.1098_rspb.2025.1721_DatasetS1.csv
Description: Metadata file that provides nomenclature and trait data for every species-by-plot (i.e. a species that occurs in n plots is represented n times), including mean metabolome chemical properties, morphological trait values, and abudance-weighted chemical similarity to co-occurring species with respect to metabolites that represent the upper or lower quartile of metabolites for particular chemical properties. NA indicates data not available in the Madidi Project morphological traits dataset.
Variables
- sampleCode: Name of the LC-MS data file
- speciesCode: Six-character species code or morpho-species code
- Plot: Madidi Project forest plot
- Species_binomial: Species Latin binomial
- Genus: Botanical genus
- Family: Botanical family
- Abundance: Number of individuals in the forest plot
- nAtomP: The number of atoms in the largest aromatic, conjugated pi system, which is high in pigments and light-absorbing molecules that function to protect plants from ultraviolet radiation.
- ALogP: Hydrophobicity measured as the log-ratio of solubility in octanol versus water; may be related to a plant-defense spectrum defined by unsaturated, aromatic, and nonpolar metabolites versus saturated and polar metabolites; negatively correlated with polarity and hence desiccation resistance.
- TopoPSA: Topological polar surface area, the sum of the surface area of polar atoms in Ångstroms (Å), which may contribute to desiccation resistance, but is negatively correlated with passive transport through cell membranes.
- Fsp3: The fraction of carbon atoms with sp3 electron orbits (i.e. only single bonds) to the total number of carbon atoms in the molecule, which is positively correlated with melting point, solubility, and the likelihood of a compound to exhibit bioactivity in pharmaceutical assays.
- MW: Molecular weight in Daltons (Da); greatest in peptides and lignans, may be related to leaf longevity because peptides and lignans function as long-lived physical and storage structures.
- SLA: Specific leaf area; a leaf economics-resource capture trait associated with high photosynthetic rates, high relative growth rates, low carbon investment in lignin or tannins, and resource-rich environments. NA indicates data not available in the Madidi Project morphological traits dataset.
- LeafArea: Leaf area; associated with a resource-acquisitive life history strategy. NA indicates data not available in the Madidi Project morphological traits dataset.
- LeafThickness: Leaf thickness; associated with tolerance to disturbance and nutrient stress and a resource-conservative life history strategy. NA indicates data not available in the Madidi Project morphological traits dataset.
- TwigBarkThickness_Relative: Twig bark thickness relative to stem diameter; associated with protection from attack by pathogens and herbivores. NA indicates data not available in the Madidi Project morphological traits dataset.
- TwigSpecDens: Twig specific density; associated with low relative growth rates, high survival, and resistance to pathogens, herbivores or physical damage. NA indicates data not available in the Madidi Project morphological traits dataset.
- nAtomP_75: Abundance-weighted CSCS chemical similarity to co-occurring tree species with respect to the upper quartile of metabolites with respect to nAtomP
- ALogP_25: Abundance-weighted CSCS chemical similarity to co-occurring tree species with respect to the lower quartile of metabolites with respect to ALogP
- TopoPSA_75: Abundance-weighted CSCS chemical similarity to co-occurring tree species with respect to the upper quartile of metabolites with respect to TopoPSA
- TopoPSA_25: Abundance-weighted CSCS chemical similarity to co-occurring tree species with respect to the lower quartile of metabolites with respect to TopoPSA
- Fsp3_75: Abundance-weighted CSCS chemical similarity to co-occurring tree species with respect to the upper quartile of metabolites with respect to Fsp3
- Elevation: Elevation of the forest plot in which the species occurs in meters (m)
- Plot_simp: Simplified name of the forest plot
File: Chadwick_ProcB_10.1098_rspb.2025.1721_DatasetS2.csv
Description: Data table of variables defined for each of the 16 Madidi forest plots used in the study, including elevation, climate, phylogenetic similarity, abundance-weighted mean chemical properties, and abundance-weighted mean chemical similarity of co-occurring species with respect to suites of compounds that represent upper or lower quartiles of metabolites with respect to particular chemical properties.
Variables
- Plot: Madidi Project forest plot
- Elevation: Elevation of the forest plot in meters (m).
- Climate_PC1: Position of the forest plot on the first principal component in a principal component analysis (PCA); greater values reflect greater temperature and precipitation.
- Climate_PC2: Position of the forest plot on the second principal component in a principal component analysis (PCA); greater values reflect precipitation and temperature seasonality.
- invSimpson: Species diversity represented as the inverse Simpson index.
- nAtomP: Community-weighted mean nAtomP, the number of atoms in the largest aromatic, conjugated pi system, which is high in pigments and light-absorbing molecules that function to protect plants from ultraviolet radiation.
- ALogP: Community-weighted mean ALogP, hydrophobicity measured as the log-ratio of solubility in octanol versus water; may be related to a plant-defense spectrum defined by unsaturated, aromatic, and nonpolar metabolites versus saturated and polar metabolites; negatively correlated with polarity and hence desiccation resistance.
- TopoPSA: Community-weighted mean TopoPSA, topological polar surface area, the sum of the surface area of polar atoms in Ångstroms (Å), which may contribute to desiccation resistance, but is negatively correlated with passive transport through cell membranes.
- Fsp3: Community-weighted mean Fsp3, the fraction of carbon atoms with sp3 electron orbits (i.e. only single bonds) to the total number of carbon atoms in the molecule, which is positively correlated with melting point, solubility, and the likelihood of a compound to exhibit bioactivity in pharmaceutical assays.
- MW: Community-weighted mean molecular weight in Daltons (Da); greatest in peptides and lignans, may be related to leaf longevity because peptides and lignans function as long-lived physical and storage structures.
- SLA: Community-weighted mean specific leaf area; a leaf economics-resource capture trait associated with high photosynthetic rates, high relative growth rates, low carbon investment in lignin or tannins, and resource-rich environments.
- LeafArea: Community-weighted mean leaf area; associated with a resource-acquisitive life history strategy.
- LeafThickness: Community-weighted mean leaf thickness; associated with tolerance to disturbance and nutrient stress and a resource-conservative life history strategy.
- TwigBarkThickness_Relative: Community-weighted mean twig bark thickness relative to stem diameter; associated with protection from attack by pathogens and herbivores.
- TwigSpecDens: Community-weighted mean twig specific density; associated with low relative growth rates, high survival, and resistance to pathogens, herbivores or physical damage.
- PhyloPCoA1: Position of the forest plot on the first axis of a principal coordinates analysis that reflects the phylogenetic similarity among the 16 forest plots
- PhyloPCoA2: Position of the forest plot on the second axis of a principal coordinates analysis that reflects the phylogenetic similarity among the 16 forest plots
- nAtomP_75: Community (abundance)-weighted mean CSCS chemical similarity of species to other co-occurring tree species with respect to metabolites in the upper quartile of nAtomP
- ALogP_25: Community (abundance)-weighted mean CSCS chemical similarity of species to other co-occurring tree species with respect to metabolites in the lower quartile of ALogP
- TopoPSA_75: Community (abundance)-weighted mean CSCS chemical similarity of species to other co-occurring tree species with respect to metabolites in the upper quartile of TopoPSA
- TopoPSA_25: Community (abundance)-weighted mean CSCS chemical similarity of species to other co-occurring tree species with respect to metabolites in the lower quartile of TopoPSA
- Fsp3_75: Community (abundance)-weighted mean CSCS chemical similarity of species to other co-occurring tree species with respect to metabolites in the upper quartile of Fsp3
File: Chadwick_ProcB_10.1098_rspb.2025.1721_DatasetS3.tre
Description: Qemistree dendrogram that reflects the structural similarity among metabolites. Tip labels correspond to LC-MS feature IDs in Dataset S4.
File: Chadwick_ProcB_10.1098_rspb.2025.1721_DatasetS4.csv
Description: Metabolome mastertable that contains metabolite annotations, including predicted structures, classifications, chemical properties, and abundance in each sample. NAs indicate missing annotations due to insufficient data.
Variables
- id: Metabolite feature ID number generated by Qemistree; matches tip labels in the Qemistree dendrogram.
- X.featureID: Metabolite feature ID number generated by MZmine.
- csi_smiles: Predicted structure generated by Sirius.
- table_number: Dataset input number for Qemistree
- smiles: Predicted structure generated by Sirius.
- structure_source: Sirius module used to predict metabolite chemical structure.
- kingdom: Classyfire chemotaxonomical classification at the kingdom level.
- superclass: Classyfire chemotaxonomical classification at the superclass level.
- class: Classyfire chemotaxonomical classification at the class level.
- subclass: Classyfire chemotaxonomical classification at the subclass level.
- direct_parent: Classyfire chemotaxonomical classification at the most-specific, direct-parent level.
- class_results: NPClassifier chemotaxonomical classification at the class level.
- superclass_results: NPClassifier chemotaxonomical classification at the superclass level.
- pathway_results: NPClassifier chemotaxonomical classification at the biosynthetic pathway level.
- isglycoside: NPClassifier indicator, whether the metabolite is a glycoside (contains a sugar moiety).
- custom: Classification used by the authors.
- MW: Molecular weight in Daltons (Da); greatest in peptides and lignans, may be related to leaf longevity because peptides and lignans function as long-lived physical and storage structures.
- nAtom: Number of atoms.
- naAromAtom: Number of aromatic atoms
- nAtomLC: Number of atoms in the longest chain.
- nAtomP: The number of atoms in the largest aromatic, conjugated pi system, which is high in pigments and light-absorbing molecules that function to protect plants from ultraviolet radiation.
- nB: Number of bonds
- nAromBond: Number of aromatic bonds
- nRotB: Number of rotatable bonds.
- XLogP: Octanol/water partition coefficient calculated using a modified atom-additive model summing atomic contributions and correcting for intramolecular interactions; positive indicates affinity to octanol (i.e. nonpolar, hydrophobic), negative to water (i.e. polar, hydrophilic);
- ALogP: Octanol/water partition coefficient calculated using a modified atom-additive model summing atom type contributions based on focal atom and bond characteristics; positive indicates affinity to octanol (i.e. nonpolar, hydrophobic), negative to water (i.e. polar, hydrophilic); may be related to a plant-defense spectrum defined by unsaturated, aromatic, and nonpolar metabolites versus saturated and polar metabolites; negatively correlated with polarity and hence desiccation resistance.
- MLogP: Octanol/water partition coefficient calculated using a simple equation dependent on the number of C atoms and number of hetero atoms; smaller indicates affinity to octanol (i.e. nonpolar, hydrophobic), larger to water (i.e. polar, hydrophilic);
- TopoPSA: Topological polar surface area, the sum of the surface area of polar atoms in Ångstroms (Å), which may contribute to desiccation resistance, but is negatively correlated with passive transport through cell membranes.
- tpsaEfficiency: Molecular weight-specific topological polar surface area, the sum of the surface area of polar atoms in Ångstroms divided by the molecular weight in Daltons (Å Da-1), which may contribute to desiccation resistance, but is negatively correlated with passive transport through cell membranes.
- HybRatio: Fraction of sp3 to sp2 carbon atoms; proxy for bond saturation and three-dimensional topological complexity, which is positively correlated with melting point, solubility, and the likelihood of a compound to exhibit bioactivity in pharmaceutical assays.
- Fsp3: The fraction of carbon atoms with sp3 electron orbits (i.e. only single bonds) to the total number of carbon atoms in the molecule, which is positively correlated with melting point, solubility, and the likelihood of a compound to exhibit bioactivity in pharmaceutical assays.
- FMF: Ratio between size of molecular framework (ring atoms plus linkers) and size of metabolite; complexity measure positively correlated with promiscuity (number of protein targets >50% inhibited) at values above 0.65.
- ECCEN: Eccentric connectivity index, the distance-cum-adjacency topological descriptor; higher for longer chains with less branching; correlated with size and physicochemical properties such as boiling point.
- WPATH: Wiener path number, a topological descriptor of molecular branching that can differentiate structural isomers; correlated positively with size and boiling point.
- WPOL: Wiener polarity number, a variant of the Wiener path number calculated using vertices (C atoms) at distance 3; correlated positively with size and boiling point.
- nHBDon: Number of hydrogen-bond donors (e.g. OH, NH, formal charge ≥ 0).
- nHBAcc: Number of hydrogen-bond acceptors (e.g. O/N, formal charge ≤ 0, non-ether O, non-adjacent ON)
- PC1: Position on the first principal component of a principal component analysis on metabolite chemical properties.
- PC2: Position on the second principal component of a principal component analysis on metabolite chemical properties.
- PC3: Position on the third principal component of a principal component analysis on metabolite chemical properties.
- PC4: Position on the fourth principal component of a principal component analysis on metabolite chemical properties.
- PC5: Position on the fifth principal component of a principal component analysis on metabolite chemical properties.
- MDP0209.mzXML.Peak.area: Area under the curve of the chromatographic peak representing the quantity of the metabolite in sample MDP0209.mzXML; all subsequent columns columns display quantification data for pools representing every unique tree species-by-forest plot combination (906 species-by-plot).
- MDP0213.mzXML.Peak.area:
- MDP0206.mzXML.Peak.area:
- MDP0221.mzXML.Peak.area:
- MDP0211.mzXML.Peak.area:
- MDP0215.mzXML.Peak.area:
- MDP0212.mzXML.Peak.area:
- MDP0204.mzXML.Peak.area:
- MDP0220.mzXML.Peak.area:
- MDP0208.mzXML.Peak.area:
- MDP0214.mzXML.Peak.area:
- MDP0223.mzXML.Peak.area:
- MDP0207.mzXML.Peak.area:
- MDP0224.mzXML.Peak.area:
- MDP0219.mzXML.Peak.area:
- MDP0222.mzXML.Peak.area:
- MDP0210.mzXML.Peak.area:
- MDP0205.mzXML.Peak.area:
- MDP0216.mzXML.Peak.area:
- MDP0203.mzXML.Peak.area:
- MDP0218.mzXML.Peak.area:
- MDP0217.mzXML.Peak.area:
- MDP0587.mzXML.Peak.area:
- MDP0585.mzXML.Peak.area:
- MDP0572.mzXML.Peak.area:
- MDP0586.mzXML.Peak.area:
- MDP0571.mzXML.Peak.area:
- MDP0554.mzXML.Peak.area:
- MDP0580.mzXML.Peak.area:
- MDP0596.mzXML.Peak.area:
- MDP0553.mzXML.Peak.area:
- MDP0561.mzXML.Peak.area:
- MDP0576.mzXML.Peak.area:
- MDP0582.mzXML.Peak.area:
- MDP0594.mzXML.Peak.area:
- MDP0545.mzXML.Peak.area:
- MDP0573.mzXML.Peak.area:
- MDP0574.mzXML.Peak.area:
- MDP0568.mzXML.Peak.area:
- MDP0579.mzXML.Peak.area:
- MDP0548.mzXML.Peak.area:
- MDP0583.mzXML.Peak.area:
- MDP0599.mzXML.Peak.area:
- MDP0534.mzXML.Peak.area:
- MDP0584.mzXML.Peak.area:
- MDP0563.mzXML.Peak.area:
- MDP0529.mzXML.Peak.area:
- MDP0528.mzXML.Peak.area:
- MDP0593.mzXML.Peak.area:
- MDP0567.mzXML.Peak.area:
- MDP0549.mzXML.Peak.area:
- MDP0543.mzXML.Peak.area:
- MDP0564.mzXML.Peak.area:
- MDP0535.mzXML.Peak.area:
- MDP0537.mzXML.Peak.area:
- MDP0551.mzXML.Peak.area:
- MDP0559.mzXML.Peak.area:
- MDP0532.mzXML.Peak.area:
- MDP0542.mzXML.Peak.area:
- MDP0546.mzXML.Peak.area:
- MDP0547.mzXML.Peak.area:
- MDP0569.mzXML.Peak.area:
- MDP0565.mzXML.Peak.area:
- MDP0541.mzXML.Peak.area:
- MDP0555.mzXML.Peak.area:
- MDP0556.mzXML.Peak.area:
- MDP0558.mzXML.Peak.area:
- MDP0544.mzXML.Peak.area:
- MDP0591.mzXML.Peak.area:
- MDP0539.mzXML.Peak.area:
- MDP0581.mzXML.Peak.area:
File: DisciplineSpecificMetadata.json
Description:
The DisciplineSpecificMetadata.json file contains parameter values for experimental and instrumental protocols used in liquid chromatography-mass spectrometry (LC-MS) data collection.
Code/software
File: Chadwick_ProcB_10.1098_rspb.2025.1721.R
R code used for Chadwick, Sierra E., David Henderson, Dale L. Forrister, Leslie Cayola, Alfredo F. Fuentes, Belén Alvestegui, Nathan Muchhala, J. Sebastián Tello, Martin Volf, Jonathan A. Myers, and Brian E. Sedio. Proceedings B (https://doi.org/10.1098/rspb.2025.1721) is provided in the file. This file includes R code used for data manipulation, statistical analyses, and data visualization/figure creation for the Chadwick et al. Proc B manuscript. R version 4.4.1 on Mac OS 15.5.
The R code assumes that the four supplementary data files are located in a folder with the path: ~Documents/Madidi_Project
R package versions used in the file:
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
# other attached packages:
# [1] V.PhyloMaker2_0.1.0 plotly_4.10.4 adiv_2.2.1 phylosignal_1.3.1 picante_1.8.2
# [6] nlme_3.1-167 phytools_2.4-4 maps_3.4.2.1 ape_5.8-1 corrplot_0.95
# [11] ggrepel_0.9.6 ggplot2_3.5.1 vegan_2.6-10 lattice_0.22-6 permute_0.9-7
# loaded via a namespace (and not attached):
# [1] DBI_1.2.3 mnormt_2.1.1 deldir_2.0-4 phangorn_2.12.1
# [5] rlang_1.1.5 magrittr_2.0.3 ade4_1.7-22 compiler_4.4.1
# [9] mgcv_1.9-1 png_0.1-8 vctrs_0.6.5 reshape2_1.4.4
# [13] combinat_0.0-8 quadprog_1.5-8 stringr_1.5.1 pkgconfig_2.0.3
# [17] crayon_1.5.3 fastmap_1.2.0 promises_1.3.2 rmarkdown_2.29
# [21] purrr_1.0.2 xfun_0.52 seqinr_4.2-36 clusterGeneration_1.3.8
# [25] jsonlite_1.8.9 progress_1.2.3 later_1.4.1 adegenet_2.1.10
# [29] uuid_1.2-1 jpeg_0.1-10 parallel_4.4.1 prettyunits_1.2.0
# [33] cluster_2.1.8 R6_2.5.1 stringi_1.8.4 RColorBrewer_1.1-3
# [37] boot_1.3-31 numDeriv_2016.8-1.1 Rcpp_1.0.14 iterators_1.0.14
# [41] knitr_1.49 optimParallel_1.0-2 base64enc_0.1-3 adephylo_1.1-16
# [45] httpuv_1.6.15 Matrix_1.7-2 adegraphics_1.0-21 splines_4.4.1
# [49] igraph_2.1.4 tidyselect_1.2.1 yaml_2.3.10 phylobase_0.8.12
# [53] doParallel_1.0.17 codetools_0.2-20 tibble_3.2.1 plyr_1.8.9
# [57] shiny_1.10.0 withr_3.0.2 coda_0.19-4.1 evaluate_1.0.3
# [61] xml2_1.3.6 lpSolve_5.6.23 pillar_1.10.1 KernSmooth_2.23-26
# [65] foreach_1.5.2 generics_0.1.3 sp_2.2-0 hms_1.1.3
# [69] munsell_0.5.1 scales_1.3.0 xtable_1.8-4 rncl_0.8.7
# [73] glue_1.8.0 lazyeval_0.2.2 scatterplot3d_0.3-44 tools_4.4.1
# [77] interp_1.1-6 data.table_1.16.4 rgl_1.3.17 XML_3.99-0.18
# [81] fastmatch_1.1-6 grid_4.4.1 tidyr_1.3.1 RNeXML_2.4.11
# [85] crosstalk_1.2.1 latticeExtra_0.6-30 colorspace_2.1-1 cli_3.6.3
# [89] DEoptim_2.2-8 expm_1.0-0 viridisLite_0.4.2 dplyr_1.1.4
# [93] gtable_0.3.6 digest_0.6.37 farver_2.1.2 htmlwidgets_1.6.4
# [97] htmltools_0.5.8.1 lifecycle_1.0.4 httr_1.4.7 mime_0.12
# [101] MASS_7.3-64
Access information
Other publicly accessible locations of the data:
- The raw data from the Madidi Project are stored and managed in Tropicos (https://tropicos.org/home), the botanical database of the Missouri Botanical Garden. The data for each forest plot can be accessed via the Madidi Project Plot Search page (http://tropicos.org/PlotSearch.aspx?projectid=20).
- The raw LC-MS spectra were deposited as a public MassIVE dataset on the Global Natural Products Social (GNPS) Molecular Networking server (https://massive.ucsd.edu/ProteoSAFe/dataset_files.jsp?task=a4a891a0f3b3467fb790a9c335783205#%7B%22table_sort_history%22%3A%22main.collection_asc%22%7D) with FTP download link (ftp://massive.ucsd.edu/MSV000090549) and doi (https://doi.org/10.25345/C52R3P21H).
Forest plot data: The Madidi Project
Floristic data were collected as part of the Madidi Project (www.mobot.org/madidi), a collaboration of more than two decades between the Herbario Nacional de Bolivia and the Missouri Botanical Garden to document the flora of the Madidi region in the Andes of Bolivia (35). The region features wide variation in plant communities over an extreme elevational gradient, from lowland rainforests located at 200 m above sea level (a.s.l.) to alpine environments above the tree line at 6,000 m a.s.l. (48). The Madidi Project includes 50 1-ha permanent forest plots ranging in elevation from 212 m to 3334 m a.s.l. We selected 16 1-hectare (ha) permanent plots in which leaves were sampled for chemical analysis and which represent broad variation in elevation (662-3324 m a.s.l.), climate, and tree species richness (17-137 species per 1-ha plot). The 16 plots include three seasonally dry, low-elevation forest plots and 13 moist, montane forest plots (28). Abundant genera include: Miconia (Melastomataceae), Sloanea (Elaeocarpaceae), and Ocotea (Lauraceae) in the low-elevation moist plots; Weinmannia (Cunoniaceae), Hedyosmum (Chloranthaceae), and Clethra (Clethraceae) in the high-elevation (>2500 m) plots; and Weinmannia, Hedyosmum, and Clethra in the seasonally dry plots. Tree species richness declines with elevation among the 13 moist forest plots, whereas the three seasonally dry lowland plots display low species richness (28). In each 1-ha plot, all free-standing woody plants with a diameter at breast height of ³ 10 cm were mapped, measured, and identified to a botanically valid species or morphospecies.
Morphological Functional Traits
Protocols for morphological functional trait data collection are described in detail in the Madidi Project manual (www.mobot.org/madidi). We selected five morphological leaf and stem traits that reflect a species position on a tradeoff axis from conservative traits associated with defense and survival to acquisitive traits associated with fast growth (Box 1). Leaf area and specific leaf area (SLA; area per unit mass) are associated with a resource-acquisitive life history strategy; leaf thickness, bark thickness, and twig specific density are associated with a resource-conservative life history strategy (49). Morphological traits for each species-by-plot are reported in Dataset S1. Community-weighted mean trait values are reported in Dataset S2.
Chemical Analysis
We collected leaf samples from 473 tree species representing 906 unique species-by-plot. Within each forest plot, we collected leaf samples from 62-90% of the species in the plot (mean = 80% of the species per plot; (28)). Leaves of up to five individual trees per species per plot were collected between 2010 and 2019 and dried with silica gel upon collection in the field. Leaf samples were extracted for untargeted metabolomics analysis following Sedio et al. (31). Briefly, 50 mg of dried leaf tissue was ground to a fine powder and 10 mg weighed for extraction in 1800 mL 90:10 methanol:water pH 5 overnight at 4 °C. Extracts of up to five individuals per species per plot were pooled to create 906 pools representing unique species-by-plot.
All individual extracts and species pools were filtered and analyzed using ultra-high performance liquid chromatography-heated electrospray ionization-tandem mass spectrometry (UHPLC-HESI-MS/MS) using a Thermo Fisher Scientific (Waltham, MA, USA) Vanquish UHPLC with a C18 column and a Thermo QExactive quadrupole-orbitrap MS. Separation of metabolites by UHPLC was followed by HESI ionization in positive mode using full scan MS1 and data-dependent acquisition of MS2. Detailed instrumental methods are described by Sedio et al. (31). Spectra were deposited as a public MassIVE dataset on the Global Natural Products Social (GNPS) Molecular Networking server (doi:10.25345/C52R3P21H).
Raw spectra were centroided and processed for peak detection, peak alignment, and filtering using MZmine 2 (50). Aligned chromatograms were used to create a feature-based molecular network (FBMN; (51)) using GNPS (52). The structural similarities of all metabolites as represented in the resulting network were used to create a dendrogram using the software Qemistree (53), which is reported in Dataset S3. Metabolites were annotated by predicting molecular formulae using Sirius (54), predicting molecular structures using CSI:FingerID (55) and classifying compounds using Canopus (56) according to the organic chemical taxonomy scheme of ClassyFire (57) and according to biosynthetic origins using NPClassifier (58). For a comparison of intra- and inter-specific variation for selected species-rich high- and low-elevation genera, see (28).
To calculate chemical properties of metabolites, we used the highest-confidence molecular structure predicted by CSI:FingerID, represented as a SMILES text string, to query the Chemistry Development Kit (CDK; (59)) using the R package ‘rcdk’ (60). The CDK library includes 51 variables that describe chemical and physical properties, but Walker et al. (27) found that many of these are highly correlated and hence represent five major axes of variation. A correlation matrix of 21 chemical properties for metabolites in our data closely matched that of Walker et al. (27). Hence, like Walker et al. (27), we chose one of each of five major dimensions of variation (Box 1). Molecular formulae, predicted structures, classifications, and chemical properties and sample-level abundances for 20,571 unique metabolites are provided in Dataset S4. Foliar metabolome chemical properties for each species-by-plot are reported in Dataset S1. Community-weighted mean values are reported in Dataset S2.
We calculated the chemical structural-compositional similarity (CSCS) of species, which accounts for the structural similarity of unique metabolites (30). We calculated CSCS with respect to metabolites in the upper and/or lower quartile of nAtomP, ALogP, TopoPSA, and Fsp3, respectively, for the species co-occurring in each of the 16 forest plots.
Climate Data
We selected four climatic variables to represent variation among the 16 forest plots in temperature, precipitation, and seasonality. Annual mean temperature and annual range in temperature were derived from WorldClim Version 2.1 (61). Annual precipitation and precipitation seasonality, calculated as the ratio of the standard deviation to the mean precipitation of each month, were derived from the Tropical Rainfall Measuring Mission (TRMM), a regional database that provides greater accuracy in precipitation measurements relative to WorldClim in the Bolivian Andes (28). We scaled and centered the four variables and carried out a principal components analysis, of which the first principal component represented 71.2% of the variation and was clearly interpretable as a gradient from cold, dry environments (values < 0) to warm, wet environments (values > 0; (28)). Elevation and position on climate PC1 for each of the 16 forest plots are reported in Dataset S2.
Discipline-Specific Metadata
The DisciplineSpecificMetadata.json file contains parameter values for experimental and instrumental protocols used in liquid chromatography-mass spectrometry (LC-MS) data collection. These methods are also reported in Sedio et al. 2021 "Chemical similarity of co-occurring trees decreases with precipitation and temperature in North American forests". Front. Ecol. Evol. 9.679638. doi: 10.3389/fevo.2021.679638
