Data from: Realm specific adaptations lead to contrasting global patterns in submerged and emergent aquatic plant height
Data files
Apr 07, 2026 version files 502.65 KB
-
Appendix_1.xlsx
28.52 KB
-
Appendix_4.xlsx
27.24 KB
-
Data_S1.xlsx
79.70 KB
-
Data_S2-2.xlsx
9.74 KB
-
Data_S2.xlsx
35.67 KB
-
Data_S3.xlsx
41.36 KB
-
Data_S4.xlsx
12.28 KB
-
Data_S5.xlsx
164.79 KB
-
Data_S6.xlsx
74.30 KB
-
plots_and_model_calculating_codes.R
22.45 KB
-
README.md
6.58 KB
Abstract
Aim: This study aims to evaluate whether the trait-environment relationships known for terrestrial plant height extend to freshwater macrophytes, and to identify the primary environmental factors influencing global height patterns in emergent and fully submerged aquatic plants.
Location: Global.
Time Period: Species occurrence records and height measurements compiled from 1931 to 2022.
Major Taxa Studied: Freshwater plant species.
Methods: We compiled a global dataset of maximum plant height records for 1,735 aquatic plant species, categorized by life form (partially emergent vs. submerged). Using generalized additive models, we tested how plant height varied along latitudinal gradients and how it was related to environmental predictors including temperature, relative inorganic carbon supply, and water depth (indicated by habitat availability). We explicitly analyzed whether these relationships differed between the two life forms, and validated these patterns using site-scale species and environmental data from northern temperate lakes and streams.
Results: Species with emergent growth increase in height with warmer temperature. On the contrary, fully submerged species exhibit increased height with higher inorganic carbon availability and colder climates. At the local scale, submerged plant height was positively associated with relative inorganic carbon supply in lakes and streams of the northern temperate zone.
Main Conclusions: The relationships between environmental variables and plant height differ between partially emergent to completely submerged life forms, consistent with realm-specific patterns in trait-environment correlations. These differing patterns highlight that trait-environment coupling in freshwater systems may not parallel those in terrestrial ecosystems and suggest that life form mediates species’ responses to environmental variation.
Dataset DOI: 10.5061/dryad.m37pvmdcv
We compiled a global dataset of maximum plant height for 1,735 aquatic plant species, categorized by life form (emergent vs. submerged). These data were aggregated to 10° × 10° grid cells and combined with environmental variables, including climate and inorganic carbon availability. Analyses were conducted using generalized additive models (GAMs). In addition, a site-specific dataset from 587 northern hemisphere lakes and streams was used to validate key relationships.
All variables, units, and data structures are described below to facilitate reuse.
File: Appendix_1.xlsx
Description: Sources of plant trait data
Sheet 1: References
- List of literature sources used for plant height and life form data
Sheet 2: Databases
- Global trait databases used (e.g. TRY)
File: Appendix_4.xlsx
Description: Calculation of relative inorganic carbon supply index (rICSI)
Sheet 1: CO2 supply index (CO2SI; mol m⁻¹ s⁻¹)
Sheet 2: bicarbonat supply index (HCO3SI; mol m⁻¹ s⁻¹)
- Provides formulas used to calculate ICSI and rICSI
- rICSI = ICSI / temperature-dependent carbon demand
- Units: mol m⁻¹ s⁻¹
Note: 1. Diffusion coefficent of CO2 was based on engineeringtoolbox.com R2=1.
2.Diffusion coefficent of bicarbonate was calculated based on the method of Zeebe. (2011).
Zeebe, R. E. (2011). On the molecular diffusion coefficients of dissolved CO2, HCO3-, and CO32-and their dependence on isotopic mass. Geochimica et Cosmochimica Acta, 75(9), 2483-2498.
### File: Data_S1.xlsx
Description: Grid cell plant height summaries (Same variables and units are only mentioned the first time they appear)
Sheet 1: Grid cell height of aquatic plants
- Aquatic_plant_height (cm): mean maximum plant height across all species in each grid cell
- Longitude (decimal degrees)
- Latitude (decimal degrees)
Sheet 2: Grid cell height of emergent plants
- Emergent_height (cm): mean maximum height of emergent species
Sheet 3: Grid cell height of submerged plants
- Submerged_height (cm): mean maximum height of submerged species
Sheet 4: Grid cell height and species richness of emergent and submerged plants
- S and E refer to submerged and emergent grid cell plant height, separately
- Species_number: number of species in the grid cell
- SSpecies_number: number of submerged species in the grid cell
- ESpecies_number: number of emergent species in the grid cell
Sheet 5: dataset for ridgeplot in Figure 1
-Plant_height (cm): mean maximum plant height across all species in each grid cell
-Life_form: emergent or submerged
- Level: 20-degree latitudinal intervals from the equator to the poles.
File: Data_S2.xlsx
Description: Grid cell plant height with environmental factors. Environmental variables were derived as follows:
Temperature and precipitation variables were condensed using principal component analysis (PCA), and the first principal component (PC1) was extracted as a proxy for temperature and precipitation, respectively. Water availability was characterized using global surface water data (Pekel et al. 2016), where the proportion of seasonal water relative to total water area (seasonal water ratio) was used as an indicator of shallow versus deeper aquatic habitats within each grid cell.
Variables:
- Longitude, Latitude (decimal degrees)
- E: emergent gridcell height (cm)
- S: submerged gridcell height (cm)
- ESpecies_number: species number of emergent plant
- SSpecies_number: species number of submerged plant
- Species_number: species number of aquatic plant
- Temperature: first principal component representing temperature-related variation (unitless)
- Precipitation: first principal component representing precipitation-related variation (unitless)
- Seasonal_water_ratio (unitless): proportion of seasonal water area relative to total water area in each grid cell
- rICSI (mol m⁻¹ s⁻¹): relative inorganic carbon supply index
File: Data_S2-2.xlsx
Description: Summary of GAM model outputs used for Figure 2. This file contains model estimates describing the relationships between environmental predictors and plant height across life forms.
Variables:
- Predictor variables used in GAMs (e.g., Temp_PC1, rICSI, Seasonal_water_ratio)
- Std.: standard error associated with the estimate
- Life_form: plant life form category (emergent or submerged)
File: Data_S3.xlsx
Description: Site-specific dataset (northern temperate lakes and streams)
Variables:
- Submerged plant site height (cm)
- Temperature: first principal component representing temperature-related variation (unitless)
- rICSI (mol m⁻¹ s⁻¹): relative inorganic carbon supply index
- Longitude, Latitude (decimal degrees)
File: Data_S4.xlsx
Description: Species-level trait dataset
Variables:
- Species name
- Photosynthetic type (CO₂-user / bicarbonate-user)
- Plant_height (cm) : maximum plant height of each species
File: Data_S5.xlsx
Description: Plant height records and species-level averages
Sheet 1: all records of plant hight (cm)
- Value (cm): plant height
Sheet 2: mean height (cm)
- LifeForm: life forms of aquatic plants (E-emergent; S-Submerged; F-floating-leaved; FF-floating rooted)
- PlantHeight (cm): the average plant height of each species
File: Data_S6.xlsx
Description: Environmental variables per grid cell. Temperature variables are expressed in °C, while precipitation variables are expressed in mm. Temperature seasonality represents the standard deviation of monthly temperature (scaled), isothermality is expressed as a percentage, and precipitation seasonality represents the coefficient of variation.
Variables:
- Longitude, Latitude (decimal degrees)
- Climate variables derived from WorldClim (e.g., temperature of warmest quarter, °C)
File: plots_and_model_calculating_codes.R
Description: R scripts used for data processing, statistical analyses (GAMs), and figure generation. Annotations are provided throughout the script through 1) library loading, 2) dataset loading and cleaning, 3) analyses, and 4) figure creation.
Code/software: All analyses were conducted in R (version 4.4.3), using packages including mgcv, spdep, and ggplot2.
2.1 | Data compilation
To study global patterns of aquatic plant height, we searched for height data for all known aquatic plants occurring in freshwater (3696 species; summarized by Murphy et al. (2019)). We collated height information for roughly half of the world’s aquatic plant species, i.e., 1,735 species (1,206 emergent plants, 411 submerged plants, 128 floating plants). The data comprised 6,186 records of height data and life form information for aquatic plants from 147 references and 120 online databases (including 810 records from the TRY database (Kattge et al. 2020), Supporting information Appendix 1) accessed from 22nd Dec 2021 to 20th Jan 2022. For each species, our dataset consists of an average of 3.57 (1 - 163) records. Where multiple records were available, we selected the maximum reported height to represent the species’ potential height under favorable conditions. Although we aimed to use vegetative height, this distinction was not consistently provided in some original sources (e.g. TRY). Therefore, our maximum height values may sometimes include reproductive structures. We acknowledge this as a limitation of our dataset, however, we believe this does not introduce systematic bias across life forms, and the broad patterns we report remain robust. For species with unknown life form (54 species), we used visual inspection of species photographs for categorizations (submerged vs. emergent life form) based on the location and types of leaves of the species. Species were classified as emergent when more than half of their leaves were positioned above the water surface, and as submerged when more than half of their leaves were entirely below the water surface. For a subset of the submerged species with recorded plant height, their photosynthetic usage of inorganic carbon was known as either strictly CO2 (40 species) or both bicarbonate and CO2 (39 species, Data S4; Iversen et al. 2019), thus allowing for a division into bicarbonate-users and obligate CO2-users. This set of species was used to test whether height differed between the two inorganic carbon strategies.
We combined the collected height data with the presence/absence information for each species in 10° × 10° (latitude × longitude) grid cells at global scale extracted from Murphy et al. (2019), who compiled a global, grid‐based distribution dataset for aquatic macrophytes by synthesizing information from regional floras, monographs, published literature, and databases. By using the maximum height value recorded for each species, we calculated the average maximum height of the plant assemblage in each grid cell spanning the entire globe (hereafter abbreviated as grid cell plant height, which effectively provides an estimate of the average plant maximum height of the species with ranges overlapping a given grid cell). The grid cell plant height was calculated using the arithmetic mean of the height of the species in each assemblage. By adopting the arithmetic mean of species’ maximum heights as our grid cell-level metric, we capture the community’s average potential for growth and competitive performance, serving as a useful proxy in assessing plant ecological strategy (Westoby et al. 2002; Osada et al. 2014; de Vries et al. 2024). This resulted in a dataset of 237 grid cells spanning all vegetated land areas of the earth (Data S1). To quantify the patterns for the different life forms of aquatic plants, we separated the dataset into emergent species (occurring in 237 grid cells), submerged species (n=233 grid cells; Data S1). Floating plants were excluded in the analysis because their height to a large degree is structured by local variations in water depth (floating rooted), or they are often very small in size (free-floating: e.g., Wolffia arrhiza: 0.1 cm).
We conducted an additional analysis using available intraspecific height data to verify whether maximum height values reliably reflect a species overall trait profile and to identify any potential biases from using extreme values. Specifically, we applied linear regression on log-transformed species height data to examine the relationship between maximum and average heights. The results, showed a strong correlation between the true maximum recorded heights and the simulated species height values derived from repeated random draws (Pearson R = 0.98, p < 0.001). The mean absolute error of 0.56 log units between log-transformed simulated and true maximum heights further supported the robustness of our maximum height metric (Fig. S1).
2.2 | Environmental conditions
Climatic variables related to temperature and precipitation (see Appendix 2) were extracted from the WorldClim database (http://worldclim.org/version2), using data at 10-arc min resolution (Hijmans et al. 2005). We condensed temperature and precipitation variables using a PCA for each set of variables and extracted the 1st-axis (PC1) as a proxy of temperature and precipitation, respectively. The 1-axis from the temperature PCA (PC1temp) correlated strongly with most variables related to average temperature conditions such as the mean temperature in the warmest quarter (R = 0.94, Fig S2), and likewise the precipitation axis (PC1precip) was strongly related to annual precipitation (R = 0.99, Fig S2).
As an indicator of the availability of shallow and deeper aquatic habitats, we collected data on the amount of seasonal (shallow) and permanent (deeper) water cover per grid cell from Pekel et al. (2016). We used the fraction of the total water covered area comprised by seasonal water (seasonal water ratio) to estimate dominance of either shallow or deeper freshwater habitats within each grid cell.
As the photosynthesis and growth of submerged plants can be limited by inorganic carbon supply due to the low diffusivity of CO2 in water (Maberly & Gontero 2018), we also collected the inorganic carbon concentration for each grid cell. For CO2, we estimated the average concentration in the grid cells using the global dataset (pCO2; µatm from lakes and streams separately) from Raymond et al. (2013). For each grid cell, we calculated the CO2 values by extracting mapped pCO2 from lakes and streams and computed a weighted mean, where each pCO₂ value was weighted by the relative area of its corresponding waterbody within each COSCAT regions (Meybeck et al. 2006; Raymond et al. 2013) using ArcGIS Pro (version 2.7.0). For bicarbonate, we used the global bicarbonate dataset (HCO3-; mM) provided by Iversen et al. (2019) and analyzed it using the mean value for each grid cell.
To understand the role of the supply of inorganic carbon (CO2 and HCO3-) to submerged plants, we estimated a supply index using a framework equivalent for estimating oxygen limitation for aquatic ectotherms (oxygen supply index, Verberk et al. (2011)). Modified from the oxygen supply index equations (see Appendix 3), we calculated the bicarbonate supply index (HCO3SI; mol* m−1 * s−1) and the CO2 supply index (CO2SI; mol* m−1 * s−1) incorporating both the concentration of each compound and the temperature-dependent diffusivity. The sum of the two indices estimates the total supply of inorganic carbon (inorganic carbon supply index; ICSI; mol* m−1 * s−1). As the amount of sequestered carbon used for basic metabolism is expected to be dependent on temperature (Gillooly et al. 2001), we calculated the relative supply of inorganic carbon as the ICSI divided by the temperature-dependent carbon demand as suggested by Verberk et al. (2011) (relative inorganic carbon supply index, rICSI; mol* mol* m−1 * s−1) :
Where Q10 is the temperature coefficient (using a standard value of 2.0, (Rasmusson et al. 2019)) and ΔT is the difference between the grid cell temperature (temperature in the warmest quarter) and the average temperature in the entire dataset (℃). Environmental variables are matched with grid-cell plant height data using grid-cell information and compiled in Data S2.
As a supplement to the pattern of grid cell plant height of submerged species at the global scale, we also analyzed a finer-scale dataset with site-specific records of plant species and measurements of CO2 and bicarbonate from 963 sites across the Northern Hemisphere (lakes and streams from Europe and North America (Iversen et al. 2019)). Species occurrences were compiled from regional aquatic vegetation monitoring programs and previously published floristic surveys. Based on the extracted dataset, species richness per site ranged from 5 to 52 species, with a mean of 10 species per site.
We analyzed the relationships between the average maximum plant height (abbreviated as site plant height) and two environmental variables calculated at site level; relative inorganic carbon supply and a temperature variable (PC1site temp) constructed in the same way as for the global data. After removing sites with incomplete data (e.g., unknown site names or missing species height records), the final dataset comprised 539 sites (Data S3).
2.3 | Statistical analysis
To account for phylogenetic constraints on plant height, we tested whether the maximum plant height contained a phylogenetic signal by estimating Blomberg’s K with the function ‘phylosignal’ in R package Picante (Blomberg et al. 2003). The phylogenetic relationships were obtained using phylomatic-awk-1.1.0 (Zanne et al. 2014). We extracted phylogenetic information for the selected species (1644 species matched) from the backbone tree in Zanne et al. (2014), which was constructed using multi-gene molecular and fossil data. Our analysis indicated that a very low phylogenetic signal in aquatic plant height (Blomberg’s K = 0.026), with a significant deviation from expectations under Brownian motion (p < 0.001). This suggests that plant height is not strongly structured by shared ancestry and may be influenced by other ecological or evolutionary factors (Blomberg et al. 2003). As a result, phylogeny was not taken into account in subsequent analyses.
To assess if there was a latitudinal pattern in aquatic plant height (including non-linear relationships), we first used Generalized Additive Models (GAM, mgcv package in R) to model the effect of latitude on grid cell plant height, with latitude as a smoothed term (Joswig et al. 2022). This model was compared to a constant null model using an F-test. Non-significance of the smoother term would indicate a random pattern between latitude and plant height. GAMs were selected because they allow flexible non-linear responses, which are expected along broad biogeographic gradients where biological patterns vary non-linearly with environmental and climatic drivers. Subsequently, we tested for potential divergent patterns in emergent and submerged plants by introducing plant life form and its interaction with latitude as independent variables in the model. To assess latitudinal patterns in plant height while accounting for spatial dependence, we fitted a generalized additive model (GAM) including a two-dimensional spatial smoother (s(Longitude, Latitude, bs = "sos")) together with an interaction between absolute latitude and life form. This approach controls for spatial autocorrelation in the data while allowing the latitudinal effect to differ between emergent and submerged species. To assess whether height variability differed between life forms at the spatial scale of analysis, we compared the variance in plant height between emergent and submerged species using Levene’s test for homogeneity of variance.
To examine whether emergent and submerged species differed in their responses to environmental gradients, we included interaction terms between life form and each environmental predictor in the GAM including the linear effect of seasonal water ratio, PC1temp, PC1precip and rICSI as explanatory variables. To account for variation in sampling robustness among grid cells, each grid cell was weighted by species richness in the model. We also verified that the results were robust to omitting the weighting scheme, with parameter estimates and significance remaining largely unchanged. To account for spatial autocorrelation, we included longitude and latitude in the model as a spatial spline. Due to a strong correlation between PC1precip and rICSI (Pearson's correlation coefficient > 0.7), we chose to retain the environmental indicator rICSI, which we consider more ecologically more important, and removed PC1precip from the model. rICSI was log-transformed to meet the assumption of a log linear effect on plant performance (Iversen et al. 2019) and all variables were subsequently normalized as Z-scores to ensure that parameter estimates were on the same scale to improve interpretation. The parameter estimate of this interaction term represents the differences in regression slopes between emergent and submerged species of a variable of interest. Significance was based on the 95% confidence interval of estimated parameter means.
Site plant height data (containing only submerged plants) was analyzed with a GAM model that included a spatial spline (latitude and longitude), using rICSI and temperature (PC1site temp) as explanatory variables.
A Welch t-test was performed to test the height difference between the two inorganic carbon use strategies. All statistical analyses were performed in R software (version 4.1.3).
