Models for Rapid estimates of leaf litter chemistry using reflectance spectroscopy
Data files
Apr 22, 2024 version files 69.73 MB
-
Models.zip
-
README.md
Apr 23, 2024 version files 69.73 MB
-
Models.zip
-
README.md
Abstract
Measuring the chemical traits of leaf litter is important for understanding plants’ roles in nutrient cycles, including through nutrient resorption and litter decomposition, but conventional leaf trait measurements are often destructive and labor-intensive. Here, we develop and evaluate the performance of partial least-squares regression (PLSR) models that use reflectance spectra of intact or ground leaves to estimate leaf litter traits, including carbon and nitrogen concentration, carbon fractions, and leaf mass per area (LMA). Our analyses included more than 300 samples of senesced foliage from 11 species of temperate trees, including needleleaf and broadleaf species. Across all samples, we could predict each trait with moderate-to-high accuracy from both intact-leaf litter spectra (validation R2 = 0.543-0.941; %RMSE = 7.49-18.5) and ground-leaf litter spectra (validation R2 = 0.491-0.946; %RMSE = 7.00-19.5). Notably, intact-leaf spectra yielded better predictions of LMA. Our results support the feasibility of building models to estimate multiple chemical traits from leaf litter of a range of species. In particular, the success of intact-leaf spectral models allows non-destructive trait estimation in a matter of seconds, which could enable researchers to measure the same leaves over time in studies of nutrient resorption.
README: Models for rapid estimates of leaf litter chemistry using reflectance spectroscopy
This folder contains the partial least-squares regression (PLSR) model coefficients to accompany the paper "Rapid estimates of leaf litter chemistry using reflectance spectroscopy" by Kothari et al. (2024) Canadian Journal of Forest Research. An open version is on bioRxiv, DOI: 10.1101/2023.11.27.568939.
Each set of models for a given trait comprises a .csv file containing 200 models (rows) × coefficients (columns). The 200 models are derived from a jackknife analysis described in the paper. To generate trait estimates using a model set, you can apply them to the data (see below) and use the mean or the full distribution of estimates.
Spectral data preparation
Your spectral data should ideally be processed in the way described by the paper linked above. At a minimum, the data must be resampled to a 1 nm continuously in the 400-2400 nm range, or 400-1000 nm (VNIR) or 1300-2400 nm (SWIR) for the restricted-range intact-leaf models. The data should be a matrix in the format samples × wavelengths.
Kinds of models
Here, I provide six different sets of models:
- A set of models using intact leaf litter spectra from 400-2400 nm to predict LMA and chemical traits (folder Intact)
- A set of models using ground leaf litter spectra from 400-2400 nm to predict LMA and mass-based chemical traits (folder Ground)
- A set of models using intact litter spectra from 400-1000 nm to predict LMA and chemical traits (folder IntactVNIR)
- A set of models using intact litter spectra from 1300-2400 nm to predict LMA and chemical traits (folder IntactSWIR)
The file names should indicate which traits are correspond to which files. Abbreviations are as follows:
- LMA: leaf mass per area
- sol: soluble cell contents
- hemi: hemicellulose and bound proteins
- recalc: cellulose, lignin, and other recalcitrant material
- Nmass: nitrogen per unit mass
- Cmass: carbon per unit mass
- Narea: nitrogen per unit area
- Carea: carbon per unit area
Code to build the models is available on GitHub (see Related Works section for link) or as an archived version on Zenodo (DOI: 10.5281/zenodo.10969388). You can find more details about model performance in the paper. Note that models vary tremendously in their performance depending on the trait and the type of tissue whose spectra are used to predict it. Please exercise caution in using models without some sort of validation using conventional trait measurements.
Application
If you have a spectral dataset, you can generate model-based trait estimates by modifying the code below. Call the model trait.model
(200 models × 2002 coefficients) and the spectral dataset test.spectra
(n samples × 2001 wavelength bands), both arrays. Our models have intercepts, hence the extra coefficient (2002 = 1 + 2001). (For the VNIR and SWIR models, there are correspondingly smaller numbers of coefficients.) We can generate estimates with the following code:
## define a function using a tiny bit of matrix algebra
## to apply the coefficients
apply.coefs<-function(coef.matrix,val.spec,intercept=T){
if(ncol(coef.matrix)!=ncol(val.spec)+intercept){
stop("spectral matrix has incorrect dimensions")
}
if(intercept==T){
pred.matrix<-t(t(as.matrix(val.spec) %*% t(coef.matrix[,-1]))+coef.matrix[,1])
} else {
pred.matrix<-as.matrix(val.spec) %*% t(coef.matrix)
}
}
trait.model<-read.csv("PressedModels/LMA.csv") ## for example
trait.estimates<-apply.coefs(coef.matrix = trait.model,
val.spec = test.spectra)
mean.estimates<-rowMeans(trait.estimates)
The object trait.estimates
is a samples × (200) models array whose row means constitute the average trait estimate for each sample. All units are specific: mass-based chemical traits are all expressed as percentages of total dry mass (%), while area-based traits and LMA are all expressed in g per m-squared.
Make sure to use models appropriate to the kind of data: intact-leaf models for intact-leaf spectra, and so on. Otherwise, the trait estimates will be very inaccurate.
Associated data products
The spectral and trait data used to train and test these models can be found on ECOSIS (link in Related Works section) or with a DOI on Dryad (this submission).
I usually use the CRAN-hosted package spectrolab v. 0.0.18
(working in R 4.2.1
) to handle spectral data. In this package, the class spectra
allows users to attach and retrieve metadata from spectral data using the function meta()
. Below, you can find an example script that reads a .csv file, like our archived data, and turns it into an R spectra
object. (Alternately, the columns in the .csv file corresponding to wavelength bands could be converted into a matrix without the intermediate step of creating a spectra
-class object.)
library(spectrolab)
spec_df<-read.csv("mydata.csv")
name_var<-1 ## index for the column that contains sample names
meta_vars<-2:20 ## adjust as needed: indices for columns that contain metadata (including traits)
band_names<-400:2400 ## wavelengths of spectral bands corresponding to remaining columns
## you can also use the as_spectra command, but it's a bit more finicky
## with data frames because the column names of bands must contain a letter
spec<-spectra(value = spec_df[,-c(name_vars,meta_vars)],
band_names = 400:2400,
names = spec_df[,name_var],
meta = spec_df[,meta_vars])
test.spectra<-as.matrix(spec) ## this matrix can be used in apply.coefs() above
Maintenance and questions
Please contact Shan Kothari at shan.kothari [at] umontreal [dot] ca or quercusacerifolia [at] gmail [dot] com with any questions.
Methods
This repository contains coefficients for partial least-squares regression models trained to predict leaf litter traits from spectra of intact or ground leaves. Models for intact leaves were trained either on the full spectrum (400-2400 nm) or subsets (visible and near-infrared, 400-1000 nm; short-wave infrared, 1300-2400 nm).