Data from: Toward spatio-temporal models to support national-scale forest carbon monitoring and reporting

Shannon, Elliot1; Finley, Andrew 1 ; May, Paul2; Domke, Grant3; Andersen, Hans-Erik 3 ; Gaines, George3; Banerjee, Sudipto4

Published Mar 12, 2025 on Dryad. https://doi.org/10.5061/dryad.4mw6m90kf

Data files

Mar 12, 2025 version files 924.09 MB

Abstract

National forest inventory (NFI) programs provide vital information on forest parameters’ status, trend, and change. Most NFI designs and estimation methods are tailored to estimate status over large areas but are not well suited to estimate trend and change, especially over small spatial areas and/or over short time periods (e.g. annual estimates). Fine-scale space-time indexed estimates are critical to a variety of environmental, ecological, and economic monitoring efforts. In the United States, for example, NFI data are used to estimate forest carbon status, trend, and change to support national, state, and local user group needs. Increasingly, these users seek finer spatial and temporal scale estimates to evaluate existing land use policies and management practices, and inform future activities. Here we propose a spatio-temporal Bayesian small area estimation modeling framework that delivers statistically valid estimates with complete uncertainty quantification for status, trend, and change. The framework accommodates a variety of space and time dependency structures, and we detail model configurations for different settings. The proposed framework is used to quantify forest carbon dynamics at an annual county-level across a 14 year period for the contiguous United States. Also, using an analysis of simulated data, we compare the proposed framework with traditional NFI estimators and offer computationally efficient algorithms, software, and data to reproduce results for benchmarking.

Here we provide the simulated and FIA datasets along with R code to fit candidate models and provide model output summaries.

Simulated datasets

conus_pop_plots.tar.gz is the compressed conus_pop_plots.csv file that holds plot-level population values. Columns are:

state: State name.
county: County name.
fips: State and county code.
ha: County hectare.
poly_id: Unique county code.
pop_id: Unique plot code.
tcc_2008 - tcc_2021: NLCD tree canopy cover (%) by year.
n_2008 - n_2021: Number of plots in the county by year.
carbon_2008 - carbon_2021: Simulated carbon values (Mg/ha) by year.
geometry: Simple feature point coordinates.

conus_pop_counties.csv is a file that holds county-level population values computed from conus_pop_plots.csv. Columns are:

state: State name.
county: County name.
fips: State and county code.
ha: County hectare.
poly_id: Unique county code.
n_plots - n_2021: Number of plots observed in the actual FIA data by year.
carbon_2008 - carbon_2021: Simulated carbon values (Mg/ha) averaged over plot values by year.
tcc_2008 - tcc_2021: NLCD tree canopy cover (%) averaged over plot values by year.
geometry: Simple feature polygon geometry.

conus_rep_1_counties.csv is a file that holds county-level direct estimates computed using the first replicate's sample data drawn from conus_pop_plots.csv. Columns are:

poly.id: Unique county code.
year: Year.
state: State name.
county: County name.
fips: State and county code.
ha: County hectare.
n: Sample size.
hat.mu: Direct estimate of the mean for carbon (Mg/ha).
hat.sigma.sq: Standard error of hat.mu.
tcc: County-level mean NLCD tree canopy cover (%).
mu.true: Population carbon (Mg/ha) which matches carbon_2008 - carbon_2021 in conus_pop_counties.csv.

FIA datasets

conus_counties.geojson a spatial polygon file that holds county-level values. Columns include:

state: State name.
county: County name.
fips: State and county code.
tcc_2008 - tcc_2021: County-level NLCD tree canopy cover (%) by year.
hat_mu_2008 - hat_mu_2021: Direct estimate of the mean for carbon (Mg/ha) from observed FIA plot data by year.
hat_sigma_sq_2008 - hat_sigma_sq_2021: Standard error of hat_mu_2008 - hat_mu_2021 by year.
n_2008 - n_2021: Sample size by year.
geometry: Simple feature polygon geometry.

conus_carbon_county_data.csv is a file that holds annual county-level direct estimates from observed FIA plot data. Rows are ordered by year 2008 - 2021 within county (follows stacking described in paper Section S4). County order matches that of conus_counties.geojson . Columns are:

n: Sample size.
hat.mu: Direct estimate of the mean for carbon (Mg/ha).
hat.sigma.sq: Standard error of the mean.
tcc: County-level NLCD tree canopy cover (%).

conus_car_D_matrix.txt is a file that holds the diagonal values for the diagonal matrix D defined in Section 2.2. Ordering matches county order in conus_counties.geojson.

conus_car_W_matrix.txt is a file that holds the values for the matrix W defined in Section 2.2. Ordering matches county order in conus_counties.geojson.

Missing values

NA values in the simulated and FIA data represent missing values (i.e., no data collected for the given county and year).

Code

Files fullmodel.R, submodel_1.R, and submodel_2.R implement the MCMC samplers for the Full model, Submodel 1, and Submodel 2 defined in Section 2.2 and described in Section S4.

mkREX.cpp and mkREX.R is code used to efficiently create sparse design matrices and used by fullmodel.R, submodel_1.R, and submodel_2.R. Given R build tools, mkREX.cpp can be built into a .so or .dll shared library using R CMD SHLIB mkREX.cpp.

Running code

For illustration, fullmodel.R, submodel_1.R, and submodel_2.R are set up to run one chain and save resulting MCMC samples and create summaries used for tables and figures in the paper. By setting the type variable in these files, you can fit the models to the simulated or actual FIA data. To run, create a shared library using mkREX.cpp (the code is currently set up for a mkREX.so, adjustments will need to be made for Windows operating system such that a mkREX.dll is created). Create a directory called "samples" to receive the resulting MCMC samples. Rmarkdown files Simulated_summaries_and_figures.Rmd and FIA_summaries_and_figures.Rmd can be used to summarize the MCMC samples for the first replicate's data and actual FIA data, respectively.

Sharing/Access information

Data were derived from the following sources.

Housman, I., Schleeweis, K., Heyer, J., Ruefenacht, B., Bender, S., Megown, K., Goetz, W., and Bogle, S. (2023). National land cover database tree canopy cover methods v2021.4.611. GTAC-10268-RPT1. Salt Lake City, UT: U.S. Department of Agriculture, Forest Service, Geospatial Technology and Applications Center.

USDA Forest Service. 2019. Forest Inventory and Analysis database. St. Paul, MN: USDA Forest Service, Northern Research Station. https://doi.org/10.2737/RDS-2001-FIADB