Estimating global GPP from the plant functional type perspective using a machine learning approach
Data files
Mar 28, 2023 version files 13.06 GB
-
ECGC_GPP_monthly_005_1999.nc
-
ECGC_GPP_monthly_005_2000.nc
-
ECGC_GPP_monthly_005_2001.nc
-
ECGC_GPP_monthly_005_2002.nc
-
ECGC_GPP_monthly_005_2003.nc
-
ECGC_GPP_monthly_005_2004.nc
-
ECGC_GPP_monthly_005_2005.nc
-
ECGC_GPP_monthly_005_2006.nc
-
ECGC_GPP_monthly_005_2007.nc
-
ECGC_GPP_monthly_005_2008.nc
-
ECGC_GPP_monthly_005_2009.nc
-
ECGC_GPP_monthly_005_2010.nc
-
ECGC_GPP_monthly_005_2011.nc
-
ECGC_GPP_monthly_005_2012.nc
-
ECGC_GPP_monthly_005_2013.nc
-
ECGC_GPP_monthly_005_2014.nc
-
ECGC_GPP_monthly_005_2015.nc
-
ECGC_GPP_monthly_005_2016.nc
-
ECGC_GPP_monthly_005_2017.nc
-
ECGC_GPP_monthly_005_2018.nc
-
ECGC_GPP_monthly_005_2019.nc
-
README.md
Abstract
The long-term monitoring of gross primary production (GPP) is crucial to the assessment of the carbon cycle of terrestrial ecosystems. In this study, a well-known machine learning model (Random Forest, RF) is established to reconstruct the global GPP dataset named ECGC_GPP. The model distinguished nine functional plant types, including C3 and C4 crops, using eddy fluxes, meteorological variables, and leaf area index as training data of the RF model. Based on ERA5_Land and the corrected GEOV2 data, the global monthly GPP dataset at a 0.05-degree resolution from 1999 to 2019 was estimated. The results showed that the RF model could explain 74.81% of the monthly variation of GPP in the testing dataset, of which the average contribution of Leaf Area Index (LAI) reached 41.73%. The average annual and standard deviation of GPP during 1999–2019 were 117.14 ± 1.51 Pg C yr-1, with an upward trend of 0.21 Pg C yr-2 (p < 0.01). By using the plant functional type classification, the underestimation of cropland is improved. Therefore, ECGC_GPP provides reasonable global spatial patterns and long-term trends of annual GPP.
Methods
We unified the ERA5_Land and the corrected GEOV2 datasets to 0.05 degree and monthly scales. The meteorological and remote sensing datasets were classified by the eight PFTs to estimate the GPP of different PFT. Particularly, we established site-level PFT training models for CRO_C3 and CRO_C4, respectively, due to their significant differences. The CRO cells were a mixture of CRO_C3 and CRO_C4. Therefore, trained CRO_C3 and CRO_C4 models were both applied to the CRO cells and multiplied by their respective proportions to generate the final GPP estimation of CRO. This is what we designed to improve the current situation of GPP underestimation over CRO_C4 dominated regions. In this way, we generated a 0.05 degree and monthly scales global GPP dataset (ECGC_GPP) from 1999 to 2019.
Usage notes
The ECGC_GPP dataset is stored in .nc file format and can be opened using Matlab or Python.