Global land use change and its impact on greenhouse gas emissions
Data files
Dec 06, 2024 version files 8.78 MB
-
algorithm.zip
8.77 MB
-
README.md
4.39 KB
Dec 06, 2024 version files 8.78 MB
-
algorithm.zip
8.77 MB
-
README.md
5.01 KB
Abstract
We synthesized 29 years of global historical data from the Food and Agriculture Organization of the United Nations (FAO) and World Bank and summarized global land use change and its implication for global GHG emissions. The land use types include artificial surface (i.e., any type of land with a predominant human-made structure), cropland, pasture (including both natural and cultivated), barren land, and forest. The goal was to combine empirical analysis, through structural equation modeling, with predictive modeling using deep learning, to understand and forecast the impact of land use decisions on GHG emissions. More specifically, we first established and validated causal relationships between areas of different land use types and global GHG emissions. This was achieved through structural equation modeling using the historical dataset consisting of 33,234 data points from 1992 to 2020. Then, we employed a deep learning approach to leverage the extensive historical data across various land use types, from the lowest to the highest GHG emitting land, to predict potential future GHG emissions under different land use scenarios from 2021 to 2050. By estimating GHG emissions for various future land use scenarios, our study intended to offer a projection approach that could assist in planning effective climate change mitigation strategies. These projections are important for developing strategies that balance sustainability with climate change mitigation.
README: Global land use change and its impact on greenhouse gas emissions
https://doi.org/10.5061/dryad.4j0zpc8n3
Description of the data and file structure
Historical data for the period of 1992–2020 were obtained from the Food and Agriculture Organization of the United Nations (FAO) and World Bank. The data were available by country and year. We compiled land use (artificial surface, cropland, pasture, barren land, forest) areas and greenhouse gas (GHG) emissions for 191 countries and territories, capturing 96−97% of global GHG emissions. In this paper, artificial surface is composed of any type of areas with a predominant human-made structure; Cropland is the land used for cultivation of crops; Pasture is the land used permanently, i.e., five years or more, to grow herbaceous forage crops naturally or through cultivation, e.g., natural prairie and grasslands or cultivated grazing land; Barren land is dominated by natural abiotic surfaces, e.g., bare soil, sand, rock, with natural vegetation cover less than 2%; Forest is the land spanning more than 0.5 ha with trees higher than 5 m and a canopy cover of more than 10%, excluding land that is predominantly under agricultural or urban land use; GHG is composed of CO2 totals excluding short-cycle biomass burning, e.g., agricultural waste and savanna burning, but including other biomass burning, e.g., forest fires, post-burn decay, peat fires, and decay of drained peatlands, and all anthropogenic N2O, CH4, and fluorinated gas totals. The unit for land use areas is billion ha, and GtCO2eq for GHG.
Model Architecture
The model used for the analysis is a recurrent neural network (RNN) based on a long short-term memory (LSTM) architecture. It comprises four LSTM layers, with the first two layers consisting of 32 units each, followed by two layers with 16 units each. Each LSTM layer uses a ReLU activation function and includes a 20% dropout rate to reduce overfitting. The final layers include a dense (fully connected) layer with 16 units and ReLU activation, followed by an output layer with a single neuron to predict the target variable.
Training Procedure
A stratified 10-fold cross-validation approach was employed to train and validate the model across the dataset. The data was split into training and testing subsets for each fold. The model was trained for a maximum of 200 epochs with a batch size of 16, using early stopping and learning rate reduction callbacks to prevent overfitting and optimize convergence. The final prediction was achieved through an ensemble of 10 models, each representing the best-performing model from a cross-validation fold. This approach ensures robust performance by aggregating predictions from multiple models.
Software and Libraries
The model was implemented using the TensorFlow and Keras libraries in Python. Required libraries were listed at the beginning of the script.
Files and variables
File: algorithm.zip
Description:
input.csv is the input dataset for modeling [The unit for land use areas is billion ha (Data source: FAO), and GtCO2eq for greenhouse gas emissions (Data source: World Bank)];
predicting model.ipynb is the python code for the deep learning model;
best_model folds 1-10.keras are the trained deep learning models;
future_ghg predictions sets 1 and 2.csv are the predicted values (land use areas are the hypothesized future scenarios. GHG is the predicted values under these scenarios. the units are the same as in input.csv);
loss function plot.jpg and observed vs predicted.jpg are he visualization of the model performance;
SEM.ipynb is the python code for structural equation modeling;
results.csv and stats.csv are the output of the structural equation modeling;
SEM.jpg is the visualization of the structural equation model.
Code/software
Structural equation modeling was conducted using the Structural Equation Models Optimization in Python (semopy; Python 3.12) to quantify the effects of land use on greenhouse gas (GHG) emissions.
We used the Long Short-Term Memory (LSTM) based Recurrent Neural Network (RNN) as the algorithm for modeling (Keras in TensorFlow; Python 3.12).
Access information
Data was derived from the following sources:
- Food and Agriculture Organization of the United Nations (FAO)
- World Bank
Global total greenhouse gas emission data were downloaded from World Bank (https://data.worldbank.org/indicator?tab=all) on May 3, 2024.
FAOSTAT Land use/land cover data: https://openknowledge.fao.org/items/6c17080e-6c5d-47ae-a982-609c882bd4e7
FAO, 2024. FAOSTAT Land Use dataset, available at https://www.fao.org/faostat/en/#data/RL and https://www.fao.org/faostat/en/#data/LC. FAO, Rome, Italy. Downloaded on May 5, 2024.
Acknowledgment
I thank World Bank for the publicly available database. I thank FAO for the publicly available database and Dr. Francesco Tubiello for guidance on the use of FAO data.
Methods
Historical data for the period of 1992–2020 were obtained from the Food and Agriculture Organization of the United Nations (FAO) and the World Bank. The data were available by country and year. We compiled land use (artificial surface, cropland, pasture, barren land, forest) areas and greenhouse gas (GHG) emissions for 191 countries and territories.
Structural equation modeling was conducted using the Structural Equation Models Optimization in Python (semopy; Python 3.12) to quantify the effects of land use on GHG emissions.
We used the Long Short-Term Memory (LSTM) based Recurrent Neural Network (RNN) as the algorithm for modeling (Keras in TensorFlow; Python 3.12). The historical data was used for model training and testing. Based on the established data-driven relationships obtained from the model training and testing with the historical data, we predicted the future GHG emissions from 2021 to 2050 in two hypothesized scenarios using future land use areas as predictors.