Data and Code from: Sheltered or suppressed? Tree regeneration in unmanaged European forests
Data files
Aug 07, 2023 version files 2.29 MB
-
data.zip
2.23 MB
-
README.md
5.78 KB
-
scripts.zip
60.30 KB
Jun 30, 2026 version files 2.29 MB
-
CHANGELOG_correction.md
1.62 KB
-
data_corrected.zip
2.22 MB
-
make-public-data.R
3.98 KB
-
README.md
11.92 KB
-
scripts_corrected.zip
54.71 KB
Abstract
Tree regeneration is a key demographic process influencing long-term forest dynamics. It is driven by climate, disturbances, biotic factors, and their interactions. Thus, predictions of tree regeneration are challenging due to complex feedbacks along the wide climatic gradients covered by most tree species. We developed statistical models based on >50,000 recruitment events observed for 24 tree species in an extensive permanent plot network (6,540 plots from 299 unmanaged European temperate, boreal, and subalpine forests) covering a wide climatic gradient.
This repository stores csv files with response and explanatory variables and R-Scripts that are needed to run the analysis and create the figures that are presented in the accompanying article titled "Sheltered or suppressed? Tree regeneration in unmanaged European forests" which was published in Journal of Ecology in 2023.
Our study shows that competition between trees towards climatic stress decreases systematically but depends on species stress tolerance to climate and shade. These findings explain within- and between-species differences in tree recruitment patterns in European temperate forests. Moreover, our findings imply that projections of forest dynamics along wide climatic gradients and under climate change must accommodate both competition and positive interactions, as they strongly affect rates of community turnover.
This repository contains the processed data and analysis code for the article "Sheltered or suppressed? Tree regeneration in unmanaged European forests" (Journal of Ecology). The data files are the inputs to scripts/fit_models.R, which fits the recruitment (regeneration) models and produces the figures and the coefficient tables reported in the paper.
Version note: data correction
This version replaces the processed data files to correct an anonymisation error in the original deposit. When the full data were anonymised for publication, the plot identifiers (unique_plot_id2) in site_data_unique_plot_id2.csv were renumbered using a different indexing than the stand-variable tables. Because the DBH 7 cm and DBH 10 cm analyses are based on different plot subsets (701 and 865 plots respectively, DBH 7 cm being a subset of DBH 10 cm), the shared site_data file ended up keyed to the DBH 10 cm plot numbering, so the seasonal climate predictors (dds, swb) were attached to the wrong plots in the DBH 7 cm analysis. As a consequence, re-fitting the models from the originally deposited data did not reproduce the published recruitment count-model coefficients (Appendix Table A4) or the response-curve figures (Figs 3–4).
The analysis code and the published results are unaffected — only the deposited public data were incorrect. The corrected files use a single, consistent plot/site/institute identifier mapping across both DBH thresholds and the climate data. Re-running scripts/fit_models.R on the corrected data exactly reproduces the published coefficients, selected model types and figures for both DBH thresholds (verified for all 23 species). The data-anonymisation script (scripts/data_preparation/make-public-data.R) is now included and has been corrected accordingly.
Files
This version of the dataset contains the files below. The first-version archives (data.zip, scripts.zip) have been replaced by data_corrected.zip and scripts_corrected.zip; the originals remain available under the first version of this dataset on Dryad.
| File | Status | Contents |
|---|---|---|
data_corrected.zip |
Corrected — use this | The data/ folder: corrected processed data plus the raw inputs needed to run the analysis. |
scripts_corrected.zip |
Corrected — use this | The scripts/ folder: analysis and data-preparation code (including the corrected make-public-data.R). |
make-public-data.R |
Documentation | The corrected data-anonymisation script, provided as a standalone file for convenience (identical to the copy inside scripts_corrected.zip). |
CHANGELOG_correction.md |
Documentation | Short description of the data correction made in this version. |
Data (data_corrected.zip)
| File | Description |
|---|---|
data/processed_data/site_data_unique_plot_id2.csv |
Environmental predictors (degree-day sum and site water balance) per plot. Input to scripts/fit_models.R. |
data/processed_data/stand_variables_dbh_7.csv, data/processed_data/stand_variables_dbh_10.csv |
Leaf area index (LAI_tot), other stand predictors, and the number of recruited trees per inventory period and species. The dbh_7 / dbh_10 suffix denotes a measurement threshold of 7 cm or 10 cm diameter at breast height (DBH), respectively. Input to scripts/fit_models.R. |
data/raw_data/ |
Files needed to produce the figures of the manuscript. |
data/raw_data/expectations.csv |
Encodes the Stress Gradient Hypothesis expectations illustrated in Figure 1. Each row is a species strategy under high/low leaf area index. stol = stress tolerance (0 = high, 1 = low); comp = competitive/shade tolerance (0 = low shade tolerance, 1 = high); fav = relative regeneration intensity under favorable conditions (left-hand y position); stress = relative regeneration intensity under stressful conditions (right-hand y position); lai = leaf area index for which the strategy is specified; pannel = panel of Figure 1 in which the line is drawn. |
data/raw_data/lhs_species.csv |
Life-history / tolerance annotations used only for plotting (life-history symbols). Not used in model fitting. |
data/raw_data/interpretation/icons/ |
Icons (processed with GIMP) used in the figures. Not needed for the statistical analysis, but required to run scripts/fit_models.R without error. |
Scripts (scripts_corrected.zip)
| File | Description |
|---|---|
scripts/fit_models.R |
Fits the recruitment models, performs model selection, and creates the figures and coefficient tables in the paper. |
scripts/functions/my_theme.R |
ggplot2 theme used by the figures. |
scripts/data_preparation/ |
Scripts used to derive the model variables from source data. The upstream raw inputs include very large files from CHELSA (https://chelsa-climate.org), the ISRIC Data Hub (https://data.isric.org), and EU-DEM (https://land.copernicus.eu/imagery-in-situ/eu-dem/eu-dem-v1.1); their origin is described in scripts/data_preparation/site_level_data. |
scripts/data_preparation/make-public-data.R |
Anonymises the full processed data into the deposited public data (corrected version; see the Version note above). |
Variable descriptions
site_data_unique_plot_id2.csv
- unique_plot_id2: anonymised ID of the forest sampling plot (integer).
- dds: seasonal degree-day sum (DDS), the integral under the temperature curve ignoring temperatures below 5.5 °C (Allen 1976; Fischlin et al. 1995). Unit: °C.
- swb: site water balance (SWB; cf. Speich 2019); the amount of water that can be stored in the soil. Calculation detailed in Appendix A1 of the article. Unit: mm.
stand_variables_dbh_7.csv, stand_variables_dbh_10.csv
The dbh_7 / dbh_10 suffix denotes a DBH measurement threshold of 7 cm or 10 cm respectively.
- species: tree species (scientific name).
- unique_plot_id2: anonymised ID of the forest sampling plot (integer; consistent with
site_data_unique_plot_id2.csv). - site_id: anonymised ID of the site a plot belongs to (integer).
- institute_id: anonymised ID of the data-providing institute (integer).
- year: anonymised index of the forest inventory period within a plot (integer; 1 = first inventory, 2 = second, …).
- period_length: number of years between two consecutive forest inventories. Used as an offset in the models.
- plot_area2: plot area in ha. Used as an offset in the models.
- r.trees: number of trees that surpass the DBH measurement threshold (7 or 10 cm).
- gFolA: species' one-sided foliage area per ha, derived as in Bugmann (1994).
- LAI_tot: sum of
gFolAover all species (total leaf area index).
Empty cells indicate missing values (NA); e.g. period_length is empty for the first inventory of each plot, and those rows are excluded from model fitting.
Software
The analysis was run in R. The core model fitting uses the packages
data.table and glmmTMB (negative-binomial GLMs, nbinom2); the
reported estimates are stable across glmmTMB versions. Figure generation
additionally uses ggplot2, ggthemes, cowplot, lemon, magrittr,
colorspace, ggrepel, pROC, grid, png, and ragg. The full list of
required packages is given at the top of scripts/fit_models.R.
Reproducing the analysis
- Extract
data_corrected.zipandscripts_corrected.zipinto the same working
directory (yieldingdata/andscripts/side by side). - Set the working directory to that folder and run
scripts/fit_models.R.
The script reads the files indata/processed_data/anddata/raw_data/,
fits the binomial and count recruitment models for each species, selects the
best model by BIC, and writes the coefficient tables and figures.
References
Allen, J. C. (1976). A modified sine wave method for calculating degree days. Environmental Entomology, 5(6), 388–396. https://doi.org/10.1093/ee/5.3.388
Bugmann, H. (1994). On the ecology of mountainous forests in a changing climate: A simulation study [PhD Thesis].
Fischlin, A., Bugmann, H., & Gyalistras, D. (1995). Sensitivity of a forest ecosystem model to climate parametrization schemes. Environmental Pollution, 87(3), 267–282. https://doi.org/10.1016/0269-7491(94)P4158-K
Speich, M. J. R. (2019). Quantifying and modeling water availability in temperate forests: A review of drought and aridity indices. iForest, 12(1), 1–16. https://doi.org/10.3832/ifor2934-011
These data are the processed model inputs for the article "Sheltered or suppressed? Tree regeneration in unmanaged European forests" (Journal of Ecology).
Recruitment (the number of trees surpassing a 7 cm or 10 cm diameter-at-breast-height threshold) and stand variables (e.g. total leaf area index) were derived per forest sampling plot and inventory period from an extensive permanent-plot network of unmanaged European forests. The environmental predictors are:
- dds — seasonal degree-day sum (Allen 1976; Fischlin et al. 1995)
- swb — site water balance (Speich 2019), calculated as described in Appendix A1 of the article
Plot, site, institute and inventory-period identifiers were anonymised (replaced by integer IDs) for publication. Full variable definitions, software requirements, and reproduction steps are given in the README.
Note (this version): the processed data were corrected for a plot-identifier misalignment present in the first deposit, which attached the climate predictors to the wrong plots in the DBH 7 cm analysis. See the README and the change log for details. The analysis code and the published results are unaffected.
These data are the processed model inputs for the article "Sheltered or suppressed?
Tree regeneration in unmanaged European forests" (Journal of Ecology).
Recruitment (the number of trees surpassing a 7 cm or 10 cm diameter-at-breast-height
threshold) and stand variables (e.g. total leaf area index) were derived per forest
sampling plot and inventory period from an extensive permanent-plot network of
unmanaged European forests. The environmental predictors are:
- dds — seasonal degree-day sum (Allen 1976; Fischlin et al. 1995)
- swb — site water balance (Speich 2019), calculated as described in Appendix A1 of the article
Plot, site, institute and inventory-period identifiers were anonymised (replaced by
integer IDs) for publication. Full variable definitions, software requirements, and
reproduction steps are given in the README.
Note (this version): the processed data were corrected for a plot-identifier
misalignment present in the first deposit, which attached the climate predictors to
the wrong plots in the DBH 7 cm analysis. See the README and the change log for
details. The analysis code and the published results are unaffected.
Full file descriptions, variable definitions, software requirements, and reproduction steps are provided in the README.
Use the corrected files:
data_corrected.zip — processed model inputs (data/processed_data/) and the raw inputs used to create the figures (data/raw_data/).
scripts_corrected.zip — the analysis code (scripts/fit_models.R and supporting scripts).
The first-version archives (data.zip, scripts.zip) have been replaced by the corrected files above and remain available under the first version of this dataset on Dryad.
To reproduce the analysis: extract data_corrected.zip and scripts_corrected.zip into the same working directory and run scripts/fit_models.R.
Full file descriptions, variable definitions, software requirements, and reproduction
steps are provided in the README.
Use the corrected files:
-
data_corrected.zip — processed model inputs (data/processed_data/) and the
raw inputs used to create the figures (data/raw_data/). -
scripts_corrected.zip — the analysis code (scripts/fit_models.R and
supporting scripts).
The first-version archives (data.zip, scripts.zip) have been replaced by the
corrected files above and remain available under the first version of this dataset
on Dryad.
To reproduce the analysis: extract data_corrected.zip and scripts_corrected.zip
into the same working directory and run scripts/fit_models.R.
Changes after Aug 7, 2023:
29.06.2026 Version update: correction of a plot-identifier misalignment in the processed data. This version replaces the processed data files (data/processed_data/) to correct an anonymisation error in the original deposit. When the full data were anonymised for publication, the plot identifiers (unique_plot_id2) in site_data_unique_plot_id2.csv were renumbered using a different indexing than the stand-variable tables. Because the DBH 7 cm and DBH 10 cm analyses are based on different plot subsets (701 and 865 plots, respectively), the shared site_data file was keyed to the DBH 10 cm plot numbering, so the seasonal climate predictors (dds, swb) were matched to the wrong plots in the DBH 7 cm recruitment analysis. As a consequence, re-fitting the models from the originally deposited data did not reproduce the published recruitment count-model coefficients (Appendix Table A4) or the response-curve figures (Figs 3–4). The analysis code and the published results are unaffected — only the deposited public data were incorrect. The corrected files (data_corrected.zip) use a single, consistent plot/site/institute identifier mapping across both DBH thresholds and the climate data. Re-running scripts/fit_models.R on the corrected data exactly reproduces the published coefficients, selected model types and figures for both DBH thresholds (verified for all 23 species). scripts_corrected.zip additionally includes the corrected data-anonymisation script (data_preparation/make-public-data.R).
- Käber, Yannek et al. (2022), Sheltered or suppressed? Tree regeneration in unmanaged European forests, [], Posted-content, https://doi.org/10.22541/au.166748406.68292738/v1
