# Centennial recovery of recent human-disturbed forests

## Cite this dataset

Rodríguez-Uña, Asun; Cruz-Alonso, Verónica; López-López, José A.; Moreno-Mateos, David (2024). Centennial recovery of recent human-disturbed forests [Dataset]. Dryad. https://doi.org/10.5061/dryad.rv15dv4h8

## Abstract

International commitments are challenging countries to restore their degraded lands, particularly forests. These commitments require global assessments of recovery timescales and trajectories of different forest attributes to inform restoration strategies. We use a meta-chronosequence approach including 125 forest chronosequences to reconstruct the past (c. 300 years), and model future recovery trajectories of forests recovering from agriculture and logging impacts. We found recovering forests significantly differed from undisturbed ones after 150 years and projected that difference to remain for up to 218 or 494 years for ecosystem attributes like nitrogen stocks or species similarity, respectively. These conservative estimates, however, do not capture the complexity of forest ecosystems. A centennial recovery of forests requires strategic, unprecedented planning to deliver a restored world.

## README: Centennial recovery of recent human-disturbed forests

https://doi.org/10.5061/dryad.rv15dv4h8

This dataset includes the information compiled in a meta-analysis about global long-term recovery trajectories of forest ecosystems, in terms of their biodiversity (i.e., organism abundance, species diversity, and Morisita-Horn species similarity) and biogeochemical functions (i.e., cycling of carbon, nitrogen stock, and phosphorus stock); and the response ratios computed to estimate the recovery completeness of each of these metrics.

### Description of the data and file structure

Data are provided in a single .csv data file. These are the definitions of all the columns:

Columns related to each selected article in the meta-analysis:

*code*: code given to each selected article in the meta-analysis

*citation*: an abbreviated version of each selected article's citation

Columns related to each forest chronosequence:

*chrono*: number to identify each chronosequence in each article (this value is typically "1", as there was usually only one chronosequence per article)

*latitude*: latitude (decimal degrees)

*longitude*: longitude (decimal degrees)

*arid*: Lang aridity index

*disturb_category*: main anthropogenic disturbance ocurring prior to forest recovery

*disturb_specific*: specific anthropogenic disturbance ocurring prior to forest recovery

*restoration_activity*: type of restoration strategy (passive/active)

*n*: number of subplots (i.e., replicates within forest plots) measured in each chronosequence. It was used to estimate the study precision.

*plot_size*: subplot area (m2). It was used to estimate the study precision.

Columns related to each recovery trajectory inside each chronosequence:

*response_variable*: the specific variable measured along a recovery trajectory (as it was literally called in the primary article)

*n_variable*: the code use to differentiate each response variable from the same chronosequence

*metric_type*: organism abundance, species diversity, Morisita-Horn species similarity, cycling of carbon, nitrogen stock and phosphorus stock:

*life_form*: the type of life form characterized with each response variable related to organism abundance, species diversity and Morisita-Horn species similarity.

Columns related to each data-point inside each recovery trajectory:

*age*: time since forest recovery started (years)

*y*: the value of the response ratio, computed as ln (*X*res/*X*ref), where *X*res is the value of each response variable at each age and *X*ref is the reference value of the response variable in the reference forest.

*age1*: time since forest recovery started + 1 (years)

*age_log*: natural logarithm [time since forest recovery started] (years)

*age1_log*: natural logarithm [time since forest recovery started + 1] (years)

age_sqrt: square root [time since forest recovery started] (years)

*age1_sqrt*: square root [time since forest recovery started + 1] (years)

### Code/Software

The R code enables to fit the meta-analytic models and to predict the time forests need to recover their biodiversity (i.e., organism abundance, species diversity and Morisita-Horn species similarity) and functions (i.e., cycling of carbon, nitrogen stock and phosphorus stock); and the effect of aridity, disturbance category, restoration strategy and life form on their recovery.

## Methods

__Database construction__

We collected data from 16,873 plots from 125 chronosequences of recovering forest ecosystems in 110 published primary studies. From these chronosequences, we extracted 641 recovery trajectories of quantitative measures of ecosystem attributes along time, related to six recovery metrics (organism abundance, species diversity, species similarity, carbon cycling, nitrogen stock, and phosphorus stock), two restoration strategies (passive and active), three disturbance types [agriculture (including abandoned croplands and pastures), logging and mining], and a climatic metric (i.e., aridity index). From the selected chronosequences, we extracted 641 recovery trajectories, i.e., field-based quantitative measurements of ecosystem integrity repeated through time, reported in tables, figures, and text of the selected studies. Each trajectory included at least two data points, defined as the value of the ecosystem metric at different times since recovery started (hereafter, recovery time). Average values were considered for the data points with the same recovery time (n = 72, in 21 studies). We used response ratios (RRs) to estimate the recovery completeness, i.e., the effect sizes between reference and recovering systems. We computed the RR for each data point along the trajectory as ln (*X*_{res}/*X*_{ref}), where *X*_{res }is the value of the ecosystem metric at a certain recovery time and *X*_{ref} is the reference value of the same metric in the reference forest. Effect sizes of the meta-analysis were weighted by study precision, which was estimated as the product of the number of subplots and their area, assuming that a higher sampling effort would imply a higher precision. For abundance, diversity and similarity, we fitted fixed-effects models, with weights only accounting for within-study variability; whereas for biogeochemical functions, we assumed random-effect meta-analytic models, accounting for both between- and within-study variability.

__Statistical analysis__

To estimate the trajectory of forest recovery over time, we fitted a separate linear mixed model (LMM) for the RR of each recovery metric. We included the recovery time as a fixed factor and as a random slope, and the trajectory identity as a random intercept, enabling a different slope and intercept for each trajectory. As the recovery process along time may result in a wide range of trajectories from linear to more saturating shapes, we consider three functions to include the recovery time variable: one linear and two decelerating trends [ln(recovery time + 1) and √recovery time]. We then selected among the three options the one that best fit the data of each recovery metric according to the minimum AICc. The models for the recovery of similarity were fitted using the Morisita-Horn index, as the Pearson correlation test informed that it was correlated to Jaccard and Bray-Curtis indices. Their absolute values were square root transformed to meet the assumptions of general linear models and then multiplied by -1 to facilitate interpretation.

Using the resulting LMMs, we predicted the RR after 73, 146, and 219 years of recovery [i.e., one, two, and three times the global life expectancy in 2019]. We then predicted the time needed for forest ecosystems to recover to 90% of reference values for each trajectory and recovery metric and calculated the median by metric. Also using the resulting LMMs, we predicted the RR after 50 and 100 years of recovery for each metric and trajectory (1) to know if the recovery completeness is dependent on the metric and (2) to understand the main explanatory variables underlying the recovery process for each metric. We fitted linear models (LM) to analyse the difference in the RR after 50 years and after 100 years of recovery among recovery metrics. The models had the recovery metric and the intercepts of the LMMs for each trajectory as fixed factors. The latter was included to account for the effect of the initial state of degradation when recovery started. We then fitted a separate LM for the effect of each explanatory variable studied (i.e., aridity, disturbance category, restoration strategy or life form) on the RR predictions after 50 and 100 years of all recovery metrics together, and then for each recovery metric individually. In all the cases, the intercept of the LMMs for each trajectory was also included as a fixed factor to account for the effect of the initial state of degradation when recovery started. For the models fitted for the disturbance category and the life form, we excluded the categories with <1% of the values (i.e., “mining” for disturbance and “bird” for life form) or those including data with mixing information from other categories (i.e., “agriculture and logging” for disturbance and “woody and non-woody” for life form).

## Funding

Fundación Tatiana Pérez de Guzmán el Bueno

Agencia Estatal de Investigación, Award: 2018-2022 MDM-2017-0714

Real Colegio Complutense