Skip to main content

Landscape structure, predictability of forest regeneration trajectories, and recovery rate on secondary forests

Cite this dataset

Rito, Kátia F. et al. (2022). Landscape structure, predictability of forest regeneration trajectories, and recovery rate on secondary forests [Dataset]. Dryad.


Abandonment of agricultural lands promotes the global expansion of secondary forests, which are critical for preserving biodiversity and ecosystem functions and services. Such roles largely depend, however, on two essential successional attributes, trajectory and recovery rate, which are expected to depend on landscape-scale forest cover in non- linear ways. This dataset is the synthesis outcome of 22 independent databases from studies of woody plant species recovery as part of the research project entitled "Impacts of landscape structure on secondary tropical forest regeneration". This work aimed to understand the effect of landscape-level disturbance on forest regeneration, specifically through the predictability of trajectories and the recovery rate of these forests.

Using a multiscale approach and a large vegetation dataset (843 plots, 3511 tree species) from 22 secondary forest chronosequences distributed across the Neotropics, we show that successional trajectories of woody plant species richness, stem density, and basal area are less predictable in landscapes (4-km radius) with intermediate (40-60%) forest cover than in landscapes with high (>60%) forest cover. This supports theory suggesting that high spatial and environmental heterogeneity in intermediately deforested landscapes can increase the variation in key ecological factors for forest recovery (e.g. seed dispersal, seedling recruitment), increasing the uncertainty of successional trajectories. Regarding the recovery rate, only the species richness is positively related to forest cover in relatively small (1-km radius) landscapes. These findings highlight the importance of using a spatially-explicit landscape approach in restoration initiatives and suggest that these initiatives can be more effective in more forested landscapes, especially if implemented across spatial extents of 1-4 km radius. 


We compiled 22 independent databases from studies of woody plant species recovery across five Neotropical countries. Each study included plots established in secondary forest stands of different ages forming a chronosequence. We used taxonomic species richness, density of individuals, and total basal area per plot to evaluate the successional trajectories and recovery rate of vegetation structure. We calculated the extrapolated values of species richness considering the maximum sample coverage (𝐶n = 1.0), following the protocols proposed by Chao & Jost (2012). To assess the predictability of successional trajectories, we related each community attribute (species richness, density of individuals, and basal area) to stand age for each chronosequence (n = 22). We derived the adjusted R²adj  values from generalized additive models (GAMs) to use as a proxy for predictability of the successional trajectories. Because R²adj represents the fraction of the variance in the dependent variable that is explained by the independent variable (Crawley 2012), this parameter can be used as a proxy of the predictability of the relationship between each vegetation attribute and stand age.  To assess the recovery rate of successional trajectories, we extracted the predicted value of the GAM relating each vegetation attribute and stand age for the fixed age of 15 and 20 years of succession. Then, we calculated the recovery rate values through the equation: [15 𝑦𝑒𝑎𝑟𝑠 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑣𝑎𝑙𝑢𝑒 ― 20 𝑦𝑒𝑎𝑟𝑠 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑣𝑎𝑙𝑢𝑒 / 5 ] , where 5 corresponds  to the age interval in years. This measure was established under the assumption that five years is a short interval of recovery and therefore presents a lineal behavior. 

Imagery selection and pre-processing

For each site we defined a landscape of 10-km radius from the centroid of the set of plots from each study. We choose this radius to standardize landscape size and enable an adequate analysis of landscape-level orest cover for the chronosequence stands in all databases. We obtained Landsat ETM+ and Landsat 8 satellite imagery with 30-m spatial resolution in the multispectral bands from the United States Geological Survey database (USGS, Images were selected based on the location of the landscape of interest, the year of vegetation inventories of each study, and cloudiness. For databases containing data collected in different years, we selected imagery corresponding to the median year of the study and containing less than 10% cloudiness. During pre-processing of images, we corrected for cloud cover by creating both a cloud and a cloud shadow mask using the Cloud Masking tool and fmask function in QGIS 2.18.14 software (Quantum GIS Development Team 2009) as recommended for Landsat TM/ETM +/OLI/TIRS images (Zhu & Woodcock 2012). For each image in a landscape, we conducted panchromatic and spectral image fusion (i.e. pansharpend compound) to improve spatial resolution in the Landsat image using the IHS (Intensity Hue Saturation) method (Zhang 2002). Due to a failure in the Scan Line Corrector (SLC) of the Landsat ETM satellite sensor since May 2003, some images have wedge-shaped gaps on each side, resulting in the loss of ca. 22% of information. To correct this, we applied the Gapfill tool with the ENVI 4.7 program (ENVI 2008) according to the filling technique developed by Scaramuzza et al. (2004). This technique fills gaps in a Landsat image with data from another image and applies a linear transformation to adjust the corrected image based on the standard deviation and mean values of each band of each scene (Scaramuzza et al. 2004).

Image classification and estimation of percent forest cover

We carried out a supervised classification of images based on training data and validation. We considered three categories of land cover in the classification: native forest cover, agricultural lands, and other land covers (e.g. water, human settlements). Forest cover included both old-growth and late successional second-growth forests because vegetation structure in the later forest type is quite similar to old-growth forests (Poorter et al. 2016; Rozendaal et al. 2019). First, we selected regions of interest based on expert knowledge (i.e. polygons with land cover information) of the raster layer as a reference to classify unknown pixels by comparing the digital value of pixels with training data (Canty 2007). To this end, we used the Support Vector Machine non-parametric method for non-linear data (SVM). This method uses the Kernel class of algorithm (Foody & Mathur 2004; Tamma et al. 2013). Overall satellite image classification accuracy was relatively high (>85%). To reduce the salt and pepper effect, we applied post-classification Majority/Minority analysis. Next, we used the classified vectors to estimate the percent forest cover within each study landscape, using ten differently sized buffers, ranging from 1 to 10 km, at 1 km intervals (corresponding to landscapes of 314.1 to 31,415.6 ha). We next calculated forest cover for each buffer using the Dinamica EGO 4 program ( All classified vectors were sent to the authors of each database for revision and approval before estimation of the percent forest cover.


Universidad Nacional Autónoma de México