Skip to main content

Data from: Seeing the wood despite the trees: exploring the impact of human disturbance on plant diversity, community structure, and standing biomass in fragmented high Andean forests

Cite this dataset

Calbi, Mariasole (2021). Data from: Seeing the wood despite the trees: exploring the impact of human disturbance on plant diversity, community structure, and standing biomass in fragmented high Andean forests [Dataset]. Dryad.


High Andean forests harbor a remarkably high biodiversity and play a key role in providing vital ecosystem services for neighboring cities and settlements. However, they are among the most fragmented and threatened ecosystems in the neotropics. To preserve their unique biodiversity, a deeper understanding of the effects of anthropogenic perturbations on them is urgently needed. Here, we characterized the plant communities of high Andean forest remnants in the hinterland of Bogotá in 32 0.04 ha plots. We assessed the woody vegetation, and the understory and epiphytic cover were sampled. We gathered data on compositional and structural parameters and compiled a broad array of variables related to anthropogenic disturbance, ranging from local to landscape-wide metrics. We also assessed phylogenetic diversity and functional diversity. We employed non-metric multidimensional scaling (NMDS) to select meaningful variables in a first step of the analysis. Then we performed partial redundancy analysis (pRDA), and generalized linear models (GLMs) in order to test how selected environmental and anthropogenic variables are affecting the composition, diversity, and above-ground biomass of these forests. Identified woody vegetation and understory layer communities were characterized by differences in elevation, temperature and relative humidity, but that were also related to different levels of human influence. We found that the increase of human-related disturbance resulted in less phylogenetic diversity and in the phylogenetic clustering of the woody vegetation and in lower above-ground biomass (AGB) values. As to the understory, disturbance was associated with a higher diversity, jointly with a higher phylogenetic dispersion. The most relevant disturbance predictors identified here were: edge effect, proximity of cattle, minimum fragment age, and median patch size. Interestingly, AGB was efficiently predicted by the proportion of late successional species. We therefore recommend the use of AGB and abundance of late successional species as indicators of human disturbance on high Andean forests.


Tree and shrub layer sampling: for every woody plant with basal diameter >5 cm (measured at 5 cm from the ground – DAH: Diameter at ‘ankle’ height), we recorded its DAH, DBH and visually estimated tree height. Plant material was collected and identified with the available literature, by comparison with herbarium specimens, digitized specimens available online, or with additional help from local experts. Specimens were deposited in the herbarium of the Jardín Botánico de Bogotá José Celestino Mutis (JBB); high resolution digital specimen images can be provided upon request.

Understory layer sampling: In each 20 x 20 m plot, eight 1 x 1 m quadrants (with marked 10 cm sub-grids) were placed randomly. All vascular plants, including tree seedlings, were recorded and mean height and total cover (the sum of all individuals cover) were measured for every species in each quadrant. Plant material was collected and identified with the available literature, by comparison with herbarium specimens, digitized specimens available online, or with additional help from local experts. When available, fertile material was collected and deposited in the JBB. 

Macro environmental variables: for each plot, macro-environmental variables were compiled from different sources in QGIS. Altitude, slope, and aspect (northness and eastness) were derived from an Aster Digital elevation model. Mean annual precipitation and mean and maximum temperature data for the period 1981–2010 were obtained from the IDEAM meteorological station closest to each plot. Mean population density was extracted in two buffers (radius 1 km and 5 km) around the plots from the worldpop database for South America at 1 ha resolution.

Functional diversity: three leaf functional traits (specific leaf area: SLA; leaf thickness: LT; and leaf dry matter content: LDMC) were measured for each tree species. The traits used to estimate functional diversity were SLA, LDMC, LT, WD, maximum recorded height in the plots, and life form (tree or shrub). Computation of functional divergence, functional dispersion, functional richness, functional evenness, and Rao’s quadratic entropy (FDiv, FDis, FRic, FEve and Rao’s Q) was performed using the R package FD.

Landscape metrics: A Landsat 8 raster was downloaded from the US Geological Survey and processed in QGIS with the SCP plugin to obtain a land cover map. Landscape metrics refer to the size, shape, configuration, number, and position of land use patches within a landscape and were obtained for the forest class within a 1000 m diameter buffer zone around the plots with the LecoS QGIS plugin. Additionally, fragments of forests were manually vectorized and the area was calculated on a prepared Bing aerial map obtained through the Openlayers QGIS plugin. Distance to closest roads was calculated with the NNJoin QGIS plugin on a shapefile downloaded from the DANE website. Also, the type of closest road (main, secondary, or track) was noted. Distances to closest houses or tracks were manually measured on the map. Presence or absence of cattle or active cultivated fields in different buffers (0 m, 50 m, 100 m or 500 m radius) was surveyed in the field.  Minimum age of the forest cover of each plot was estimated through the visual analysis of 43 aerial pictures of the plot locations acquired from the IGAC (Instituto Geográfico Agustín Codazzi, Bogotá).

Community composition and structural variables: tree layer species were classified either as late successional slow-growing, early successional fast-growing, exotic or ‘other’. Additionally, understory exotic species cover was calculated. The number of species and the relative proportion of individuals (in case of trees) or the percent cover (in case of the understory) of exotic species were used as indicators of disturbance vs. conservation. Variance of tree DBH and height was also computed across all trees within each plot, together with the overall number of tree individuals, stems, stems per tree, and the percentage of large trees (DBH >30 cm). Mean understory height and cover was calculated, as well as mean epiphytes cover. The Gini coefficient was calculated in each plot for stem basal areas with the gini function in the R package reldist.

Taxonomic and phylogenetic diversity: alpha-Diversity indices (Shannon’s diversity, Simpson’s and Pielou’s evenness) were computed for each plot with the R package vegan. Phylogenetic community structure was assessed on the basis of a published angiosperm supertree. Phylogenetic diversity (PD), mean pairwise distance (MPD), mean nearest taxon distance (MNTD), and their standardized counterparts (sesPD, sesMPD and sesMNTD) were calculated for both trees and understory in the R package picante. Moreover, abundance‐weighted MPD and MNTD were calculated to account for differences in species abundance. The standardized PD metrics express the difference between observed and average value in units of standard deviation (sd).

Above-ground biomass: above-ground tree biomass was calculated with the R package biomass. Field measurements of DBH contained less than 5% missing data, so imputation of missing values was performed with the R package mice. To balance the missing data in height measurements, a regional diameter-height model was built in biomass. Error propagation was carried out using the AGBmonteCarlo function. Wood density error (errWD) was obtained with the getWoodDensity function as prior values on the uncertainty on wood density values, obtained using the mean sd at the species, genus and family levels of taxa having at least 10 wood density values in the Global Wood Density database. Height error (errH) was calculated as the RSE resulting from the local height-diameter models, and diameter measurements propagation error (Dpropag)was set to "chave2004", which assigns a standard important error on 5 percent of the measures, and a smaller error on 95 percent of the trees. Mean stand above-ground biomass (AGB) and 95% credibility interval following the error propagation were calculated with the following equation: AGB = 0.0673 * (WD * H * D2)0.976; where AGB = above-ground biomass [kg], WD = wood density [g/cm3], H = height [m], and D = DBH [cm]. Mean AGB per tree was calculated by dividing the total AGB value of each plot by the number of tree individuals.

Usage notes

The tree sampling data provided here is relative only to the 12 plots established by Calbi, M. For the complementary dataset of the Rastrojos project please refer to: 

For details on utilized units of measurement and complete variables calculation methodology please refer to the article: "Seeing the wood despite the trees: exploring the impact of human disturbance on plant diversity, community structure, and standing biomass in fragmented high Andean forests".


Federal Ministry of Education and Research, Award: ColBioDiv - 01DN17006