Skip to main content

Conservation of woody species in China under future climate and land-cover changes

Cite this dataset

Peng, Shijia et al. (2021). Conservation of woody species in China under future climate and land-cover changes [Dataset]. Dryad.


  1. Climate and land-cover changes are major threats to biodiversity, and their impacts are expected to intensify in the future. Protected areas (PAs) are crucial for biodiversity conservation. However, their effectiveness under future climate and land-cover changes remains to be evaluated. Moreover, the impacts of climate and land-cover changes on multi-dimensions of biodiversity are rarely considered when expanding PAs.
  2. Using distributions of 8732 woody species in China and species distribution models, we identified species that will be threatened by future climate and land-cover changes (i.e. species with significant projected loss of suitable habitats by the 2070s) under different dispersal scenarios. We then estimated the geographical patterns in species richness (SR) and phylogenetic diversity (PD) of these threatened species, evaluated the effectiveness (i.e. the changes in SR and PD) of Chinese PAs, and identified conservation priorities for future PA expansion.
  3. Approximately 12-38% of woody species will be threatened under different scenarios. These species tend to be clustered in the tree of life, and their SR and PD show consistent spatial patterns, being highest at low latitudes. PAs currently protect 90% of these threatened species. However, their SR and PD of threatened species within PAs will decrease by 30-40% by the 2070s, which reduces the PA effectiveness, especially for PAs at low elevations and those with low topographic heterogeneity and high natural vegetation loss.
  4. The conservation priorities identified from the SR and PD of the threatened species are mainly in mountains in southern China, the Yunnan-Guizhou Plateau, and Taiwan Island. PA expansion and ecological corridors in these regions are needed to conserve these threatened species.
  5. Synthesis and applications. We present a systematic study of the impacts of future climate and land-cover changes on the conservation status of woody species and PA effectiveness in China. Our results suggest that future climate and land-cover changes will reduce PA effectiveness, and the spatial prioritization of biodiversity conservation should consider the influences of future global changes on biodiversity. These results shed new light on the conservation priorities for the post-2020 expansion of PAs in China.


Species distribution data

The county-level distributions of woody species in China were compiled from the Atlas of Woody Plants in China: Distribution and Climate, which contains a total of 11405 native woody species (Fang et al., 2011; Wang et al., 2011). We substantially updated the database using more than 40 volumes of national and regional floras published from 2009-2018 (see Appendix S1), and available specimen records in China ( To further improve the accuracy of species distributions, the elevational range and habitat types of each species were compiled and used to refine their distributions within each county. To eliminate the bias of unequal areas on the estimates of species diversity, we transformed species distributions into 20 × 20 km2 grid cells with the Albers equal-area projection. After removing the incomplete grid cells with area less than 200 km2 within national borders and coasts, a total of 23718 grid cells (ca. 94%) remained. Previous studies suggested that the performance of SDMs depends on the sample size of the species distributions. Therefore, we excluded species with fewer than 20 distribution records from our analyses (Thuiller et al., 2011). Finally, 8732 species were included for the subsequent analyses.

Climate and soil data

The current climatic data during the period 1960-1990 with a spatial resolution of 2.5' (ca. 5×5 km at the equator) were obtained from the WorldClim (ver. 1.4; In total, 19 variables were obtained, which were divided into three groups: 1) temperature 2) precipitation and 3) climate seasonality. We calculated the Pearson’s correlation coefficients among all variables, and they were highly correlated with each other within groups (r > 0.7), but not among groups. For each group, we chose those with high ecological relevance to woody plant distributions following previous studies (Thuiller et al., 2011; Zhang et al., 2017). Finally, 6 climate variables were chosen: mean temperature of the warmest and the coldest quarters, temperature seasonality, precipitation of the wettest and the driest quarters, and precipitation seasonality. The climatic data of each grid cell were calculated as the average of all data points within it.

Climate data for the 2070s under three climate scenarios (i.e. RCP 2.6, RCP 6.0 and RCP 8.5) were also obtained from WorldClim. We chose 2070 as a critical point since a global assessment show that nearly 1/3 species will go extinct by 2070 (Román-Palacios & Wiens, 2020). Here, we used the data projected by the BCC-CSM1-1.1 model, which could better reflect the magnitude of future climate change in China compared with other models (Xin et al., 2013).

We also included four commonly-used soil variables reflecting physical (i.e. sand-clay ratio, bulk density) and chemical properties (soil water pH and organic carbon concentration). Soil data were obtained from the SoilGrids 250m database (, and the value for each 20×20 km grid cell was calculated as the average of all data points within it. We used data for the top soil (0-15cm depth), and assumed soil indices were constant over time.   

Land cover data

Current and future land-cover data with a spatial resolution of 1×1 km2 were obtained from Li et al. (2017). This dataset includes the following layers: forests, shrublands and grasslands (including savanna-like grasslands), barren, farmland, urban and water. The first three layers reflect the coverage of natural land-cover types, while farmland and urban reflect human-related ones. Decreases in natural land-cover types and increases in human-related ones could significantly influence biodiversity patterns (de Chazal & Rounsevell, 2009). We calculated the proportion of each land-cover type within each grid cell for the current and future periods, and adopted each as an independent predictor in the SDMs. We excluded water because they represent habitats not suitable for terrestrial plants. It is noteworthy that the data of future land cover were projected under earlier emission scenarios (i.e. A1B, A2, B1 and B2). We matched three of these emission scenarios, i.e. B1, A1B and A2, with RCP2.6, RCP6.0 and RCP8.5, respectively (IPCC, 2014). Finally, six bioclimatic variables, four soil variables and five land-cover variables were selected for SDM analyses.

Protected area data

The PA data in China were obtained from Zhang et al. (2015). We only incorporated national and provincial PAs as most of prefectural and county-level PAs are inefficiently managed due to inadequate funding, staff and equipment (Quan et al., 2009). We further excluded another 30 PAs located in the "mangrove biome", which are normally considered as marine PAs. Finally, our database included 1161 PAs. We rasterized these PA polygons at the same resolution as the species distribution data via layer intersection. A grid cell was identified as a "protected cell" if the percentage of its area covered by PA polygons was higher than a threshold. Here, three thresholds (10%, 30% and 50%) were used, and preliminary analyses suggest that the results based on these thresholds were highly consistent. Therefore, we only presented the results based on the threshold of 30% following the post-2020 conservation target (i.e. a global 30% coverage).

To explore the drivers of the effectiveness of PAs under climate and land-cover changes, we evaluated the impacts of five PA characteristics: 1) PA area; 2) topographic heterogeneity; 3) elevation; 4) human footprint and 5) the percentage of natural vegetation change. The area of each PA was obtained from Zhang et al. (2015). Human footprint represents the cumulative impacts of human pressures on environments, and was calculated using eight variables: human population density, buildings, electric infrastructure, roads, railways, navigable waterways, cropland, and pasture (Venter et al., 2016). Elevational range within each PA was estimated as the difference between the highest and lowest elevation and was used to represent the topographic heterogeneity (Stein & Kreft, 2015). Elevation data with a spatial resolution of 30″were obtained from the U.S. Geological Survey ( The natural vegetation here included forests, shrublands and grasslands (including savanna-like grasslands) (Li et al., 2017).

Species distribution models (SDMs)

Five SDM algorithms were used, including classification tree analysis (CTA), the generalized linear model (GLM), the generalized boosting model (GBM), random forest (RF) and maximum entropy (MaxEnt). These SDMs are widely used in previous studies (e.g. Zhang et al. 2017). For each species, the distribution data were randomly partitioned into training (80%) and testing (20%) data, which were used to calibrate and test the SDMs, respectively (Franklin & Miller, 2010). We repeated this process 10 times for each species, and each SDM, and evaluated the model performance based on true skill statistics (TSS) (Allouche et al., 2006). Models with TSS > 0.5 were used to project species distributions under current and future conditions (Thuiller et al., 2009). As different SDM algorithms have different performance, an ensemble method was used to generate the predicted suitability of each species. Specifically, we used the median of all model predictions. We then further converted the predicted habitat suitability of species into binary maps with the threshold maximizing the TSS (Zhang et al., 2017). SDMs often tend to overestimate the current and future potential ranges of species since dispersal limitations are not considered. Following previous studies (e.g. Kremen et al., 2008), we used a 200 km buffer minimum convex polygon (MCP) around the current distribution range of a species to clip its predicted ranges.

Dispersal affects species' ability to track favorable habitats under global changes, and should be considered in the evaluations of global change impacts on species distributions (Acevedo et al., 2020). Here, we assumed three dispersal scenarios: 1) no dispersal; 2) full dispersal; and 3) 20 km/decade dispersal beyond predicted current ranges (Chen et al., 2011). The third scenario was conducted similarly using a 200 km (i.e. 10 decade * 20km/decade) buffer MCP around the predicted current distribution range of each species. All analyses were performed using the biomod2 package in R (R Core Team, 2019).

Identification of threatened species

We identified species that would be threatened in the future under climate and dispersal scenarios based on the relative loss of their suitable areas (LSA). Following similar logic to criterion A3 of the IUCN Red List (IUCN, 2012), we classified all species into five categories: 1) extinct (EX, LSA = 100%); 2) critically endangered (CE, 80%≤LSA < 100%); 3) endangered (EN, 50% <= LSA < 80%); 4) vulnerable (VU, 30% <= LSA < 50%) and 5) least concerned (LC, LSA < 30%). Species with LSA≥30% (i.e. EX+CE+EN+VU) were recognized as threatened species.

Phylogenetic signal and PD of threatened species

A dated genus-level phylogeny of seed plants in China was obtained from Lu et al. (2018). To obtain the species-level phylogeny covering all 8732 woody species included in our study, we first inserted all the studied species into the genus-level phylogeny as basal polytomies of their corresponding genera, and then randomly resolved the polytomies within each genus using the birth-death bifurcation model. We repeated this process 100 times, and generated 100 random phylogenetic trees for the subsequent analyses.

To evaluate whether threatened species were randomly distributed across the phylogeny, we tested the phylogenetic signal in the conservation status of species using D-statistic calculated by the "phylo.d" function of the caper package (Fritz & Purvis, 2010). Here, threatened species were coded as "1" and non-threatened as "0". D-statistic normally ranges from 0 under the Brownian evolution to 1 under random distribution of threatened species across the phylogeny. The significance of D-statistic was tested by comparing the observed D-statistic with the frequency distributions of the simulated values under 1) phylogenetic randomness and 2) Brownian evolution, which was generated from 999 randomizations for each phylogeny.

The current and future geographical patterns in SR and PD of threatened species were estimated. Here, Faith’s PD was used (Faith, 1992). PD was calculated using each of the 100 randomly-resolved phylogenies, and the average across these phylogenies was finally used. Since Faith's PD is normally positively correlated with SR, we conducted a generalized additive model (GAM) between PD and SR, and extracted its residuals (hereafter termed residual PD) to represent the net contribution of phylogenetic relationships to PD after SR is accounted for. Positive (or negative) residual PD suggests higher (or lower) PD than expected given the level of SR (Thuiller et al., 2011).

Protected area network analyses

To assess the effectiveness of the PAs under future climate and land-cover changes, we first estimated the total number and the percentage of PD of all threatened species protected by the current PA network. Second, we calculated the changes in SR (∆SR) and PD (∆PD) of the threatened species within each PA, respectively. The frequency distributions of ∆SR or ∆PD were separately estimated for PAs with ∆SR or ∆PD>0 (group I) and ∆SR or ∆PD<0 (group II). The PAs with ∆SR or ∆PD > the 50% quantile of group I was considered to be "effective", while those with ∆SR or ∆PD< the 50% quantile of group II was considered to be "ineffective". The remaining PAs of both groups, together with those with ∆SR or ∆PD=0 were considered to be "stable". The five PA characteristics were compared between the effective and ineffective PAs using the Kruskal-Wallis test. Moreover, to evaluate the relative importance of the five PA characteristics in explaining PA effectiveness, we conducted Random Forest analysis with PD or SR changes as the response variable, and the five PA variables as the predictors.

To identify the conservation priorities under future climate and land-cover changes, we identified the grid cells 1) not covered by the current PAs, and 2) with the highest SR or PD in the future. Following the "top 5% richness algorithm" (Grenyer et al., 2006; Pollock et al., 2017), the top 5% of the grid cells with the highest SR or PD were considered as areas of conservation priorities (Priority5%). As additional PAs covering 15% of land area in China are required to achieve the post-2020 conservation target, we also used a 15% cut-off. To facilitate the conservation planning, we further overlaid the Priority5% with the county boundaries in China. A county should be given high conservation priorities in the future if: 1) more than 50% of its area was covered by Priority5%; or 2) the sum of the overlapping area was larger than the median area of all counties (i.e. 2081 km2).


National Natural Science Foundation of China, Award: #31988102

National Natural Science Foundation of China, Award: #32125026

National Natural Science Foundation of China, Award: #3210130080

Chinese Academy of Sciences Strategic Priority Research Program, Award: #XDB31000000

Ministry of Science and Technology of the People's Republic of China, Award: #2017YFA0605101

Ministry of Science and Technology of the People's Republic of China, Award: #2018YFA0606104

Chinese Academy of Sciences Strategic Priority Research Program, Award: #XDB31000000