Skip to main content
Dryad

Environmental heterogeneity explains contrasting plant species richness between the South African Cape and southwestern Australia

Cite this dataset

van Mazijk, Ruan; Cramer, Michael D.; Verboom, G. Anthony (2021). Environmental heterogeneity explains contrasting plant species richness between the South African Cape and southwestern Australia [Dataset]. Dryad. https://doi.org/10.5061/dryad.8w9ghx3m8

Abstract

Aim: To assess whether a difference in species richness per unit area between two mediterranean-type biodiversity hotspots is explained by differences in environmental heterogeneity.

Location: The Greater Cape Floristic Region, South Africa (GCFR) and Southwest Australian Floristic Region (SWAFR).

Taxon: Vascular plants (tracheophytes).

Methods: Comparable, geospatially explicit environmental and species occurrence data were obtained for both regions and used to generate environmental heterogeneity and species richness raster layers. Heterogeneity in multiple environmental variables and species richness per unit area were compared between the two regions at a range of spatial scales. At each scale, richness was also regressed against these individual axes and against a major axis of heterogeneity, derived by principal component analysis (PCA).

Results: The GCFR is generally more environmentally heterogeneous and species-rich than the SWAFR. Species richness per unit area is significantly related to the major axis of heterogeneity across both regions, the latter describing ca. 38-50% of overall heterogeneity, the slope of this relationship differing between the two regions only at the finest spatial scale. Multivariate regressions, and regressions against the first axes of the PCAs (PC1), revealed variations in the dependence of species richness on environmental heterogeneity between the two regions.

Main conclusions: Notwithstanding some region-specific effects, we present evidence of a common positive relationship between floristic richness and environmental heterogeneity across the GCFR and SWAFR. This is dependent on spatial scale, being strongest at the coarsest level of sampling. The generally greater richness per unit area of the GCFR compared to the SWAFR is thus explained by the former’s generally greater environmental heterogeneity and is concordant with its greater levels of floristic turnover.

Methods

Preamble

Comparable, geospatially explicit environmental and vascular plant species' occurrence data were obtained for both regions (the Greater Cape Floristic Region [GCFR] and Southwest Australian Floristic Region [SWAFR]) and used to generate environmental heterogeneity and species richness raster layers. Heterogeneity in multiple environmental variables and species richness per unit area were compared between the two regions at a range of spatial scales. At each scale, richness was also regressed against these individual axes and against a major axis of heterogeneity, derived by principal component analysis (PCA).

Species occurrence data cleaning

[Taken from Supporting Information]

Firstly, we retained only records identified to the species level, and ignored intraspecific taxa. This resulted in the retention of 14,147 and 8,912 unique species names for the GCFR and SWAFR, respectively. The R package “taxize” (Chamberlain et al., 2016) was then used to query each species name against two major taxonomic databases, the Global Name Resolver (GNR) and the Taxonomic Name Resolution Service (TNRS; Boyle et al., 2013). Where either or both databases returned a match for a name, the name was retained; where not, it was excluded. Although the number of species thus excluded is high (GCFR: 692; SWAFR: 1,171), the geographically random distribution of the records associated with these names suggests that exclusion of these names will not significantly influence spatial patterns of species richness.

In order to ensure that no species were listed under multiple synonyms, the retained names were then queried against the Tropicos and Integrated Taxonomic Information System (ITIS) for known synonyms, again using “taxize”. We then removed all records of species identified as non-native, using lists of invasive plants for South Africa and Australia from the IUCN’s Global Invasive Species Database (http://www.iucngisd.org/gisd/).

Finally, we removed species with fewer than five total collection records in total, in order to exclude collections with potentially low-confidence identifications. This, and the exclusion of occurrence data originating from coastal pixels at the 0.05° resolution, brought the total number of species in each region down to 9,419 and 6,696 in the GCFR and SWAFR respectively.

Species richness data

[Taken from main text]

To compare vascular plant species richness between the GCFR and SWAFR, geospatially explicit occurrence records of tracheophytes from within the borders of each region were obtained from the Global Biodiversity Information Facility (GBIF; Table S1). Occurrence data were cleaned using the “taxize” package (Chamberlain et al., 2016) for R (R Core Team, 2019) that was also used for all other analyses (see Supporting Information). Despite spatial variability in collection effort in both regions, we used raw species counts to estimate QDS-scale species richness on the basis that the application of rarefaction techniques severely distorts known richness patterns when applied to the South African flora (Cramer & Verboom, 2016). The final numbers of unique species thus identified as occurring in the GCFR and SWAFR, respectively, were 9,419 and 6,696. To ensure that species richness was compared and decomposed (see below) across equally sized area units, we included only squares comprising four constituent sub-squares (e.g. four QDS in an HDS). While this resulted in the loss of several coastal squares, which is unfortunate because the coastal floras of both the GCFR and SWAFR are rich in endemic taxa, it is not expected to introduce any systematic biases. Overall, we retained 362 of ca. 449 QDS in the GCFR and 624 of ca. 737 in the SWAFR (ca. 81% and 85% sampling, respectively).

The cleaned species occurrence record data were collated into QDS, HDS and DS (sensu Larsen et al., 2009). In addition, following the additive decomposition (Veech et al., 2002) of Whittaker’s (1960) γ-diversity, we decomposed the species richness of each HDS (SHDS) and DS (SDS) into its α (“plot” richness) and β (turnover) components, as the average species richness of the four constituent squares in each HDS and DS, respectively, and TQDS and THDS represent the residual (i.e. turnover-based) β richness, determined as γ − α.

Environmental heterogeneity data

[Taken from main text]

Georeferenced environmental data1 and vascular plant species occurrence data sources used in this study. Data were acquired for the GCFR and SWAFR, with the temporal extent of data products used described where applicable.

Dataset(s)

Source

Temporal extent

Citation(s)

Plant species occurrences

GBIF

 

GBIF (2017a,b)

Elevation

SRTM (v2.0)

 

Farr et al. (2007)

NDVI, Surface T

MODIS (v006)

Feb. 2000 to Apr. 2017

NASA (2017a,b)

MAP, PDQ

CHIRPS (v2.0)

Jan. 1981 to Feb. 2017

Funk et al. (2015)

CEC, clay, soil C, pH

SoilGrids250m

 

Hengl et al. (2017)

Abbreviations are as follows: NDVI, normalized difference vegetation index; T, temperature; MAP, mean annual precipitation; PDQ, precipitation in the driest quarter; CEC, cation exchange capacity; C, carbon.

To compare environmental heterogeneity between the GCFR and SWAFR, we acquired a suite of nine geospatially-explicit environmental variables [table above] in the form of raster layers to represent topographic (elevation), climatic (surface temperature, T; mean annual precipitation, MAP; precipitation in the driest quarter, PDQ), edaphic (clay content; soil carbon, C; pH; cation exchange capacity, CEC) and vegetational gradients (normalized difference vegetation index, NDVI). Wherever possible, we made use of remote sensing derived layers that are comparable between the two regions. As far as possible (see Supporting Information), these variables were selected to represent environmental axes which are considered regionally important and independent (Figure S1–3). Soil variables were summarized as depth-interval weighted averages and climatic and spectral variables as annual means using the “raster” package for R (Hijmans, 2016). All layers were then projected to a common coordinate reference system (WGS84) using the “rgdal” package (Bivand et al., 2017) and resampled bilinearly to 0.05º resolution.

In order to quantify heterogeneity in these environmental variables, we developed an index that would account for the spatial configuration of environmental conditions. Making use of raster data, this employs nested squares at various spatial scales (see Supporting Information). We quantified the environmental heterogeneity of a given square (i.e. 0.10°×0.10°-, QDS-, HDS- and DS-scale) as the variance of the environmental conditions of its four sub-squares (i.e. 0.05°×0.05°-, eighth-degree square-, QDS- and HDS-scale). Since our index measures within-square heterogeneity at each spatial scale, it can be related directly to species richness at the QDS-, HDS- and DS-scales.

We used principal components analysis (PCA), applied to the nine environmental variables across both regions, to extract a major axis of environmental heterogeneity. For this purpose, the layers describing heterogeneity in the nine environmental variables at each spatial scale were first log10-transformed to ensure normality. A separate PCA was then run at the four spatial scales. The first axis (PC1) from each represents a major axis of heterogeneity across the nine environmental heterogeneity variables considered (see Figure S4).

Usage notes

[Taken from README]

Be sure to un-zip all the .zip-files, as these contain shape-files needed as inputs for parts of the analysis.

CSV-files

Inputs

  • `cleaned-species-occ_GCFR.csv`
  • `cleaned-species-occ_SWAFR.csv`

Cleaned and filtered vascular plant species (= Tracheophyta) occurrence data based on that from GBIF, within the GCFR and SWAFR. Note, this includes occurrences of taxa with <5 occurrences in either region (as opposed to the shape-file `species_occ2.zip`, below).

Citations:

  • GBIF.org (24 July 2017) GBIF Occurrence Download. DOI: https://doi.org/10.15468/dl.n6u6n0. URL: https://www.gbif.org/occurrence/download/0005227-170714134226665.
  • GBIF.org (24 July 2017) GBIF Occurrence Download. DOI: https://doi.org/10.15468/dl.46okua. URL: https://www.gbif.org/occurrence/download/0005227-170714134226665.

Outputs

Analyses' results:

  • `comparing-residuals-w-and-wo-outliers_F-tests.csv`
  • `comparing-residuals-w-and-wo-outliers.csv`
  • `list-outlier-squares.csv`
  • `multivariate-model-ANOVAs.csv`
  • `multivariate-model-results_refit.csv`
  • `multivariate-model-results.csv`

The summaries and results from univariate models at various spatial scales have file-names of the form:

`<scale>_richness_univariate_model_results.csv`

E.g.: `QDS_richness_univariate_model_results.csv`

Outputs based on raster-files:

These contain the species richness and environmental heterogeneity data in GCFR and SWAFR grid cells. They have file-names of the form:

`species-richness_<scale>.csv`

E.g.: `species-richness_QDS.csv`

`heterogeneity_<scale>.csv`

E.g.: `heterogeneity_QDS.csv`

Shape-files (inputs only)

Be sure to un-zip all these .zip-files, as these contain the shape-files needed as inputs for parts of the analysis.

The boundaries of the two regions used here:

  • `GCFR_border_buffered.zip`
  • `SWAFR_border_buffered.zip`

Various spatial scales' grid-cell lattices, with file-names of the form:

`Larsen_grid_<scale>.zip`

E.g.: `Larsen_grid_QDS.zip`

Lastly, `species_occ2.zip` contains the cleaned and filtered vascular plant species (= Tracheophyta) occurrence data, additionally filtered to only contain occurrences of taxa with ≥5 occurrences in either region, as a shape-file.

Raster-files

Note, the suffixes of file-names here denote the spatial scale, as follows:

  • QDS = quarter-degree square resolution
  • HDS = half-degree square resolution
  • DS  = degree square resolution

Inputs

Raster versions of the grid-cell lattice shape-files above, with file-names of the form:

`Larsen_grid_<scale>_ras.tif`

E.g.: `Larsen_grid_QDS_ras.tif`

Absolute environmental variables (0.05ºx0.05º resolution) in each region separately, with file-names of the form:

`<region>_<variable>.tif`

E.g.: `GCFR_CEC.tif`, `SWAFR_CEC.tif`

Outputs

Raster-form species richness data, with file-names of the form:

`species-richness_<scale>.tif`

E.g.: `species-richness_QDS.tif`

`mean-<sub-cell scale>-richness_<scale>.tif`

E.g.: `mean-QDS-richness_HDS.tif`

Environmental heterogeneity layers based on the absolute environmental variables above (derived as the variance of sub-grid-cell values within a grid-cell). The file-names are of the form:

`heterogeneity-<variable>_<scale>.tif`

E.g.: `heterogeneity-CEC_QDS.tif`

And finally, raster-form residuals from PC1-based univariate models and multivariate models of species richness at various spatial scales. File-names are of the form:

`MV-residual_<scale>.tif`

E.g.: `MV-residual_QDS.tif`

`PC1-residual_<scale>.tif`

E.g.: `PC1-residual_QDS.tif`
 

Funding

National Research Foundation (South Africa)

South African Association of Botanists

National Research Foundation (South Africa)