Aim: The African Guineo-Congolian (GC) region is a global biodiversity hotspot with high species endemism, bioclimatic heterogeneity, complex landscape features, and multiple biogeographic barriers. Bioclimatic and geographic variables influence global patterns of species richness and endemism, but their relative importance varies across taxa and regions and is poorly understood for many faunas. We characterized patterns of richness and turnover in endemic amphibians of the GC biodiversity hotspot and evaluated the relative roles of geographic distance and bioclimatic variables in predicting turnover.

Location: West and Central Africa.

Major taxa studied: Amphibians

Methods: We compiled species-occurrence records via field sampling, online databases, and taxonomic literature. Our study used 1205 unique georeferenced records of 222 amphibian species endemic to the GC region. Patterns of species richness were mapped onto a grid with a spatial resolution of 0.5° × 0.5°. We estimated weighted endemism and tested whether endemism was higher than the expected species richness (randomization test). We quantified species turnover using generalized dissimilarity modelling to evaluate the processes underlying observed patterns of species richness in GC endemic amphibians. We explored bioregionalization using agglomerative hierarchical clustering based on the unweighted pair group method with arithmetic averages.

Results: We identified six areas within the lower GC region – forests in Southern Nigeria, Cameroon, Equatorial Guinea, Gabon, Republic of Congo, and Democratic Republic of Congo – as having high species richness of endemic amphibians. The randomization test returned four major areas of significant weighted endemism: Nigeria-Cameroon mountains, forest regions of the Democratic Republic of Congo, Cote d’Ivoire, and Ghana. Our analysis revealed five bioregions for amphibian endemism, four of which were located within the lower Guineo-Congolian forest. Species turnover was strongly related to the geographic distance between grid cells; contributing bioclimatic variables included precipitation of the warmest quarter, mean temperature of the wettest quarter, and mean diurnal temperature range.

Main conclusions: Our results indicate that geographic distance between grid cells is the primary determinant of turnover in GC endemic amphibians, with secondary but significant effects of rainfall- and temperature-related variables. Our results identify key areas of endemic amphibian richness that could be prioritized for conservation actions.

Distributional data

To generate a comprehensive checklist of endemic amphibians of the GC forest, we searched the Amphibiaweb database (https://amphibiaweb.org/; last accessed date: 1 April 2021) and the Amphibian Species of the World database (https://amphibiansoftheworld.amnh.org; last accessed date: 1 April 2021) to confirm the geographic distribution and taxonomic citations of amphibian species. We further searched the International Union for Conservation of Nature (IUCN) Red List database (https://www.iucnredlist.org) to cross-validate each species' geographic range and IUCN Red List status. We considered extant (resident) species in the GC forest for this study. Records with inaccurate/incomplete geographical location points were eliminated. Each species record was double-checked to match the currently recognized species distributions to produce an endemic amphibian species inventory for GC.

After this process, our initial data comprised 310 endemic amphibian species belonging to 41 genera and 17 families. Of these, 305 species were anurans and 5 were caecilians. The most species-rich anuran families were Arthroleptidae (86 species), Hyperoliidae (81 species), Phrynobatrachidae (49 species), Bufonidae (28 species), and Pipidae (16 species). Other families were represented by ≤ 9 species. According to the IUCN Red List, 122 (39%) of the endemic species are considered Least Concern, 78 (25%) are Data Deficient, and 110 (36%) are threatened to varying degrees: Near Threatened (6%), Vulnerable (8%), Endangered (14%) and Critically Endangered (8%).

For the geographic occurrence of each species, we consulted the Global Biodiversity Information Facility (GBIF; https://www.gbif.org; last accessed date: 17 April 2021). We manually performed quality evaluation of the dataset to identify uncertainty and mistakes in the geographic or taxonomic information before further analyses. Each record was reviewed for species identification; we included only individuals with ‘accepted’ status in the ‘taxonomicstatus’ field in GBIF and excluded those with ‘doubtful’ taxonomic status. We also eliminated records with inaccurate geographic information, records that could not be georeferenced, duplicate records, and records outside the GC region. We assembled other species records from published literature and online portals (the University of California’s Museum of Vertebrate Zoology; https://mvz.berkeley.edu; last accessed date: 21 April 2021). We restricted records to those collected between 1980 and the date of access. Lastly, we conducted field surveys to cover an underrepresented area in Nigeria.

Our initial dataset included 7536 valid georeferenced records. We could not obtain georeferenced records for 88 species due to a lack of records in either online databases or published primary literature, which reduced our dataset to 222 species, representing 72% of the endemic amphibian species currently known from the GC region. Of the 7536 valid records, 7476 were from searched databases and primary literature, and 60 additional georeferenced records were from our field surveys in an underrepresented area in Nigeria. Our initial datasets included >500 georeferenced records each from Cameroon, Gabon, the Republic of Congo, Ghana, Equatorial Guinea, and the Democratic Republic of Congo. We obtained 11-499 occurrence records each from Liberia, Nigeria, Guinea, Cote d'Ivoire, Benin, and Sierra Leone. Burkina Faso, Guinea, the Central African Republic, and Togo each had <10 georeferenced records for endemic amphibians.

We performed data cleaning and quality control of the records using the R package CoordinateCleaner (Zizka et al., 2019). This automatically filters common erroneous coordinates in public databases, such as those assigned to the sea, capital cities, or biodiversity institutions (Maldonado et al., 2015). To avoid potential spatial autocorrelation among occurrence records or uneven sampling, we thinned the records by removing localities within a 15 km radius of one another using the package spThin, (Aiello-Lammens et al., 2015), allowing only a single record per pixel (grid cell size of 0.5° × 0.5°). This helps reduce the biased selection of variables or model coefficients (Cruz et al., 2014). After removing duplicates of georeferenced records, we retained 1205 unique records of 222 endemic species for further analysis

Environmental data

We downloaded 19 bioclimatic variables from the CHELSA project (Climatologies at High Resolution for the Earth’s Land Surface Areas; http://chelsa-climate.org/; Karger et al., 2017) at 30 arc-seconds resolution corresponding to a spatial resolution of 1 km at the equator. We also downloaded a digital elevation model (DEM) from https://opentopography.org/about (Krishnan et al., 2011). We then aggregated all rasters to a 0.1° × 0.1° resolution and resampled the DEM to match the spatial resolution of the bioclimatic rasters. We also calculated the slope from the DEM. The largest difference in elevation between each site and its eight nearest neighbors was used to calculate relief roughness (also referred to as terrain ruggedness; Riley et al., 1999). All variables were cropped to a 1-degree buffer around all species records to ensure that the background of the environmental data for the species distribution modelling is not restricted to the inferred species range but also encompasses potentially relevant points lying outside the set of species distribution records. We prepared all data using the R package raster.

We estimated the Variance Inflation Factor (VIF) to measure the degree of multicollinearity among all variables. We removed variables exceeding a VIF threshold of 10 using the packages USDM (Naimi et al., 2014) and retained those without multicollinearity for further analyses. The retained variables included relief roughness and eight bioclimatic variables (Mean Diurnal Range; Maximum Temperature of Warmest Month; Mean Temperature of Wettest and Coldest Quarters; Precipitation of Wettest, Driest, and Warmest Quarters; and Precipitation Seasonality).

Species ranges

Given the recommended minimum sample size of 3 for species range estimation and modelling (van Proosdij et al., 2016), we excluded 58 species with fewer than three unique records, leaving 164 species. Although we acknowledge that a minimum sample size of three is small, we consider it justifiable because (a) it makes our analyses more inclusive and representative of the endemic amphibian fauna; (b) endemic species often occur at low abundance in restricted ranges with specific habitat requirements; (c) 71% of retained species had >10 records and 29% had >3 records; and (d) previous studies show that range-size estimation based on a minimum of 3 samples can nonetheless provide valuable predictions (Hernandez et al., 2006; Pearson et al., 2006; Van Proosdij et al., 2016; Deb et al., 2017; Zizka et al., 2020; Bharti et al., 2021; Ye et al., 2021). For the 164 species with ≥ 3 records, we used a two-tier approach to estimate range sizes depending on the total number of records per species. For species with 3–10 unique records (n = 47), which are likely to be rare and range-restricted, we used the minimum convex polygon (MCP) method (Mohr, 1947; Bekoff & Mech, 1984) to estimate range sizes using a half-degree buffer to limit over-prediction (Whitfield, 1984). We used MCP because previous studies (Kazmaier et al., 2002; Row & Blouin-Demers, 2006) have supported the use of this method for estimating the range size of reptiles and amphibians in general and for range-restricted amphibians in particular (Watson et al., 2003; Miaud and Sanuy, 2005; Blomquist & Hunter, 2009; Boenke, 2011). For species with >10 unique records (n = 117), which are likely to be widespread, we estimated MCPs using a one-degree buffer to ensure that ecologically relevant background points (pseudo-absences) are included in the species distribution modelling. Thereafter, we removed unsuitable areas within polygons following Förderer et al. (2018). To do this, we used species distribution modelling to overlay the MCPs and estimate the final species ranges.

We used the Maxent algorithm for species distribution modelling (Phillips et al., 2006). Using the ENMevaluate function from the R package ENMeval (Muscarella et al., 2014), we performed model selection with testing parameters such as feature class combinations (linear, quadratic, product, and threshold). Regularization multipliers were selected in Maxent using ENMevaluate. Using cross-validation (k = 10), we evaluated models and selected those with second-order Akaike Information Criterion difference (∆AICc) < 2 and Area Under the Receiver Operator Curve (AUC; a measure of the diagnostic test accuracy) > 0.75. We considered AUC values > 0.75 as useful for modelling based on the recommendations of Elith (2002) and the results of previous studies on amphibians (e.g., Urbina-Cardona & Loyola, 2008; Milanovich et al., 2010; García et al., 2014; Barrett et al., 2014). To delineate the extent of the projected data within the polygons for each species, we generated ~1000 background points. To select background points in each grid cell, we generated a sampling-bias layer based on historical sampling for African amphibians estimated by Farooq et al. (2021). In this way, more ecologically relevant background points were selected for the SDM, thus fairly reflecting the ecological variability and sampling history of the study region. Finally, we used a threshold based on maximum specificity and sensitivity to produce binary predictions (presence-absence), which we converted into range maps for each species (Nenzen & Araujo, 2011).

Species richness and endemism

To evaluate biogeographic patterns of endemic amphibian species richness across the GC region (question 1), we quantified species richness and endemism using two approaches of Crisp et al. (2001) and Linder (2001). Species richness was calculated as the number of endemic species in a grid cell (0.5° × 0.5°). We mapped species richness at a coarse resolution to reduce potential bias in sampling effort and enable a better understanding and visualization of richness patterns (Graham and Hijmans, 2006). Endemism was measured as weighted endemism, where the proportion of endemics is inversely weighted by range size. The endemism value for a cell equals the sum of these weights for all species in the cell. To test whether weighted endemism was higher than the expected species richness, we produced replicate random draws from the species pool based on the observed species richness (i.e., the same number of species) and the actual species frequencies (the more frequent a species, the more likely it is to be drawn). For estimations of weighted endemism, we used the functions “weighted endemism” and “endemism.null.test” from the R package biomapME (Guerin et al., 2015). The distribution of the resulting set of null endemism scores was compared to the observed endemism. Subsequently, grid cells were mapped as higher or lower than expected (based on significance testing and comparison to null quantiles) to produce a binary-weighted endemism map. Finally, we performed sensitivity analyses in relation to the mapping strategy and for the presence of species with <3 unique records per grid cell. To do this, we calculated species richness and endemism with the following datasets: (1) without SDMs to remove unsuitable habitats, and (2) without species with <3 unique records per grid cell.

Generalized Dissimilarity Modelling (GDM)

To evaluate how much of the species turnover can be attributed to the independent and combined influences of geographic distance and bioclimatic variables (question 2), we used Generalized Dissimilarity Modelling (GDM). This method is widely used to identify the contributions of different factors to explaining beta diversity patterns (e.g., Blois et al. 2013; Fitzpatrick et al. 2013; Capinha et al. 2015). We focused only on turnover (reflecting replacement), to the exclusion of nestedness (reflecting species loss), in light of multiple studies indicating that turnover dominates amphibian beta diversity in regions below the 37^th parallel north, which encompasses all of Africa (Baselga et al., 2012; Jiménez-Robles et al., 2017; Azevedo et al., 2021); however, we acknowledge that a comprehensive understanding of beta diversity requires evaluating both turnover and nestedness (Baselga, 2010) and suggest this as an objective for future study on the endemic amphibians of the GC region.

We calculated species turnover among grid cells with >5 species, thereby minimizing possible effects of nestedness that might arise when species in locations with very low species richness are subsets of species in areas with relatively high species richness (Wright & Reeves, 1992; Ulrich & Gotelli, 2007). We built GDM models using all possible combinations of bioclimatic variables (after removing those deemed to be colinear, as described above). We then consecutively excluded from the full model the bioclimatic variable with the smallest contribution. In each iteration, we calculated the relative variable importance (percentage of explained deviance between a model with and without a variable) and significance (P-value < 0.05) through matrix permutations for each variable (n = 1,000). According to Wagner & Fortin (2015), model selection using AIC is not applicable to analyses of regression matrices such as GDM. Thus, we used a comparable approach that involves selection of the model with the highest explained deviance (Azevedo et al., 2021). This approach retained only important bioclimatic variables after more than 160 successive permutations (following the 0.16 optimization level suggested by Heinze et al., 2018). All analyses were done in R using the package betapart (Baselga, 2010; Baselga et al., 2018).

We used I-splines to identify the most important bioclimatic variables that could account for the nonlinear monotonic relationships between variables and species turnover (i.e., partial ecological distance). We produced the models using the R-package gdm (Ferrier et al., 2007). We then used GDMs to predict species turnover across the entire study area (background of bioclimatic variables). The results were clustered with the unweighted pair group method using arithmetic averages (UPGMA) to define bioregions with similar species composition. UPGMA is a frequently used agglomerative hierarchical clustering approach with good performance in clustering dissimilarity for bioregionalization studies (Kreft & Jetz, 2010).

Patterns of species richness and turnover in endemic amphibians of the Guineo-Congolian rainforest

Data files

Abstract

Distributional data

Environmental data

Species ranges

Species richness and endemism

Generalized Dissimilarity Modelling (GDM)

Patterns of species richness and turnover in endemic amphibians of the Guineo-Congolian rainforest

Data files

Abstract

Methods

Distributional data

Environmental data

Species ranges

Species richness and endemism

Generalized Dissimilarity Modelling (GDM)

Works referencing this dataset