Data for: Beyond Protected Areas: The importance of mixed-use landscapes for the conservation of Sumatran elephants (Elephas maximus sumatranus)
Data files
Sep 08, 2023 version files 135.88 MB
-
ElephantConnectivity.tif
-
ElephantsCircuitscape.R
-
GPSPts_50km.tif
-
PresencePts_50km.tif
-
README.md
-
SumatranElephantHabitatSuitability.tif
Abstract
Elephants were once widely distributed across the Indonesian island of Sumatra but now exist in small, isolated populations. Using the best data currently available on elephant occurrence, we mapped suitable habitat for elephants on Sumatra (Elephas maximus sumatranus). We used direct sightings and indirect observations of elephant signs, as well as six remotely-sensed proxies of surface ruggedness, vegetation productivity and structure, and human land use and disturbance, to model habitat suitability in a Google Earth Engine (GEE) environment. We validated the habitat suitability prediction using an independent geolocation dataset collected from global positioning system (GPS) collars fitted on elephants, and also assessed the functional connectivity between known elephant population ranges by deriving a resistance surface for Circuitscape from this prediction. Thirty-two percent (135,646 km2) of Sumatra’s land area was predicted to be suitable habitat, with 43 patches of suitable habitat located across Sumatra. Areas with high connectivity were concentrated in the Riau and North Sumatra provinces. Though our analysis highlights the need to improve the quality of data collected on Sumatran elephants, it suggests that more suitable habitat remains on Sumatra than is used by known populations. Targeted habitat conservation, especially of the suitable habitat in and by the Lamno, Balai Raja, Tesso Tenggara, Tesso Utara, Bukit Tigapuluh, Seblat, Padang Sugihan, and Bukit Barisan Selatan ranges, may improve the long-term viability of the critically endangered species.
Methods
Data Collection & Cleaning
We obtained 2,952 observations of elephant occurrence on the island of Sumatra from the Indonesian Ministry of Environment and Forestry, including the Natural Resource Conservation Agencies of South Sumatra and Jambi provinces (see Table S1 in Supporting Information). Observations were either direct sightings or indirect observations of elephant signs (i.e., dung, scrapings, footprints, trails). All observations were collected between 2011 and 2020 during elephant population surveys, park ranger patrols, or investigations of human-elephant conflict. Using high-resolution satellite imagery in Google Earth Engine (GEE), we removed potentially erroneous occurrence observations that were located on top of buildings or in large water bodies. Our remaining occurrence dataset consisted of 2,916 georeferenced locations.
Asian elephants are a generalist species and occur across a wide variety of ecosystems and habitats, from grassland savannahs to mangroves and tropical forests (Leimgruber et al., 2011; Lin et al., 2008; Neupane, Kwon, Risch, Williams, & Johnson, 2019; Varma, 2008). Habitat selection within specific ecosystems may be driven by forage availability, topographic factors, and the presence of humans (Calabrese et al., 2017; Moßbrucker, Fleming, Imron, Pudyatmoko, & Sumardi, 2016b). To understand how habitat availability influences Sumatran elephant distribution, we integrated six remotely-sensed proxies of surface ruggedness, vegetation productivity and structure, and human land use and disturbance (Table 2).
We derived surface ruggedness from 30 m Shuttle Radar Topography Mission (SRTM) elevation data (Farr et al., 2007), calculating the standard deviation of the data within a 500 m radius buffer. Pixel neighborhoods with large standard deviations were rougher (i.e. have steeper slopes) than areas with smaller standard deviations (i.e. flat areas). We used Normalized Difference Vegetation Index (NDVI) data derived from the Moderate-Resolution Imaging Spectroradiometer (MODIS; MOD13Q1, 250 m spatial resolution) as a proxy for primary productivity (Pettorelli et al., 2011). To characterize habitat structure, we utilized C-band and L-band synthetic aperture radar data (Shimada et al., 2014). In forests, these layers respectively denote the structure of the canopy and understory (Berninger, Lohberger, Stängel, & Siegert, 2018; Omar, Misman, & Kassim, 2017; Thapa, Shimada, Watanabe, Motohka, & Shiraishi, 2013). We also incorporated the location of oil palm plantations, inclusive of year of establishment (Danylo et al. 2021; 30 m spatial resolution). We used the number of years since each pixel had been transformed from forest to palm oil as the predictor variable, assuming that selection by elephants changes throughout the economic lifespan of the oil palm plantation (Evans, Goossens, & Asner, 2017). All predictor variables were resampled to 250 m resolution using the default near neighbor method in GEE. To evaluate potential collinearity among environmental predictors, we estimated the Spearman correlation at 5,000 random locations across Sumatra and calculated the Variance Inflation Factor (VIF) of the predictor layers using the R package usdm (Naimi 2017). We found no significant correlation between the covariates incorporated in the analyses (< 0.7 for all pairwise correlations; VIF < 3 for all covariates).
Model Development
To construct our elephant habitat suitability model, we implemented a workflow in Google Earth Engine (Gorelick et al., 2017) modified from Crego et al. (2022). We choose to use Google Earth Engine to implement the SDM due to the easy accessibility of the desired raster products, availability of desired algorithms, high computing capacity, and easy reproducibility of our modeling framework (see shared code). Google Earth Engine utilizes a parallel computing system that improves efficiency by reducing the computation time (Gorelick et al., 2017; Tamiminia et al., 2020).
To reduce the potential bias of clustered presence locations, we randomly thinned the observational data to one location per square kilometer (n = 1167) (Boria, Olson, Goodman, & Anderson, 2014; Fourcade, Engler, Rödder, & Secondi, 2014; Veloz, 2009). In our modeling framework, we used random forest classifiers and a repeated (10-fold) spatial block cross-validation approach (Roberts et al., 2017; Valavi et al., 2019). From the different machine learning methods available in GEE, we chose random forests due to the known good performance when compared to other classifier algorithms (Crimmins, Dobrowski, & Mynsberge, 2013; Hao, Elith, Guillera-Arroita, & Lahoz-Monfort, 2019; Hao, Elith, Lahoz-Monfort, & Guillera-Arroita, 2020; Valavi, Guillera‐Arroita, Lahoz‐Monfort, & Elith, 2022). For the cross-validation, we defined 50 x 50 km blocks that were randomly split 10 times, 70% used for model training and 30% for model validation, to ensure spatial independence between training and validation datasets. We created blocks across the entire Sumatra Island understanding that elephants used to range across its entire extent (Jackson, 1990). At each model iteration, a set of presence points from the validation block set was selected, and an equal number of pseudo-absences was created randomly within the area of these blocks but at distances > 1 km from any occurrence point. Similarly, an equal number of random pseudo-absences to the number of presences within the validation set of blocks was created. We used these balanced datasets (i.e., the same number of presence and pseudo-absences) for model fitting and model validation at each iteration because the performance of random forest has been shown to decrease when using imbalanced datasets (Evans et al., 2011; Barbet-Massin et al., 2012; Sillero et al., 2021). Each random forest was run with 500 trees. We calculated the relative importance of each predictor variable as the averaged proportional contribution of each band, indicated by the GINI index that is calculated by each random forest classifier at each model iteration. We made 10 separate model predictions for each model iteration and then averaged habitat suitability index of each pixel to obtain a final habitat suitability index map.
Model Validation
To assess model accuracy, we calculated Area Under the Precision-Recall Curve (AUC-PR), sensitivity (the true positivity rate), and specificity (the true negativity rate) for each model iteration (Fielding & Bell, 1997; Sofaer, Hoeting, & Jarnevich, 2019). To calculate sensitivity and specificity, we used the averaged threshold value that maximized the sum of the sensitivity and specificity among the 10 model iterations (Liu, Newell, & White, 2016). We also applied this threshold to create a binary potential distribution map across the island. Additionally, we validated the habitat suitability prediction using an independent geolocation dataset for Sumatran elephants. The geolocation dataset was collected from global positioning system (GPS) collars fitted on elephants by a variety of organizations in conjunction with the Indonesian Ministry of Environment and Forestry (see Tables S2 and S3 in Supporting Information). Much of these data were lacking critical supporting information (i.e. date, time, dilution of precision). To reduce potential biases from inaccurate or duplicated geolocations, we relied on data summaries for model validation. We produced a histogram to examine the habitat suitability predicted by our model at the GPS point locations, understanding that the model would be a good representation of elephant-suitable habitat if collared elephants consistently used the predicted suitable habitat (i.e., pixels with HSI above the threshold value). We also calculated the sensitivity for this GPS point dataset using the averaged threshold value.
Assessing the Patchiness of Suitable Habitat & its Relative Distribution In and Out of Protected Areas
We used ArcGIS Pro (version 2.8.0, Environmental Systems Research Institute 2021) to summarize the percent of predicted suitable habitat inside and outside Sumatra’s protected areas. We considered national parks, wildlife reserves, natural reserves, grand forest parks, natural tourism parks, and grand gardens receiving conservation status by the Indonesian government as protected areas. We used the Region Group tool in ArcGIS Pro to map patches of suitable habitat based on the potential distribution map. We defined a patch as all orthogonally contiguous raster cells of suitable habitat with a total area greater than 275 km2, which is the minimum area-corrected autocorrelated kernel density estimated (AKDEC) home range size for elephants in Bukit Tigapuluh, Sumatra (Moßbrucker et al., 2016a)
Modeling Landscape Connectivity from Known Elephant Populations
To assess the functional connectivity between known elephant population ranges, we used Circuitscape 4.0 (Anantharaman, Hall, Shah, & Edelman, 2019). Circuitscape uses circuit theory to model connectivity and considers the landscape as a resistance surface in which animals move as random walkers without a complete knowledge of the landscape (McRae, Shah, & Mohapatra, 2013). The resulting resistance and conductance values are proportional, showcasing the relative probability of movement through different pathways on the landscape (Shah & McRae, 2008). We derived our resistance surface from our habitat suitability prediction, inverting the suitability values and scaling between 1 and 100 (i.e., minimum to maximum resistance). We used the program’s pairwise mode that considers each known population as an electrical node and runs a theoretical current between each pair of nodes. A cumulative connectivity map was created by adding the currents generated between every node pair. We identified corridors in the top 15% of cumulative current as areas of high connectivity (Theron, Pryke, & Samways, 2022).