Skip to main content

Finding what you don’t know: testing SDMs methods for poorly-known species

Cite this dataset

Radomski, Tom et al. (2022). Finding what you don’t know: testing SDMs methods for poorly-known species [Dataset]. Dryad.


Aim: A limitation of species distribution models (SDMs) is that species with low sample sizes are difficult to model. Yet it is often important to know the habitat associations of poorly known species to guide conservation efforts. Techniques have been proposed for modeling species’ distributions from few records, but their performance relative to one another has not been compared. Because these models are built and evaluated with small datasets, sampling error could cause severely biased sampling in environmental space. As a result, SDMs are likely to underpredict geographic distributions given small sample sizes. We perform the first comparison of methods explicitly promoted or developed for predicting the geographic ranges of species with very low sample sizes.

Location: North Carolina, USA

Taxon: South Mountains Gray-cheeked Salamander (Plethodon meridianus)

Methods: Using the sparse, existing georeferenced records of P. meridianus, we built SDMs using a range of methods that previous researchers have argued should work for low sample sizes. We then tested each SDM’s ability to accurately predict independent survey data that were not georeferenced prior to our study. We compared SDMs using omission error and AUC.

Results: Roughly half of the models successfully predicted survey records in the range center, and all models had high omission error rates in the range exterior. In the range interior or exterior, the ‘ensemble of small models’ technique produced SDMs with high omission error rates. Spatial filtering had negligible impact on model performance. Most, but not all, models outperformed predictions using distance from known populations. Using one of the best-performing methods, we developed an improved range map of P. meridianus.

Main Conclusions: Geographically peripheral populations were difficult to predict for all SDMs, though some methods were clearly inferior for our dataset. We recommend that when sample sizes are low, researchers use Maxent with species-specific model settings.