Skip to main content

Data from: Occurrence-habitat mismatching and niche truncation when modelling distributions affected by anthropogenic range contractions

Cite this dataset

Pang, Sean E. H. (2022). Data from: Occurrence-habitat mismatching and niche truncation when modelling distributions affected by anthropogenic range contractions [Dataset]. Dryad.


Aims: Human-induced pressures such as deforestation cause anthropogenic range contractions (ARCs). Such contractions present dynamic distributions that may engender data misrepresentations within species distribution models. The temporal bias of occurrence data—where occurrences represent distributions before (past bias) or after (recent bias) ARCs—underpins these data misrepresentations. Occurrence-habitat mismatching results when occurrences sampled before contractions are modelled with contemporary anthropogenic variables; niche truncation results when occurrences sampled after contractions are modelled without anthropogenic variables. Our understanding of their independent and interactive effects on model performance remains incomplete but is vital for developing good modelling protocols. Through a virtual ecologist approach, we demonstrate how these data misrepresentations manifest and investigate their effects on model performance.

Location: Virtual Southeast Asia

Methods: Using 100 virtual species, we simulated ARCs with 100-year land-use data and generated temporally biased (past, recent) occurrence datasets. We modelled datasets with and without a contemporary land-use variable (conventional modelling protocols) and with a temporally dynamic land-use variable. We evaluated each model’s ability to predict historical and contemporary distributions.

Results: Greater ARC resulted in greater occurrence-habitat mismatching for datasets with past bias and greater niche truncation for datasets with recent bias. Occurrence-habitat mismatching prevented models with the contemporary land-use variable from predicting anthropogenic-related absences, causing overpredictions of contemporary distributions. Although niche truncation caused underpredictions of historical distributions (environmentally suitable habitats), incorporating the contemporary land-use variable resolved these underpredictions, even when mismatching occurred. Models with the temporally dynamic land-use variable consistently outperformed models without.

Main conclusions: We showed how these data misrepresentations can degrade model performance, undermining their use for empirical research and conservation science. Given the ubiquity of anthropogenic range contractions, these data misrepresentations are likely inherent to most datasets. Therefore, we present a three-step strategy for handling data misrepresentations: maximise the temporal range of anthropogenic predictors, exclude mismatched occurrences, and test for residual data misrepresentations.



Ministry of Education, Singapore, Award: AcRF Tier 1 Grant to ELW