Skip to main content

Performance results from species distribution models considering historical occurrences and variables of varying persistency

Cite this dataset

Bracken, Jason (2021). Performance results from species distribution models considering historical occurrences and variables of varying persistency [Dataset]. Dryad.


Occurrence data used to build species distribution models often include historical records from locations in which the species no longer exists. When these records are paired with contemporary environmental values that no longer represent the conditions the species experienced, the model creates false associations that hurt predictive performance. The extent of mismatching increases with the number of historical occurrences and with inclusion of environmental variables that are prone to change over time. Indeed, the mismatch between occurrence data and contemporaneous environmental variables is a common dilemma when modeling rare or cryptic species, especially those of conservation concern that were once more abundant. Herein, we assess (1) the impact of historical occurrences on model performance across three sets of environmental variables of increasing persistency, and (2) the performance of models built using selected-historical occurrences from locations that showed evidence of limited environmental change over time. Concepts are tested on federally listed flatwoods salamanders, reflecting real-world conservation management efforts. We predicted that, compared to other occurrence sets, (1) historical occurrences would perform best with environmental variables that were more persistent, (2) recent occurrences would perform best when the environmental variables were more impersistent, and that (3) our selected-historical occurrences would perform best with a combination of persistent and impersistent variables. Our results showed the expected inversion of model performance of recent and historical occurrences across environmental variables of increasing persistency when evaluated by correct predictions. However, the inversion was not seen in AUC performance, in which historical occurrences outperformed recent occurrence models across all variable sets. Selected-historical occurrences did not notably improve performance over all-historical occurrences in any metric or variable set. To maximize utility and performance, modelers could acknowledge potential tradeoffs from inclusion of historical occurrences and consider number and age of recent and historical occurrences available, the persistency of environmental variables considered, and how their conservation goals are reflected in model design and evaluation, particularly with respect to sensitivity vs. specificity. Our study lends support for inclusion of historical occurrences, with the potential exception of mostly impersistent variables when sensitivity is the highest priority.