Data from: Evaluating presence-only species distribution models with discrimination accuracy is uninformative for many applications
Warren, Dan L.; Matzke, Nicholas; Iglesias, Teresa (2020), Data from: Evaluating presence-only species distribution models with discrimination accuracy is uninformative for many applications, Dryad, Dataset, https://doi.org/10.5061/dryad.6ft55k9
Aim: Species distribution models are used across evolution, ecology, conservation, and epidemiology to make critical decisions and study biological phenomena, often in cases where experimental approaches are intractable. Choices regarding optimal models, methods, and data are typically made based on discrimination accuracy: a model’s ability to predict subsets of species occurrence data that were withheld during model construction. However, empirical applications of these models often involve making biological inferences based on continuous estimates of relative habitat suitability as a function of environmental predictor variables. We term the reliability of these biological inferences “functional accuracy.” We explore the link between discrimination accuracy and functional accuracy. Methods: Using a simulation approach we investigate whether models that make good predictions of species distributions correctly infer the underlying relationship between environmental predictors and the suitability of habitat. Results: We demonstrate that discrimination accuracy is only informative when models are simple and similar in structure to the true niche, or when data partitioning is geographically structured. However, the utility of discrimination accuracy for selecting models with high functional accuracy was low in all cases. Main conclusions: These results suggest that many empirical studies and decisions are based on criteria that are unrelated to models’ usefulness for their intended purpose. We argue that empirical modeling studies need to place significantly more emphasis on biological insight into the plausibility of models, and that the current approach of maximizing discrimination accuracy at the expense of other considerations is detrimental to both the empirical and methodological literature in this active field. Finally, we argue that future development of the field must include an increased emphasis on simulation; methodological studies based on ability to predict withheld occurrence data may be largely uninformative about best practices for applications where interpretation of models relies on estimating ecological processes, and will unduly penalize more biologically informative modeling approaches.