Skip to main content
Dryad

Data from: cross-validation matters in species distribution models: a case study with goatfish species

Data files

Sep 05, 2024 version files 187.97 MB

Abstract

In an era of ongoing biodiversity, it is critical to map biodiversity patterns in space and time for better-informing conservation and management. Species distribution models (SDMs) are widely applied in various types of such biodiversity assessments. Cross-validation represents a prevalent approach to assess the discrimination capacity of a target SDM algorithm and determine its optimal parameters. Several alternative cross-validation methods exist; however, the influence of choosing a specific cross-validation method on SDM performance and predictions remains unresolved. Here, we tested the performance of random versus spatial cross-validation methods for SDM using goatfishes (Actinopteri: Syngnathiformes: Mullidae) as a case study, which are recognized as indicator species for coastal waters. Our results showed that the random versus spatial cross-validation methods resulted in different optimal model parameterizations in 57 out of 60 modeled species. Significant difference existed in predictive performance between the random and spatial cross-validation methods, and the two cross-validation methods yielded different projected present-day spatial distribution and future projection patterns of goatfishes under climate change exposure. Despite the disparity in species distributions, both approaches consistently suggested the Indo-Australian Archipelago as the hotspot of goatfish species richness and also as the most vulnerable area to climate change. Our findings highlight that the choice of cross-validation method is an overlooked source of uncertainty in SDM studies. Meanwhile, the consistency in richness predictions highlights the usefulness of SDMs in marine conservation. These findings emphasize that we should pay special attention to the selection of cross-validation methods in SDM studies.