Data for: A new threshold selection method for species distribution models with presence-only data: extracting the mutation point of the P/E curve by threshold regression
Data files
Mar 21, 2024 version files 119.51 KB
-
Panda_Presence_data.xls
-
README.md
-
Suitable_habitat.tif
Abstract
Selecting thresholds to convert continuous predictions of species distribution models proves critical for many real-world applications and model assessments. Prevalent threshold selection methods for presence-only data require unproven pseudo-absence data or subjective researchers’ decisions. This study proposes a new method, Boyce-Threshold Quantile Regression (BTQR), to determine thresholds objectively without pseudo-absence data. We summarize that the mutation point is a typical shape feature of the predicted-to-expected (P/E) curve after reviewing relevant articles. Analysis based on source-sink theory suggests that this mutation point may represent a transition in habitat types and serve as an appropriate threshold. Threshold regression is introduced to accurately locate the mutation point.
To validate the effectiveness of BTQR, we used four virtual species of varying prevalence and a real species with reliable distribution data. Six different species distribution models were employed to generate continuous suitability predictions. BTQR and nine other traditional methods transformed these continuous outputs into binary results. Comparative experiments show that BTQR has advantages in terms of accuracy, applicability, and consistency over the existing methods.
README: The distribution data of giant pandas recorded in the fourth survey on Giant Panda
Includes 910 presence records and the suitable habitat map of giant pandas.
Description of the Data and file structure
The dataset is organized into two files:
*'Panda_Presence_data.xls' the presence records of giant pandas. the location of presence data is represented by latitude and longitude. The precision of coordinate data is 0.1 decimal degrees, which is not able to have potential negative impacts on species.
*'Suitable_habitat.tif' the suitable habitat map of giant pandas, where Areas assigned a value of 1 are suitable habitats.
Sharing/access Information
Data was derived from the following sources:
Jiang, C., Wang, H. & Gu, X. 2015. Giant Pandas in Sichuan: Report of the Fourth Giant Panda Survey in Sichuan Beijing, Sichuan Science and Technology Press.Considering that the data used in this study was already publicly released in 2015, we believe that uploading this data will not have any potential negative impact on the conservation of giant pandas.
Methods
The data in this dataset comes from a published book: Jiang, C., Wang, H. & Gu, X. 2015. Giant Pandas in Sichuan: Report of the Fourth Giant Panda Survey in Sichuan Beijing, Sichuan Science and Technology Press.We digitised the accompanying maps provided in this book using the geographic alignment tool of Arcgis to obtain 910 occurrence data and the distribution of suitable habitat for giant pandas.Considering that the data used in this study was already publicly released in 2015, we believe that uploading this data will not have any potential negative impact on the conservation of giant pandas.