Data from: A global database of butterfly species native distributions
Data files
Aug 06, 2024 version files 7.85 GB
-
ButterflyMaps.zip
-
Data_S1.zip
-
Model_test_statistics.csv
-
README.md
Abstract
Butterflies represent a diverse group of insects, playing key ecosystem roles such as pollination and their larval form in herbivory. Despite their importance, comprehensive global distribution data for butterfly species is lacking. This lack of comprehensive global data has hindered many large-scale questions in ecology, evolutionary biology, and conservation at regional and global scales. Here, I use an integrative workflow that combines occurrence records, alpha hull polygons, species’ dispersal capacity, natural habitat and environmental variables within a framework of species distribution models to generate species-level native distributions for butterflies at a global scale in contemporary period. The database releases native range maps for 10,372 extant species of butterflies at a spatial grain resolution of 5 arcmin (~10 km). This database has the potential to allow unprecedented large-scale analyses in ecology, biogeography, and conservation of butterflies. The maps are available in the WGS84 coordinate reference system (EPSG:4326 code) and stored as vector polygons in GEOPACKAGE format for maximum compression, allowing easy data manipulation using a standard computer. I additionally provide each species’ spatial raster. All maps and R scripts are open access and available for download at Dryad, and are guided by FAIR (Findable, Accessible, Interoperable, and Reusable) data principle. By making this data available to the scientific community, I aim to advance the sharing of biological data to stimulate more comprehensive research in ecology, biogeography, and conservation of butterflies.
README: A global database of butterfly species native distributions
Data description within the folder "ButterflyMaps.zip":
All scripts and code necessary to repeat the analyses described here have been made available in the R package phyloregion
This folder includes all the data for the Daru paper.
The folder is structured as follows:
\ButterflyMaps (contains all the input data files)
\GreenMaps\CODES (contains all the R scripts that were generated and used in this project)
The full folder structure is below:
+---ButterflyMaps.zip
| +---Model_test_statistics.csv
| ---polygons
| ---rasters
| ---raw_rasters
+---Data S1.zip
| ---Data S1.docx
- ButterflyMaps.zip
- Model_test_statistics.csv: a file with values of the three model test statistics quantifying the predictive accuracy of the models for each species: area under the curve (AUC), True Skill Statistic (TSS), and Boyce Index. Cells containing "NA" correspond to cells with missing data.
- polygons: a folder with individual species polygons in geopackage format. Each species polygon has two fields: binomial, indicating the species binomial name, and origin corresponding to the native status of the species (1 for native species). The current dataset considered only native distributions.
- rasters: folder with individual modeled spatial rasters in GeoTiff format at a grain resolution of 5–arc min (~10 km at the equator).
- raw_rasters: folder with raster layers in GeoTiff format at a grain resolution of 5–arc min (~10 km at the equator) of the raw suitability scores to explore alternative approaches of thresholding the predictions.
- Data_S1.zip_
- Data S1.docx: A file in Word format with the ODMAP (Overview, Data, Model, Assessment and Prediction) protocol of reporting species distribution models.
- phyloregion_1.0.9.tar.gz R package used to run the analyses.