Urbanization drives habitat suitability of the invasive Cuban Knight Anole (Anolis equestris) in Florida, USA
Data files
Oct 22, 2025 version files 52.31 KB
-
independent_survey_data.csv
5.69 KB
-
README.md
6.05 KB
-
ThreatenedInvertDataObscured.csv
40.57 KB
Abstract
This dataset supports the study Urbanization Drives Habitat Suitability of the Invasive Cuban Knight Anole (Anolis equestris) in Florida, which evaluated climatic and anthropogenic drivers of habitat suitability in both the species’ native range (Cuba) and invasive range (Florida). We developed replicated species distribution models (SDMs) using eight algorithms and ten independent pseudo-absence sets (1:1 PA:presence ratio), with regional stratification and 10-fold cross-validation (100 runs per algorithm). Predictors included climate, topography, vegetation indices, human population density, urban land cover, and a standardized iNaturalist observer-effort layer. An ensemble modeling framework was applied at both the algorithm-specific and global levels. Model performance was assessed with True Skill Statistic (TSS) and Boyce Index (BI) calculated separately for each region. The best-performing model, a Random Forest ensemble, was validated with an independent long-term monitoring dataset from South Florida. Results show that urbanization variables had a stronger influence in Florida, while climate and vegetation played larger roles in Cuba. Predicted high-suitability areas for A. equestris overlapped with occurrence points of three protected invertebrates: Florida tree snail (Liguus fasciatus), Schaus’ swallowtail (Papilio aristodemus ponceanus), and Miami tiger beetle (Cicindelidia floridana). This package contains cleaned occurrence records, environmental layers generated for the analysis, habitat suitability rasters for both regions, model performance summaries, invertebrate overlap results, and the R script used to reproduce the workflow.
This dataset supports the study published in Ecology and Evolution
(Romer et al. 2025; https://doi.org/10.1002/ece3.72334), which evaluated climatic and anthropogenic drivers of habitat suitability for the invasive Cuban knight anole (Anolis equestris) across its native (Cuba) and invasive (Florida, USA) ranges.
Replicated species-distribution models (SDMs) were constructed using eight algorithms and ten independent pseudo-absence sets, with regional stratification and 10-fold cross-validation.
A Random Forest ensemble showed the highest performance and was validated with a 15-year independent monitoring dataset from South Florida.
Descriptions
ECE-2025-02-00429.R
- Type: R script
- Purpose: Implements the full SDM workflow, including data import, preprocessing, model training (eight algorithms × 10 replicates), ensemble generation, evaluation, and figure export.
- Key packages:
biomod2,terra,sf,randomForest,ggplot2,glmmTMB,emmeans. - Output: Performance metrics, variable-importance plots, and suitability rasters.
independent_survey_data.csv
- Type: Presence/absence dataset (2008–2023).
- Source: long-term (2008–2023) monitoring effort for invasive ectotherms in South Florida.
- Columns:
- SiteID: Survey location code.
- Latitude, Longitude: decimal degrees (WGS 84).
- Presence: 1 = present, 0 = absent.
ThreatenedInvertDataObscured.csv
- Type: Occurrence data for three protected invertebrates
(Liguus fasciatus, Papilio aristodemus ponceanus, Cicindelidia floridana). - Coordinates: Randomly displaced ≤ 10 km to protect sensitive sites.
- Columns:
- Species: Scientific name.
- Latitude, Longitude: decimal degrees (WGS 84).
- Source: Originating agency (FWC or FDEP).
Algo_Selection.png
- Content: Comparison of algorithm-level and ensemble-level performance using True Skill Statistic (TSS) and Boyce Index (BI).
Variable_Importance.png
- Content: Regional variable-importance scores for the final Random Forest ensemble.
Cuba_Map.jpg and Florida_Map.jpg
- Content: Habitat-suitability rasters (0–1 scale) for each region.
- Resolution: 30 arc-sec (~1 km²).
invert_cosuitability_annotated_plot.png
- Content: Predicted suitability of A. equestris at locations of the three threatened invertebrates.
SFig1.jpg – SFig4.jpg
- Content: Individual exports of supplementary figures (e.g., correlation matrix, diagnostics).
Supplementary_Figures_ECE-2025-02-00429.pdf
- Content: Combined supplemental figures
Supplementary_Tables_ECE-2025-02-00429.xlsx
- Sheets:
- Model_Performance: TSS and BI for all algorithms × regions.
- Variable_Importance: Scaled importance scores per predictor.
- Predictor_Correlations: Pairwise Pearson r values (|r| > 0.7 flagged).
Key Variables
- Latitude / Longitude: decimal degrees (WGS 84)
- Presence: 1 = present, 0 = absent
- TSS, BI: unitless model-performance metrics (0–1)
- Suitability: predicted habitat suitability (0–1 continuous scale)
- BIO1 – BIO19: WorldClim bioclimatic predictors (°C or mm)
- Elevation: meters above sea level (m)
- NDVI, EVI: vegetation indices (unitless −1 to 1)
- Settlement Model Grid, Built-up Surface, Population Density: urbanization metrics (GHSL; fraction or people per km²)
- iNaturalist Effort: log-transformed observation density ln(records per km² + 1)
Key Information Sources
Environmental and ancillary data were obtained from publicly available repositories:
- WorldClim v2.1 Bioclimatic Variables: https://www.worldclim.org/data/worldclim21.html
- Global Multi-resolution Terrain Elevation Data (GMTED2010): https://topotools.cr.usgs.gov/gmted/
- Global Human Settlement Layers (GHSL): https://ghsl.jrc.ec.europa.eu/
- MODIS EVI and NDVI Products: https://modis.gsfc.nasa.gov/data/dataprod/
- Florida Administrative Boundaries (FL DOT): https://hub.arcgis.com/datasets/519e0a0ed5984bedba53695e1f56c1ee
- Cuba Administrative Boundaries (OCHA): https://data.humdata.org/dataset/cod-ab-cub
- GBIF Occurrence Records for Anolis equestris: https://www.gbif.org/species/2458613
- iNaturalist Observation Data (Phylum Chordata): https://www.inaturalist.org/
All listed sources are open-access and compatible with CC0. Derived layers (e.g., scaled rasters, effort surfaces) were authored by the dataset creators and are likewise released under CC0.
Code / Software
R (version 4.3.1 or later) is required to run ECE-2025-02-00429.R.
The script is fully annotated and self-contained:
- Library loading
- Data import and cleaning
- Model fitting and ensemble construction
- Validation and evaluation
- Figure and table export
Hardware used: Windows 11 OS, Intel i7 CPU, 32 GB RAM.
Licensing and Attribution
- Data: Released under CC0 Public Domain Dedication (https://creativecommons.org/publicdomain/zero/1.0/).
- Creative works and figures ( Zenodo supplements ): Licensed under CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/).
- All data were generated by the authors or derived from sources permitting unrestricted reuse; no copyrighted or restricted materials are included.
