Data and code from: Extrapolation strategy matters when transferring ecological niche models to non-analog environments: New visualization tools for informed decisions
Data files
May 26, 2026 version files 68.53 MB
-
data_and_code.zip
68.52 MB
-
README.md
6.15 KB
Abstract
This dataset and associated code repository provide a reproducible workflow and novel visualization tools for assessing and managing extrapolation in ecological niche models (ENMs) transferred to non-analog environments. The repository contains custom R functions, Quarto analysis scripts, and tabular evaluation metrics (.csv) used to model the distribution of the black-eared mouse (Peromyscus melanotis) in Mexico, projecting from present conditions to the Last Glacial Maximum (~21 kya).
Dataset DOI: 10.5061/dryad.ksn02v7m0
Description of the data and file structure
Data and code from: Extrapolation strategy matters when transferring ecological niche models to non-analog environments
Dataset Overview
Title: Data and code from: Extrapolation strategy matters when transferring ecological niche models to non-analog environments: new visualization tools for informed decisions
Authors: Gonzalo E. Pinilla-Buitrago, Jamie M. Kass, Robert P. Anderson
Corresponding Author: Gonzalo E. Pinilla-Buitrago, gepinillab@gmail.com
Date of Data Collection/Creation: 2026
Geographic Location: Mexico
Data and Code Availability Note
Important regarding occurrence data: The raw occurrence records for the black-eared mouse (Peromyscus melanotis) used in this case study (Pm_occs.csv) are not included in this repository due to copyright restrictions held by the Journal of Mammalogy.
- To obtain this occurrence data, please refer to: García-Mendoza, D. F., López-González, C., Hortelano-Moncada, Y., López-Wilchis, R., & Ortega, J. (2018). Geographic cranial variation in Peromyscus melanotis (Rodentia: Cricetidae) is related to primary productivity. Journal of Mammalogy, 99(4), 898–905. https://doi.org/10.1093/jmammal/gyy062
- For reproducibility: To run the scripts provided in this repository, the occurrence data must be formatted as a CSV file named Pm_occs.csv and placed in the data/occs/ directory. The scripts strictly require the columns longitude and latitude (in decimal degrees, WGS84). You also can obtain a formated file by writing an email to gepinillab@gmail.com
Files and variables
The repository is structured into three main directories inside data_and_code.zip: code/, R/ (functions), and output/. Environmental data (data/raster/bios/) is generated directly via the provided Quarto scripts.
1. code/ (Analysis Scripts)
- 01_download_bios.qmd: Quarto script to download and format CHELSA bioclimatic variables for the Present (1981-2010, v2.1) and Past (Last Glacial Maximum, PMIP3, MIROC-ESM and IPSL-CM5A-LR). Downloads variables:
- bio05: Max Temperature of Warmest Month
- bio06: Min Temperature of Coldest Month
- bio13: Precipitation of Wettest Month
- bio14: Precipitation of Driest Month
- 02_model-training-and-transfer.qmd: Quarto script outlining the main analytical workflow. It partitions data (block method), trains Maxent models using ENMeval, generates response curves and density plots, and projects models into geographic space under different extrapolation scenarios (clamping, unconstrained, and custom).
- 03_extrapolation.R: R script to calculate environmental similarity metrics (MOP and MESS) between calibration and transfer environments using the smop R package.
2. output/ (Model Outputs and Tables)
- models/Pm_eval-table.csv: Evaluation metrics for the Maxent models generated by ENMeval. Contains 19 variables including:
- fc: Feature class
- rm: Regularization multiplier
- auc.train, cbi.train: Training metrics
- auc.val.avg, or.10p.avg, AICc, delta.AICc: Validation and complexity metrics.
- models/Pm_models.rds: [Optional depending on if you upload it] R object containing the final tuned Maxent models.
- extrapolation/: Contains generated .tif rasters for Mobility-Oriented Parity (MOP) and Multivariate Environmental Similarity Surfaces (MESS) for the present and past scenarios (mop_present.tif, mess_miroc.tif, etc.).
Code/software
R/ (Custom Visualization Functions) - Shared in Zenodo
- plot_curve.R: Contains evalplot.curve(), a custom function to plot response curves for Maxent models, visualizing clamped vs. unconstrained tails.
- plot_density.R: Contains evalplot.density(), a custom function to plot the density of environmental variables across training and transfer spaces, highlighting non-analog conditions
Software and Hardware Requirements
- R: Version 4.3.0 or higher.
- Java: Required to run maxent.jar (v3.4.4) via the dismo package.
- Key R Packages:
- Spatial processing: terra (v1.8-70), sf (v1.0-21).
- Modeling & Extrapolation: dismo (v1.3-14), ENMeval (v2.0.5.2), smop (v0.0.2).
- Data manipulation and visualization: dplyr, tidyr, readr, ggplot2, gridExtra, tidyterra.
Access information
Other publicly accessible locations of the data:
- García-Mendoza, D. F., López-González, C., Hortelano-Moncada, Y., López-Wilchis, R., & Ortega, J. (2018). Geographic cranial variation in Peromyscus melanotis (Rodentia: Cricetidae) is related to primary productivity. Journal of Mammalogy, 99(4), 898–905.
Data was derived from the following sources:
- Karger, D. N., Conrad, O., Böhner, J., Kawohl, T., Kreft, H., Soria-Auza, R. W., Zimmermann, N. E., Linder, H. P., & Kessler, M. (2017). Climatologies at high resolution for the earth’s land surface areas. Scientific Data, 4(1), 170122.
Karger, D. N., Conrad, O., Böhner, J., Kawohl, T., Kreft, H., Soria-Auza, R. W., Zimmermann, N. E., Peter Linder, H., & Kessler, M. (2021). Dataset: Climatologies at high resolution for the earth’s land surface areas [Dataset]. EnviDat. https://doi.org/10.16904/envidat.228.v2.1
Considerations: Please note that the raw occurrence coordinates for P. melanotis are omitted from this repository due to copyright restrictions held by the Journal of Mammalogy; instructions for obtaining these occurrences are detailed in this README to ensure full reproducibility. Furthermore, because the custom visualization scripts interact with the ENMeval package, the R code is distributed under a GNU General Public License (GPL-3), while all data tables and model outputs are dedicated to the public domain (CC0 1.0).
