Guidelines

This file contains the data from Muylaert et al. 2022. Present and future distribution of bat hosts of sarbecoviruses: implications for conservation and public health. Proceedings of the Royal Society B. and guidelines for the use of the workflow.

Please read the following guidelines:

Folder structure

Please find the content description for each folder below:

Workflow table

The reference for workflow table used is Zurell et al. 2020. A standard protocol for reporting species distribution models. Ecography 43, 1261–1277.

Section Subsection Element Value
Overview Authorship Study title Present and future distribution of bat hosts of sarbecoviruses: implications for conservation and public health.
Overview Authorship Author names Renata L. Muylaert; Tigga Kingston; Jinhong Luo; Maurício Humberto Vancine; Nikolas Galli; Colin J. Carlson; Reju Sam John; Maria Cristina Rulli; David T. S. Hayman.
Overview Authorship Contact
Overview Authorship Study link https://doi.org/10.1098/rspb.rspb.2022.0397
Overview Model objective Model objective Forecasting and transfer.
Overview Model objective Target output Continuous occurrence probabilities and binary maps of potential presence.
Overview Focal Taxon Focal Taxon Bats hosts of sarbecoviruses.
Overview Location Location World.
Overview Scale of Analysis Spatial extent -30, 160, -30, 70 (xmin, xmax, ymin, ymax)
Overview Scale of Analysis Spatial resolution 0.25 dd
Overview Scale of Analysis Temporal extent Near-current and Future (2021-2100).
Overview Scale of Analysis Temporal resolution Near-current, 2021-2040, 2041-2060, 2061-2080, 2081-2100.
Overview Scale of Analysis Boundary Terrestrial areas of the world.
Overview Biodiversity data Observation type Human observation of occurrences.
Overview Biodiversity data Response data type Presence.
Overview Predictors Predictor types Bioclimatic; karst; forest cover.
Overview Hypotheses Hypotheses Implications for the conservation and public health through evaluation of species distribution change in response to climatic, karst, and forest cover.
Overview Assumptions Model assumptions Bats occur within their bioregions where they were detected, and around their highest density of occurrence points (MSDMs). Bat distribution is driven bioclimatic covariates, karst and native forest cover. Accessibility bias partially drives observed occurrences. Sampling bias is minimized by filtering, spatial thinning and minimal occurrences for inclusion criteria (N=40).
Overview Algorithms Modelling techniques Maxent through the ENMTML R package.
Overview Algorithms Model complexity Six follwoing covariates were used bio 1, bio 4, bio 12, bio 15, karstm, primf tif files.
Overview Algorithms Model averaging True skill statistics-weighted (TSS-weighted) averaging.
Overview Workflow Model workflow ENMTML workflow.
Overview Software Software R 4.
Overview Software Code availability https://github.com/renatamuy/dynamic
Overview Software Data availability Dryad.
Data Biodiversity data Taxon names
Aselliscus stoliczkanus

Hipposideros armiger

Hipposideros galeritus

Hipposideros larvatus

Hipposideros pomona (gentilis)

Hipposideros pratti

Hipposideros ruber

Miniopterus schreibersii

Chaerephon plicatus

Tadarida teniotis

Rhinolophus acuminatus

Rhinolophus affinis

Rhinolophus blasii

Rhinolophus blythi

Rhinolophus cornutus

Rhinolophus creaghi

Rhinolophus euryale

Rhinolophus ferrumequinum

Rhinolophus hipposideros

Rhinolophus luctus

Rhinolophus macrotis

Rhinolophus malayanus

Rhinolophus marshalli

Rhinolophus mehelyi

Rhinolophus monoceros

Rhinolophus pearsonii

Rhinolophus rex

Rhinolophus shameli

Rhinolophus siamensis

Rhinolophus sinicus

Rhinolophus stheno

Rhinolophus thomasi

Nyctalus leisleri

Plecotus auritus
Data Biodiversity data Taxonomic reference system Wilson D, Mittermeier R, editors. Handbook of the Mammals of the World. Barcelona: Springer; 2019.
Data Biodiversity data Ecological level assemblage-level, species-level.
Data Biodiversity data Data sources Darkcides v1, Global Biodiversity Information Facility (GBIF), Berkeley Ecoinformatics Engine (Ecoengine), Vertnet, Integrated Digitized Biocollections (IDigBio), iNaturalist, Obis, Vertnet, and data compiled for previous publications Darkcides v01, Rulli et al. (2020), Luo et al. (2013)
Data Biodiversity data Sampling design ENMTML workflow.
Data Biodiversity data Clipping Terrestrial areas of the world.
Data Biodiversity data Scaling None.
Data Biodiversity data Cleaning Data cleaning: Temporal range from 1970-2020. Cleaning process through CooordinateCleaner package including species with at least 40 occurrence points.
Data Biodiversity data Absence data None.
Data Biodiversity data Background data pres_abs_ratio = 1
Data Biodiversity data Errors and biases Errors and biases: Sampling rates estimates through sampbias R package.
Data Data partitioning Training data 75:25 training:test.
Data Data partitioning Validation data 75:25 training:test.
Data Data partitioning Test data Ratio of 75:25 training:test cross-validation splits with 10 repeats.
Data Predictor variables Predictor variables Bioclimatic variables, Karst composite layer, Primary forest cover.
Data Predictor variables Data sources Table S3.
Data Predictor variables Spatial extent -30, 160, -30, 70 (xmin, xmax, ymin, ymax)’
Data Predictor variables Spatial resolution 0.25 dd.
Data Predictor variables Coordinate reference system WGS84.
Data Predictor variables Temporal extent Bioclimatic variables cover 1970-2000 for near-current conditions. Future projection periods: 2020-2040, 2040-2060, 2060-2080, 2080-2100.
Data Predictor variables Temporal resolution Future projection periods: 2020-2040, 2040-2060, 2060-2080, 2080-2100.
Data Predictor variables Data processing Covariates resampled to 0.25 dd.
Data Predictor variables Errors and biases Assessed via sampbias R package.
Data Predictor variables Dimension reduction None.
Data Transfer data Data sources
Data Transfer data Spatial extent World.
Data Transfer data Spatial resolution 0.25 dd
Data Transfer data Temporal extent 1970-present
Data Transfer data Temporal resolution Yearly
Data Transfer data Models and scenarios Future bioclimati data downloaded from Worldclim (CMIP6).
Data Transfer data Data processing Future-occurrence predictions were made for each species and then ensembled per period per GCM and SSP.
Data Transfer data Quantification of Novelty NA
Model Variable pre-selection Variable pre-selection Relevance for our conceptual model of important native habitats for the selected species.
Model Multicollinearity Multicollinearity All bioclimatic covariates, karst layer and forest layer were pre-selected and then filtered after correlation analysis (0.7 cutoff value).
Model Model settings Model settings (fitting) MXS’ and ‘MXD’ algorithms.
Model Model settings Model settings (extrapolation) Extrapolations over near-current accessible areas assuming MSDM ‘OBR’ for the present.
Model Model estimates Coefficients NA
Model Model estimates Parameter uncertainty NA
Model Model estimates Variable importance Correlative.
Model Model selection - model averaging - ensembles Model selection NA
Model Model selection - model averaging - ensembles Model averaging NA
Model Model selection - model averaging - ensembles Model ensembles Weighted averaging of the algorithms through TSS.
Model Analysis and Correction of non-independence Spatial autocorrelation NA
Model Analysis and Correction of non-independence Temporal autocorrelation NA
Model Analysis and Correction of non-independence Nested data NA
Model Threshold selection Threshold selection We used the sensitivity‐specificity sum maximisation (max TSS) approach to select the optimal suitability threshold.
Assessment Performance statistics Performance on training data NA
Assessment Performance statistics Performance on validation data NA
Assessment Performance statistics Performance on test data True skill statistics (TSS).
Assessment Plausibility check Response shapes NA
Assessment Plausibility check Expert judgement IUCN range polygons and the Handbook of the Mammals of the World.
Prediction Prediction output Prediction unit Continuous suitability and estimated richness for hotspots inference (sum of final binary maps).
Prediction Prediction output Post-processing Area calculation through raster R package.
Prediction Uncertainty quantification Algorithmic uncertainty Ensemble over two algorithms and 10 repeats.
Prediction Uncertainty quantification Input data uncertainty Sampling bias adjusted map in Figure 2. 2 SSPs and 2 GCMs for future scenarios.
Prediction Uncertainty quantification Parameter uncertainty Table S2 for parameters used in sampbias.
Prediction Uncertainty quantification Scenario uncertainty SSP-2.45 and SSP-5.85 scenario evaluation.
Prediction Uncertainty quantification Novel environments NA