Data from: A pattern-oriented simulation for forecasting species spread through time and space: A case study on an ecosystem engineer on the move
Data files
Jan 22, 2025 version files 11.38 MB
-
aoa.tif
134.91 KB
-
fine_scale_data.csv
14.89 KB
-
future_hbs.tif
2.16 MB
-
hbs_map.tif
2.76 MB
-
README.md
13.96 KB
-
suly_sdm_input.csv
6.21 MB
-
suly_sdm_performance.csv
84.09 KB
Abstract
Modelling the spread of introduced ecosystem engineers is a conservation priority due to their potential to cause irreversible ecosystem-level changes. While existing models predict potential distributions and spread capacities, new approaches that simulate the trajectory of a species’ spread over time are needed. We developed novel simulations that predict spatial and temporal spread, capturing both continuous diffusion-dispersal and occasional long-distance leaps. We focused on the introduced population of Superb Lyrebird (Menura novaehollandiae) in Tasmania, Australia. Initially introduced as an insurance population, lyrebirds have become novel bioturbators, spreading across key natural areas and becoming "unwanted but challenging to eradicate". Using multi-scale ecological data, our research (1) identified broad and fine-scale correlates of lyrebird occupation and (2) developed a spread simulation guided by a pattern-oriented framework. This occurrence-based modelling framework is useful when demographic data are scarce. We found that the cool, wet forests of western Tasmania with open understories offer well-connected habitats for lyrebird foraging and nesting. By 2023, lyrebirds had reached quasi-equilibrium within a core range in southern Tasmania, and were expanding northwest, with the frontier reaching the western coast. Our model forecasts that by 2085, lyrebirds will have spread widely across suitable regions of western Tasmania. By pinpointing current and future areas of lyrebird occupation, we provide land managers with targeted locations for monitoring the effects of their expansion. Further, our Area of Applicability (AOA) analysis identified regions where environmental variables deviate from the training data, guiding future data collection to improve model certainty. Our findings offer an evidence-based approach for future monitoring and provide a framework for understanding the dynamics of other range-expanding species with invasive potential.
README: A pattern-oriented simulation for forecasting species spread through time and space: A case study on an ecosystem engineer on the move
https://doi.org/10.5061/dryad.xsj3tx9rg
Description of the data and file structure
This dataset supports the study on habitat suitability and spread modelling of the Superb Lyrebird (Menura novaehollandiae) in Tasmania. The study applies a sequential framework:
- Fine-scale habitat modelling using camera-trap data to identify vegetation features influencing lyrebird activity.
- Broad-scale species distribution modelling (SDM) using citizen science and camera-trap records combined with environmental predictors.
- Spread simulation to predict lyrebird expansion over time using a stochastic grid-cell approach.
The dataset includes fine-scale vegetation structure, occurrence records, environmental predictors, SDM performance outputs, and simulation scripts.
Files and variables
File: fine_scale_data.csv
Description:
This file contains fine-scale habitat structure data collected from 210 camera sites in Tasmania. The data were used to analyse the relationship between vegetation structure and lyrebird activity.
Variables:
- camera: Unique identifier for each camera station.
- longitude: Longitude of the camera station (WGS84).
- latitude: Latitude of the camera station (WGS84).
- independent_observations: Number of independent lyrebird detections at each camera station.
- camera_opt_time: Camera operational time in days.
- litter: Classification of litter density as dense (1) or sparse (0).
- logs: Classification of log abundance as dense (1) or sparse (0).
- grass: Classification of grass density as dense (1) or sparse (0).
- herb_understorey: Classification of herbaceous understorey density as dense (1) or sparse (0).
- woody_understorey: Classification of woody understorey density as dense (1) or sparse (0).
- tree_density: Classification of tree density as dense (1) or sparse (0).
*Note on coordinate privacy: To protect sensitive camera locations, all site coordinates (latitude and longitude) have been rounded to two decimal places (~1 km precision). This up-scaling ensures site privacy while preserving the spatial context for ecological analyses.
File: suly_sdm_performance.csv
Description:
This file includes performance metrics for the Species Distribution Models (SDMs) used to predict lyrebird habitat suitability.
Variables:
- lon: Longitude of the prediction point (WGS84).
- lat: Latitude of the prediction point (WGS84).
- ibra_ID: Identifier for the Interim Biogeographic Regionalisation for Australia (IBRA) sub-region.
- occ: Observed species occurrence (1 = presence, 0 = absence).
- prediction: Predicted suitability score (0–1) for lyrebirds.
- predicted_class: Binary classification of predicted presence (1 = suitable, 0 = unsuitable).
*Note on Coordinate Privacy: Longitude (lon) and Latitude (lat) were rounded to two decimal places to maintain privacy of sensitive locations.
File: aoa.tif
Description:
This raster file represents the Area of Applicability (AOA) for the SDM predictions, highlighting areas in Tasmania where the model is reliable based on similarity to training data.
Metadata:
- Coordinate Reference System (CRS): EPSG:4326 (WGS84).
- Resolution: 0.01° (~1 km² pixel).
- Values:
- 1 = Within the Area of Applicability (reliable predictions).
- 0 = Outside the Area of Applicability (potential extrapolation errors).
File: suly_sdm_input.csv
Description:
This file contains species occurrence data and environmental predictors used to build broad-scale Species Distribution Models (SDMs). All continuous predictors were z-transformed before modelling. Vegetation and land use variables were aggregated into four major categories for each.
Variables:
- ibra_ID: Identifier for the IBRA region where the data point is located.
- lon: Longitude of the data point (WGS84).
- lat: Latitude of the data point (WGS84).
- occ: Binary species occurrence (1 = presence, 0 = absence).
- meanDiurnRange: Mean diurnal temperature range (z-transformed).
- isothermality: Ratio of diurnal temperature range to annual temperature range (z-transformed).
- meanTempColdQ: Mean temperature during the coldest quarter (z-transformed).
- meanTempWarmQ: Mean temperature during the warmest quarter (z-transformed).
- precipSeason: Coefficient of variation in monthly precipitation (z-transformed).
- tempSeason: Coefficient of variation in monthly temperature (z-transformed).
- precipWarmQ: Total precipitation during the warmest quarter (z-transformed).
- precipColdQ: Total precipitation during the coldest quarter (z-transformed).
- annPrecip: Total annual precipitation (z-transformed).
- Elevation: Elevation in meters above mean sea level (z-transformed).
- FPAR: Fraction of Photosynthetically Active Radiation absorbed by vegetation (z-transformed).
- veg_4class: Generalised vegetation type:
- 1 = Rainforests
- 2 = Wet forests
- 3 = Dry woodland
- 4 = Other (e.g., grasslands)
- landuse_4class: Generalised land-use type:
- 1 = Protected
- 2 = Modified native
- 3 = Softwood Plantation
- 4 = Farmland/Settlement
- Dist.FW: Distance to freshwater sources (meters, z-transformed).
- Dist.Road: Distance to the nearest road (meters, z-transformed).
- Top.Roughness: Topographic roughness index (z-transformed).
- Soil.Class: Soil richness based on type and porosity (categorical).
- Fire.Freq: Average fire frequency in the area (categorical).
- split_cv: Cross-validation group assignment for training and testing.
*Note on Coordinate Privacy: Coordinates (lon, lat) have been rounded to two decimal places. The resulting coordinate resolution (~1 km) protects potentially sensitive locality data while preserving sufficient accuracy for broad-scale SDM.
File: hbs_map.tif
Description:
This raster file represents the predicted habitat suitability map for lyrebirds across their range.
Metadata:
- Coordinate Reference System (CRS): EPSG:4326 (WGS84).
- Resolution: 0.01° (~1 km² pixel).
- Values: Habitat suitability scores ranging from 0 (unsuitable) to 1 (highly suitable).
File: future_hbs.tif
Description: This raster file represents the projected habitat suitability for Superb Lyrebirds (Menura novaehollandiae) in 2085 under future climate scenarios. The projections are based on mean outputs from three Global Climatic Models (GCMs) under Representative Concentration Pathway (RCP) scenario 4.5. These models were selected for their reliability in downscaled climatic projections. Only bioclimatic variables identified through Forward Feature Selection (FFS) were used in the modelling. Future land-use and vegetation change data were not included in these projections due to the unavailability of corresponding datasets.
Metadata:
- Source Data: Climatic projections from GFDK-CM21, MRI-CGCM232A, and UKMO-HADCM3 accessed via EcoCommons.
- Resolution: 0.01° (~1 km² pixel).
- Coordinate Reference System (CRS): EPSG:4326 (WGS84).
Values:
- Range: 0 (unsuitable) to 1 (highly suitable).
- Interpretation: Habitat suitability scores derived from the random forest model for climatic conditions projected in 2085.
Code/software
1. functions_spread_ecography.R
This file contains the core functions necessary for the spread simulation. Each function plays a specific role in the modelling process, including managing diffusion, leap dynamics, and validating predictions.
Key Functions
- Simulation Functions:
simulate_spread
: Models yearly species spread using habitat suitability, diffusion, and leap probabilities.perform_diffusion_matrix
: Simulates local diffusion dynamics.leap_and_colonise_matrix
: Simulates long-distance dispersal events.create_template_matrix
: Prepares matrices to define cells marked for diffusion or leaping.calculate_establishment_probs
: Calculates probabilities for species establishment based on leap distance.
- Validation Metrics:
first_species_arrival
: Identifies the first year of colonisation for each grid cell.- Binary entropy loss: Quantifies the accuracy of predictions by comparing them with observed data.
- Parallel Processing:
predict_spread_parallel
: Executes simulations across multiple CPU cores.run_parallel_simulations
: Coordinates and averages parallel simulations.
Inputs and Outputs
- Inputs: Rasters for initial species presence, habitat suitability, and validation targets.
- Outputs: Yearly raster predictions of species spread and performance metrics such as binary entropy loss.
2. simulate_spread_ecography.R
This file provides the main workflow to execute the simulation, perform parameter calibration, and visualise results.
Workflow Description
- Data Loading:
- Loads input rasters (
suly_spread_input.tif
). - Defines simulation parameters such as spread probability, leap probability, and habitat suitability.
- Loads input rasters (
- Parameter Calibration:
- Uses Latin Hypercube Sampling (LHS) to sample parameter space.
- Refines parameters through Approximate Bayesian Computation (ABC).
- Simulation Execution:
- Models the historical spread (1930–2023) and future projections (2023–2085).
- Results and Validation:
- Outputs raster maps for yearly spread and calculates performance metrics.
Software Requirements
The scripts require R (version 4.0 or later) and the following R packages:
terra
(for raster data handling and visualisation)lhs
(to implement Latin Hypercube Sampling)abc
(for Approximate Bayesian Computation)doParallel
(to enable parallel processing)grDevices
(for generating colour palettes)
Notes on Software Setup
- Ensure all required packages are installed using
install.packages()
. - The scripts are designed to run efficiently on multi-core systems for large raster datasets. Adjust
n_cores
in the script for your system's specifications.
Inputs and Outputs
Input Files
suly_spread_input.tif
: A raster stack containing:- Initial species distribution.
- Habitat suitability map.
- Observed species distribution (validation target).
Outputs
- Rasters showing predicted spread patterns (1930–2085).
- Performance metrics, including binary entropy loss, for each parameter set.
Usage Instructions
- Setup:
- Place
functions_spread_ecography.R
andsimulate_spread_ecography.R
in the same directory. - Load the required R packages.
- Place
- Execution:
- Open
simulate_spread_ecography.R
in R. - Adjust parameters such as
nsim
,intro_year
, andfinal_year
as needed. - Run the script to execute simulations and generate outputs.
- Open
- Output Visualisation:
- Use the provided code within the script to visualise spread predictions and evaluate model performance.
Access information
Species Occurrence Data
Species occurrence records were sourced from the Atlas of Living Australia (ALA) and include records for the superb lyrebird (Menura novaehollandiae) as well as all other land birds used to generate effort-controlled pseudo-absences.
DOI Links for Species Data:
- Superb Lyrebird Occurrences: doi.org/10.26197/ala.4744a2de-99ec-4cfb-9a0d-2a52d0f1dd5e (accessed 06 April 2023)
- All Bird Occurrences New South Wales: doi.org/10.26197/ala.4a361a2e-a9ea-4044-b75a-0f777bcc3b96
- All Bird Occurrences Queensland: doi.org/10.26197/ala.3d110109-1285-4518-8b0a-9e63d8bb1c3f
- All Bird Occurrences Victoria: doi.org/10.26197/ala.adf7dc7a-41d0-4fc3-8190-79fc03f4a31f
- All Bird Occurrences Tasmania and Australian Capital Territory: doi.org/10.26197/ala.cf2a97b2-42aa-4fee-b0bb-cd6c884467a2
Environmental Predictor Data
The predictor data was compiled from multiple publicly accessible sources:
- Climatic Variables: EcoCommons (ecocommons.org.au)
- Elevation: Department of Agricultural Resources (data.daff.gov.au)
- Vegetation Type: National Vegetation Information System (environment.gov.au)
- Land-Use Type: Catchment Scale Land Use of Australia (agriculture.gov.au)
- Distance to Freshwater and Roads: Created by the host research group using publicly available shapefiles.
- Terrain Roughness: Created by the host research group using elevation data (data.daff.gov.au)
- Soil Richness: Australian Soil Resource Information System (asris.csiro.au)
- Fire Frequency: AusCover (data.auscover.org.au)
IBRA Regions and Subregions Shapefiles
The Interim Biogeographic Regionalisation for Australia (IBRA) shapefiles, including both regions and subregions, were obtained from the Australian Department of Climate Change, Energy, the Environment and Water (DCCEEW):
https://www.dcceew.gov.au/environment/land/nrs/science/ibra
Methods
This dataset integrates fine- and broad-scale ecological data collected between 1970 and 2023. Fine-scale data was obtained from 210 camera trap sites in Tasmania, recording vegetation structure and lyrebird detections. Broad-scale data was compiled from citizen science records via the Atlas of Living Australia, combined with environmental predictors such as climatic variables, elevation, and land-use data.
Fine-scale habitat data was derived from camera trap detections, with vegetation classified into dense or sparse categories using a 3×3 grid overlay method. Broad-scale occurrence records were filtered for spatial and temporal accuracy, and pseudo-absence data was generated based on effort-controlled absence criteria.
Predictor variables for Species Distribution Models (SDMs) were z-transformed, and categorical variables (e.g., vegetation, land-use types) were aggregated into broader classes. Spread simulations used a stochastic grid-cell model with parameters calibrated via a pattern-oriented framework using Approximate Bayesian Computation.
For full details on data collection and processing, please refer to the associated publication: https://doi.org/10.1111/ecog.07597.