Seki, Takeharu 1 ; Hirota, Mitsuru1; Kenta, Tanaka1

Research facility: University of Tsukuba

Published Mar 19, 2026 on Dryad. https://doi.org/10.5061/dryad.ksn02v7hj

Abstract

This repository houses the simulation code and data for "A Bias-robust Framework for Quantifying Community Responses to the Climate Change Using the Occurrence Data." There are two simulation codes. In the first simulation (SimulationCode.R), distribution data for a pseudo-biological community—whose range shifts due to climate warming—is generated, and numerous rounds of biased sampling are conducted from that distribution. The CCDM (Community Change Detection Model) is then applied to the resulting biased occurrence data to evaluate the rate of thermophilization. In the second simulation (SimulationCode_Sensitivity_Analysis.R), sampling and species distribution generation are carried out under different bias conditions to assess the robustness of CCDM across various scenarios.

SimulationCode.R

Generation of true community data: We constructed a fictional community of 100 species that shifted to cooler regions without delay. The optimal LTI for each species at Year 50—the midpoint of the simulation—was randomly assigned from a uniform distribution ranging from −10 °C to 10 °C. The optimal LTI changes from year to year, whereas the STI—also referred to as the climatic niche—remains constant over time. Warming caused a 0.01 °C annual decrease in each species' optimal LTI. Because the LTI is a representative temperature index for a site, it remains constant even under warming conditions. Therefore, decreasing the optimal LTI by 0.01 °C/year represents range shifts at a rate of 0.01 °C/year to colder regions, not a change in climatic niches. Each year, 100,000 individuals were generated and randomly assigned to one of the 100 species. The LTI of each individual's location was drawn from a normal distribution with the mean equal to that species' optimal LTI for that year and a standard deviation of 3 °C. Thus, each individual contains only LTI as locational information. This produced a true community dataset of 10,000,000 individuals (100 years × 100,000 individuals/year).
Generation of occurrence data with sampling bias: To simulate observations within boundaries that do not encompass species' entire distributions, we created “truncated community data” by excluding individuals in the top and bottom 5 % tails of the LTI—excluding a total of 10 % of records—from the community distribution data. We sampled from the “truncated community data” under two sampling scenarios: “Bias toward Colder”, where the mean LTI of sampling locations shifted annually toward colder regions, and “Bias toward Warmer”, where it shifted toward warmer regions to demonstrate STVOE. The centroid LTI shifted linearly from 1 °C to −1 °C (Bias toward Colder scenario) or from −1 °C to 1 °C (Bias toward Warmer scenario) over the 100 years. Sampling weight for each year and each individual in the truncated community data was assigned based on the probability density of the normal distribution based on the mean as the centroid LTI specific to each year and standard deviation as 5 °C. A total of 10,000 records were extracted using weighted sampling to generate occurrence data. This occurrence data generation process was repeated 1000 times for each scenario (2 scenarios × 10,000 records/dataset × 1000 datasets).
Regression analysis and evaluation: For each occurrence dataset, the STI for each species was estimated by averaging LTIs of records collected in the first 20 years. For each occurrence dataset, multiple regression analysis was applied, followed by correction using the resulting coefficients . The difference between the estimated thermophilization rate and the true simulated warming rate (0.01 °C/year) was evaluated using Cohen's d effect size.

SimulationCode(Sensitivity_Analysis).R

To evaluate the validity of the CCDM under different bias conditions, we conducted sensitivity analyses by varying the magnitudes of STVOE and the truncation effect. STVOE was simulated by setting the centroid LTI of sampling efforts in the first and final years to differ from −4 °C (+2 °C to −2 °C) to +4 °C (−2 °C to +2 °C) in 0.2 °C increments. Same centroid LTIs of sampling effort in the initial and final year indicated no STVOE, however, a spatial bias remained, as the observations were still concentrated around LTI = 0 °C. For each STVOE setting, the truncation effect was simulated by removing 0 %–7.5 % from each tail of the species distribution (i.e., 0 %–15 % in total) in 0.75 % increments. In total, 231 bias scenarios were generated from the 21 STVOE levels and 11 truncation levels. Each combination was sampled 100 times. For every simulated occurrence dataset, CCDM was applied, and the differences between the estimated values—both uncorrected (c1) and bias-corrected (c1/c2)—and the true warming rate (0.01 °C year⁻¹) were compared.

Data and code from: A bias-robust framework for quantifying community responses to the climate change using the occurrence data

Data files

Abstract

Description of this repository

Description of the data and file structure

File: SimulationCode.R

File: SimulationCode(Sensitivity_Analysis).R

File: 01_GeneratedDistributionData.csv

Variables

File: 02_ExtractedBiasedOccurrenceData.zip

Variables

Reference

Code/software

Access information

Data and code from: A bias-robust framework for quantifying community responses to the climate change using the occurrence data

Data files

Abstract

README: Dryad dataset

Description of this repository

Description of the data and file structure

File: SimulationCode.R

File: SimulationCode(Sensitivity_Analysis).R

File: 01_GeneratedDistributionData.csv

Variables

File: 02_ExtractedBiasedOccurrenceData.zip

Variables

Reference

Code/software

Access information

Methods