Data and code from: Integrating microclimate to understand vector development and disease patterns: Challenges and lessons from plague in Madagascar's Central Highlands
Data files
Feb 05, 2026 version files 10.32 GB
-
01_code_microclimf_4_submission.zip
2.53 GB
-
02_temp_2_VDI.zip
82.02 MB
-
03_code_VDI_and_null_calculation_4_submission.zip
7.71 GB
-
README.md
17.28 KB
Abstract
This dataset contains code and data supporting the generation of microclimate-driven estimates of plague vector development across the Central Highlands of Madagascar. It comprises three zipped archives. The first archive includes scripts, spatial input datasets, and intermediate outputs required to construct a mechanistic microclimate model of below-ground temperature at 0.15 m depth for summer 2023 using the microclimf modelling framework. Input data include categorical soil class rasters, a digital elevation model, MODIS-derived land-cover rasters, and hourly ERA5 climate data, and the archive produces hourly microclimate temperature rasters.
The second archive contains scripts and microclimate outputs used to convert hourly below-ground temperatures into annual Vector Development Index (VDI) rasters for two flea species (Synopsyllus fonquerniei and Xenopsylla cheopis). The third archive contains scripts and data for calculating temporal slopes of VDI, plotting VDI gradients, and conducting null analyses that compare observed pre-case VDI gradients with temporally permuted null case series. Dummy human plague case datasets (RDS and CSV formats) are provided to enable execution of the workflow without releasing sensitive surveillance records.
All analyses are implemented in the R programming environment, with package dependencies specified within scripts and installed automatically via the pacman package. No personally identifiable information is included. Original human plague surveillance data are curated by the Plague Unit of the Institut Pasteur de Madagascar and may be requested separately. The dataset supports reproduction of the full analytical workflow, adaptation of the modelling framework to other regions or vector systems, and reuse of scripts for microclimate modelling, vector development estimation, and null-model based gradient testing. Data are released under a public domain waiver.
Description of the data and file structure
Introduction
The dataset and script in this repository enable the running of a dummy subset of the analysis completed in the paper “Integrating microclimate to understand vector development and disease patterns: challenges and lessons from plague in Madagascar's Central Highlands”. Much of the analysis for this work was completed using an institutional High Performance Computer; hence, a subset of the code and data, which should be runnable on a personal computer, is included here in the electronic supplementary material.
Required R Packages
The following packages are needed for data handling, geospatial processing, and microclimate modelling:
- CRAN / standard packages
pacman,remotes,devtools,lutz,terra,abind,tictoc,stringr,ncdf4,
tidyverse,tidyterra,ggplot2,sf,rgdal,lubridate,stats,
ggpubr,grid,plyr,ggspatial,spatstat,ggtext,
ggbreak,ggrepel,scales - GitHub-hosted microclimate packages
These are installed from relevant repositories:climvars— https://github.com/ilyamaclean/climvarsmicroclimf— https://github.com/ilyamaclean/microclimfmicropoint— https://github.com/ilyamaclean/micropointmicroctools— https://github.com/ilyamaclean/microctools
Micorclimate analysis
The analysis is split into three separate zip files, which, when expanded, contain the script and data to run the analysis. All required packages are listed within the functions and will be installed and loaded through the package "pacman". This package is installed on line 7 of the “microclimf_test_script.R” script. The first file "01_code_microclimf_4_submission.zip" contains the script and data for running the microclimate model (microclimf - see https://doi.org/10.32942/X2BD17). The user should open the R project file "non_interactive_micorclimf.Rproj", which will establish the working directory to enable all functions to be correctlysourced. Running the “microclimf_test_script.R” will construct a microclimate model across this region for the summer of 2023.
Data
- Soil data (iSDA Africa Soil Database)
Categorical raster of soil class (e.g., clay-loam, sandy-loam). Units: none. Used to assign soil thermal and hydraulic properties for below-ground temperature modelling.- "./data/tile_dat/0.5x0.5/soil_high_id...........tif"
- Digital Elevation Model (DEM)
Raster of elevation in metres above sea level (m). Used to derive slope, aspect, terrain shading, and topographic wetness.- "./data/highland_dat/MDG_msk_alt/MDG_highland.tif"
- Climate data (ERA5, hourly)
Time series or gridded data including air temperature (°C), relative humidity (%) or dewpoint (°C), wind speed (m s⁻¹), surface pressure (Pa or hPa), downward shortwave radiation (W m⁻²), downward longwave radiation (W m⁻² or derived), and precipitation (mm). Represents above-canopy macroclimate forcing.- "./data/spatial_data/era5_............nc"
- Land cover/vegetation (MODIS Land Cover)
Categorical land-cover raster (e.g., forest, shrub, cropland, grassland). Converted usingveg_from_hab_local(local version with vegetation height variables marginally altered) to vegetation parameters including canopy height (m), plant area index (m² m⁻²), leaf reflectance (0–1), and leaf transmittance (0–1).- "./data/highland_dat/microclim_format/lc_high_id_............tif"
Files and structure
All of the zipped files follow the same basic structure containing a results directory, in which all output are stored, a data directory, where the input variables are drawn, and a script directory containing numbered script files to run the processing and a functions directory from which all functions are sourced.
├── data
│ ├── highland_dat
│ │ ├── MDG_msk_alt
│ │ └── microclim_format
│ ├── spatial_data
│ │ └── subset
│ └── tile_dat
│ └── 0.5x0.5
├── results
│ └── runmicro_biga_sensitivity
│ └── 2023_-0.15
│ └── microut
└── script
├── 01_Functions
└── microclimf_local
All data is loaded from the data file and subdirectories as detailed above.
Script is within the script folder with functions and local versions of microclimf functions in the 01_Functions and microclimf directories, respectively.
The script produces a single year of microclimat output (2023) and at a depth of -0.15m, which is output as tiled RDS files in the "/results/runmicro_biga_sensitivity/2023_-0.15/microut" folder.
Vector development calculation
The second file "02_temp_2_VDI.zip" contains the code and script for transforming the microclimate data into a vector development index (VDI), an estimation of the total complete development cycles possible in the selected time period (1 Year) for the vector species Synopsyllus fonquerniei and Xenopsylla cheopis. The user should again run all scripts through the provided project "PDI_calculation.Rproj" in numerical order.
This script is set to only calculate the VDI over a subset of the data, to save time while demonstrating functionality. If the user would like to complete the analysis in full, line 26 in "01_microclim_2_PDI.R" (micro_out_hourly_files <- micro_out_hourly_files[1:10]) should be commented out.
The input data for this project is the output from the previous project stored in 01_code_microclimf_4_submission/results/runmicro_biga_sensitivity/2023_-0.15/microut/").
Running "01_From_microclimate_to_PDI.R" will run the conversion.
Files and structure
There is no data file in this directory, as the data input data is instead called from the output from the "01_code_microclimf_4_submission.zip". This means that the relative location of the files is important, and we recommend extracting all files in the same folder. Results are output as "tif" files, and script and functions are both contained in the script directory.
├── results
│ ├── pdi_rasts
│ └── temp_rasts
└── script
└── functions
Vector development Index gradient and nulls
The second file "03_code_VDI_and_null_calculation_4_submission.zip" contains the code and script for plotting the VDI gradients and running a null analysis comparing the calculated VDI gradients prior to plague cases to null cases.
Data
This analysis is completed using the annual VDI (PDI) rasters as produced in 02_temp_2_VDI with daily layers; these are stored as tiffs.
The only other input data is the DUMMY plague case data, stored both as an RDS and CSV files ("./results/dummy_plague_spatial_14_20.rds" and "./data/case_data/case_date_14_20.csv" respectively).
Script
01_PDI_prep.R - Loads and combines the plague case data (dummy) with the VDI (initially labeled as PDI) data.
02_PDI_gradient_plot - Plots the VDI gradients as displayed in Figure 4.
03_PDI_summary.R - Summarise and generate boxplots across both species and time periods.
04_null_gradient.R - Calculate and plot the distribution of null gradients across all regions and cases in. comapison to "real" case data.
Files and structure
This directory produces all of the data displayed in figures and integrates data from the previous two directories. The script and data results structure is maintained.
Script
script/
└── functions
All scripts and functions are contained in the script directory, and the scripts should be run in numerical order as described above.
Data
data/
├── DEM
├── MDG_msk_alt
├── Modis_LC
├── case_date
├── commune
├── district
├── pdi_rasts
└── spatial
The data folder contains all data for analysis in this directory.
- DEM - Digital elevation model masked to the extent of the Madagascan Highlands, "tif" format
- MDG_msk_alt - Digital elevation model further masked to only the Madagascan Central Highlands, "tif" format
- **Modis_LC **- Modis-derived land cover for the Madagascan Highlands across all years of analysis, "tif" format. Further, "mad_all_df.rds" is an RDS file containing all land cover layers as a list.
- case_data - DUMMY plague case data, at the district and commune administrative level (18_20 and 14_20 respectively), "csv" format.
- commune - Shape file of the commune administrative boundaries, "shp" format
- district - Shape files of the district administrative boundaries, "shp" format
- pdi_rasts - The vector development index (previously PDI) files for both vector species, using both corrected and uncorrected microclimate temperature values, all in "tif" format.
- spatial - Shape files of all administrative levels across Madagascar, "shp" format.
Results
results/
├── p_values
│ ├── 14_20
│ │ ├── best_p
│ │ └── crctd
│ │ └── best_p
│ └── 18_20
│ ├── best_p
│ ├── crctd
│ │ └── best_p
│ └── temp_maint
├── pdi_null
│ ├── 14_20
│ │ ├── no_structure
│ │ │ ├── s_fonq
│ │ │ │ ├── 1
│ │ │ │ ├── 2
│ │ │ │ ├── 3
│ │ │ │ ├── 6
│ │ │ │ └── crctd
│ │ │ │ ├── 1
│ │ │ │ ├── 2
│ │ │ │ ├── 3
│ │ │ │ └── 6
│ │ │ └── x_cheop
│ │ │ ├── 1
│ │ │ ├── 2
│ │ │ ├── 3
│ │ │ ├── 6
│ │ │ └── crctd
│ │ │ ├── 1
│ │ │ ├── 2
│ │ │ ├── 3
│ │ │ └── 6
│ │ └── temporal_structure_maintained
│ │ ├── s_fonq
│ │ │ ├── 1
│ │ │ ├── 2
│ │ │ ├── 3
│ │ │ ├── 6
│ │ │ └── crctd
│ │ │ ├── 1
│ │ │ ├── 2
│ │ │ ├── 3
│ │ │ └── 6
│ │ └── x_cheop
│ │ ├── 1
│ │ ├── 2
│ │ ├── 3
│ │ ├── 6
│ │ └── crctd
│ │ ├── 1
│ │ ├── 2
│ │ ├── 3
│ │ └── 6
│ └── 18_20
│ ├── no_structure
│ │ ├── s_fonq
│ │ │ ├── 1
│ │ │ ├── 2
│ │ │ ├── 3
│ │ │ ├── 6
│ │ │ └── crctd
│ │ │ ├── 1
│ │ │ ├── 2
│ │ │ ├── 3
│ │ │ └── 6
│ │ └── x_cheop
│ │ ├── 1
│ │ ├── 2
│ │ ├── 3
│ │ ├── 6
│ │ └── crctd
│ │ ├── 1
│ │ ├── 2
│ │ ├── 3
│ │ └── 6
│ └── temporal_structure_maintained
│ ├── s_fonq
│ │ ├── 1
│ │ ├── 2
│ │ ├── 3
│ │ ├── 6
│ │ └── crctd
│ │ ├── 1
│ │ ├── 2
│ │ ├── 3
│ │ └── 6
│ └── x_cheop
│ ├── 1
│ ├── 2
│ ├── 3
│ ├── 6
│ └── crctd
│ ├── 1
│ ├── 2
│ ├── 3
│ └── 6
├── pdi_summary
│ ├── significance_plots
│ │ └── csv
│ └── xl
├── pdi_temporal
│ ├── 14_20
│ ├── 18_20
│ ├── full_14_20
│ ├── full_18_20
│ └── plots
│ ├── 14_20
│ │ ├── 1
│ │ ├── 2
│ │ ├── 3
│ │ ├── 6
│ │ └── crctd
│ │ ├── 1
│ │ ├── 2
│ │ ├── 3
│ │ └── 6
│ └── 18_20
│ ├── 1
│ ├── 2
│ ├── 3
│ ├── 6
│ └── crctd
│ ├── 1
│ ├── 2
│ ├── 3
│ └── 6
└── significance_locs
The results folder is the directory for all Vector Development Index related outputs (Figures 2-5).
- pdi_temporal - For the calculated VDI gradient across all lag, temperature correction, administrative scale, and species permutations, all in "csv" format. Further contains all gradient plots, as displayed in Figure 4.
- pdi_summary - For the boxplot of cumulative VDI for each species at both adminstrative level.
- pdi_null - For the null histogram plots across all lag, temperature correction, administrative scale, and species permutations.
- p_value - For the extracted p-values of the DUMMY plague cases in comparison to the null plague cases across all lag, temperature correction, administrative scale, and species permutations. Further, for the p-value plots displayed in Figure 5.
The results folder further contains:
- Plague spatial dummy data ("RDS")
"./results/dummy_plague_spatial_14_20.rds"
"./results/dummy_plague_spatial_18_20.rds"
Files and variables
File: 01_code_microclimf_4_submission.zip
Description: Script and data for constructing the microclimate model at a depth of -0.15 m across the Madagascan Central Highlands for summer 2023 using microclimf. Includes the R project file (non_interactive_micorclimf.Rproj), microclimate model scripts, and input spatial datasets (soil class raster, DEM, MODIS land cover raster, and ERA5 climate data). Running “microclimf_test_script.R” generates hourly microclimate outputs.
Key contents:
- Soil raster (categorical soil class; GeoTIFF)
- DEM raster (elevation in metres; GeoTIFF)
- Land-cover raster (MODIS-derived categorical classes; GeoTIFF)
- ERA5 climate inputs (NetCDF)
- Microclimate output rasters (hourly below-ground temperature at -0.15 m; GeoTIFF)
File: 02_temp_2_VDI.zip
Description: Script and data for converting hourly microclimate outputs into annual Vector Development Index (VDI) rasters for Synopsyllus fonquerniei and Xenopsylla cheopis. Uses microclimate outputs generated in 01_code_microclimf_4_submission. By default, the script processes a subset of files to demonstrate functionality.
Key contents:
- R project file (PDI_calculation.Rproj)
- Script: 01_From_microclimate_to_PDI.R
- Annual VDI rasters (daily layers stored as GeoTIFF)
File: 03_code_VDI_and_null_calculation_4_submission.zip
Description: Script and data for plotting VDI gradients, summarising VDI patterns across species and time periods, and conducting null analyses comparing observed pre-case VDI gradients to temporally permuted null cases.
Key contents:
- Annual VDI rasters produced in 02_temp_2_VDI
- Dummy human plague case data:
- ./results/dummy_plague_spatial_14_20.rds
- ./data/case_data/case_date_14_20.csv
- Scripts:
- 01_PDI_prep.R
- 02_PDI_gradient_plot.R
- 03_PDI_summary.R
- 04_null_gradient.R
Code/software
All analysis was completed in the R programming environment, and packages were loaded and specified within the scripts
Access information
Other publicly accessible locations of the data:
- The human plague data analysed in this study have been replaced with a dummy version. The original data are curated by the Plague Unit of the Institut Pasteur de Madagascar. These data can be made available upon reasonable request at peste@pasteur.mg.
Data was derived from the following sources:
