Predicting current and future Aedes vexans occurrence in the Netherlands

Dellar, Martha 1 ; Streng, Kiki2 ; van Bodegom, Peter3 ; Ibáñez-Justicia, Adolfo4

Published Sep 24, 2024 on Dryad. https://doi.org/10.5061/dryad.tb2rbp08d

Data files

Sep 24, 2024 version files 17.68 MB

correlations.csv

2.87 KB
futureLandUse.zip
53.70 KB
h2oModels.zip
7.16 MB
rasterTemplate.asc
802.07 KB
README.md
4.70 KB
results.zip
6.35 MB
SSP1.zip
1.65 MB
SSP5.zip
1.66 MB

Abstract

We have created predictions of the current and future occurrence of Aedes vexans (Meigen, 1830) mosquitoes in the Netherlands. Aedes vexans can transmit many different diseases, including western and eastern equine encephalitis virus, Tahyna virus, West Nile Virus and Rift Valley Fever Virus. However, lack of occurrence maps, especially at a local scale, has hampered accurate disease modelling. We used extensive occurrence data collected by the Netherlands Centre for Monitoring of Vectors to train models using AutoML. We made future predictions for 2050 using a combination of climate scenarios (from the Dutch Meteorological Organistion) and socio-economic scenarios (the Dutch One Health SSPs). We made predictions for individual days rather than for the season as a whole, allowing us to consider future changes in seasonal dynamics. This is the first time a seasonal model for Aedes vexans has been developed and the first time future predictions have been made for this species at a national scale.

https://doi.org/10.5061/dryad.tb2rbp08d

We provide the h2o models and the data to make future predictions, as well as the code used to generate these models, evaluate model uncertainty and make predictions.

Description of the data and file structure

Mosquito data

See supplementary materials on Zenodo.

Models

The h2o models are available in h2oModels.zip. WIthin this folder are 4 sub-folders: ratio1, ratio2, ratio5 and ratio10. These refer to the presence:absence ratio in the training data used to generate the models. Within each of these sub-folders are 10 more sub-folders, each containing an h2o model file. These models can only be read using h2o version 3.42.0.2.

The most effective ratio was found to be 1:2; the models for this ratio are found in the ‘ratio2’ sub-folder. Our final model was the ensemble of the 10 models contained in this sub-folder.

Future data

Raster files containing data for the future (2050) non-climate predictors are found in zip folders SSP1 and SSP5, representing these two scenarios. These are used by the futureScenario.R script listed below. These contain the following predictors:

agriAreas.asc: Percentage of agricultural land cover in 1km gridsquare
artificial.asc: Percentage of articifical land cover in 1km gridsquare
bulk_density.asc: Soil bulk density (tonnes per cubic metre)
clay.asc: Soil clay content (percentage)
distanceNature.asc: Distance to the nearest nature area in metres. Nature areas are Natura2000 areas, national parks and Natuurnetwerk Nederland areas
floodrisk.asc: Flood risk, calculated as 0.1(max depth of ‘1 in 10 year’ flood event) + 0.01(max depth of ‘1 in 100 year’ flood event) + 0.001(max depth of ‘1 in 1000 year’ flood event) + 0.00001(max depth of ‘1 in 100,000 year’ flood event)
permWater.asc: Percentage of permanent water in 1km gridsquare
permWet.asc: Percentage of permanent wetland in 1km gridsquare
shrubs.asc: Percentage shrub cover, includes shrubs between 1m and 2.5m tall, in 1km gridsquare
surfaceSalinity.asc: Chloride concentration in surface water (mg/l)

The data for the climate predictors can be downloaded from https://klimaatscenarios-data.knmi.nl/downloads.

In addition, the future land use maps which were used to create the future non-climate predictors are contained in futureLandUse.zip. These are raster files with land uses coded as follows: 1 - urban, 2 - pasture, 3 - crops, 4 - forest, 5 - non-forest nature.

Results

The results.zip file contains the following raster files:

the predicted mean, minimum and maximum occurrence probability for each scenario
the standard deviation of the occurrence probability predictions across the 30-year period for each scenario
the 95% confidence interval of model predictions for predictions made over the 2022 mosquito season. This includes both absolute uncertainty and uncertainty as a percentage of occurrence probability

Other

The correlations.csv file contains the correlations between the different predictor variables we considered for this study.

The Data_summary.docx file provides provides figures summarising the Aedes vexans data used in this study.

The figSM1uncertainty**process.png file shows the process used for calculating the model uncertainty.

We also provide the file rasterTemplate.asc. This is a blank raster file showing the grid we have used for all our work in this study. This is a 1km grid with CRS: EPSG - 28992.

Code/Software

All coding was performed in R v4.3.1. We used the h2o package v3.42.0.2.

Available scripts:

modelTraining.R
- This runs the autoML process for all the training datasets, makes predictions based on the model comparison 2 dataset (see accompanying paper for details) and records the variable importance for each individual model
modelSelection.R
- This calculates which presence:absence ratio is optimal
modelValidation.R
- This compares the final model wth the validation dataset, including calculating a confusion matrix and the balanced accuracy
variableImportance.R
- Calculates the variable importance for the model ensemble based on the training data with presence:absence ratio 1:2 and makes a graph
uncertainty.R
- Finds the uncertainty in the model derivation process using bootstrap sampling. This also plots the results.
futureScenario.R
- Makes future occurrence predictions for a given scenario (in this case the scenario RCP2.6/SSP1 - dry, but it is easily adapted to other scenarios)

Predicting current and future Aedes vexans occurrence in the Netherlands

Data files

Abstract

README: Predicting current and future Aedes vexans occurrence in the Netherlands

Description of the data and file structure

Code/Software

Methods

Works referencing this dataset