Skip to main content
Dryad

A machine learning approach to integrating genetic and ecological data in tsetse flies (Glossina pallidipes) for spatially explicit vector control planning

Data files

Abstract

Introduction - Control of vector populations is an effective strategy for addressing vector-borne disease transmission. Effective vector control requires knowledge of habitat use and connectivity. Our goal was to improve this knowledge for the tsetse species Glossina pallidipes, a vector of animal African trypanosomiasis, which is a wasting disease in livestock and represents a serious socioeconomic burden across sub-Saharan Africa. Methods and Results - We used random forest regression to: (i) Build and integrate models of G. pallidipes habitat suitability and genetic connectivity across Kenya and northern Tanzania, and (ii) provide novel vector control recommendations. Inputs for the models included field-survey records from 349 trap locations, genetic data from 11 microsatellite loci from 659 flies and 29 sampling sites, and remotely sensed environmental data. The suitability and connectivity models explained approximately 80% and 67% of the variance in the occurrence and genetic data, and exhibited high accuracy based on cross-validation. The bivariate map showed that suitability and connectivity vary independently across the landscape and inform vector control recommendations. Post-hoc analyses show spatial variation in the correlations between the most important environmental predictors from our models and each response variable (e.g. suitability and connectivity) as well as heterogeneity in expected future climatic change of these predictors. Discussion - The bivariate map suggests vector control is most likely to be successful in the Lake Victoria basin, and supports the previous recommendation that most of eastern Kenya should be managed as a single unit. We further recommend that future monitoring efforts should focus on tracking potential changes in vector presence and dispersal around the Serengeti and the Lake Victoria basin based on projected local climatic shifts. The strong performance of the spatial models suggests potential for our integrative methodology to be used to understand future impacts of climate change in this and other vector systems.