Generalising an outbreak cluster detection method for two groups: An application to rabies
Data files
Oct 30, 2025 version files 33.27 KB
-
case_dists.csv
19.11 KB
-
case_sp_vect.csv
3.30 KB
-
case_time_diffs.csv
2.92 KB
-
dates_uncertainty.csv
1.71 KB
-
distance_uncertainty.csv
4.22 KB
-
README.md
2.02 KB
Abstract
Identifying linked cases of an infectious disease can improve our understanding of its epidemiology by distinguishing sustained local transmission from frequent introductions with little onward transmission. This evidence can, in turn, inform decisions on the most appropriate interventions. Knowledge of key epidemiological distributions and reporting probabilities is key in identifying linked cases. However, with multi-host pathogens quantitative differences between hosts may need consideration, which are not incorporated in existing methods.
In this study, an existing graph-based approach to detecting outbreak clusters was extended to allow for group-specific reporting probabilities and epidemiological distributions and to assess the level and importance of assortative mixing. This method was applied to data on probable animal rabies cases in south-east Tanzania where wildlife comprised over 40% of detected animal rabies cases.
Group-specific differences (in reporting probabilities and epidemiological distributions) and the level of assortative mixing had a marked impact on the size and composition of identified clusters. The scenario most compatible with the data involved higher reporting probabilities for cases in domestic animals compared to wildlife, no difference between the mean transmission distance between domestic animals versus wildlife and substantial assortative mixing with frequent inter-species transmission.
The method described here could be applied to other multi-host systems or to single-host systems with multiple groups (such as age-classes) in which heterogeneities in reporting probabilities, distributional parameters and/or levels of mixing exist between groups. This would allow more accurate characterisation of transmission dynamics which would facilitate implementation of more effective interventions.
https://doi.org/10.5061/dryad.931zcrjvq
Description of the data and file structure
This repository provides the data used within the paper entitled 'Generalising an outbreak cluster detection method for multiple groups: An application to rabies'. The data are used within the section entitled 'Application to rabies data'.
The data reflect probable animal rabies cases that occurred between January 2011 and July 2019 within the 13 districts of Lindi and Mtwara regions of south-east Tanzania.
Each file has the same number of rows and each row represents the same rabies case in all files.
During data collection, there was some uncertainty around the dates and locations of the animal rabies cases. This uncertainty is recorded in the files dates_uncertainty and distance_uncertainty.csv.
Files and variables
File: case_dists.csv
Description: location data for each of the rabies cases
Variables
- UTM.Easting
- UTM.Northing
File: case_sp_vect.csv
Description: data on the species group for each of the rabies cases
File: case_time_diffs.csv
Description: Number of days since the start of the study period on which the rabies case occurred
File: dates_uncertainty.csv
Description: Uncertainty (in days) around the dates of the rabies cases
File: distance_uncertainty.csv
Description: Uncertainty around the location data for the rabies cases (in m)
Variables
- lower_uncert
- upper_uncert
Code/software
https://github.com/sarahhayes/vimesMulti_for_paper
https://doi.org/10.5281/zenodo.17478478
Access information
Other publicly accessible locations of the data:
