Skip to main content
Dryad logo

Data for: Calibration of individual-based models to epidemiological data: a systematic review


Hazelbag, C. Marijn et al. (2020), Data for: Calibration of individual-based models to epidemiological data: a systematic review, v3, Dryad, Dataset,


Calibrating or fitting an individual-based model (IBM) to data is a crucial step in model development. We performed a systematic review to provide an overview of calibration methods used in IBMs modelling infectious disease spread. We included articles if models stored individual-specific information and calibration involved running the model and comparing model output to population-level targets expressed as summary statistics. The dataset contains information for each of the included articles on model calibration methods, including; the parameter-search strategy, the goodness-of-fit measure, acceptance criteria, and stopping rules. Also, the dataset contains information on contextual variables for model calibration such as target statistics and parameters.


Information was collected independently by two reviewers for each article included using a prospectively developed form. Data was collected verbatim and was later categorised. The dataset contains this categorised information, accompanied by verbatim information where relevant. We revisited articles on which there was disagreement in the classification and discussed the dispute until we reached an agreement. Due to this fact, information about the category of the calibration method may be based on a later review of the article at which stage no extra verbatim information was collected.

We collected information on the number of calibrated parameters, the number of fixed parameters, and the number of targets. We noted how information on these counts was reported in the articles (i.e. the number was explicitly provided, could be deduced from text or figures, was provided incompletely or was not provided). Users of the data need to take into account a limitation concerning the counts contained in the dataset; counts often had to be deduced from the article, a process that is prone to error. We advise against further analyses using these numbers.

Please refer to the associated paper for a complete description of the methodology.

Usage Notes

The dataset is self-contained. There are missing values within nested variables (i.e. if the previous column/variable has a certain level, the nested variable will contain extra information). There are missing values in variables with an "optional" nature.