Skip to main content
Dryad logo

Aggregate TB diagnosis and treatment data, Uganda 2017 and 2019


White, Elizabeth et al. (2022), Aggregate TB diagnosis and treatment data, Uganda 2017 and 2019, Dryad, Dataset,


To accelerate tuberculosis (TB) control and elimination, reliable data is needed to improve the quality of TB care. We assessed agreement between a surveillance dataset routinely collected for Uganda’s national TB program and a high-fidelity dataset collected from the same source documents for a research study from 32 health facilities in 2017 and 2019 for six measurements: 1) Smear-positive and 2) GeneXpert-positive diagnoses, 3) bacteriologically confirmed and 4) clinically diagnosed treatment initiations, and the number of people initiating TB treatment who were also 5) living with HIV or 6) taking antiretroviral therapy. We measured agreement as the average difference between the two methods, expressed as the average ratio of the surveillance counts to the research data counts, its 95% limits of agreement (LOA), and the concordance correlation coefficient. We used linear mixed models to investigate whether agreement changed over time or was associated with facility characteristics. We found good overall agreement with some variation in the expected facility-level agreement for the number of smear-positive diagnoses (average ratio [95% LOA]: 1.04 [0.38-2.82]; CCC: 0.78), bacteriologically confirmed treatment initiations (1.07 [0.67-1.70]; 0.82), and people living with HIV (1.11 [0.51-2.41]; 0.82). The agreement was poor for Xpert positives, with surveillance data undercounting relative to research data (0.45 [0.099-2.07]; 0.36). Although surveillance data overcounted relative to research data for clinically diagnosed treatment initiations (1.52 [0.71-3.26]) and the number of people taking antiretroviral therapy (1.71 [0.71-4.12]), their agreement as assessed by CCC was not poor (0.82 and 0.62, respectively). The average agreement was similar across study years for all six measurements, but facility-level agreement varied yearly and was not explained by facility characteristics. In conclusion, the agreement of TB surveillance data with high-fidelity research data was highly variable across measurements and facilities. To advance the use of routine TB data as a quality improvement tool, future research should elucidate and address reasons for variability in its quality.


This dataset contains counts of TB diagnoses and treatment initiations, as well as HIV diagnosis and treatment counts among TB patients, at the health facility level from two data sources: surveillance data and research data.

Surveillance data: In Uganda, all hospitals and health centers that treat TB patients report data on a quarterly basis. A trained staff member at each facility reviews handwritten TB laboratory and treatment registers quarterly and records aggregated counts of diagnosis and treatment data on a standardized paper reporting form within pre-specified strata. Aggregate data are entered into a DHIS2 database at the Ministry of Health. This dataset includes annual, facility-level data on diagnoses and treatment initiations from the quarterly reports in DHIS2.

Research data: In the XPEL-TB study, trained facility staff photographed handwritten TB laboratory and treatment registers, uploaded them to a secure server, entered the data into a patient-level database, and conducted quality assurance to ensure accuracy, including resolving missing data and other discrepancies with health facility staff. To make direct comparisons to the surveillance data, this dataset includes aggregated patient data by year and health center using the same strata reported in the DHIS2 system.

Usage Notes

The data are contained in a .csv file that can be opened using most statistical softwares. Our analysis was conducted in R (version 4.1.2), but other programs may be used.


National Heart, Lung, and Blood Institute