The COVID-19 situation in Brazil is complex due to large differences in the shape and size of regional epidemics. Here we tested monthly blood donation samples for IgG antibodies from March 2020 to March 2021 in eight of Brazil’s most populous cities. The inferred attack rate of SARS-CoV-2 adjusted for seroreversion in December 2020, before the Gamma VOC was dominant, ranged from 19.3% (95% CrI 17.5% - 21.2%) in Curitiba to 75.0% (95% CrI 70.8% - 80.3%) in Manaus. Seroprevalence was consistently smaller in women and donors older than 55 years. The age-specific infection fatality rate (IFR) differed between cities and consistently increased with age. The infection hospitalisation rate (IHR) increased significantly during the Gamma-dominated second wave in Manaus, suggesting increased morbidity of the Gamma VOC compared to previous variants circulating in Manaus. The higher disease penetrance associated with the health system’s collapse increased the overall IFR by a minimum factor of 2.91 (95% CrI 2.43 – 3.53). These results highlight the utility of blood donor serosurveillance to track epidemic maturity and demonstrate demographic and spatial heterogeneity in SARS-CoV-2 spread.


We tested 97,950 blood donation samples for anti-SARS-CoV-2 IgG antibodies using the anti-N Abbott chemiluminescent microparticle immunoassay (CIMA). Tests were performed from March 2020 to March 2021 in eight Brazilian capitals: São Paulo, Manaus, Belo Horizonte, Curitiba, Fortaleza, Recife, Rio de Janeiro.

We also tested blood samples from convalescent plasma donors to estimate the sensitivity of the assay. To estimate test specificity, we tested blood donation samples from São Paulo collected in February 2020, before the beginning of the SARS-CoV-2 epidemic in Brazil. In order to estimate the time-to-seroreversion distribution (used to correct for antibody waning), we also tested samples from repeat blood donors.

Please see "Methods" section in the manuscript for more detailed information on this dataset.

Usage Notes

This repository contains four datasets:

1) Bloodbank.csv: The longitudinal cohort containing the tested blood samples used to estimate the seroprevalence in the eight cities.
2) repeat_blood_donors.csv: The cohort of repeat blood donors used to estimate the probability distribution of the time-to-seroreversion.
3) convalescent_plasma_longitudinal_roche.csv: Convalescent plasma donors used to estimate the sensitivity of the assay.
4) prepandemic_cohort.csv: The pre-pandemic blood donors cohort, containing samples tested in February 2020 in São Paulo.

In all files, each row represents a tested blood sample. Information as exact age, education level and declared race were removed to ensure data are anonymized. For the same reason, dates of sample collection were substituted by the corresponding week numbers, and the date of onset was substituted by the time interval between the date of sample collection and the date of onset in the convalescent plasma donors dataset.

See data_dictionary.pdf for the data dictionary.


