Skip to main content

Coronavirus prevalence in Brazilian Amazon and Sao Paulo city

Cite this dataset

Salomon, Tassila et al. (2020). Coronavirus prevalence in Brazilian Amazon and Sao Paulo city [Dataset]. Dryad.


SARS-CoV-2 spread rapidly in the Brazilian Amazon. Mortality was elevated, despite the young population, with the health services and cemeteries overwhelmed. The attack rate in this region is an estimate of the final epidemic size in an unmitigated epidemic. Here we show that by June, one month after the epidemic peak in Manaus, capital of the Amazonas state, 44% of the population had detectable IgG antibodies. This equates to a cumulative incidence of 52% after correcting for the false-negative rate of the test. Further correcting for the effect of antibody waning we estimate that the final attack rate was 66%. This is higher than seen in other settings, but lower than the predicted final size for an unmitigated epidemic in a homogeneously mixed population. This discrepancy may be accounted for by population structure as well as some limited physical distancing and non-pharmaceutical measures adopted in the city.


Selection of blood samples for serology testing

Both the FPS and HEMOAM blood centers routinely store residual blood samples for six months after donation. In order to cover a period starting from the introduction of SARSCoV-2 in both cities, we retrieved stored samples covering the months of February to May in São Paulo, and February to June in Manaus, at which point testing capacity became available. In subsequent months blood samples were prospectively selected for testing. The monthly target was to test 1,000 samples at each study site. However, due to problems with purchasing the kits, supply chain issues, and the period of test validity, some months were under and others over the target (to avoid wasting kits soon to expire). We aimed to include donations starting from the second week of each month. Part of the remit of the wider project is to develop a system to prospectively select blood donation samples, based on the donor’s residential address, so as to capture a spatially representative sample of each participating city. For example, FPS receives blood donations from people living across the whole greater metropolitan region of São Paulo. The spatial distribution of donors does not follow the population density, with some areas over- and others under-represented. We used residential zip codes (recorded routinely at FPS) to select only individuals living within the city of São Paulo. We then further divided the city into 32 regions (subprefeituras) and used their projected population sizes for 2020 to define sampling weights, such that the number of donors selected in any given subprefeitura was proportional to the population size. We piloted this approach in São Paulo and have developed an information system to operationalize this process at the participating center. However, at the time of data collection the system was not implemented in HEMOAM and therefore it was not possible to use this sampling strategy. As such, we simply tested consecutive blood donations, beginning from the second week of each month until the target was reached.

Quantifying antibody waning and rate of seroreversion

We sought to quantify the rate of decline of the anti-nucleocapsid IgG antibody that is detected by the Abbott CMIA. We tested paired serum samples from our cohort of convalescent plasma donors (described above). We calculated the rate of signal decay as the difference in log2 S/C between the first and second time points divided by the number of days between the two visits. We used simple linear regression to determine the mean slope and 95% CI.

Analysis of seroprevalence data

Using the manufacturer's threshold of 1.4 S/C to define a positive result we first calculated the monthly crude prevalence of anti-SARS-CoV-2 antibodies as the number of positive samples/total samples tested. The 95% confidence intervals (CI) were calculated by the exact binomial method. We then re-weighted the estimates for age and sex to account for the different demographic make-up of blood donors compared to the underlying populations of São Paulo and Manaus (Fig. S4). Because only people aged between 16 and 70 years are eligible to donate blood, the re-weighting was based on the projected populations in the two cities in this age range only. The population projections for 2020 are available from ( We further adjusted these estimates for the sensitivity and specificity of the assay using the Rogan and Gladen method As a sensitivity analysis, we took two approaches to account for the effect of seroreversion through time. Firstly, the manufacturer's threshold of 1.4 optimizes specificity but misses many true-cases in which the S/C level is in the range of 0.4 – 1.4 (see ref and main text). In addition, individuals with waning antibody levels would be expected to fall initially into this range. Therefore, we present the results using an alternative threshold of 0.4 to define a positive result and adjust for the resultant loss in specificity. Secondly, we corrected the prevalence with a model-based method assuming that the probability of seroreversion for a given patient decays exponentially with time. In the model-based method for correcting the prevalence, only the months between March and August were considered. The measured prevalence used as input for this method was obtained using the manufacturer’s threshold of 1.4, and the correction based on the test specificity (99.9%) and sensitivity (84%) was applied, as well as the normalization by age and sex. Confidence intervals were calculated through bootstrapping, assuming a beta distribution for the input measured prevalence. It is worth noting that even though this model is limited by the exponential decay assumption, assuming distributions with more degrees of freedom may lead to overfitting due to the small number of samples of 9[7]. Finally, the obtained values for - and " must be interpreted as parameters for this model, and not estimates for the actual decay rate and seroreversion probability as they may absorb the effect of variables that are not taken into account by this model.

Infection fatality ratio

We calculated the global infection fatality ratio in Manaus and São Paulo. The total number of infections was estimated as the product of the population size in each city and the antibody prevalence in June (re-weighted and adjusted for sensitivity and specificity). The number of deaths were taken from the SIVEP-Gripe system, and we used both confirmed COVID-19 deaths, and deaths due to severe acute respiratory syndrome of unknown cause. The latter category likely represents COVID-19 cases in which access to diagnostic testing was limited , and more closely approximate the excess mortality. We calculated age-specific infection fatality ratios by assuming equal prevalence across all age groups.

Effective reproduction number

We calculated the effective reproduction number for São Paulo and Manaus using the renewal method9, with the serial interval as estimated by Ferguson (2020)10. Calculations were made using daily severe acute respiratory syndrome cases with PCR-confirmed COVID-19 in the SIVEP-Gripe system. Region-specific delays between the PCR result release and the date of symptom onset were accounted for using the technique proposed by Lawless (1994).


Medical Research Council, Award: MR/S0195/1

São Paulo Research Foundation, Award: 18/14389-0

Itau Unibanco Todos pela Saude

Itau Unibanco Todos pela Saude