Data and code from: Investigating the Yanomami malaria outbreak: Gold mining and malaria
Data files
Nov 03, 2025 version files 171.46 KB
-
data_panel_2023.csv
120.86 KB
-
GEE_Code_.rtf
13.17 KB
-
README.md
2.47 KB
-
Yanomami_Code_Clean.R
34.96 KB
Abstract
The Yanomami, an Indigenous group from the Amazon, confront multifaceted challenges endangering their health and cultural integrity. Of immediate concern is the humanitarian crisis caused by surges in malaria amid increasing illegal gold mining in their territory. Leveraging satellite imagery and panel regression analyses, we quantified the effect of land use changes on malaria incidence on their land (2016-2023). We observed a ~300 % increase in malaria cases during this period, associated with increases in illegal gold mining. An increase of one standard deviation in gold mining is associated with a 20-46 % rise in malaria incidence one to two years later. We found that changes in forest areas significantly affect malaria rates: for every one standard deviation increase in the perimeter of forest edges, malaria cases rise by 55 %. Our findings highlight the major impact of illegal gold mining and the resulting fragmentation of forests on the high malaria burden experienced by the Yanomami.
Dataset DOI: 10.5061/dryad.0p2ngf2dc
Description of the data and file structure
In December 2023, we obtained annual data on malaria cases in Brazil between 2003 and 2023 from the Brazilian Ministry of Health database (SIVEP-Malaria, Malaria em áreas indígenas table). Malaria data in the Yanomami territory is reported at the polo base level (i.e., health district subdivisions) and typically associated with a coordinate point indicating the likely site of infection (hereafter, ‘infection sites’). These infection sites were used as spatial units in our analyses. We selected the variables: year, infection site (i.e., coordinate point location associated with diagnosed malaria cases), parasite species, population, and polo base.
Forest and mining cover data were obtained through the Google Earth Engine platform using the MapBiomas Brazil 9.0 database. Climate data (i.e., annual temperature and precipitation) was obtained from the Climate Research Unit (CRU) database and is not publicly available due to terms of a Creative Commons Zero (CC0) waiver. Attached code "GEE_Code_.rtf" presents all coding required to repeat our spatial analyses.
Statistical analyses were made using R Software Version 4.3. Attached code "Yanomami_Code_Clean.R" presents all coding require to repeat our statistical analyses. To analyse our data, we used panel regression models. Data was recorded in csv format.
Files and variables
File: data_panel_2023.csv
Description:
Variables
- ID: (Infection Site ID)
- polobase (Indigenous health subunit):
- lat:
- lon:
- year:
- row_code (number associated with each infection site):
- mean_forest: (mean percentage of forest cover)
- mean_mining: (mean percentage of mining cover)
- deforestation: (annual changes in forest cover percentage)
- mining_changes: (annual changes in mining cover percentage)
- dist_edge: (lenght of forest edge perimeter)
- n_malaria: (number of malaria cases)
- n_falciparum: (number of P. falciparum cases)
- pop (population):
NA values appear for deforestation and mining_changes in 2003 because these variables represent year-to-year changes, and 2003 was the first year of data collection, leaving no prior year for comparison.
Data
In December 2023, we obtained annual data on malaria cases in Brazil between January 2003 and September 2023 from the Brazilian Ministry of Health database provided by the Indigenous Special Secretary of Health (SIVEP-SESAI Malaria, Malaria em áreas indígenas table) and filtered the data to include only cases reported at the Yanomami Indigenous Sanitary Special District. Malaria data in the Yanomami territory is reported at the polo base level (i.e., health district subdivisions) and typically associated with a coordinate point indicating the likely site of infection (hereafter, ‘infection sites’). We selected the variables: year, infection site (i.e., coordinate point location associated with diagnosed malaria cases), parasite species, population, and polo base. In total, 28 polo bases reported malaria in 64 distinct infection sites. We filtered the malaria data (~480,000 malaria records) to only include cases among the Yanomami people (~120,000 malaria records). The population size at each polo base (i.e., Indigenous health division), dated from 2010 to 2023, was used to calculate malaria incidence. Although this approach might deflate incidence in polo bases containing multiple infection sites, it allows assessment of changes in incidence over time while maintaining the highest resolution on focal land use change, which is the primary aim of this research. For the years before 2010, estimates from 2010 were used to calculate incidence. Forest and mining cover data were obtained through the Google Earth Engine platform using the MapBiomas Brazil 9.0 database (https://brasil.mapbiomas.org). Climate data (i.e., annual mean temperature and total precipitation) was obtained from the Climate Research Unit (CRU) database.
