Evidence for a role of Anopheles stephensi in the spread of drug- and diagnosis-resistant malaria in Africa
Data files
Mar 17, 2024 version files 293.15 KB
-
dd_cc_individual_data.csv
-
HH_level_entomology_data.csv
-
Mosquito_data_field_lab.csv
-
README.md
-
Secondary_data_dd.csv
-
Secondary_data_DHIS2.csv
Abstract
Anopheles stephensi, an Asian malaria vector, continues to expand across Africa. The vector is now firmly established in urban settings in the Horn of Africa. Its presence in areas where malaria resurged suggested a possible role in causing malaria outbreaks. Here, using a prospective case–control design, we investigated the role of An. stephensi in transmission following a malaria outbreak in Dire Dawa, Ethiopia in April–July 2022. Screening contacts of patients with malaria and febrile controls revealed spatial clustering of Plasmodium falciparum infections around patients with malaria in strong association with the presence of An. stephensi in the household vicinity. Plasmodium sporozoites were detected in these mosquitoes. This outbreak involved clonal propagation of parasites with molecular signatures of artemisinin and diagnostic resistance. To our knowledge, this study provides the strongest evidence so far for a role of An. stephensi in driving an urban malaria outbreak in Africa, highlighting the major public health threat posed by this fast-spreading mosquito.
README: Data from: Evidence for a role of Anopheles stephensi in the spread of drug- and diagnosis-resistant malaria in Africa
This data set contains information related to data collected at individual level (case control study), malaria trend data from the health system, laboratory results, and entomology survey. The data set consists of five CSV files (dd_cc_individual_data, HH_level_entomology_data, Secondary_data_dd, Secondary_data_DHIS2 and Mosquito_data_field_lab). The two CSV files contain the data collected during the data collection (dd_cc_individual_data and HH_level_entomology_data) from April to July 2022. The dd_cc_individual_data file contains data collected during the study period from the study participants on their infection status (as measured by rapid diagnostic tests(RDT), microscopy, and quantitative species-specific polymerase chain reaction (qPCR) that targeted 18S small rRNA subunit), sociodemographic, intervention utilization, and malaria predisposing factors. The HH_level_entomology_data file contains data related to adult and larvae mosquito surveys. The Mosquito_data_field&lab file contains individual adult mosquito data including date, method, and place of collection, and abdominal status. The remaining two CSV files were obtained from secondary sources (Secondary_data_dd and Secondary_data_DHIS2). The Secondary_data_dd file contains hand collected data from 34 private and public health facilities (January 2019-May 2022). The Secondary_data_DHIS2 was collected from the national District Health Information Software 2(DHIS2) repository for Dire Dawa between January 2013 and May 2022.
Date of data collection: April - July 2022 Geographic location of data collection: Dire Dawa, Ethiopia.
Information about funding sources that supported the collection of the data: Bill and Melinda Gates Foundation (INV-005898); Wellcome Trust (102348); AHRI NORAD-SIDA Core Fund; Presidents Malaria Initiative/ USAID
Methodological Information
Description of methods used for collection/generation of data:
- Individual level data-We conducted a prospective case control study to identify risk factors associated with the sudden rise in P. falciparum malaria in Dire Dawa city. Goro Health Center and Dire Dawa University (DDU) students clinic were selected based on their malaria report in the years prior to the start of the study. We recruited 101 malaria confirmed cases plus 235 case-household members and 189 controls without malaria who attended the same clinic within 72 hours plus 427 control-household members. The index and controls were followed to their homes and their household members were tested for malaria and their households/dormitories were screened for Anopheles adult mosquitoes and their neighborhood (within 100meter radius) were surveyed for larvae/pupae. The results demonstrated that members of the index cases and controls had different levels of exposure. Living closer to An. stephensi positive sites defined as larvae within 100m radius to the household and/or adult mosquitoes and water bodies (both natural or artificial) were found as risk factors; strongly associated with being a member of the case household/dormitory.
- Entomology data - Entomological survey of households and dormitories for adult and immature Anopheles mosquitoes were conducted. Immature stages of Anopheles mosquitoes were surveyed within a 100-meter radius of the index and control houses/dormitories targeting both artificial water storage containers and natural habitats. Our data, shows An. stephensi is abundant both in artificial and natural aquatic habitats in the driest months of the year. An. stephensi larvae and/or adult mosquitoes were more often detected near the index cases than controls.
- Secondary data - Clinical malaria incidence data were collected from public and private health facilities (n=34) and the finding showed a 12-fold increase in malaria incidence in Dire Dawa during the dry months (January May) of 2022 compared to 2019. We also analyzed the DHIS2 data starting from 2015 to 2022 to examine the trend in cases and the Plasmodium species composition over the years in relation to the detection of An. stephensi in Dire Dawa. An. stephensi was first detected in Dire Dawa in 2018. The proportion of infections that is due to P. falciparum has increased substantially during the years prior to the outbreak which finally resulted in purely P. falciparum outbreak.
- Mosquito data: This data set contains individual adult mosquito data including date, method, and place of collection, abdominal status. Included in this file are also lab results on blood meal source identification and infection status of mosquitoes as determined by circumsporozoite protein (CSP) bead-based multiplex assay and a confirmatory PCR test targeting 18S.
- Genomics - Multiplexed amplicon sequencing was performed on qPCR positive samples with reagents and protocol described before (Tessema, S. K. et al. J. Infect. Dis., 2020) targeting high-diversity microhaplotype targets (n=162), polymorphisms associated with drug resistance, and targets in and adjacent to pfhrp2 and pfhrp3 to assess for gene deletion. Amplified libraries were sequenced in a NextSeq 2000 or a MiniSeq instrument using 150PE reads with 10% PhiX.
Description of the data and file structure
dd_cc_individual_data csv file description:
- idindividual - Individual ID
- idhh - Household ID
- casecat - Case category based on expert microscopy (index, index family, control, or control family)
- casexp - Classification Index vs Control (Case = index case plus household/dormitory members or Control = control plus household/dormitory members)
- sex - Sex of an individual (Female or Male)
- anymalpos - RDT and/or Microscopy and/or qPCR positivity
- agecat - Age of participant (<5 years, 5-15 years and > 15 years)
- travel - Did you travel away from home in the last one month? (Yes or No)
- waterbodytype - Type of the water body (River ,stream,pond ,stagnant water)
- livestockpres - Does this household own any livestock, herds, other farm animals, or poultry? (Yes or No)
- larvaepos - Lavae positivity around the household(HH) (Yes or No)
- adultpos - Adult Anopheles presence (indoor/outdoor separately) (presence or absence)
- adultpos_2 - Adult Anopheles presence (indoor/outdoor combined) (presence or absence)
- stephpos - An. stephensi positivity (larvae/adult/indoor/outdoor) (presence or absence)
- spray - Usage of aerosol insecticide sprays (Yes or No)
- site - Study site in Dire Dawa (City or University)
- pfqden- Pf18S parasite density per microliter
- pfqpos - Pf18S positivity (Positive or Negative)
- eave - Presence of eave (open or closed)
- repellent - Usage of repellents (Yes or No)
- irs - Has your household been sprayed with insecticide in the last 12 months? (Yes or No)
- llininhouse - Long lasting insecticide-treated nets use in house (Yes or No)
- distance_Artif_con_cat - Distance from artificial container (>100m or <=100m)
- micpos - Microscopy result (Positive or Negative)
- rdtpos - RDT result (Positive or Negative)
- malpos - RDT and/or Microscopy positivity (Positive or Negative)
HH_level_entomology_data csv file description:
- idhh - Household ID
- stephpos - Positive for any stage of stephensi (positive or negative)
- stephlarv - Stephensi larvae detected (Positive or Negative)
- stephadult - Adult stephensi detected (Positive or Negative)
- site - Study site in Dire Dawa (City or University)
- casexp - Classification Index vs Control (case or Control)
- waterbody - Water body presence in the neighborhood (Yes or No)
- waterbodytype - Type of the water body (River, stream, pond , and stagnant water)
- livestockpres - Does this household own any livestock, herds, other farm animals, or poultry? (Yes or No)
- interventiontype - Type of intervention used (Long lasting insecticide-treated nets (LLIN), LLIN + repellent/spray,repellent/spray)
- dist_river - Distance from river in meter
- dist_artificial_cont - Distance from artificial container in meter
- adultspp - Adult mosquito species detected
- anophspp - Anopheles species detected
- stephadultnum - An. stephensi adult density
- stephnumindoor - An. stephensi adult density indoor
- stephnumoutdoor- An. stephensi adult density outdoor
- aedespos - Aedes mosquito positivity
- aedesnumindoor - Number of aedes mosquito detected indoor.
- aedesnumoutdoor - Number of aedes mosquito detected outdoor.
- hab_presence_withinhh - Habitat presence within household
- larvaepos - Lavae positivity around the HH
- steph_larv_num - Number of An. stephensi larvae detected.
- gamb_larv_num - Number of An. gambiae larvae detected.
- turk_larv_num - Number of An. turkhudi larvae detected.
- pret_larv_num - Number of An. pretoriensis larvae detected
Secondary_data_dd csv file description:
- Month - Month of data collection
- total_tested - Total number suspected cases tested during the period
- positive_case - Number of positive cases out of total tested during the period
- year - Year of data collection
Secondary_data_DHIS2 csv file description:
- year - Year of data collection
- P_falciparum - Number of P. falciparum parasite positive cases
- P_vivax - Number of P. vivax parasite positive cases
- Total - Total number of positive infections (P. falciparum plus P. vivax)
- proportion_Pf - Proportion of P. falciparum out of total infection
Mosquito_data_field_lab csv file description:
- ID - Mosquito ID
- Date - Date of collection
- Method - Method of collection
- Place - Place of collection
- Species - Mosquito species
- Abdomen - Abdominal status
- Bloodmeal - Bloodmeal source identification
- Net_MFI1 -Net MFI First run(pf)
- Net_MFI2-Net MFI rerun(pf_2)
- Pf - Pf circumsporozoite protein(CSP) result - first experiment
- pv210 - Pv210 CSP result - first experiment
- pv247 - Pv247 CSP result - first experiment
- pf_2 - Pf CSP result - rerun
- pv210_2 - Pv210 CSP result - rerun
- pv247_2 - Pv247 CSP result - rerun
- 18SPCR - 18S PCR result of head and thoraces
Sequence data was uploaded on NCBI with the BioProject accession number is PRJNA962166.
Sharing/access Information
Repository of data and R script are available in this link https://github.com/legessealamerie/DD-Stephensi and https://github.com/EPPIcenter/mad4hatter.
Methods
Data collection: Data on the socio-demographic, epidemiological, intervention, and travel history were collected verbally using pre-tested questionnaires which were uploaded to mobile tablets using REDCap tools. The entomological survey data and intervention availability were scored by the study data collectors. Malaria case incidence data (from January 2019 to May 2022) were collected from the records of both private and public health facilities (n=34).
Data processing: Data collection tools (mobile application version 5.20.11) were prepared and managed using REDCap electronic data capture tools hosted at AHRI. CSV files exported from REDCap were analyzed using STATA 17 (StataCorp., TX, USA), RStudio v.2022.12.0.353 (Posit, 2023), QGIS v.3.22.16 (QGIS Development Team, 2023. QGIS Geographic Information System. Open Source Geospatial Foundation Project), and GraphPad Prism 5.03 (Graph Pad Software Inc., CA, USA). All statistical analyses were performed in RStudio with packages lme4 (generalized linear mixed models) and dcifer.
Bioinformatic analysis: FASTQ files from multiplexed amplicon sequencing of P. falciparum were subjected to filtering, demultiplexing and allele inference using a Nextflow-based pipeline (https://github.com/EPPIcenter/mad4hatter). We used cut adapt to demultiplex reads for each locus based on the locus primer sequences (no mismatches or indels allowed), filter reads by length (100 base pairs) and quality (default NextSeq quality trimming). We used dada2 to infer variants and remove chimeras. Reads with a PHRED quality score of less than 5 were truncated. The leftmost base was trimmed and trimmed reads of less than 75 base pairs were filtered out. Default values were used for all other parameters. We then aligned alleles to their reference sequence and filtered out reads with low alignment. We masked homopolymers and tandem repeats to avoid false positives.
Epidemiological analysis: We used standard Case-Control analyses to examine the association between risk factors and malaria infection. It calculates point estimates and confidence intervals for the OR along with the significance level based on the chi squared test. Continuous variables were presented as median and interquartile range (IQR). Tests of association between two categorical variables were performed using contingency tables. Mann-Kendall statistical test was used to test for monotonic (increasing or decreasing) trends of malaria cases using the secondary data obtained from the private and public health facilities at the city and DDU.
Usage notes
CSV files exported from REDCap were analyzed using STATA 17 (StataCorp., TX, USA), RStudio v.2022.12.0.353, QGIS v.3.22.16 (QGIS Development Team, 2023. QGIS Geographic Information System. Open Source Geospatial Foundation Project), and GraphPad Prism 5.03 (Graph Pad Software Inc., CA, USA). All statistical analyses were performed in Rsoftware (4.12) with packages lme4 (generalized linear mixed models) and dcifer. Amplicon sequencing data was processed using cutadapt (v4.1) and DADA2 (v3.16).