Impacts of improved cookstove interventions on personal exposure to carbon monoxide and particulate matter in Zambia
Data files
May 02, 2025 version files 1.49 MB
-
CEEEZ_B_Primary.csv
36.66 KB
-
CEEEZ_Primary.csv
49.77 KB
-
Kabwe_1_B_Primary.csv
15.94 KB
-
Kabwe_1_Primary.csv
21.26 KB
-
Kabwe_2_B_Primary.csv
16.71 KB
-
Kabwe_2_Primary.csv
22.39 KB
-
README.md
30.44 KB
-
SupaMoto_B_Primary.csv
36.57 KB
-
SupaMoto_Primary.csv
49.58 KB
-
ZCCS_2019_2021_combined_EM_CSV.csv
164.65 KB
-
ZCCS_2019_2021_Cross_Sectional_All_EM_CSV.csv
391.74 KB
-
ZCCS_2019_CO_hourly_CSV.csv
154.70 KB
-
ZCCS_2019_CO_SIperiods_7_20_2.csv
34.60 KB
-
ZCCS_2019_EM_temp_hourly_CSV.csv
70.80 KB
-
ZCCS_2019_ODK_HH_char_CSV.csv
113.95 KB
-
ZCCS_2019_PM_hourly_CSV.csv
53.01 KB
-
ZCCS_2021_CO_hourly_CSV.csv
99.43 KB
-
ZCCS_2021_CO_SIperiods_7_20_2.csv
18.01 KB
-
ZCCS_2021_EM_temp_hourly_CSV.csv
51.11 KB
-
ZCCS_2021_PM_hourly_CSV.csv
53.31 KB
-
ZCCS_2021_PM_SIperiods_7_20_2.csv
2.50 KB
Abstract
Eighty-four percent of sub-Saharan African households rely on polluting fuels (e.g., wood, charcoal) for cooking, leading to high levels of household air pollution (HAP). While switching to modern fuels/stoves could decrease HAP levels, they are not always available or affordable. Improved biomass cookstoves could provide an intermediate step supporting transitions from traditional biomass to clean burning fuels/stoves. We conducted two stove intervention trials in Lusaka, Zambia using targeted marketing/incentives to motivate participants to use improved biomass stoves, either the Mimi Moto (pellet) or the EcoZoom (charcoal). Before the intervention, 65% of participants exclusively used charcoal, while 27% relied on electricity to some extent for cooking. We measured 24-hour personal exposure to CO (n=747) and PM2.5 (n=90) of primary cooks. We implemented several statistical approaches to estimate the effects of interventions on exposure: household-specific endline minus baseline exposure, ranksum testing, difference-in-differences analyses, and cross-sectional analyses. We did not find that switching from traditional charcoal stoves to either intervention stove was associated with significantly reduced exposures. However, cooks using electric stoves independent of the intervention did have significantly lower CO exposures than those using traditional charcoal, with greater electric stove use corresponding to greater exposure reductions. Variability in exposure was dominated by seasonal, regional, and neighborhood differences rather than household stove/fuel choices. A focus on HAP exposure from cooking in urban settings is unlikely to yield expected exposure reductions. Policy makers should consider pollution reduction policies/interventions that target ambient air quality in tandem with HAP-mitigating strategies to address air pollution health burden.
Dataset DOI: 10.5061/dryad.xd2547drx
Description of the data and file structure
We conducted two stove intervention trials in Lusaka, Zambia using targeted marketing and incentives to motivate traditional biomass stove users to switch to one of two improved biomass stoves, either the pellet burning Mimi Moto or the charcoal burning EcoZoom. We measured 24-hour CO (n=747) and PM2.5 (n=90) exposures of primary cooks. Primary cooks were outfitted with exposure monitors and instructed to wear them for 24 hours, unless sleeping or bathing, when they were instructed to keep the monitors nearby. Data collection also consisted of a structured household questionnaire that included questions about household demographics, cooking practices, household facilities, and economic decision making. The main decision maker (household head) and person most knowledgeable about cooking practices in the household (primary cook) were interviewed. Exposure data was hourly- and daily-averaged for analysis by participant and compared across participants by neighborhood, stove use group, and year/study phase. Additionally, we utilized publicly available ambient PM2.5 data from PurpleAir sensors in the area to compare personal PM2.5 exposure to ambient PM2.5.
This dataset is not intended to be standalone and use with the associated manuscript is likely required to fully interpret data variables and delineations. Detailed information regarding stove groups, ambient PM2.5 data, and statistical analyses are included in the manuscript.
Empty data cells (NaNs) are common in this dataset as the exposure monitoring participants were a substantially smaller subset from the larger questionnaire study participants. Thus, exposure data and questionnaire responses for questions only asked to exposure participants were not collected for many participants. NaNs are likely present for one of the following reasons:
- Data was not collected for that participant (e.g., no CO data was collected, not an exposure monitoring participant so details about cooking location were not asked)
- Data was thrown out after failing a quality control check (e.g., someone reported cooking for 25 hours a day, someone reported their age as 175 years)
- Participants answered a question in a way such that a follow up question was not needed (e.g., participants who cook outdoors were not asked kitchen dimensions, participants who did not own their home were not asked how long they've owned their home)
List of commonly used acronyms in this dataset and their meaning:
- HH - household
- CO - carbon monoxide
- PM2.5 - particulate matter with aerodynamic diameter < 2.5 micrometers
- SI - stove influenced
- ppm - parts per million
- μgm-3 or μg/m3 - micrograms per meter cubed
- CEEEZ - centre for energy, environment, and engineering Zambia
- SM - SupaMoto
- MM - Mimi Moto
- Eco - EcoZoom
- VL - VITALITE
Files and variables
File: ZCCS_2019_PM_hourly_CSV.csv
Description: hourly PM2.5 exposure in μgm-3 for all participants in 2019 (baseline)
Variables
- HH ID: household ID
- Compound: neighborhood
- hour_0: hourly average PM2.5 from 00:00-00:59
- hour_1: hourly average PM2.5 from 01:00-01:59
- hour_2: hourly average PM2.5 from 02:00-02:59
- hour_3: hourly average PM2.5 from 03:00-03:59
- hour_4: hourly average PM2.5 from 04:00-04:59
- hour_5: hourly average PM2.5 from 05:00-05:59
- hour_6: hourly average PM2.5 from 06:00-06:59
- hour_7: hourly average PM2.5 from 07:00-07:59
- hour_8: hourly average PM2.5 from 08:00-08:59
- hour_9: hourly average PM2.5 from 09:00-09:59
- hour_10: hourly average PM2.5 from 10:00-10:59
- hour_11: hourly average PM2.5 from 11:00-11:59
- hour_12: hourly average PM2.5 from 12:00-12:59
- hour_13: hourly average PM2.5 from 13:00-13:59
- hour_14: hourly average PM2.5 from 14:00-14:59
- hour_15: hourly average PM2.5 from 15:00-15:59
- hour_16: hourly average PM2.5 from 16:00-16:59
- hour_17: hourly average PM2.5 from 17:00-17:59
- hour_18: hourly average PM2.5 from 18:00-18:59
- hour_19: hourly average PM2.5 from 19:00-19:59
- hour_20: hourly average PM2.5 from 20:00-20:59
- hour_21: hourly average PM2.5 from 21:00-21:59
- hour_22: hourly average PM2.5 from 22:00-22:59
- hour_23: hourly average PM2.5 from 23:00-23:59
File: ZCCS_2021_CO_SIperiods_7_20_2.csv
Description: hourly mask variables for each HHID in 2021 where 0 means the primary cook was not experiencing 'stove-influenced' concentrations during that hour, while a 1 means they were; SI periods were estimated using CO exposure concentrations, n=7, alpha=20, and beta =2
Variables
- HH ID: household ID
- SI hours sum: total number of hours at 'stove influenced' concentrations
- SI_CO_avg: average CO in ppm during SI periods only
- hour_zero: if hour 00:00-00:59 was stove influenced (1) or not (0)
- hour_one: if hour 01:00-01:59 was stove influenced (1) or not (0)
- hour_two: if hour 02:00-02:59 was stove influenced (1) or not (0)
- hour_three: if hour 03:00-03:59 was stove influenced (1) or not (0)
- hour_four: if hour 04:00-04:59 was stove influenced (1) or not (0)
- hour_five: if hour 05:00-05:59 was stove influenced (1) or not (0)
- hour_six: if hour 06:00-06:59 was stove influenced (1) or not (0)
- hour_seven: if hour 07:00-07:59 was stove influenced (1) or not (0)
- hour_eight: if hour 08:00-08:59 was stove influenced (1) or not (0)
- hour_nine: if hour 09:00-09:59 was stove influenced (1) or not (0)
- hour_ten: if hour 10:00-10:59 was stove influenced (1) or not (0)
- hour_eleven: if hour 11:00-11:59 was stove influenced (1) or not (0)
- hour_twelve: if hour 12:00-12:59 was stove influenced (1) or not (0)
- hour_thirteen: if hour 13:00-13:59 was stove influenced (1) or not (0)
- hour_fourteen: if hour 14:00-14:59 was stove influenced (1) or not (0)
- hour_fifteen: if hour 15:00-15:59 was stove influenced (1) or not (0)
- hour_sixteen: if hour 16:00-16:59 was stove influenced (1) or not (0)
- hour_seventeen: if hour 17:00-17:59 was stove influenced (1) or not (0)
- hour_eighteen: if hour 18:00-18:59 was stove influenced (1) or not (0)
- hour_nineteen: if hour 19:00-19:59 was stove influenced (1) or not (0)
- hour_twenty: if hour 20:00-20:59 was stove influenced (1) or not (0)
- hour_twentyone: if hour 21:00-21:59 was stove influenced (1) or not (0)
- hour_twentytwo: if hour 22:00-22:59 was stove influenced (1) or not (0)
- hour_twentythree: if hour 23:00-23:59 was stove influenced (1) or not (0)
File: ZCCS_2021_PM_SIperiods_7_20_2.csv
Description: hourly mask variables for each HHID in 2021 where 0 means the primary cook was not experiencing 'stove-influenced' concentrations during that hour, while a 1 means they were; SI periods were estimated using PM2.5 exposure concentrations, n=7, alpha=20, and beta =2
Variables
- HH ID: household ID
- SI hours sum: total number of hours at 'stove influenced' concentrations
- SI_PM_avg: average PM2.5 in μgm-3 during SI periods only
- hour_zero: if hour 00:00-00:59 was stove influenced (1) or not (0)
- hour_one: if hour 01:00-01:59 was stove influenced (1) or not (0)
- hour_two: if hour 02:00-02:59 was stove influenced (1) or not (0)
- hour_three: if hour 03:00-03:59 was stove influenced (1) or not (0)
- hour_four: if hour 04:00-04:59 was stove influenced (1) or not (0)
- hour_five: if hour 05:00-05:59 was stove influenced (1) or not (0)
- hour_six: if hour 06:00-06:59 was stove influenced (1) or not (0)
- hour_seven: if hour 07:00-07:59 was stove influenced (1) or not (0)
- hour_eight: if hour 08:00-08:59 was stove influenced (1) or not (0)
- hour_nine: if hour 09:00-09:59 was stove influenced (1) or not (0)
- hour_ten: if hour 10:00-10:59 was stove influenced (1) or not (0)
- hour_eleven: if hour 11:00-11:59 was stove influenced (1) or not (0)
- hour_twelve: if hour 12:00-12:59 was stove influenced (1) or not (0)
- hour_thirteen: if hour 13:00-13:59 was stove influenced (1) or not (0)
- hour_fourteen: if hour 14:00-14:59 was stove influenced (1) or not (0)
- hour_fifteen: if hour 15:00-15:59 was stove influenced (1) or not (0)
- hour_sixteen: if hour 16:00-16:59 was stove influenced (1) or not (0)
- hour_seventeen: if hour 17:00-17:59 was stove influenced (1) or not (0)
- hour_eighteen: if hour 18:00-18:59 was stove influenced (1) or not (0)
- hour_nineteen: if hour 19:00-19:59 was stove influenced (1) or not (0)
- hour_twenty: if hour 20:00-20:59 was stove influenced (1) or not (0)
- hour_twentyone: if hour 21:00-21:59 was stove influenced (1) or not (0)
- hour_twentytwo: if hour 22:00-22:59 was stove influenced (1) or not (0)
- hour_twentythree: if hour 23:00-23:59 was stove influenced (1) or not (0)
File: ZCCS_2019_2021_Cross_Sectional_All_EM_CSV.csv
Description: compilation of all exposure data (24-hour averaged) and questionnaire data where participants for each year are treated as individual entries (as opposed to the same entry as in ZCCS_2019_2021_combined_EM_CSV); used for statistical analyses (Tables 3-4) and Figure 6.
Variables
- HHID: household ID
- Compound: neighborhood
- Phase: baseline (BL) or endline (EL)
- New HHs: whether household was a panel (0) or new household added at endline (1)
- Season: cool or warm
- stove_ind_grp: stove usage group: traditional charcoal, improved charcoal (excluding EcoZoom), charcoal + electric (primary traditional/improved charcoal, secondary electric), electric + charcoal (primary electric, secondary traditional/improved charcoal), electric exclusive, Mimi Moto primary or secondary (either primary or secondary use regardless of other stove), EcoZoom primary or secondary, and other (gas, wood, kerosene)
- elec_mask: whether electricity was used for any cooking (0=no, 1=yes)
- CO: 24 hour average CO in ppm
- PM: 24 hour average PM2.5 in μgm-3
- SI_CO_hours_7_20_2: total number of SI hours from the CO exposure profile and traditional procedure where n=7, alpha=20, and beta=2 (refer to Text S5 for more details)
- SI_CO_7_20_2: average CO exposure in ppm during the SI periods from the traditional procedure where n=7, alpha=20, and beta=2 (refer to Text S5 for more details)
- SI_PM_hours_7_20_2: total number of SI hours from the PM2.5 exposure profile and traditional procedure where n=7, alpha=20, and beta=2 (refer to Text S5 for more details)
- SI_PM_7_20_2: average PM2.5 exposure in μgm-3 during the SI periods from the traditional procedure where n=7, alpha=20, and beta=2 (refer to Text S5 for more details)
- SI_CO_charcHH: average CO exposure in ppm during the SI periods from the charcoal user baseline procedure (refer to Text S5 for more details)
- SI_PM_charcHH: average PM2.5 exposure in μgm-3 during the SI periods from the charcoal user baseline procedure (refer to Text S5 for more details)
- prim_stove: primary stove
- sec_stove: secondary stove
- prim_fuel: primary fuel
- eco_count: number of meals cooked on EcoZoom in past 3 days
- elec_count: number of meals cooked on electric in past 3 days
- gaskero_count: number of meals cooked on gas or kerosene in past 3 days
- imprcharc_count: number of meals cooked on improved charcoal in past 3 days
- tradcharc_count: number of meals cooked on traditional charcoal in past 3 days
- wood_count: number of meals cooked on wood in past 3 days
- mimimoto_count: number of meals cooked on Mimi Moto in past 3 days
- other_count: number of meals cooked on other in past 3 days
- eco_perc: percentage of cooking on EcoZoom
- elec_perc: percentage of cooking on electric
- gaskero_perc: percentage of cooking on gas or kerosene
- imprcharc_perc: percentage of cooking on improved charcoal
- tradcharc_perc: percentage of cooking on traditional charcoal
- wood_perc: percentage of cooking on wood
- mimimoto_perc: percentage of cooking on Mimi Moto
- gap_wall_roof: whether there is a gap between the wall and roof or not; only for those who cook indoors
- openwindows: if there were open windows, meaning the participant vented or did not vent during cooking
- cookloc: cooking location
- cookloc_code: whether cooking was indoors or not indoors
- expenditure_code: annual expenditure quintile
- DiD_Time: difference-in-differences variable for time status; 0=baseline, 1=endline
- DiD_Treat: difference-in-differences variable for treatment status; 0=control, 1=treatment; only for Kalingalinga and Ng'ombe neighborhoods
- DiD_Treat_key: key for difference-in-differences interaction term of time and treatment (e.g., C0 is control at baseline, C1 is control at endline, T0 is treatment at baseline, T1 is treatment at endline)
File: ZCCS_2019_CO_SIperiods_7_20_2.csv
Description: hourly mask variables for each HHID in 2019 where 0 means the primary cook was not experiencing 'stove-influenced' concentrations during that hour, while a 1 means they were; SI periods were estimated using CO exposure concentrations, n=7, alpha=20, and beta =2
Variables
- HH ID: household ID
- SI hours sum: total number of hours at 'stove influenced' concentrations
- SI_CO_avg: average CO in ppm during SI periods only
- hour_zero: if hour 00:00-00:59 was stove influenced (1) or not (0)
- hour_one: if hour 01:00-01:59 was stove influenced (1) or not (0)
- hour_two: if hour 02:00-02:59 was stove influenced (1) or not (0)
- hour_three: if hour 03:00-03:59 was stove influenced (1) or not (0)
- hour_four: if hour 04:00-04:59 was stove influenced (1) or not (0)
- hour_five: if hour 05:00-05:59 was stove influenced (1) or not (0)
- hour_six: if hour 06:00-06:59 was stove influenced (1) or not (0)
- hour_seven: if hour 07:00-07:59 was stove influenced (1) or not (0)
- hour_eight: if hour 08:00-08:59 was stove influenced (1) or not (0)
- hour_nine: if hour 09:00-09:59 was stove influenced (1) or not (0)
- hour_ten: if hour 10:00-10:59 was stove influenced (1) or not (0)
- hour_eleven: if hour 11:00-11:59 was stove influenced (1) or not (0)
- hour_twelve: if hour 12:00-12:59 was stove influenced (1) or not (0)
- hour_thirteen: if hour 13:00-13:59 was stove influenced (1) or not (0)
- hour_fourteen: if hour 14:00-14:59 was stove influenced (1) or not (0)
- hour_fifteen: if hour 15:00-15:59 was stove influenced (1) or not (0)
- hour_sixteen: if hour 16:00-16:59 was stove influenced (1) or not (0)
- hour_seventeen: if hour 17:00-17:59 was stove influenced (1) or not (0)
- hour_eighteen: if hour 18:00-18:59 was stove influenced (1) or not (0)
- hour_nineteen: if hour 19:00-19:59 was stove influenced (1) or not (0)
- hour_twenty: if hour 20:00-20:59 was stove influenced (1) or not (0)
- hour_twentyone: if hour 21:00-21:59 was stove influenced (1) or not (0)
- hour_twentytwo: if hour 22:00-22:59 was stove influenced (1) or not (0)
- hour_twentythree: if hour 23:00-23:59 was stove influenced (1) or not (0)
File: ZCCS_2019_ODK_HH_char_CSV.csv
Description: compilation of household and primary cook characteristics from questionnaires for 2019 participants used in Table 2
Variables
- HH ID: household ID
- Compound: neighborhood
- hhmembercount: range of people living in the household
- cook_ageyears: range of age of primary cook
- cook_highestgrade: highest grade of primary cook. The numeric code key has been excluded to protect anonymity of participants.
- cook_gender: cook gender. The numeric code key has been excluded to protect anonymity of participants.
- type_homeownership: type of homeownership. The numeric code key has been excluded to protect anonymity of participants.
- elecgrid_access: whether household has electric grid access (1) or not (0)
- cook_loc: whether primary cook cooks indoors or not; 1=in, 2=in, 3=out, 4=out, 99=out
- cook_loc_other: if cook_loc=99, then enumerator entered "other" here
- stove_ind_grp: stove usage group: traditional charcoal, improved charcoal (excluding EcoZoom), charcoal + electric (primary traditional/improved charcoal, secondary electric), electric + charcoal (primary electric, secondary traditional/improved charcoal), electric exclusive, Mimi Moto primary or secondary (either primary or secondary use regardless of other stove), EcoZoom primary or secondary, and other (gas, wood, kerosene)
- 24 hour CO avg (ppm): 24 hour average CO in ppm
- 24 hour PM2.5 avg (ug/m^3): 24 hour average PM2.5 in μgm-3
- EM_HH: whether the household was an exposure monitoring household (True) or not (False)
- gap_wall_roof: whether there is a gap between the wall and roof (1) or not (0); only for those who cook indoors
- num_openwindows: number of open windows when cooking; only for those who cook indoors
- SI_hours: total number of SI hours from the CO exposure profile and traditional procedure where n=7, alpha=20, and beta=2 (refer to Text S5 for more details)
File: ZCCS_2021_EM_temp_hourly_CSV.csv
Description: hourly temperature recorded by PM2.5 exposure monitors in Celsius for all participants in 2021 (endline)
Variables
- HH ID: household ID
- Compound: neighborhood
- day_avg: average temperature
- hour_0: hourly average temperature from 00:00-00:59
- hour_1: hourly average temperature from 01:00-01:59
- hour_2: hourly average temperature from 02:00-02:59
- hour_3: hourly average temperature from 03:00-03:59
- hour_4: hourly average temperature from 04:00-04:59
- hour_5: hourly average temperature from 05:00-05:59
- hour_6: hourly average temperature from 06:00-06:59
- hour_7: hourly average temperature from 07:00-07:59
- hour_8: hourly average temperature from 08:00-08:59
- hour_9: hourly average temperature from 09:00-09:59
- hour_10: hourly average temperature from 10:00-10:59
- hour_11: hourly average temperature from 11:00-11:59
- hour_12: hourly average temperature from 12:00-12:59
- hour_13: hourly average temperature from 13:00-13:59
- hour_14: hourly average temperature from 14:00-14:59
- hour_15: hourly average temperature from 15:00-15:59
- hour_16: hourly average temperature from 16:00-16:59
- hour_17: hourly average temperature from 17:00-17:59
- hour_18: hourly average temperature from 18:00-18:59
- hour_19: hourly average temperature from 19:00-19:59
- hour_20: hourly average temperature from 20:00-20:59
- hour_21: hourly average temperature from 21:00-21:59
- hour_22: hourly average temperature from 22:00-22:59
- hour_23: hourly average temperature from 23:00-23:59
File: ZCCS_2021_CO_hourly_CSV.csv
Description: hourly CO exposure in ppm for all participants in 2021 (endline)
Variables
- HH ID: household ID
- Compound: neighborhood
- day_avg: average CO
- hour_0: hourly average CO from 00:00-00:59
- hour_1: hourly average CO from 01:00-01:59
- hour_2: hourly average CO from 02:00-02:59
- hour_3: hourly average CO from 03:00-03:59
- hour_4: hourly average CO from 04:00-04:59
- hour_5: hourly average CO from 05:00-05:59
- hour_6: hourly average CO from 06:00-06:59
- hour_7: hourly average CO from 07:00-07:59
- hour_8: hourly average CO from 08:00-08:59
- hour_9: hourly average CO from 09:00-09:59
- hour_10: hourly average CO from 10:00-10:59
- hour_11: hourly average CO from 11:00-11:59
- hour_12: hourly average CO from 12:00-12:59
- hour_13: hourly average CO from 13:00-13:59
- hour_14: hourly average CO from 14:00-14:59
- hour_15: hourly average CO from 15:00-15:59
- hour_16: hourly average CO from 16:00-16:59
- hour_17: hourly average CO from 17:00-17:59
- hour_18: hourly average CO from 18:00-18:59
- hour_19: hourly average CO from 19:00-19:59
- hour_20: hourly average CO from 20:00-20:59
- hour_21: hourly average CO from 21:00-21:59
- hour_22: hourly average CO from 22:00-22:59
- hour_23: hourly average CO from 23:00-23:59
File: ZCCS_2021_PM_hourly_CSV.csv
Description: hourly PM2.5 exposure in μgm-3 for all participants in 2021 (endline)
Variables
- HH ID: household ID
- Compound: neighborhood
- day_avg: average PM2.5
- hour_0: hourly average PM2.5 from 00:00-00:59
- hour_1: hourly average PM2.5 from 01:00-01:59
- hour_2: hourly average PM2.5 from 02:00-02:59
- hour_3: hourly average PM2.5 from 03:00-03:59
- hour_4: hourly average PM2.5 from 04:00-04:59
- hour_5: hourly average PM2.5 from 05:00-05:59
- hour_6: hourly average PM2.5 from 06:00-06:59
- hour_7: hourly average PM2.5 from 07:00-07:59
- hour_8: hourly average PM2.5 from 08:00-08:59
- hour_9: hourly average PM2.5 from 09:00-09:59
- hour_10: hourly average PM2.5 from 10:00-10:59
- hour_11: hourly average PM2.5 from 11:00-11:59
- hour_12: hourly average PM2.5 from 12:00-12:59
- hour_13: hourly average PM2.5 from 13:00-13:59
- hour_14: hourly average PM2.5 from 14:00-14:59
- hour_15: hourly average PM2.5 from 15:00-15:59
- hour_16: hourly average PM2.5 from 16:00-16:59
- hour_17: hourly average PM2.5 from 17:00-17:59
- hour_18: hourly average PM2.5 from 18:00-18:59
- hour_19: hourly average PM2.5 from 19:00-19:59
- hour_20: hourly average PM2.5 from 20:00-20:59
- hour_21: hourly average PM2.5 from 21:00-21:59
- hour_22: hourly average PM2.5 from 22:00-22:59
- hour_23: hourly average PM2.5 from 23:00-23:59
File: ZCCS_2019_CO_hourly_CSV.csv
Description: hourly CO exposure in ppm for all participants in 2019 (baseline)
Variables
- HH ID: household ID
- Compound: neighborhood
- day_avg: average CO
- hour_0: hourly average CO from 00:00-00:59
- hour_1: hourly average CO from 01:00-01:59
- hour_2: hourly average CO from 02:00-02:59
- hour_3: hourly average CO from 03:00-03:59
- hour_4: hourly average CO from 04:00-04:59
- hour_5: hourly average CO from 05:00-05:59
- hour_6: hourly average CO from 06:00-06:59
- hour_7: hourly average CO from 07:00-07:59
- hour_8: hourly average CO from 08:00-08:59
- hour_9: hourly average CO from 09:00-09:59
- hour_10: hourly average CO from 10:00-10:59
- hour_11: hourly average CO from 11:00-11:59
- hour_12: hourly average CO from 12:00-12:59
- hour_13: hourly average CO from 13:00-13:59
- hour_14: hourly average CO from 14:00-14:59
- hour_15: hourly average CO from 15:00-15:59
- hour_16: hourly average CO from 16:00-16:59
- hour_17: hourly average CO from 17:00-17:59
- hour_18: hourly average CO from 18:00-18:59
- hour_19: hourly average CO from 19:00-19:59
- hour_20: hourly average CO from 20:00-20:59
- hour_21: hourly average CO from 21:00-21:59
- hour_22: hourly average CO from 22:00-22:59
- hour_23: hourly average CO from 23:00-23:59
File: ZCCS_2019_EM_temp_hourly_CSV.csv
Description: hourly temperature recorded by PM2.5 exposure monitors in Celsius for all participants in 2019 (baseline)
Variables
- HH ID: household ID
- Compound: neighborhood
- hour_0: hourly average temperature from 00:00-00:59
- hour_1: hourly average temperature from 01:00-01:59
- hour_2: hourly average temperature from 02:00-02:59
- hour_3: hourly average temperature from 03:00-03:59
- hour_4: hourly average temperature from 04:00-04:59
- hour_5: hourly average temperature from 05:00-05:59
- hour_6: hourly average temperature from 06:00-06:59
- hour_7: hourly average temperature from 07:00-07:59
- hour_8: hourly average temperature from 08:00-08:59
- hour_9: hourly average temperature from 09:00-09:59
- hour_10: hourly average temperature from 10:00-10:59
- hour_11: hourly average temperature from 11:00-11:59
- hour_12: hourly average temperature from 12:00-12:59
- hour_13: hourly average temperature from 13:00-13:59
- hour_14: hourly average temperature from 14:00-14:59
- hour_15: hourly average temperature from 15:00-15:59
- hour_16: hourly average temperature from 16:00-16:59
- hour_17: hourly average temperature from 17:00-17:59
- hour_18: hourly average temperature from 18:00-18:59
- hour_19: hourly average temperature from 19:00-19:59
- hour_20: hourly average temperature from 20:00-20:59
- hour_21: hourly average temperature from 21:00-21:59
- hour_22: hourly average temperature from 22:00-22:59
- hour_23: hourly average temperature from 23:00-23:59
File: ZCCS_2019_2021_combined_EM_CSV.csv
Description: baseline (2019) and endline (2021) data matched for each participant; used for Figure 5.
Variables
- HH ID: household ID
- Compound: neighborhood
- new_HH: whether household was a panel (0) or new household added at endline (1)
- base CO: 24 hour average CO in ppm at baseline
- base PM: 24 hour average PM2.5 in μgm-3 at baseline
- end CO: 24 hour average CO in ppm at endline
- end PM: 24 hour average PM2.5 in μgm-3 at endline
- end EM temp: 24 hour average temperature measured by exposure monitor in C at endline
- TOT_lvl: stove use status (kept using original, starting using intervention, etc.)
- prim_stove_base: primary stove used at baseline
- sec_stove_base: secondary stove used at baseline
- stove_ind_base_grp: stove usage group at baseline: traditional charcoal, improved charcoal (excluding EcoZoom), charcoal + electric (primary traditional/improved charcoal, secondary electric), electric + charcoal (primary electric, secondary traditional/improved charcoal), electric exclusive, Mimi Moto primary or secondary (either primary or secondary use regardless of other stove), EcoZoom primary or secondary, and other (gas, wood, kerosene)
- prim_stove_end: primary stove used at endline
- sec_stove_end: secondary stove used at endline
- stove_ind_end_grp: stove usage group at endline: traditional charcoal, improved charcoal (excluding EcoZoom), charcoal + electric (primary traditional/improved charcoal, secondary electric), electric + charcoal (primary electric, secondary traditional/improved charcoal), electric exclusive, Mimi Moto primary or secondary (either primary or secondary use regardless of other stove), EcoZoom primary or secondary, and other (gas, wood, kerosene)
File: CEEEZ_B_Primary.csv
Description: raw data from Lusaka CEEEZ PurpleAir sensor B from October to November 2021.
Variables
- created_at: datetime in UTC
- PM2.5_CF1_ug/m3: ambient PM2.5 measurement in μgm-3
File: CEEEZ_Primary.csv
Description: raw data from Lusaka CEEEZ PurpleAir sensor A from October to November 2021.
Variables
- created_at: datetime in UTC
- PM2.5_CF1_ug/m3: ambient PM2.5 measurement in μgm-3
- Temperature_F: temperature in Fahrenheit
- Humidity_%: relative humidity in %
File: Kabwe_1_B_Primary.csv
Description: raw data from Kabwe 1 PurpleAir sensor B (no longer available) from July to August 2019.
Variables
- created_at: datetime in UTC
- PM2.5_CF1_ug/m3: ambient PM2.5 measurement in μgm-3
File: Kabwe_1_Primary.csv
Description: raw data from Kabwe 1 PurpleAir sensor A (no longer available) from July to August 2019.
Variables
- created_at: datetime in UTC
- PM2.5_CF1_ug/m3: ambient PM2.5 measurement in μgm-3
- Temperature_F: temperature in Fahrenheit
- Humidity_%: relative humidity in %
File: Kabwe_2_B_Primary.csv
Description: raw data from Kabwe 2 PurpleAir sensor B (no longer available) from July to August 2019.
Variables
- created_at: datetime in UTC
- PM2.5_CF1_ug/m3: ambient PM2.5 measurement in μgm-3
File: Kabwe_2_Primary.csv
Description: raw data from Kabwe 2 PurpleAir sensor A (no longer available) from July to August 2019.
Variables
- created_at: datetime in UTC
- PM2.5_CF1_ug/m3: ambient PM2.5 measurement in μgm-3
- Temperature_F: temperature in Fahrenheit
- Humidity_%: relative humidity in %
File: SupaMoto_B_Primary.csv
Description: raw data from Lusaka SupaMoto PurpleAir sensor B from October to November 2021.
Variables
- created_at: datetime in UTC
- PM2.5_CF1_ug/m3: ambient PM2.5 measurement in μgm-3
File: SupaMoto_Primary.csv
Description: raw data from Lusaka SupaMoto PurpleAir sensor A from October to November 2021.
Variables
- created_at: datetime in UTC
- PM2.5_CF1_ug/m3: ambient PM2.5 measurement in μgm-3
- Temperature_F: temperature in Fahrenheit
- Humidity_%: relative humidity in %
Code/software
All data is available in CSVs. All data processing, analysis, and visualization was completed in the aforementioned Jupyter Notebook (https://zenodo.org/doi/10.5281/zenodo.13245015).
This notebook was built under notebook version 6.0.3 and Python version 3.7.6. It includes all analysis completed of raw CSV files and code for all figures and tables in the manuscript. Last updated Mar 24, 2025. The notebook includes the follow analysis sections with a Table of Contents included for ease of use:
- loading in various packages for analysis.
- loading in CSVs containing CO and PM2.5 exposure hourly diurnal averages for baseline (2019) and endline (2021); figures comparing them.
- loading in CSVs containing cook/ household information, 24-hour average CO and PM2.5 for baseline and endline; various figures comparing them.
- loading in CSVs from 4 PurpleAir sensors; correcting and averaging raw PurpleAir data; comparing exposure and temperature measurements to PurpleAir data.
- comparing baseline to endline exposure data.
- applying multiple statistical modeling methods to analyze the impact of the two intervention stoves on cooks’ CO and PM2.5 exposures, including hypothesis testing, difference-in-differences, and cross-sectional generalized least squares modeling.
- loading in a CSV of household and primary cook characteristics during baseline for participant comparison.
- loading in CSVs for filter corrections of real-time particle exposure sensors during endline.
- repeating statistical modeling methods from #6 to include stove influenced hours and consider average SI CO and average SI PM2.5 as dependent variables in models.
Code from the following website was referenced for the propensity score matching used for difference-in-differences analyses: https://www.r-bloggers.com/2022/04/propensity-score-matching/
Data:
We conducted two stove intervention trials in Lusaka, Zambia using targeted marketing and incentives to motivate traditional biomass stove users to switch to one of two improved biomass stoves, either the pellet burning Mimi Moto or the charcoal burning EcoZoom. Exposure data for this study was collected from primary cooks in Lusaka, Zambia in 2019 and 2021. Ambient data was collected using PurpleAir PM2.5 sensors available online at the time of the study (baseline, no longer available) and installed by researchers on the project (endline). Raw and processed data are included in CSVs. Refer to the descriptions below for what is included in each CSV file. Note that household ID (HHID) is used to differentiate households/primary cooks and match exposure data to questionnaire responses.
- Hourly exposure data:
- ZCCS_2019_CO_hourly_CSV - hourly CO exposure in ppm for all participants in 2019 where hour_0 = 00:00, hour_1 = 01:00, etc.
- ZCCS_2019_PM_hourly_CSV - hourly PM2.5 exposure in μgm-3 for all participants in 2019 where hour_0 = 00:00, hour_1 = 01:00, etc.
- ZCCS_2019_EM_temp_hourly_CSV - hourly temperature recorded by PM2.5 exposure monitors in °C for all participants in 2019 where hour_0 = 00:00, hour_1 = 01:00, etc.
- ZCCS_2021_CO_hourly_CSV - hourly CO exposure in ppm for all participants in 2021 where hour_0 = 00:00, hour_1 = 01:00, etc.
- ZCCS_2021_PM_hourly_CSV - hourly PM2.5 exposure in μgm-3 for all participants in 2021 where hour_0 = 00:00, hour_1 = 01:00, etc.
- ZCCS_2021_EM_temp_hourly_CSV - hourly temperature recorded by PM2.5 exposure monitors in °C for all participants in 2021 where hour_0 = 00:00, hour_1 = 01:00, etc.
- Processed exposure data, participant questionnaire results:
- ZCCS_2019_ODK_HH_char_CSV - compilation of household and primary cook characteristics from questionnaires for 2019 participants used in Table 2 (e.g., primary cook education level, cook gender)
- ZCCS_2019_2021_combined_EM_CSV - baseline (2019) and endline (2021) data matched for each participant; used for Figure 5.
- ZCCS_2019_2021_Cross_Sectional_All_EM_CSV - compilation of all averaged exposure data and questionnaire data where participants for each year are treated as individual entries (as opposed to the same entry as in ZCCS_2019_2021_combined_EM_CSV); used for statistical analyses (Tables 3-4) and Figure 6.
- PurpleAir raw data files:
- Kabwe_1_Primary - raw data from Kabwe 1 PurpleAir sensor A (no longer available) from July to August 2019.
- Kabwe_1_B_Primary - raw data from Kabwe 1 PurpleAir sensor B (no longer available) from July to August 2019.
- Kabwe_2_Primary - raw data from Kabwe 2 PurpleAir sensor A (no longer available) from July to August 2019.
- Kabwe_2_B_Primary - raw data from Kabwe 2 PurpleAir sensor B (no longer available) from July to August 2019.
- CEEEZ_Primary - raw data from Lusaka CEEEZ PurpleAir sensor A from October to November 2021.
- CEEEZ_B_Primary - raw data from Lusaka CEEEZ PurpleAir sensor B from October to November 2021.
- SupaMoto_Primary - raw data from Lusaka SupaMoto PurpleAir sensor A from October to November 2021.
- SupaMoto_B_Primary - raw data from Lusaka SupaMoto PurpleAir sensor B from October to November 2021.
- Stove influenced (SI) hours:
- ZCCS_2019_CO_SIperiods_7_20_2 - hourly mask variables for each HHID in 2019 where 0 means the primary cook was not experiencing 'stove-influenced' concentrations during that hour, while a 1 means they were; SI periods were estimated using CO exposure concentrations, n=7, alpha=20, and beta =2
- ZCCS_2021_CO_SIperiods_7_20_2 - hourly mask variables for each HHID in 2021 where 0 means the primary cook was not experiencing 'stove-influenced' concentrations during that hour, while a 1 means they were; SI periods were estimated using CO exposure concentrations, n=7, alpha=20, and beta =2
- ZCCS_2021_PM_SIperiods_7_20_2 - hourly mask variables for each HHID in 2021 where 0 means the primary cook was not experiencing 'stove-influenced' concentrations during that hour, while a 1 means they were; SI periods were estimated using PM2.5 exposure concentrations, n=7, alpha=20, and beta =2
PurpleAir raw data was corrected in the Jupyter Notebook file using the following literature corrections:
-
McFarlane, C., Isevulambire, P.K., Lumbuenamo, R.S., Ndinga, A.M.E., Dhammapala, R., Jin, X., McNeill, V.F., Malings, C., Subramanian, R., Westervelt, D.M., 2021a. First Measurements of Ambient PM2.5 in Kinshasa, Democratic Republic of Congo and Brazzaville, Republic of Congo Using Field-calibrated Low-cost Sensors. Aerosol Air Qual. Res. 21, 200619. https://doi.org/10.4209/aaqr.200619
-
McFarlane, C., Raheja, G., Malings, C., Appoh, E.K.E., Hughes, A.F., Westervelt, D.M., 2021. Application of Gaussian Mixture Regression for the Correction of Low Cost PM2.5 Monitoring Data in Accra, Ghana. ACS Earth Space Chem. 5, 2268–2279. https://doi.org/10.1021/acsearthspacechem.1c00217
-
Barkjohn, K.K., Gantt, B., Clements, A.L., 2021. Development and application of a United States-wide correction for PM2.5 data collected with the PurpleAir sensor. Atmospheric Measurement Techniques 14, 4617–4637. https://doi.org/10.5194/amt-14-4617-2021
-
Holder, A.L., Mebust, A.K., Maghran, L.A., McGown, M.R., Stewart, K.E., Vallano, D.M., Elleman, R.A., Baker, K.R., 2020. Field Evaluation of Low-Cost Particulate Matter Sensors for Measuring Wildfire Smoke. Sensors 20, 4796. https://doi.org/10.3390/s20174796
-
Magi, B.I., Cupini, C., Francis, J., Green, M., Hauser, C., 2020. Evaluation of PM2.5 measured in an urban setting using a low-cost optical particle counter and a Federal Equivalent Method Beta Attenuation Monitor. Aerosol Science and Technology 54, 147–159. https://doi.org/10.1080/02786826.2019.1619915
Software:
Data processing, analysis, and visualization was completed in a Jupyter Notebook available on GitHub (https://github.com/stephanieparsons14/Impacts-of-improved-cookstove-interventions-on-personal-exposure-to-CO-and-PM-in-Zambia) and preserved in Zenodo (https://zenodo.org/doi/10.5281/zenodo.13245015).
Code from the following website was referenced for the propensity score matching used for difference-in-differences analyses: https://www.r-bloggers.com/2022/04/propensity-score-matching/
