Data for: A behaviourally-informed chatbot increases vaccination in Argentina
Data files
Aug 01, 2024 version files 5.59 MB
-
15_07_2024_FINAL_brown2023_analysis_code.Rmd
76.86 KB
-
20230505_vaccinated_per_day.csv
12.21 KB
-
brown2024_analysis_data.rds
5.50 MB
-
README.md
4.49 KB
Abstract
This repository contains the dataset used for the paper 'A behaviourally informed chatbot increases vaccination rates in Argentina more than a one-way reminder'. The data originates from administrative databases collected by the Ministry of Health of the Republic of Argentina and the Ministry of Health of the Province of Chaco, Argentina. The original data sources are (1) Nomivac: a complete dataset of all COVID-19 vaccinations across the country; (2) Pasaporte Chaco: the Chaco province's online services phone application; (3) Chaco's 0800 help line: a database from a phone helpline established by the provincial Ministry of Health to address citizens' queries on COVID-19 vaccinations and (4) SUMAR: the Argentinian public subsidised healthcare system. The data have been processed to anonymise identity numbers for public availability and to create variables suitable for regression analysis.
This repository hosts the dataset employed for the ‘A behaviourally-informed chatbot increases vaccination rates in Argentina more than a one-way reminder’ paper’s analysis. The paper discusses an intervention involving a new chatbot service to explore if two-way interactive messaging outperforms one-way reminders. We conducted a pre-registered randomised controlled trial involving 249,705 participants in Argentina and recorded vaccinations using Ministry of Health data.
The repository includes two data files:
‘brown2024_analysis_data.rds’ features individual-level data of all trial participants, including basic demographic data, Covid-19 vaccination history, assignment to an experimental arm, a summary of chatbot interaction, and outcome measures: binary variables indicating whether the participant received their next Covid-19 vaccine dose within four weeks.
The data has been de-identified. Information on age has been coded into bins or ranges (18-29, 30-49, 50+) and information on gender has been set as a binary variable (0, 1).
‘20230505_vaccinated_per_day.csv’ depicts the demand for COVID-19 vaccines in the Chaco province from February 2021 to December 2022. It records the daily total of COVID-19 vaccine doses delivered during this period in Chaco. These data were sourced from Nomivac, a comprehensive dataset of all COVID-19 vaccinations nationwide.
Description of the data and file structure
1. File “brown2024_analysis_data.rds”
- randomisation_allocation_name: Name of treatment group
- current_dose: Number of doses of COVID-19 vaccine received
- next_dose: Number of next dose of COVID-19 to be received
- random_number_date: Date of message delivery, assigned at random to individuals in the control group. See materials and methods for details
- outcome_latest_date: Date of most recent dose of COVID-19 vaccine received
- dose_X_date: Date of X dose of COVID-19 vaccination
- vaccinated_message_sent: Whether a message was successfully delivered to the individual
- session_current_level: Level of the message flow where the user stopped interacting with the chatbot
- session_current_branch: Branch within the level of the message flow where the user stopped interacting with the chatbot
- session_selected_hc: Option number chosen by the user when asked to select a health centre, out of a list of a maximum of 9 options
- session_reminder_date: The date for which the reminder message was programmed
- session_reminder_time: Time of day chosen by the user to get vaccinated
- session_initial_response: Users’ answer to the initial message sent by the chatbot
- session_reminder_sent: Whether a reminder message was successfully sent to the user
- history_message_code: Type of initial message sent to the user (“control” is the one-way message; “inicial” is the chatbot’s first message)
- history_created: Timestamp of the first message successfully delivered to the user
- history_timestamp_read: Timestamp of the first message read by the user
- message_date: This variable is used to indicate the date that each individual entered the trial. For individuals in the chatbot and one-way message reminder groups, it is the date that the message was sent. For the control group, this is the date that we randomly pre-assigned each participant to enter the trial (random_number_date). See materials and methods for details
- had_new_dose: Participant received an additional dose of the COVID-19 vaccine after being randomly assigned to the treatment arm
- had_new_dose_after_message: Participant received a vaccine within 4 weeks of entering the trial
- had_new_dose_after_message_2w: Participant received a vaccine within 2 weeks of entering the trial
- had_new_dose_after_message_6w: Participant received a vaccine within 6 weeks of entering the trial
- n_doses_after_entering_trial: Number of doses within four weeks of entering the trial
- primary_outcome: had_new_dose as a factor variable
- age: Age range of participant
- gender_numeric: Gender of the participant, coded as 1 or 0 for de-identification purposes
- period_since_last_dose: Length of time between the first value of message_date in the whole sample and the last dose date before trial
2. File “20230505_vaccinated_per_day.csv”
- delivery_date: Date of vaccination
- count: Total number of residents of Chaco province who received a COVID-19 dose on a given day
This is administrative data collected by Argentina's Ministry of Health. It encompasses four constituent datasets: three phone number databases and Nomivac (briefly explained in the Data Sources section of the Materials and Methods). The data has been anonymized for public availability and processed to generate variables suitable for regression analysis.