Data from: Co-selection of genetic antibiotic resistance in Streptococcus pneumoniae after repeated Azithromycin mass drug administrations in Niger
Data files
Jan 21, 2026 version files 186.26 MB
-
annotated_genomes.tgz
186.25 MB
-
genomes_metadata_v1.3.csv
10.93 KB
-
README.md
2.91 KB
Abstract
We performed long-read whole-genome sequencing and phenotypic resistance analysis on Streptococcus pneumoniae isolated from the nasopharynx of Nigerien children from communities treated with either 6 twice-yearly azithromycin distributions or placebo. This dataset contains the annotated genome files for the 122 samples used in the study, as well as the associated metadata.
Dataset DOI: 10.5061/dryad.vt4b8gv5w
Description of the data and file structure
This repository contains the annotated genomes of Streptococcus pneumoniae, as well as associated metadata.
Isolated pneumococcal colonies were subjected to long-read whole-genome-sequencing using the SMRTbell Prep Kit 3.0 and sequenced on the PacBio Revio platform (Pacific Biosciences of California). PacBio HiFi reads were de novo assembled using Canu (version 2.2) with the ‘-pacbio-hifi’ option and evaluated for quality using BUSCO (version 5.7.1). Assemblies achieving a completeness score >90% underwent a ‘Comprehensive Genome Analysis’ using BV-BRC online tools (accessed April 2025). BV-BRC-annotated genomes were subsequently screened for mobile elements utilizing ICEscreen (version 1.3.3). Annotated genomes were saved in genbank format.
Files and variables
File: annotated_genomes.tgz
Description: Compressed archive of the annotated genomes (in genbank format) used for this study. Sample names follow the convention Streptococcus_pneumoniae_sample_<lab_ID>_canu_BV-BRC-annotated.gb, where lab_ID is a unique identifier for each sample. Use command tar cvfz annotated_genomes.tgz to un-compress.
File: genomes_metadata_v1.3.csv
A comma-delimited text file containing metadata associated with the genomes.
Variables:
Lab_ID: a unique identifier for each samplewhg_code: a unique, de-identified code for the village the sample originated from. (Dosso region of Niger)group: treatment group.A: azithromycin,P: placebobeta_lactam: boolean indicator of genetic beta-lactam resistance status (0: not resistant, 1: resistant)erythromycin: boolean indicator of genetic erythromycin resistance status (0: not resistant, 1: resistant)trimethoprim_sulfamethoxazole: boolean indicator of genetic trimethoprim-sulfamethoxazole resistance status (0: not resistant, 1: resistant)tetracycline: boolean indicator of genetic tetracylcine resistance status (0: not resistant, 1: resistant)erythromycin_arup: boolean indicator of phenotypic erythromycin resistance status (0: not resistant, 1: resistant)betalactams_arup: boolean indicator of phenotypic beta-lactam resistance status (0: not resistant, 1: resistant)trimethoprim_sulfamethoxazole_arup: boolean indicator of phenotypic trimethoprim-sulfamethoxazole resistance status (0: not resistant, 1: resistant)tetracycline_arup: boolean indicator of phenotypic tetracylcine resistance status (0: not resistant, 1: resistant)genbank_filename: The filename of the associated Streptococcus pneumoniae genome found in theannotated_genomes.tgztar archive file.
Isolated pneumococcal colonies were subjected to long-read WGS using the SMRTbell Prep Kit 3.0 and sequenced on the PacBio Revio platform (Pacific Biosciences of California). PacBio HiFi reads were de novo assembled using Canu (version 2.2) with the ‘-pacbio-hifi’ option and evaluated for quality using BUSCO (version 5.7.1). Assemblies achieving a completeness score >90% underwent a ‘Comprehensive Genome Analysis’ using BV-BRC online tools (accessed April 2025). BV-BRC-annotated genomes were subsequently screened for mobile elements utilizing ICEscreen (version 1.3.3). Annotated genomes were saved in genbank format.
