Data from: Antimicrobial resistance genes and antibiotic use in chronic lung disease: A bronchoscopy study of the lower airways microbiome
Data files
May 15, 2026 version files 190.45 KB
-
ARGdata_BronchoscopyStudy_11May26.csv
172.05 KB
-
README.md
3.81 KB
-
Stata_Syntax_28mar26.txt
14.59 KB
Abstract
Background: Antimicrobial resistance genes (ARGs) in the respiratory microbiome are poorly characterized. We compared the presence of ARGs in healthy controls with chronic lung disease patients in a cross-sectional study, adjusted for time since antibiotic use.
Methods: Bronchoalveolar lavage was collected from 100 controls, 93 COPD, 13 asthma, 34 sarcoidosis, 12 idiopathic pulmonary fibrosis (IPF) patients, and 11 patients with unclassifiable interstitial lung disease (uILD). Participants had not used antibiotics for 14 days prior to sampling. Shotgun metagenomic sequencing was performed with Illumina NovaSeq. ARGs were identified using the National Database of Antibiotic-Resistant Organisms. Sample reads were normalized to counts per million.
Results: In total, 38% of controls had at least one ARG, compared with 51%, 39%, 65%, and 83% of COPD, asthma, sarcoidosis, and IPF patients, respectively (p=0.01). ARGs against tetracycline (33%) were the most common ARG class, followed by beta-lactam and macrolide resistance (both 26%). In a logistic regression analysis adjusted for sex, age, body composition, smoking, and antibiotics use, the OR (95% CI) for having ARGs in the lower airways was 1.30 (0.70-2.41) in COPD, 1.00 (0.29-3.52) in asthma, 3.52 (1.40-8.83) in sarcoidosis, and 6.40 (1.25-32.73) in IPF, and 3.27 (0.76-14.16) in uILD compared with controls. Overall mean (SD) ARG counts per million were 403.8 (537.7) in the 35 subjects who had used antibiotics ≤ 3 months before bronchoscopy, compared with 197.6 (355.9) in the 228 subjects without (p=0.02).
Conclusion: The presence of ARGs in the lower airways microbiome was significantly higher in patients with sarcoidosis and IPF than in controls. The counts per million for ARGs was significantly associated with recent antibiotic use.
Dataset DOI: 10.5061/dryad.g1jwstr48
Description of the data and file structure
The file ARGdata_BronchoscopyStudy_11May26.csv contains all necessary data.
There are 289 variables in 263 subjects, where one subject have one unique sample each.
Data created by Tomas Eagan May 11th, 2026
Files and variables
File: ARGdata_BronchoscopyStudy_11may26.csv
Description: Main data table where each row is one BAL sequenced sample, and each column a unique variable
Variables
- A0A081PMP8: All variables in the format A0A081PMP8 until X2KYR0 are UniProt ID identified sequences in the dataset, normalized to counts per million
- SampleReads_total: numerical adding all counts per row
- log_SampleReads_total: Natural log of of SampleReads_total, transformed for linear regression analyses
- TETRAclass_sum: sum of counts per million sequences assigned ARG to tetracycline class ARGs
- BETA_LACTAMclass_sum: sum of counts per million sequences assigned ARG to beta-lactam class ARGs
- MACROLIDEclass_sum: sum of counts per million sequences assigned ARG to Macrolide class ARGs
- TETRAclass_2cat:presence (=1)/abscence (=0) of any assigned ARG to tetracycline class ARGs
- BETA_LACTAMclass_2cat: presence (=1)/abscence (=0) of any assigned ARG to beta-lactam class ARGs
- MACROLIDEclass_2cat: presence (=1)/abscence (=0) of any assigned ARG to Macrolide class ARGs
- haveARG: Any ARGs found yes (=1) or no (=0)
- log_SampleReads_BETA: Natural log of SampleReads_total_BETA (total number of counts per million reads for beta-lactam class ARGs per row (subject), transformed for linear regression analyses
- log_SampleReads_TETRA: Natural log of SampleReads_total_TETRA (total number of counts per million reads for tetracycline class ARGs per row (subject), transformed for linear regression analyses
- log_SampleReads_MACRO: Natural log of SampleReads_total_MACRO (total number of counts per million reads for macrolide class ARGs per row (subject), transformed for linear regression analyses
- sex: Sex in 0 & 1, not labelled for anonymization
- age_categories: Age in 2 categories (0= < 65 yrs, 1 = > 65 yrs)
- body_comp: Body composition in three categories (1= Normal, 2= Cachectic, 3= Obese)
- current_smoking: 0= former or never, 1 = daily)
- antibiotics_3months: Used antibiotics the last 3 months (0=no, 1=yes)
- antibiotics_1yr: Used antibiotics the last year (0=no, 1=yes)
- antibiotics_5yrs: Used antibiotics the last 5 yrs (0=no, 1=yes)
- cops_ex: Moderate or severe COPD exacerbation the last 12 months (0=no, 1=yes)
- gold_2cat: COPD severity by GOLD stage in two categories (0= I/II, 1= III/IV)
- inhalsteroid: use of inhaled steroids in COPD patients, no/yes
- diagnosis: Diagnoses (1=Control, 2=COPD, 3=Asthma, 4=Sarcoidosis, 5=IPF, 6=uILD)
File: Stata_Syntax_28mar26.txt
Description: Copy of the Stata syntax file used for the analyses presented in the paper, just for reference
Variables
- NA
Code/software
Analyses for published paper performed in Stata 18, which requires purchase. Here we provide the exported csv file which can be opened in excel or R.
Access information
Other publicly accessible locations of the data:
- NA
Data was derived from the following sources:
- NA
Human subjects data
All subjects have given informed consent to participate, including publication of collected data in an anonymized form. There are no id numbers in the datafile, which cannot be linked to the original study subjects. Age is given in 10 year categories, and sex is only numerically provided without described labelling.
