Data from: Association of chronic obstructive pulmonary disease with risk of lung cancer in individuals aged 40 years and older: A cross-sectional study based on NHANES 2013-2018
Data files
Oct 09, 2024 version files 893.03 KB
-
COPD_lung_cancer_dataset.xlsx
890.45 KB
-
README.md
2.59 KB
Abstract
Background
It remains unclear whether chronic obstructive pulmonary disease (COPD) is an independent risk factor for lung cancer after excluding confounding factors such as smoking, age, gender, body mass index (BMI), comorbidities, etc.
Methods
Data from 11,440 participants (≥ 40 years old) in the National Health and Nutrition Examination Survey (NHANES) 2013-2018 were analyzed. Weighted multivariable logistic regression models were used to assess the association between COPD and lung cancer risk. Subgroup analyses were based on age, gender, body mass index (BMI), and smoking.
Results
This study included 660 COPD patients and 10,780 participants without COPD. The prevalence of lung cancer was significantly higher in COPD patients compared to participants without COPD (3.39% vs 0.14%). After adjusting for confounding factors, COPD was associated with a significantly increased risk of lung cancer (OR, 12.24, 95% CI, 4.99-30.06, p < 0.001). This association remained significant in all subgroups, particularly in individuals aged > 65 years (OR, 20.05, 95% CI, 6.85-58.72, p < 0.001), smokers (OR, 19.38, 95% CI, 2.02-185.66, p = 0.010), males (OR, 17.39, 95% CI, 5.28-57.31, p < 0.001), individuals who quit smoking within 10 years (OR, 12.86, 95% CI, 2.59, 63.99, p = 0.002), and individuals with a BMI > 25 kg/m2 (OR, 14.56, 95% CI, 3.88-54.69, p < 0.001).
Conclusions
COPD is an independent risk factor for lung cancer, especially in certain subgroups. The combination of COPD and smoking greatly amplifies the lung cancer risk. These findings highlight the importance of early lung cancer screening in COPD patients.
https://doi.org/10.5061/dryad.v15dv425f
Description of the data and file structure
File: COPD_lung_cancer_dataset.xlsx
Description: Variables in the dataset
Variables:
AGEC (%): Age, in two categories: ≤ 65 years old=1 & > 65 years old=2
GENDER (%): Gender, in two categories: male=1 & female=2
RACE (%): Race, in five categories: Non-Hispanic white=0 & Mexican American=1 & Other Hispanic=2 & Non-Hispanic black=4 & Other Race=5
BMIC (%): Body mass index (BMI), in three categories: ≤ 25 kg/m2=1 & > 25 kg/m2=2 & Unclassified=999.
EDUC: Education level, in four categories: < High school=1 & High school =2 & >High school=3 & Unclassified=999
INC(%): Annual family income, in three categories: < 20000 USD=1 & ≥20000 USD=2 & Unclassified=999.
ASTHMA (%): Asthma status, in three categories: yes=1 & no=0 & Unclassified=999.
DIABETES(%): Diabetes status, in three categories: yes=1 & no=0 & Unclassified=999.
CHF(%): Chronic heart failure (Congestive heart failure) status, in three categories: yes=1 & no=0 & Unclassified=999.
CHD(%): Coronary heart disease status, in three categories: yes=1 & no=0 & Unclassified=999.
STROKE(%): Stroke status, in three categories: yes=1 & no=0 & Unclassified=999.
COPD(%): Chronic obstructive pulmonary disease status, in three categories: yes=1 & no=0 & Unclassified=999.
LUNGCANCER(%): Lung cancer status, in three categories: yes=1 & no=0 & Unclassified=999.
SMOKER(%): Smoking status, Smokers (Yes)=1 & Non-smokers (No)=0 & Unclassified=999.
SMOKER QUITTIME(%): Smoking cessation duration, in three categories: ≤ 10 years=0 & >10 years=1 & Unclassified=999.
EMPHYSEMA(%): Emphysema status, in three categories: yes=1 & no=0 & Unclassified=999.
CHRONICBRONCHITIS(%): Chronic bronchitis status, in three categories: yes=1 & no=0 & Unclassified=999.
WT: Sampling weight, the final weights obtained after analysis. As three cycles of NHANES data were included, the sample weight (WTMEC6YR) was calculated as WTMEC6YR=WTMEC2YR × 1/3 following NHANES recommendations.
Code/software
Statistical analysis was performed using Stata software (version 17).
Sharing/Access information
The data we used could all be obtained through the NHANES database website.
All indirect identifiable categories were anonymized to protect the identity of participants.
The data analyzed in this study are from the National Health and Nutrition Examination Survey (NHANES), which are publicly available and can be downloaded from the NHANES website: http://www.cdc.gov/nchs/nhanes.htm
We obtained data from the NHANES database website for the three cycles of 2013-2014, 2015-2016, and 2017-2018. Data analysis, including baseline characteristic distribution, weighted multivariable logistic regression models, and subgroup analysis, was conducted using StataMP17.0.