Culture media strongly influences primary human bronchial epithelial cells' transcriptomic response to ozone exposure
Data files
Jul 09, 2025 version files 8.45 MB
-
README.md
10.41 KB
-
TableS1_UNC_O3vsFA.xlsx
2.23 MB
-
TableS2_PNEU_O3vsFA.xlsx
2.22 MB
-
TableS3_FA_PNEUvsUNC.xlsx
2.21 MB
-
TableS4_UNC_enrichGO_Full.xlsx
234.48 KB
-
TableS5_UNC_enrichGO_Selected.xlsx
20.40 KB
-
TableS6_PNEUvsUNC_gseGO.xlsx
1.22 MB
-
TableS7_PNEUvsUNC_gseGO_PNEUonly.xlsx
17.27 KB
-
TableS8_PNEUvsUNC_gseGO_UNConly.xlsx
285.25 KB
Abstract
Exposure to the ambient air pollutant ozone induces acute and chronic respiratory health effects in part by causing inflammation of the airways. Several aspects of the inflammatory response to ozone can be modeled in vitro using primary human bronchial epithelial cells (HBECs) cultured at an air-liquid interface. We tested two commonly used HBEC culture media systems, one proprietary and one non-proprietary, to identify which system yielded the most in vivo-like pro-inflammatory response to acute ozone exposure as indicated by gene expression. Cells from six donors were grown in each culture system in parallel followed by examination of epithelial morphology and cell type proportions prior to ozone exposure. Cultures grown in the proprietary system were notably thicker and contained more ciliated and secretory cells, as well as internal cyst-like structures. The transcriptomic response to acute ozone exposure (0.5 parts per million ozone x 2 hours) was strongly affected by media. HBECs grown in the proprietary system exhibited minimal changes after ozone, with only 7 differentially expressed genes (DEGs). In contrast, HBECs grown in the non-proprietary system exhibited a more dynamic response with 128 DEGs, including hallmark response genes indicative of inflammation (CXCL8) and oxidative stress (HMOX1). Gene set enrichment analysis using the 128 DEGs further corroborated upregulation of oxidative stress and inflammation pathways. In total, our results indicate that the choice of HBEC culture media should be carefully considered to best model the in vivo response to ozone.
Dataset DOI: 10.5061/dryad.qfttdz0t4
Description of the data and file structure
Transcriptomic data for differential expression and gene set enrichment analysis to compare primary bronchial epithelial cell responses to ozone exposure when grown in different culture medias.
Files and variables
File: TableS1_UNC_O3vsFA.xlsx
Description: Full DESeq2 results from ozone vs filtered air-exposed UNC-grown cultures. Missing values indicated with blank cells.
Variables
- “hgnc_symbol” = HGNC approved symbol for the accompanying Ensembl ID.
- “baseMean” = Mean base expression level across all samples after DESeq2 library size normalization.
- “log2FoldChange” = log2 fold change calculated by DESeq2 for the given comparison.
- “lfcSE” = standard error of the log2 fold change values.
- “pvalue” = nominal p value calculated by DESeq2.
- “padj” = the Benjamini-Hochberg adjusted p value.
- “geneID” = the Ensembl gene ID with the Ensembl version suffix.
- “gene_biotype” = gene biotype as listed by Ensembl.
- “DE_status” = one of “Up”, “NS” (non-significant), or “Down”. To be up or down, the gene must pass an adjusted p value threshold of 0.05 and an absolute log2 fold change threshold of 0.58 (equivalent to a 1.5 fold change).
File: TableS2_PNEU_O3vsFA.xlsx
Description: Full DESeq2 results from ozone vs filtered air-exposed PneumaCult-grown cultures. Missing values indicated with blank cells.
Variables
- “hgnc_symbol” = HGNC approved symbol for the accompanying Ensembl ID.
- “baseMean” = Mean base expression level across all samples after DESeq2 library size normalization.
- “log2FoldChange” = log2 fold change calculated by DESeq2 for the given comparison.
- “lfcSE” = standard error of the log2 fold change values.
- “pvalue” = nominal p value calculated by DESeq2.
- “padj” = the Benjamini-Hochberg adjusted p value.
- “geneID” = the Ensembl gene ID with the Ensembl version suffix.
- “gene_biotype” = gene biotype as listed by Ensembl.
- “DE_status” = one of “Up”, “NS” (non-significant), or “Down”. To be up or down, the gene must pass an adjusted p value threshold of 0.05 and an absolute log2 fold change threshold of 0.58 (equivalent to a 1.5 fold change).
File: TableS3_FA_PNEUvsUNC.xlsx
Description: Full DESeq2 results from filtered air exposed cultures, PneumaCult-grown vs UNC-grown. Missing values indicated with blank cells.
Variables
- “hgnc_symbol” = HGNC approved symbol for the accompanying Ensembl ID.
- “baseMean” = Mean base expression level across all samples after DESeq2 library size normalization.
- “log2FoldChange” = log2 fold change calculated by DESeq2 for the given comparison.
- “lfcSE” = standard error of the log2 fold change values.
- “pvalue” = nominal p value calculated by DESeq2.
- “padj” = the Benjamini-Hochberg adjusted p value.
- “geneID” = the Ensembl gene ID with the Ensembl version suffix.
- “gene_biotype” = gene biotype as listed by Ensembl.
- “DE_status” = one of “Up”, “NS” (non-significant), or “Down”. To be up or down, the gene must pass an adjusted p value threshold of 0.05 and an absolute log2 fold change threshold of 0.58 (equivalent to a 1.5 fold change).
File: TableS4_UNC_enrichGO_Full.xlsx
Description: Full enrichGO results from UNC DEGs.
Variables
- “ID” = Gene Ontology ID for the gene set.
- “Description” = Full name of the gene set.
- “GeneRatio” = How many genes in the differentially expressed list that are observed in the gene set versus not observed.
- “BgRatio” = How many genes in the background gene list that are observed in the gene set versus not observed.
- “pvalue” = nominal p value for genes in that gene set being enriched in the differentially expressed list compared to the background list.
- “p.adjust” = Benjamini-Hochberg adjusted p value.
- “qvalue” = enrichGO calculated q value.
- “geneID” = HGNC gene name of genes in differentially expressed list found in the gene set.
- “Count” = count of genes in differentially expressed list found in the gene set.
File: TableS5_UNC_enrichGO_Selected.xlsx
Description: Selected gene sets from table 5 used to select genes for main figure 3C.
Variables
- “ID” = Gene Ontology ID for the gene set.
- “Description” = Full name of the gene set.
- “GeneRatio” = How many genes in the differentially expressed list that are observed in the gene set versus not observed.
- “BgRatio” = How many genes in the background gene list that are observed in the gene set versus not observed.
- “pvalue” = nominal p value for genes in that gene set being enriched in the differentially expressed list compared to the background list.
- “p.adjust” = Benjamini-Hochberg adjusted p value.
- “qvalue” = enrichGO calculated q value.
- “geneID” = HGNC gene name of genes in differentially expressed list found in the gene set.
- “Count” = count of genes in differentially expressed list found in the gene set.
File: TableS6_PNEUvsUNC_gseGO.xlsx
Description: Full gseGO results from filtered air-exposed PneumaCult vs UNC DEGs.
Variables
- “ID” = Gene Ontology ID for the gene set.
- “Description” = Full name of the gene set.
- “setSize” = Size of the gene set.
- “enrichmentScore” = enrichment score as calculated by gseGO, positive values denote activated gene sets (more genes have positive log2 fold changes and are in the top part of the ranked list), negative values denote suppressed gene sets (more genes have negative log2 fold changes and are in the bottom part of the ranked list).
- “NES” = normalized enrichment score.
- “pvalue” = nominal p value for genes in that gene set being enriched in the differentially expressed list compared to the background list.
- “p.adjust” = Benjamini-Hochberg adjusted p value.
- “qvalue” = gseGO calculated q value.
- “rank” = the position in the ranked gene list where the maximum enrichment score was found.
- “leading_edge” = three fields: tags - the percentage of genes in the gene set found in the ranked list before (if positive enrichment score) or after (if negative enrichment score) the peak rank score. list - the percentage of genes included or not included in the gene set found before (if positive enrichment score) or after (if negative enrichment score) the peak rank score. signal - enrichment signal strength, combines the previous two metrics.
- “core_enrichment” = the subset of genes contributing most to the enrichment score.
File: TableS7_PNEUvsUNC_gseGO_PNEUonly.xlsx
Description: Filtered version of table S6 showing gene sets enriched in PneumaCult-grown cultures.
Variables
- “ID” = Gene Ontology ID for the gene set.
- “Description” = Full name of the gene set.
- “setSize” = Size of the gene set.
- “enrichmentScore” = enrichment score as calculated by gseGO, positive values denote activated gene sets (more genes have positive log2 fold changes and are in the top part of the ranked list), negative values denote suppressed gene sets (more genes have negative log2 fold changes and are in the bottom part of the ranked list).
- “NES” = normalized enrichment score.
- “pvalue” = nominal p value for genes in that gene set being enriched in the differentially expressed list compared to the background list.
- “p.adjust” = Benjamini-Hochberg adjusted p value.
- “qvalue” = gseGO calculated q value.
- “rank” = the position in the ranked gene list where the maximum enrichment score was found.
- “leading_edge” = three fields: tags - the percentage of genes in the gene set found in the ranked list before (if positive enrichment score) or after (if negative enrichment score) the peak rank score. list - the percentage of genes included or not included in the gene set found before (if positive enrichment score) or after (if negative enrichment score) the peak rank score. signal - enrichment signal strength, combines the previous two metrics.
- “core_enrichment” = the subset of genes contributing most to the enrichment score.
File: TableS8_PNEUvsUNC_gseGO_UNConly.xlsx
Description: Filtered version of table 6 showing gene sets enriched in UNC-grown cultures.
Variables
- “ID” = Gene Ontology ID for the gene set.
- “Description” = Full name of the gene set.
- “setSize” = Size of the gene set.
- “enrichmentScore” = enrichment score as calculated by gseGO, positive values denote activated gene sets (more genes have positive log2 fold changes and are in the top part of the ranked list), negative values denote suppressed gene sets (more genes have negative log2 fold changes and are in the bottom part of the ranked list).
- “NES” = normalized enrichment score.
- “pvalue” = nominal p value for genes in that gene set being enriched in the differentially expressed list compared to the background list.
- “p.adjust” = Benjamini-Hochberg adjusted p value.
- “qvalue” = gseGO calculated q value.
- “rank” = the position in the ranked gene list where the maximum enrichment score was found.
- “leading_edge” = three fields: tags - the percentage of genes in the gene set found in the ranked list before (if positive enrichment score) or after (if negative enrichment score) the peak rank score. list - the percentage of genes included or not included in the gene set found before (if positive enrichment score) or after (if negative enrichment score) the peak rank score. signal - enrichment signal strength, combines the previous two metrics.
- “core_enrichment” = the subset of genes contributing most to the enrichment score.
Code/software
Raw read quality was checked using FastQC (v0.12.1). Residual adapter sequences were trimmed using cutadapt (v4.1). Reads were aligned to the human transcriptome (GRch38.p14, release 44) using Star (v2.7.11a) and quantified using Salmon (v1.10.2).
All downstream analyses were performed using R (v4.3.1). Salmon counts were imported using tximport. Genes without at least 10 counts across 6 samples were filtered out. Differential expression analysis was performed using DESeq2; genes were considered differentially expressed with an adj.p.val < 0.05 and an absolute log2FC > 0.58.
All processing and analysis commands can be found at https://github.com/Kelada-Lab/HBEC-Media-Comparison.