Data from: Dynamics of the CD9 interactome during bacterial infection of epithelial cells by proximity labelling proteomics

Wolverson, Paige1; Fernandes Parreira, Isabel1; Thompson, Ruth1; Collins, Mark1; Shaw, Jonathan1; Green, Luke 1

Published Oct 21, 2025 on Dryad. https://doi.org/10.5061/dryad.m905qfvfc

Data files

Oct 21, 2025 version files 41.52 KB

CellVis_Enrichment_240_mins.csv

14.10 KB
CellVis_Enrichment_30_mins.csv

9.36 KB
CellVis_Enrichment_60_mins.csv

12.84 KB
README.md

5.22 KB

Abstract

Bacterial species utilise different receptors at the cell membrane to adhere to cells. Previously, we demonstrated that interference with CD9, a human tetraspanin, reduces adherence of multiple species of bacteria to cells. CD9 is not a receptor but organises numerous commandeered host proteins at the cell membrane; however, the full interactome has not yet been delineated. Using a CD9 proximity labelling model, a first for CD9, we observed a diverse interactome, with 710 enriched proteins in uninfected cells. Proximal proteins were associated with various cellular processes, including extracellular matrix (ECM)–receptor interactions and tight junctions. Several known bacterial receptors were also detected, including CD44, CD46, and CD147. The interactome was dynamic during infection with two distinct bacterial species, Neisseria meningitidis and Staphylococcus aureus. In total, 12 human proteins were enriched during meningococcal infection, compared to one during staphylococcal infection, demonstrating different host factor requirements during CD9-mediated bacterial adherence. CD44 or CD147 knockdown reduced staphylococcal and meningococcal adherence, respectively, but not vice versa. However, in combination with CD9 interference, no additive effects were observed, demonstrating association of these proteins during infection. We have developed a tool that measures changes within the CD9 interactome, demonstrated CD9 as a universal organiser of bacterial ‘adhesion platforms’, and shown efficacy of a disrupting CD9-derived peptide.

Dataset DOI: 10.5061/dryad.m905qfvfc

Description of the data and file structure

Data S2. Cell compartment and cellular pathway analysis of identified proteins. Data files contain analysis of significantly enriched proteins identified by mass spectrometry at 30, 60 and 240 minutes. Data files supplied by SubcellulaRVis demonstrates the cellular compartment analysis of the enriched proteins. Data files supplied by WebGestalt provide the KEGG pathway analysis of the significantly enriched proteins.

Files and variables

File: CellVis_Enrichment_30_mins.csv

Description: Cellular compartment analysis from SubcellulaRVis after 30 mins.

File: CellVis_Enrichment_60_mins.csv

Description: Cellular compartment analysis from SubcellulaRVis after 60 mins.

File: CellVis_Enrichment_240_mins.csv

Description: Cellular compartment analysis from SubcellulaRVis after 240 mins.

Compartment: The cellular localization category based on Gene Ontology (GO) Cellular Component terms.
p: p value associated with the enrichment of genes within the GO annotated pathway
FDR: False discovery rate associated with genes enriched within the GO annotated pathway
Significant: Whether the compartment passes the statistical significance threshold
n: The number of genes from input list that are associated with that cellular compartment
Genes: The list of genes from dataset that map to that compartment

File: Supplementary_Data_2.zip (hosted on Zenodo)

Description: Files are organised in to analysis of datasets from 30, 60 and 240 minutes.

Each directory contains:

i) a .html file containing the KEGG pathway analysis from WebGestalt

ii) various .txt and .png files which are used to build the .html file

Description of .txt files and .png files within zipped directories:

enriched_geneset_wsc_topsets_wg_result.txt* - describes the top GO sets observed through a weighted set cover analysis of the dataset.

enrichment_results_wg_result.txt* - describes the top GO sets observed with no redundancy reduction.

Column 1 - geneSet - GO annotated pathway, hsa*

Column 2 - description - description of GO annotated pathway

Column 3 - link - hyperlink to the GO annotated pathway

Column 4 - size - number of genes associated with the GO annotated pathway

Column 5 - overlap - number of genes from the analysed dataset which overlap with the GO annotated pathway

Column 6 - expect - number of genes expected to overlap with the GO annotated dataset

Column 7 - enrichmentRatio - ratio of expected enriched genes. The number of overlapping genes divided by the number of expected genes

Column 8 - pValue - p value associated with the enrichment of genes within the GO annotated pathway

Column 9 - FDR - False discovery rate associated with genes enriched within the GO annotated pathway

Column 10 - overlapId - Entrez Gene ID associated with enriched genes from the analysed dataset associated with the GO annotated pathway

Column 11 - userId - Gene names associated with the enriched genes from the analysed dataset associated with the GO annotated pathway

goslim_summary_wg_result.png* - demonstrates bar charts produced by GO Slim analysis showing Biological process categories, Cellular component categories and Molecular Function categories

goslim_summary_wg_result_bp.txt* - tabular form of the GO Slim analysis showing biological process analysis

Column 1 - GO biological process annotation

Column 2 - description of GO biological process

Column 3 - number of genes associated

goslim_summary_wg_result_cc.txt* - tabular form of the GO Slim analysis showing cellular compartment analysis

Column 1 - GO cellular compartment annotation

Column 2 - description of GO biological process

Column 3 - number of genes associated

goslim_summary_wg_result_mf.txt** - *tabular form of the GO Slim analysis showing molecular function analysis

Column 1 - GO molecular function annotation

Column 2 - description of GO biological process

Column 3 - number of genes associated

interestingID_mappingTable_wg_result.txt* - table showing mapped function of analysed genes

Column 1 - userId - GeneID provided by the user

Column 2 - geneSymbol - recognised gene symbol

Column 3 - geneName - full gene name

Column 4 - entrezgene - number associated with the gene in the Entrez database

Column 5 - gLink - Hyperlink associated with the gene within the NCBI database

interestingID_unmappedList_wg_result.txt* - list of unmapped genes from the analysed dataset, lists user inputted gene names not recognised in the database

Code/software

Files will require to be unzipped using any compression software. .html files can be opened with any web browser, .csv files can be opened with Microsoft Excel.

Access information

Data was analysed from the following sources:

SubcellulaRVis (https://shiny.its.manchester.ac.uk/subcellularvis/)
WebGestalt (https://www.webgestalt.org/)

Data from: Dynamics of the CD9 interactome during bacterial infection of epithelial cells by proximity labelling proteomics

Data files

Abstract

README: Data from: Dynamics of the CD9 interactome during bacterial infection of epithelial cells by proximity labelling proteomics

Description of the data and file structure

Files and variables

File: CellVis_Enrichment_30_mins.csv

File: CellVis_Enrichment_60_mins.csv

File: CellVis_Enrichment_240_mins.csv

File: Supplementary_Data_2.zip (hosted on Zenodo)

Code/software

Access information