Cell type labels for all clustering and normalization combinations compared for CODEX multiplexed imaging

Hickey, John 1

Published Jul 31, 2021; Updated Nov 17, 2022 on Dryad. https://doi.org/10.5061/dryad.dfn2z352c

Data files

Jul 31, 2021 version files 121.93 MB

cell_1_annot.csv

36.26 MB
cell_2_annot.csv

27.86 MB
cell_3_annot.csv

27.05 MB
cell_4_annot.csv

30.75 MB
Readme.txt

5.52 KB

Nov 17, 2022 version files 121.93 MB

cell_1_annot.csv

36.26 MB
cell_2_annot.csv

27.86 MB
cell_3_annot.csv

27.05 MB
cell_4_annot.csv

30.75 MB
README.txt

5.52 KB

Abstract

We performed CODEX (co-detection by indexing) multiplexed imaging on four sections of the human colon (ascending, transverse, descending, and sigmoid) using a panel of 47 oligonucleotide-barcoded antibodies. Subsequently images underwent standard CODEX image processing (tile stitching, drift compensation, cycle concatenation, background subtraction, deconvolution, and determination of best focal plane), and single cell segmentation. Output of this process was a dataframe of nearly 130,000 cells with fluorescence values quantified from each marker. We used this dataframe as input to 1 of the 5 normalization techniques of which we compared z, double-log(z), min/max, and arcsinh normalizations to the original unmodified dataset. We used these normalized dataframes as inputs for 4 unsupervised clustering algorithms: k-means, leiden, X-shift euclidian, and X-shift angular.

From the clustering outputs, we then labeled the clusters that resulted for cells observed in the data producing 20 unique cell type labels. We also labeled cell types by hiearchical hand-gating data within cellengine (cellengine.com). We also created another gold standard for comparison by overclustering unormalized data with X-shift angular clustering. Finally, we created one last label as the major cell type call from each cell from all 21 cell type labels in the dataset.

Consequently the dataset has individual cells segmented out in each row. Then there are columns for the X, Y position in pixels in the overall montage image of the dataset. There are also columns to indicate which region the data came from (4 total). The rest are labels generated by all the clustering and normalization techniques used in the manuscript and what were compared to each other. These also were the data that were used for neighborhood analysis for the last figure of the manuscript. These are provided at all four levels of cell type level granularity (from 7 cell types to 35 cell types).

Cell type labels for all clustering and normalization combinations compared for CODEX multiplexed imaging

Data files

Abstract

Usage notes

Works referencing this dataset