Skip to main content
Dryad

Data from: Clustering Deviation Index (CDI): A robust and accurate internal measure for evaluating scRNA-seq data clustering

Data files

Oct 03, 2022 version files 549.63 MB

Click names to download individual files

Abstract

The clustering of cells has been widely used to explore the heterogeneity of cell populations in single-cell RNA-sequencing (scRNA-seq). We proposed a parametric model for monoclonal and polyclonal scRNA-seq data to evaluate clustering results. Based on the parametric model, we proposed a metric (CDI) to quantify the goodness-of-fit of cell clustering to the data. Here we presented CT26.WT and T-CELL as two datasets to examine the performance of our model and metric. CT26.WT contains wild-type CT26 cells from the murine colorectal carcinoma cell line, and cells in CT26.WT are highly homogeneous. T-CELL contains T-cells from tumor tissue of mice three weeks after 4T1 tumor injection. From these datasets and public datasets, we validated our model and benchmarked our metric.