Skip to main content

Single-cell expression and TCR data from CD19-specific CAR T cells in a phase I/II clinical trial

Cite this dataset

Kim, Hyunjin; Thomas, Paul G.; Crawford, Jeremy Chase (2022). Single-cell expression and TCR data from CD19-specific CAR T cells in a phase I/II clinical trial [Dataset]. Dryad.


By leveraging single-cell transcriptome and T cell receptor (TCR) sequencing, we aimed to track the transcriptional signatures of CAR T cell clonotypes throughout the course of treatment and furthermore identify molecular patterns leading to potent CAR T cell cytotoxicity. The data presented in this study encompass blood and bone marrow samples from patients ≤ 21 years of age with relapsed or refractory B-cell acute lymphoblastic leukemia (B-ALL) participating in the SJCAR19 phase I/II clinical trial (NCT03573700). In brief, patients enrolled in the clinical trial received either 1 x 10^6 (dose level 1) or 3 x 10^6 (dose level 2) per kilogram of body weight following successful generation of autologous CAR T cell products and lymphodepleting chemotherapy. Peripheral blood was drawn from each participant every week until week 4 post-infusion, at week 6 or 8, and month 3 or 6 if feasible. At week 4 post-infusion, blood marrow was also collected from participants. Total T cells (CD3+) were sorted from each post-infusion sample, as well as the pre-infusion CAR T cell products, and processed through 10x Genomics’ single-cell gene expression and V(D)J sequencing platform using the standard protocol. We identified a unique and unexpected transcriptional signature in a subset of pre-infusion CAR T cells that shared TCRs with post-infusion cytotoxic effector CAR T cells. Functional validation of cells with even a subset of these pre-effector markers demonstrated their immediate cytotoxic potential and resistance to exhaustion.


Cells were processed using the Chromium Single Cell V(D)J 5' reagents (10X Genomics). T cell receptor V(D)J cDNA was enriched using the Chromium Single Cell V(D)J Enrichment kit for Human T cells. Corresponding libraries were sequenced on the Illumina NovaSeq platform. Sequencing data were processed using CelLRanger v3.1.0 (10X Genomics) with the GRCh38 reference (v3.0.0) modified to include the first 825 nucleotide bases of the CD19-CAR transcript. The resulting gene expression matrices were aggregated, with read depth normalization based on the number of mapped reads. TCR sequences were processed with version 3.1.0 of the GRCh38 V(D)J reference.

Aggregated gene expression matrices were analyzed using Seurat (Hao et al, Cell 2021). Cells with fewer than 300 detected genes, more than 4,999 detected genes, with at least 10% of their expression owed to mitochondrial genes, or with no detected CD19-CAR UMIs (unique molecular identifiers) were excluded from downstream analyses. TCR lineages were integrated with gene expression data using shared cellular barcodes. Additional analyses are described in the corresponding manuscript. 

Usage notes

RDS files can be accessed with the readRDS() function in the R statistical environment. The Seurat R package can be used to analyze Seurat objects and explore included metadata.


National Cancer Institute, Award: P30CA021765

National Institute of Allergy and Infectious Diseases, Award: U01AI150747

National Institute of Allergy and Infectious Diseases, Award: R01AI136514

American Lebanese Syrian Associated Charities

St. Jude Children's Research Hospital