scATAC data from: Organization of the human intestine at single cell resolution
Data files
Mar 24, 2023 version files 144.54 GB
-
atac_sample_location_metadata.csv
-
B001-A-001_atac_fragments.tsv.gz
-
B001-A-001_atac_fragments.tsv.gz.tbi
-
B001-A-006_atac_fragments.tsv.gz
-
B001-A-006_atac_fragments.tsv.gz.tbi
-
B001-A-101_atac_fragments.tsv.gz
-
B001-A-101_atac_fragments.tsv.gz.tbi
-
B001-A-201_atac_fragments.tsv.gz
-
B001-A-201_atac_fragments.tsv.gz.tbi
-
B001-A-301_atac_fragments.tsv.gz
-
B001-A-301_atac_fragments.tsv.gz.tbi
-
B001-A-302_atac_fragments.tsv.gz
-
B001-A-302_atac_fragments.tsv.gz.tbi
-
B001-A-401_atac_fragments.tsv.gz
-
B001-A-401_atac_fragments.tsv.gz.tbi
-
B001-A-406_atac_fragments.tsv.gz
-
B001-A-406_atac_fragments.tsv.gz.tbi
-
B001-A-501_atac_fragments.tsv.gz
-
B001-A-501_atac_fragments.tsv.gz.tbi
-
B004-A-004_atac_fragments.tsv.gz
-
B004-A-004_atac_fragments.tsv.gz.tbi
-
B004-A-004-R2_atac_fragments.tsv.gz
-
B004-A-004-R2_atac_fragments.tsv.gz.tbi
-
B004-A-008_atac_fragments.tsv.gz
-
B004-A-008_atac_fragments.tsv.gz.tbi
-
B004-A-204_atac_fragments.tsv.gz
-
B004-A-204_atac_fragments.tsv.gz.tbi
-
B004-A-304_atac_fragments.tsv.gz
-
B004-A-304_atac_fragments.tsv.gz.tbi
-
B004-A-404_atac_fragments.tsv.gz
-
B004-A-404_atac_fragments.tsv.gz.tbi
-
B004-A-404-R2_atac_fragments.tsv.gz
-
B004-A-404-R2_atac_fragments.tsv.gz.tbi
-
B004-A-408_atac_fragments.tsv.gz
-
B004-A-408_atac_fragments.tsv.gz.tbi
-
B004-A-504_atac_fragments.tsv.gz
-
B004-A-504_atac_fragments.tsv.gz.tbi
-
B005-A-001_atac_fragments.tsv.gz
-
B005-A-001_atac_fragments.tsv.gz.tbi
-
B005-A-002_atac_fragments.tsv.gz
-
B005-A-002_atac_fragments.tsv.gz.tbi
-
B005-A-101_atac_fragments.tsv.gz
-
B005-A-101_atac_fragments.tsv.gz.tbi
-
B005-A-201_atac_fragments.tsv.gz
-
B005-A-201_atac_fragments.tsv.gz.tbi
-
B005-A-301_atac_fragments.tsv.gz
-
B005-A-301_atac_fragments.tsv.gz.tbi
-
B005-A-401_atac_fragments.tsv.gz
-
B005-A-401_atac_fragments.tsv.gz.tbi
-
B005-A-402_atac_fragments.tsv.gz
-
B005-A-402_atac_fragments.tsv.gz.tbi
-
B005-A-501_atac_fragments.tsv.gz
-
B005-A-501_atac_fragments.tsv.gz.tbi
-
B006-A-001_atac_fragments.tsv.gz
-
B006-A-001_atac_fragments.tsv.gz.tbi
-
B006-A-002_atac_fragments.tsv.gz
-
B006-A-002_atac_fragments.tsv.gz.tbi
-
B006-A-101_atac_fragments.tsv.gz
-
B006-A-101_atac_fragments.tsv.gz.tbi
-
B006-A-201_atac_fragments.tsv.gz
-
B006-A-201_atac_fragments.tsv.gz.tbi
-
B006-A-201-R2_atac_fragments.tsv.gz
-
B006-A-201-R2_atac_fragments.tsv.gz.tbi
-
B006-A-301_atac_fragments.tsv.gz
-
B006-A-301_atac_fragments.tsv.gz.tbi
-
B006-A-401_atac_fragments.tsv.gz
-
B006-A-401_atac_fragments.tsv.gz.tbi
-
B006-A-402_atac_fragments.tsv.gz
-
B006-A-402_atac_fragments.tsv.gz.tbi
-
B006-A-501_atac_fragments.tsv.gz
-
B006-A-501_atac_fragments.tsv.gz.tbi
-
B008-A-001_atac_fragments.tsv.gz
-
B008-A-001_atac_fragments.tsv.gz.tbi
-
B008-A-002_atac_fragments.tsv.gz
-
B008-A-002_atac_fragments.tsv.gz.tbi
-
B008-A-101_atac_fragments.tsv.gz
-
B008-A-101_atac_fragments.tsv.gz.tbi
-
B008-A-201_atac_fragments.tsv.gz
-
B008-A-201_atac_fragments.tsv.gz.tbi
-
B008-A-301_atac_fragments.tsv.gz
-
B008-A-301_atac_fragments.tsv.gz.tbi
-
B008-A-401_atac_fragments.tsv.gz
-
B008-A-401_atac_fragments.tsv.gz.tbi
-
B008-A-402_atac_fragments.tsv.gz
-
B008-A-402_atac_fragments.tsv.gz.tbi
-
B008-A-501_atac_fragments.tsv.gz
-
B008-A-501_atac_fragments.tsv.gz.tbi
-
B009-A-001_atac_fragments.tsv.gz
-
B009-A-001_atac_fragments.tsv.gz.tbi
-
B009-A-101_atac_fragments.tsv.gz
-
B009-A-101_atac_fragments.tsv.gz.tbi
-
B009-A-301_atac_fragments.tsv.gz
-
B009-A-301_atac_fragments.tsv.gz.tbi
-
B009-A-405_atac_fragments.tsv.gz
-
B009-A-405_atac_fragments.tsv.gz.tbi
-
B009-A-501_atac_fragments.tsv.gz
-
B009-A-501_atac_fragments.tsv.gz.tbi
-
B010-A-001_atac_fragments.tsv.gz
-
B010-A-001_atac_fragments.tsv.gz.tbi
-
B010-A-002_atac_fragments.tsv.gz
-
B010-A-002_atac_fragments.tsv.gz.tbi
-
B010-A-101_atac_fragments.tsv.gz
-
B010-A-101_atac_fragments.tsv.gz.tbi
-
B010-A-201_atac_fragments.tsv.gz
-
B010-A-201_atac_fragments.tsv.gz.tbi
-
B010-A-301_atac_fragments.tsv.gz
-
B010-A-301_atac_fragments.tsv.gz.tbi
-
B010-A-401_atac_fragments.tsv.gz
-
B010-A-401_atac_fragments.tsv.gz.tbi
-
B010-A-405_atac_fragments.tsv.gz
-
B010-A-405_atac_fragments.tsv.gz.tbi
-
B010-A-501_atac_fragments.tsv.gz
-
B010-A-501_atac_fragments.tsv.gz.tbi
-
B011-A-001_atac_fragments.tsv.gz
-
B011-A-001_atac_fragments.tsv.gz.tbi
-
B011-A-002_atac_fragments.tsv.gz
-
B011-A-002_atac_fragments.tsv.gz.tbi
-
B011-A-101_atac_fragments.tsv.gz
-
B011-A-101_atac_fragments.tsv.gz.tbi
-
B011-A-201_atac_fragments.tsv.gz
-
B011-A-201_atac_fragments.tsv.gz.tbi
-
B011-A-301_atac_fragments.tsv.gz
-
B011-A-301_atac_fragments.tsv.gz.tbi
-
B011-A-401_atac_fragments.tsv.gz
-
B011-A-401_atac_fragments.tsv.gz.tbi
-
B011-A-405_atac_fragments.tsv.gz
-
B011-A-405_atac_fragments.tsv.gz.tbi
-
B011-A-501_atac_fragments.tsv.gz
-
B011-A-501_atac_fragments.tsv.gz.tbi
-
B012-A-001_atac_fragments.tsv.gz
-
B012-A-001_atac_fragments.tsv.gz.tbi
-
B012-A-002_atac_fragments.tsv.gz
-
B012-A-002_atac_fragments.tsv.gz.tbi
-
B012-A-101_atac_fragments.tsv.gz
-
B012-A-101_atac_fragments.tsv.gz.tbi
-
B012-A-201_atac_fragments.tsv.gz
-
B012-A-201_atac_fragments.tsv.gz.tbi
-
B012-A-301_atac_fragments.tsv.gz
-
B012-A-301_atac_fragments.tsv.gz.tbi
-
B012-A-401_atac_fragments.tsv.gz
-
B012-A-401_atac_fragments.tsv.gz.tbi
-
B012-A-405_atac_fragments.tsv.gz
-
B012-A-405_atac_fragments.tsv.gz.tbi
-
B012-A-501_atac_fragments.tsv.gz
-
B012-A-501_atac_fragments.tsv.gz.tbi
-
colon_epithelial_peak_matrix_cells.tsv
-
colon_epithelial_peak_matrix.mtx
-
colon_epithelial_peaks.bed
-
duodenum_epithelial_peak_matrix_cells.tsv
-
duodenum_epithelial_peak_matrix.mtx
-
duodenum_epithelial_peaks.bed
-
ileum_epithelial_peak_matrix_cells.tsv
-
ileum_epithelial_peak_matrix.mtx
-
ileum_epithelial_peaks.bed
-
immune_peak_matrix_cells.tsv
-
immune_peak_matrix.mtx
-
immune_peaks.bed
-
jejunum_epithelial_peak_matrix_cells.tsv
-
jejunum_epithelial_peak_matrix.mtx
-
jejunum_epithelial_peaks.bed
-
non_multiome_colon_epithelial_peak_matrix_cells.tsv
-
non_multiome_colon_epithelial_peak_matrix.mtx
-
non_multiome_colon_epithelial_peaks.bed
-
non_multiome_duodenum_epithelial_peak_matrix_cells.tsv
-
non_multiome_duodenum_epithelial_peak_matrix.mtx
-
non_multiome_duodenum_epithelial_peaks.bed
-
non_multiome_ileum_epithelial_peak_matrix_cells.tsv
-
non_multiome_ileum_epithelial_peak_matrix.mtx
-
non_multiome_ileum_epithelial_peaks.bed
-
non_multiome_immune_peak_matrix_cells.tsv
-
non_multiome_immune_peak_matrix.mtx
-
non_multiome_immune_peaks.bed
-
non_multiome_jejunum_epithelial_peak_matrix_cells.tsv
-
non_multiome_jejunum_epithelial_peak_matrix.mtx
-
non_multiome_jejunum_epithelial_peaks.bed
-
non_multiome_stromal_peak_matrix_cells.tsv
-
non_multiome_stromal_peak_matrix.mtx
-
non_multiome_stromal_peaks.bed
-
peak_matrix_metadata.csv
-
README.md
-
scATAC_multiome_cell_types_epithelial_colon.tsv
-
scATAC_multiome_cell_types_epithelial_duodenum.tsv
-
scATAC_multiome_cell_types_epithelial_ileum.tsv
-
scATAC_multiome_cell_types_epithelial_jejunum.tsv
-
scATAC_multiome_cell_types_immune.tsv
-
scATAC_multiome_cell_types_stromal.tsv
-
scATAC_non_multiome_cell_types_epithelial_colon.tsv
-
scATAC_non_multiome_cell_types_epithelial_duodenum.tsv
-
scATAC_non_multiome_cell_types_epithelial_ileum.tsv
-
scATAC_non_multiome_cell_types_epithelial_jejunum.tsv
-
scATAC_non_multiome_cell_types_immune.tsv
-
scATAC_non_multiome_cell_types_stromal.tsv
-
stromal_peak_matrix_cells.tsv
-
stromal_peak_matrix.mtx
-
stromal_peaks.bed
Abstract
The human adult intestinal system is a complex organ that is approximately 9 meters long and performs a variety of complex functions including digestion, nutrient absorption, and immune surveillance. We performed snATAC-seq on 8 regions of of the human intestine (duodenum, proximal-jejunum, mid-jejunum, ileum, ascending colon, transverse colon, descending colon, and sigmoid colon) from 9 donors (B001, B004, B005, B006, B008, B009, B010, B011, and B012). In the corresponding paper, we find cell compositions differ dramatically across regions of the intestine and demonstrate the complexity of epithelial subtypes. We map gene regulatory differences in these cells suggestive of a regulatory differentiation cascade, and associate intestinal disease heritability with specific cell types. These results describe the complexity of the cell composition, regulation, and organization in the human intestine, and serve as an important reference map for understanding human biology and disease.
Methods
For a detailed description of each of the steps of protocols and processes to obtain this data see the detailed materials and methods in the associated manuscript. Briefly, intestine pieces from 8 different sites across the small intestine and colon were flash frozen. Nuclei were isolated from each sample and the resulting nuclei were processed with either 10x scRNA-seq using Chromium Next GEM Single Cell 3’ Reagent Kits v3.1 (10x Genomics, 1000121) or Chromium Next GEM Chip G Single Cell Kits (10x Genomics, 1000120) or 10x multiome sequencing using Chromium Next GEM Single Cell Multiome ATAC + Gene Expression Kits (10x Genomics, 1000283).
Initial processing of snATAC-seq data was done with the Cell Ranger ATAC pipeline and initial processing of the mutiome data, including alignment and generation of fragments files and expression matrices, was performed with the Cell Ranger ARC pipeline. The fragments files from these pipelines are included here. Downstream processing was performed in R.
Usage notes
The dataset includes the fragments file generated by Cell Ranger ATAC or Cell Ranger ARC for each individual sample included in the study. For each sample there is a ***_atac_fragments.tsv.gz file and a ***_atac_fragments.tsv.gz.tbi file.
In addition to the individual fragments files, we include peak barcode matrices. These are divided into separate matrices for the duodenum epithelial cells, jejunum epithelial cells, ileum epithelial cells, colon epithelial cells, stromal cells, and immune cells. Each peak matrix is stored in three files. The ***_peak_matrix.mtx file contains the matrix data, the ***_peak_matrix_cells.tsv file contains the cell barcodes represented in the matrix, and the ***_peaks.bed contains the peaks represented in the matrix. They are also divided into separate matrices for cells from the multiome and non-multiome samples, with the non-multiome files having the "non_multiome_" prefix. The list of all the peak matrix files is outlined in the peak_matrix_metadata.csv file. These files can be opened with any programming language.
A list of the final cell annotations are also included. Note that the multiome annotations are the same as the multiome RNA annotations. These files are titled scATAC_non_multiome_cell_types_***.tsv for the non-multiome cells and scATAC_multiome_cell_types_***.tsv for the multiome cells. Each of these files has a column for the cell ID consisting of the sample and barcode (Cell) and the cell type annotation of the corresponding cell (CellType). These files can be opened with any programming language.
A metadata file (atac_sample_location_metadata.csv) is also included that contains columns for the ATAC sample name, the RNA sample name, the sample name without location replicates indicated, the donor the sample was collected from, the location in the intestine the sample was collected from, the fragments file name, and the index file name.
See the README for additional details on files included in the submission.