Bulk and single-cell RNA-seq of human fetal pancreatic organoids
Data files
Dec 07, 2024 version files 449.27 MB
-
Bulk_OrgsCystVSBud.csv
2.07 MB
-
Bulk_OrgsExpVSAciDiff.csv
1.83 MB
-
Bulk_OrgsExpVSDiff.csv
1.58 MB
-
counts_norm_processed.csv.gz
75.28 MB
-
counts_unprocessed.csv.gz
140.80 MB
-
obs_metadata_processed.csv
488.14 KB
-
obs_metadata_unprocessed.csv
328.16 KB
-
README.md
10.68 KB
-
var_metadata_processed.csv
26.36 MB
-
var_metadata_unprocessed.csv
200.53 MB
Abstract
The mammalian pancreas consists of three epithelial compartments: the acini and ducts of the exocrine pancreas and the endocrine islets of Langerhans. Murine studies indicate that these three compartments derive from a transient, common pancreatic progenitor. Here, we report the derivation of 18 human fetal pancreas organoid (hfPO) lines from gestational week 8-17 (8-17GW) fetal pancreas samples. Four of these lines, derived from 15-16GW samples, generate acinar-, ductal- and endocrine lineage cells while expanding exponentially for >2 years under optimized culture conditions. Single-cell RNA sequencing identifies rare LGR5+ cells in the fetal pancreas and hfPOs as the root of the developmental hierarchy. These LGR5+ cells share multiple markers with adult gastrointestinal tract stem cells. Organoids derived from single LGR5+ organoid-derived cells recapitulate this tri-potency in vitro. We describe a human fetal tri-potent stem/progenitor cell, capable of long-term expansion in vitro and of generating all three pancreatic cell lineages.
README: Bulk and single-cell RNA-seq of human fetal pancreatic organoids
Bulk RNA sequencing
Human fetal pancreatic organoids (hfPOs) were cultured in expansion, endocrine, or acinar cell differentiation conditions for 10 days.
Organoids were collected and RNA extraction was performed according to the manufacturer's instructions. RNA integrity was determined using the Agilent RNA 6000 Nano kit with the Agilent 2100 Bioanalyzer (Agilent). RNA integrity (RIN) values ranged from 9.0–10.0. Samples used for bulk RNA sequencing did not have a RIN <9.0. RNA concentrations were determined using the Qubit RNA HS Assay Kit (Thermo Fisher). RNA libraries were prepared with the TruSeq Stranded messenger RNA polyA kit and paired-end (2 × 50 base pairs) sequenced on an Illumina NextSeq 2000. Reads were mapped to the human GRCh38 genome assembly using STAR (DOI: 10.1093/bioinformatics/bts635).
Datasets
File: Bulk_OrgsCystVSbud\
Bulk-seq of hfPOs of either cystic or budding morphology cultured in expansion medium. Count matrix, sample identity, and gene names.
File: Bulk_OrgsExpVSDiff: \
Bulk-seq of hfPOs cultured in expansion and endocrine differentiation medium. Count matrix, sample identity, and gene names.
File: Bulk_OrgsExpVSAciDiff: \
Bulk-seq of hfPOs cultured in expansion and acinar differentiation medium. Count matrix, sample identity, and gene names.
File details
Details for: Bulk_OrgsCystVSbud.csv
Description: A comma-delimited file that contains read counts that represent the normalized number of sequencing reads mapped to each gene/transcript.
Format(s): .csv
Sequencing Technology: Illumina
Variables:
- Sample names (columns) in the form hfPO_lineNumber _morphology e.g. hfPO7_cyst
- Gene names (rows)
Details for: Bulk_OrgsExpVSDiff.csv
Description: A comma-delimited file that contains read counts that represent the normalized number of sequencing reads mapped to each gene/transcript. hfPOs are cultured in either expansion or differentiation medium.
Format(s): .csv
Sequencing Technology: Illumina
Variables:
- Sample names (columns):
- Exp1: Expansion medium biological replicate 1
- Exp2: Expansion medium biological replicate 2
- Exp3: Expansion medium biological replicate 3
- Diff1: Expansion medium biological replicate 1
- Diff2: Expansion medium biological replicate 2
- Diff3: Expansion medium biological replicate 3
- Gene names (rows)
Details for: Bulk_OrgsExpVSDiff.csv
Description: A comma-delimited file that contains read counts that represent the normalized number of sequencing reads mapped to each gene/transcript. hfPOs are cultured in either expansion or endocrine differentiation medium.
Format(s): .csv
Sequencing Technology: Illumina
Variables:
- Sample names (columns):
- Exp1: Expansion medium biological replicate 1
- Exp2: Expansion medium biological replicate 2
- Exp3: Expansion medium biological replicate 3
- Diff1: Expansion medium biological replicate 1
- Diff2: Expansion medium biological replicate 2
- Diff3: Expansion medium biological replicate 3
- Gene names (rows)
Details for: Bulk_OrgsExpVSAciDiff.csv
Description: A comma-delimited file that contains read counts that represent the normalized number of sequencing reads mapped to each gene/transcript. hfPOs are cultured in either expansion or acinar differentiation medium.
Format(s): .csv
Sequencing Technology: Illumina
Variables:
- Sample names (columns):
- Exp1: Expansion medium biological replicate 1
- Exp2: Expansion medium biological replicate 2
- Exp3: Expansion medium biological replicate 3
- Diff1: Expansion medium biological replicate 1
- Diff2: Expansion medium biological replicate 2
- Diff3: Expansion medium biological replicate 3
- Gene names (rows)
Single-cell RNA sequencing
For scRNA-seq cells were FACS sorted into 384-well plates pre-printed with primers (Single Cell Discoveries) and stored at -80°C until used for library preparation. Libraries were prepared according to the previously published VASA-Seq protocol for VASA-plate(DOI: https://doi-org.utrechtuniversity.idm.oclc.org/10.1038/s41587-022-01361-8) and sequenced on a NextSeq2000, high-output 100 cycles flowcell (Illumina).
Processed, normalized RNA-seq data performed according to the VASA-Seq workflow (DOI: https://doi-org.utrechtuniversity.idm.oclc.org/10.1038/s41587-022-01361-8(opens in new window) of all called genes.
Datasets
Processed (post-QC and filtering)
File: count_norm_processed.csv .gz\
Processed count matrix with normalised reads, post-QC, and filtering.
File: obs_metadata_processed.csv\
observation metadata table for the processed count matrix, with information about tissue origin, cluster number cell type, etc.
File: var_metadata_processed.csv\
variable metadata table for the processed count matrix, with information about the gene name, total counts, mean counts, etc.
Unprocessed (pre-QC and filtering)
File: counts_unprocessed.csv.gz \
Unprocessed count matrix with normalised reads, post-QC, and filtering.
File: obs_metadata_unprocessed.csv\
observation metadata table for the unprocessed count matrix, with information about tissue origin, cluster number cell type, etc.
File: var_metadata_unprocessed.csv\
variable metadata table for the unprocessed count matrix, with information about Gene name, total counts, mean counts, etc.
File details
Details for: count_norm_processed.csv .gz
Description: A comma-delimited file that contains read counts that represent the normalized number of sequencing reads mapped to each gene/transcript.
Format(s): .csv
Sequencing Technology: Illumina
Details for: obs_metadata_processed.csv
Description: A comma-delimited file that contains observation metadata, providing information about each cell in the scRNA-seq dataset. Each row corresponds to a single cell, and each column represents a different metadata attribute.
Format(s): .csv
Variables:
- Plate_number: The plate number on which the cells were processed. Pl1- Pl8.
- Cell_ID: A unique identifier for each cell. E.g. hfPanc1_4-001. In the format: hfPanc_seqbatch number- cell number
- total_counts: The total number of reads mapped to genes in each cell.
- total_counts_mt: The total number of reads mapped to mitochondrial genes in each cell.
- pct_counts_mt: The percentage of reads mapped to mitochondrial genes in each cell.
- total_counts_ribo: The total number of reads mapped to ribosomal genes in each cell.
- pct_counts_ribo: The percentage of reads mapped to ribosomal genes in each cell.
- percent_mt2: same as pct_counts_mt
- n_counts: The number of counts detected in each cell.
- n_genes: The number of genes detected in each cell.
- S_score: A score representing the cell cycle S phase activity.
- phase: The assigned cell cycle phase based on the scores (e.g., G1, S, G2M).
- leiden_0.4 to leiden_4.5: Cluster assignments from the Leiden clustering algorithm at different resolutions (0.4 to 2.0).
- louvain_0.4 to louvain_2.0: Cluster assignments from the Louvain clustering algorithm at different resolutions (0.4 to 2.0).
- kmeans_5 to kmeans_15: Cluster assignments from the kmeans clustering algorithm at different resolutions (5 to 15).
- hclust_5 to kmeans_15: Cluster assignments from the hclust clustering algorithm at different resolutions (5 to 15).
- cell type: The annotated cell type for each cell
- Acinar
- Acinar Progenitor
- Ductal
- Endocrine
Details for: count_norm_unprocessed.csv .gz
Description: A comma-delimited file that contains read counts that represent the unprocessed number of sequencing reads mapped to each gene/transcript.
Format(s): .csv
Sequencing Technology: Illumina
Details for: obs_metadata_unprocessed.csv
Description: A comma-delimited file that contains observation metadata, providing information about each cell in the scRNA-seq dataset. Each row corresponds to a single cell, and each column represents a different metadata attribute.
Format(s): .csv
Variables:
- Plate_number: The plate number on which the cells were processed. Pl1- Pl8.
- Condition: Differentiation or Expansion
- Cell_ID: A unique identifier for each cell. E.g. hfPanc1_4-001. In the format: hfPanc_seqbatch number- cell number
- total_counts: The total number of reads mapped to genes in each cell.
- total_counts_mt: The total number of reads mapped to mitochondrial genes in each cell.
- pct_counts_mt: The percentage of reads mapped to mitochondrial genes in each cell.
- total_counts_ribo: The total number of reads mapped to ribosomal genes in each cell.
- pct_counts_ribo: The percentage of reads mapped to ribosomal genes in each cell.
- percent_mt2: same as pct_counts_mt
- n_counts: The number of counts detected in each cell.
Details for: var_metadata_processed.csv
Description: A comma-delimited file that contains observation metadata, providing information about each cell in the scRNA-seq dataset. Each row corresponds to a single cell, and each column represents a different metadata attribute.
Format(s): .csv
Variables:
- Gene_name: Gene names
- mt: is the gene a mitochondria gene, TRUE or FALSE
- ribo: is the gene a ribosome gene, TRUE or FALSE
- n_cells_by_counts: the number of cells in which a particular gene was detected.
- mean_counts: the average expression level of a gene across all cells.
- pct_dropout_by_counts: the percentage of cells where a gene has zero counts (not detected).
- total_counts: the total number of counts for a gene across all cells.
- n_cells: the number of cells in which a gene or feature is present.
Details for: var_metadata_unprocessed.csv
Description: A comma-delimited file that contains observation metadata, providing information about each cell in the scRNA-seq dataset. Each row corresponds to a single cell, and each column represents a different metadata attribute.
Format(s): .csv
Variables:
Gene_name: Gene names
mt: is the gene a mitochondria gene, TRUE or FALSE
ribo: is the gene a ribosome gene, TRUE or FALSE
n_cells_by_counts: he number of cells in which a particular gene was detected.
mean_counts: the average expression level of a gene across all cells.
pct_dropout_by_counts: the percentage of cells where a gene has zero counts (not detected).
total_counts: the total number of counts for a gene across all cells.
Methods
Bulk RNA sequencing
Human fetal pancreatic organoids were cultured in expansion, endocrine, or acinar cell differentiation conditions for 10 days.
Organoids were collected and RNA extraction was performed according to the manufacturer's instructions. RNA integrity was determined using the Agilent RNA 6000 Nano kit with the Agilent 2100 Bioanalyzer (Agilent). RNA integrity (RIN) values ranged from 9.0–10.0. Samples used for bulk RNA sequencing did not have a RIN <9.0. RNA concentrations were determined using the Qubit RNA HS Assay Kit (Thermo Fisher). RNA libraries were prepared with the TruSeq Stranded messenger RNA polyA kit and paired-end (2 × 50 base pairs) sequenced on an Illumina NextSeq 2000. Reads were mapped to the human GRCh38 genome assembly using STAR (DOI: 10.1093/bioinformatics/bts635).
Single-cell RNA sequencing
For scRNA-seq cells were FACS sorted into 384-well plates pre-printed with primers (Single Cell Discoveries) and stored at -80°C until used for library preparation. Libraries were prepared according to the previously published VASA-Seq protocol for VASA-plate (DOI: https://doi-org.utrechtuniversity.idm.oclc.org/10.1038/s41587-022-01361-8) and sequenced on a NextSeq2000, high-output 100 cycles flowcell (Illumina).
Processed, normalized RNA-seq data performed according to the VASA-Seq workflow (DOI: https://doi-org.utrechtuniversity.idm.oclc.org/10.1038/s41587-022-01361-8) of all called genes.