Single-nucleus multi-omics identifies shared and distinct pathways in Pick’s and Alzheimer’s disease
Data files
Jul 16, 2025 version files 28.20 MB
-
TableS1.xlsx
1.51 MB
-
README.md
9.04 KB
-
TableS2.xlsx
24.07 MB
-
TableS3.xlsx
29.02 KB
-
TableS4.xlsx
848.31 KB
-
TableS5.xlsx
1.73 MB
Abstract
The study of neurodegenerative diseases, particularly tauopathies like Pick’s disease (PiD) and Alzheimer’s disease (AD), offers insights into the underlying regulatory mechanisms. By investigating transcriptomic and epigenomic variations in these conditions, we identified critical regulatory changes driving disease progression, revealing potential therapeutic targets. Our comparative analyses uncovered disease-enriched non-coding regions and genome-wide transcription factor (TF) binding differences, linking them to target genes. Notably, we identified a distal human-gained enhancer (HGE) associated with E3 ubiquitin ligase (UBE3A), highlighting disease-specific regulatory alterations. Additionally, fine-mapping of AD risk genes uncovered loci enriched in microglial enhancers and accessible in other cell types. Shared and distinct TF binding patterns were observed in neurons and glial cells across PiD and AD. We validated our findings using CRISPR to excise a predicted enhancer region in UBE3A and developed an interactive database, scROAD, to visualize predicted single-cell TF occupancy and regulatory networks.
https://doi.org/10.5061/dryad.h9w0vt4t9
Description of the data and file structure
Tab | Information |
---|---|
tableS1A | metadata for PiD and AD |
tableS1B | snATAC-seq FindAllMarkers on Gene Activity |
tableS1C | cell counts for each celltype in snATAC-seq and snRNA-seq |
tableS1D | snRNA-seq FindAllMarkers on Gene Expression |
tableS2A | a complete peakset of snATAC-seq for both PiD and AD |
tableS2B | a summary count for peaktype and biotype in PiD and AD DARs (p < 0.05) |
tableS2C | DAR_PiDvsControl (pct > 0.05) - include all statistics, not just singificant ones |
tableS3A | snATAC-seq overlapped SuSIE fine-mapped credible sets (PIP > 0.95) + overlapped with Xiong et al., 2023 |
tableS3B | a complete list of SuSIE fine-mapped credible sets (PIP > 0.95) |
tableS3C | GWAS gene enrichment analysis in PiD DGE |
tableS4A | DGE - MAST glm PiD vs Control |
tableS4B | DGE - MAST glm AD vs Control |
tableS5 | Table of Hge overlapped with snATAC-seq peaks |
Files and variables
File: TableS1.xlsx
Description: TableS1 - Supplemental Table 1
(Table S1A) Metadata for PiD and AD.
*In accordance with DataDryad.org’s data sharing policy, the following data in metadata, such as Brainbank, Age, Sample ID, and PMI, have been masked to protect sensitive information.
- SampleID: Sample ID - a random, anonymized code for sample sequencing for a separation purpose. This code (number) is used to match the GEO fastq files.
- Assignment: Sequencing group.
- BrainBank: Brain Bank.
- Age: Sample Age of Death.
- Sex: Sample Sex.
- PMI: Postmortem Interval.
- Diagnosis: Sample Diagnosis.
- Region: Sample Brain Region.
- Dataset: Disease Dataset.
(Table S1B) Results from snATAC-seq FindAllMarkers analysis on gene activity.
- avg_log2FC: Average Log2 Fold Change.
- p_val_adj: FDR adjusted p-value.
- cluster: celltype cluster.
- gene: gene name.
(Table S1C) Cell counts for each cell type across snATAC-seq and snRNA-seq datasets.
- Celltype: Cell type.
- Cell_Count: Nuclei Count by cell type.
- snATAC_or_snRNA: Data from snATAC or snRNA.
- Dataset: Disease dataset.
(Table S1D) Results from snRNA-seq FindAllMarkers analysis on gene expression.
- avg_log2FC: Average Log2 Fold Change.
- p_val_adj: FDR adjusted p-value.
- cluster: celltype cluster.
- gene: gene name.
File: TableS2.xlsx
Description: TableS2 - Supplemental Table 2
(Table S2A) Complete peak set of snATAC-seq for both PiD and AD.
- seqnames: peak location on chromosome.
- start: start position on chromosome.
- end: end position on chromosome.
- nearestGene: nearest gene name to the peak location.
- peakType: peak type.
(Table S2B) Summary counts of peak type and biotype in PiD and AD DARs (p < 0.05).
- peakType: peak type.
- gene_biotype: peak bio functional type.
- n: Count by biotype.
- TotalN: Count by peakType.
- Dataset: Dataset.
(Table S2C) DAR analysis for PiD vs Control (pct > 0.05), including all statistics, not just significant ones.
- gene: peak location on open chromatin.
- p_val: p-value.
- avg_log2FC: Average Log2 Fold Change.
- p_val_adj: FDR adjusted p-value.
- group: celltype cluster.
File: TableS3.xlsx
Description: TableS3 - Supplemental Table 3
(Table S3A) Overlap of snATAC-seq peaks with SuSIE fine-mapped credible sets (PIP > 0.95) and Xiong et al., 2023 (27).
- X: SuSIE automatically assigned credible set.
- GeneName: Credible set of SNPs identified near this gene, calculated using SuSiE fine-mapping.
- Lead_SNPs: Single nucleotide polymorphisms (SNPs) that show the strongest association with a particular trait or disease within a specific region of the genome.
- Credible_Set: A set of SNPs for which there is a high posterior probability (95%) that the true causal variant is among them.
- FTD_2014: SuSIE r2 for FTD GWAS credible set.
- AD_2022: SuSIE r2 for AD GWAS credible set.
- Overlapped_celltypes_in_ThisStudy: Peaks of cell type from this study that overlapped with credible set.
- lof.pLI: probability of a gene being loss-of-function intolerant.
- lof.oe_ci.upper: LOEUF - upper bound of 90% confidence interval for o/e ratio for high confidence pLoF variants.
- Overlapped_celltypes_in_Xiong2023Cell: Peaks of cell type from Xiong 2023 Cell that overlapped with credible set.
(Table S3B) Complete list of SuSIE fine-mapped credible sets (PIP > 0.95).
- Number: Row Number.
- X: SuSIE automatically assigned credible set.
- GeneName: Credible set of SNPs identified near this gene, calculated using SuSiE fine-mapping.
- Lead_SNPs: Single nucleotide polymorphisms (SNPs) that show the strongest association with a particular trait or disease within a specific region of the genome.
- Credible_Set: A set of SNPs for which there is a high posterior probability (95%) that the true causal variant is among them.
- FTD_2014: SuSIE r2 for FTD GWAS credible set.
- AD_2022: SuSIE r2 for AD GWAS credible set.
(Table S3C) GWAS gene enrichment analysis in PiD DGE.
File: TableS4.xlsx
Description: TableS4 - Supplemental Table 4
(Table S4A) DGE results using MAST glm for PiD vs Control.
- avg_log2FC: Average Log2 Fold Change.
- p_val_adj: FDR adjusted p-value.
- gene: Gene name.
- group: Celltype cluster.
(Table S4B) DGE results using MAST glm for AD vs Control.
- avg_log2FC: Average Log2 Fold Change.
- p_val_adj: FDR adjusted p-value.
- gene: Gene name.
- group: Celltype cluster.
File: TableS5.xlsx
Description: TableS5 - Supplemental Table 5
High-gain enhancers (HGE) overlapped with snATAC-seq peaks in PiD and AD datasets.
- Peak1: Promoter peak location.
- Peak2: Enhancer peak location.
- Peak1_type: Peaktype of Peak1 - Promoter.
- Peak1_nearestGene: Gene whose promoter region overlaps with Peak1.
- Peak2_type: Peaktype of Peak2 - Promoter.
- hge: Human gained enhancer location.
File: DataS1_iPSC_crispr_ko_karyotype_pluripotency.zip
Description: DataS1 - iPSC Crispr KO Karyotype Pluripotency data Files
File 1: Project_Report_UBE3A__Swarup_6_30_22.pdf
This document details the design and results of the UBE3A CRISPR knockout experiment. It includes:
CRISPR knockout design strategy
Experimental results of the knockout
Quality control data for the generated cell lines
Validation of iPSC characteristics
File 2: Microarray REPORT CLG-46814_ADRC76.pdf
This is the microarray report for the parental iPSC line ADRC76, which was provided by the UCI Alzheimer’s Disease Research Center (ADRC) Induced Pluripotent Stem Cell Core.
Source: Fibroblasts
File 3: Microarray REPORT CLG-46815_C40.pdf
This microarray report corresponds to Clone 40 derived from the ADRC76 iPSC line.
File 4: Microarray REPORT CLG-46816_C14.pdf
This microarray report corresponds to Clone 14 derived from the ADRC76 iPSC line.
Interactive database
An interactive companion resource for this dataset is available at http://swaruplab.bio.uci.edu/scROAD, providing user-friendly access to the Single-cell Regulatory Occupancy Archive in Dementia (scROAD).
Human subjects data
All human tissue samples used in this study were obtained from brain banks with appropriate institutional ethical approval and informed consent from donors or their legal representatives. Consent included permission for de-identified data to be shared publicly.
To ensure compliance with data protection policies, all personally identifiable information (PII) has been removed. Metadata fields that could potentially be used to identify individuals—such as Brainbank name, exact Age, Sample ID, and Postmortem Interval (PMI)—have been masked or anonymized in the accompanying metadata and README file.
This dataset contains only de-identified data and is shared in accordance with the ethical standards of the contributing institutions and the data sharing policy of DataDryad.org.