Skip to main content

Human tau mutations in cerebral organoids induce a progressive dyshomeostasis of cholesterol

Cite this dataset

Glasauer, Stella M.K. et al. (2023). Human tau mutations in cerebral organoids induce a progressive dyshomeostasis of cholesterol [Dataset]. Dryad.


Single cell RNA sequencing (drop-seq) data of forebrain organoids carrying pathogenic MAPT R406W and V337M mutations. Organoids were generated from 5 heterozygous donor lines (two R406W lines and three V337M lines) and respective CRISPR-corrected isogenic controls. Organoids were also generated from one homozygous R406W donor line. Single-cell sequencing was performed at 1, 2, 3, 4, 6 and 8 months of organoid maturation.


Single-cell transcriptomes were obtained using drop-seq (Macosko et al., 2015,

Counts matrices were generated using the Drop-seq tools package (Macosko et al. 2015), with full details available online ( Briefly, raw reads were converted to BAM files, cell barcodes and UMIs were extracted, and low-quality reads were removed. Adapter sequences and polyA tails were trimmed, and reads were converted to Fastq for STAR alignment (STAR version 2.6). Mapping to human genome (hg19 build) was performed with default settings. Reads mapped to exons were kept and tagged with gene names, beads synthesis errors were corrected, and a digital gene expression matrix was extracted from the aligned library. We extracted data from twice as many cell barcodes as the number of cells targeted (NUM_CORE_BARCODES = 2x # targeted cells).

Downstream analysis was performed using Seurat 3.0 in R version 3.6.3. An individual Seurat object was generated for each sample, and filtered and clustered individually. Cells with < 300 genes detected were filtered out, as were cells with > 10% mitochondrial gene content. Counts data were log-normalized using the default NormalizeData function and the default scale of 1e4. Then, the top 2000 variable genes were identified using the Seurat FindVariableFeatures function (selection.method = “vst”, nfeatures = 2000), followed by scaling and centering using the default ScaleData function. Principal Components Analysis was carried out on the scaled expression values of the 2000 top variable genes, and the cells were clustered using the first 50 principal components (PCs) as input in the FindNeighbors function, and a resolution of 0.4 in the FindClusters function. Non-linear dimensionality reduction was performed by running UMAP on the first 50 PCs. Following clustering and dimensionality reduction, putative cell doublets were identified using DoubletFinder (McGinnis et al. 2019;, assuming a doublet formation rate of 5%. For each sample, the optimal pK value was identified based on the results of paramSweep_vs, summarizeSweep and find.pK functions of the DoubletFinder package. Instead of using the default paramSweep_vs function, we extended the upper range of computed pK values to 1.2. We visually verified cells identified as doublets had high nFeatures (number of genes expressed) by plotting the pANN metric against nFeatures. For samples not showing this correlation, we adjusted the pK value to the next highest peak in the pK/BCmetric plot. Finally, the individual Seurat objects were merged.

Usage notes

The package Seurat ( in R is required to read the FB1.seurat file



National Institute on Aging

Rainwater Charitable Foundation

Rainwater Charitable Foundation

National Institute of Neurological Disorders and Stroke

Swiss National Science Foundation

Larry L. Hillblom Foundation