Cross-species analysis identifies mitochondrial dysregulation as a functional consequence of the schizophrenia-associated 3q29 deletion
Data files
Jun 18, 2024 version files 3.13 MB
-
all_ocr_ecar.csv
57.23 KB
-
anno_data.csv
301 B
-
brainspan_heatmap.csv
2.02 KB
-
hek_protein_pak2.csv
246 B
-
hek_seahorse_measures.csv
3.77 KB
-
hek_seahorse_ocr_ecar.csv
19.68 KB
-
human_common_down.csv
9.72 KB
-
human_common_up.csv
17.35 KB
-
human_deg_table.csv
1.70 MB
-
human_pathways_combined_timepoints.csv
832.33 KB
-
mm_astros.csv
65.57 KB
-
mm_nb.csv
7.47 KB
-
mm_neurons.csv
103.56 KB
-
mm_opc.csv
34.80 KB
-
mm_rg.csv
13.46 KB
-
mouse_down.csv
28.31 KB
-
mouse_oxphos_blots.csv
620 B
-
mouse_up.csv
10.92 KB
-
neuron_bg.csv
161.88 KB
-
pathway_data.csv
48.31 KB
-
README.md
10.52 KB
Abstract
The 1.6Mb deletion at chromosome 3q29 (3q29Del) is the strongest identified genetic risk factor for schizophrenia, but the effects of this variant on neurodevelopment are not well understood. We interrogated the developing neural transcriptome in two experimental model systems with complementary advantages: isogenic human cortical organoids and isocortex from the 3q29Del mouse model. We profiled transcriptomes from isogenic cortical organoids that were aged for 2 months and 12 months, as well as perinatal mouse isocortex, all at single-cell resolution. Systematic pathway analysis implicated dysregulation of mitochondrial function and energy metabolism. These molecular signatures were supported by analysis of oxidative phosphorylation protein complex expression in mouse brains and assays of mitochondrial function in engineered cell lines. Together these data indicate that metabolic disruption is associated with 3q29Del and is conserved across species.
README: Cross-species analysis identifies mitochondrial dysregulation as a functional consequence of the schizophrenia-associated 3q29 deletion
Use the following code and input data sets to generate the figures in this paper.
Create a directory for each figure, copy the code and input files into that folder, and put all folders in a directory named "Purcell 2023 dataset". Place the Purcell 2023 dataset directory on the desktop for file paths in R scripts to work smoothly.
Description of the data and file structure
Fig. 1
Cross-species single-cell sequencing. (A) A single-cell RNA-sequencing experiment was performed in isogenic human induced pluripotent stem cell (iPSC)-derived cortical organoids at two time points (2 months and 12 months in vitro) and in postnatal day 7 mouse isocortex. An overview of the strategy to collect and filter differential gene expression data from both model systems is illustrated. (B) The human 3q29 deletion locus is nearly perfectly syntenic with a region of mouse chromosome 16, with the same gene order inverted. Corresponding loci are illustrated in the same orientation to facilitate clearer cross-species comparison. Bex6 (in gray) is the only gene present in the mouse, not in the human locus. (C and E) UMAP dimensionality reduction plots colored by the main cell types identified in human (C) and mouse (E) experiments. Human and mouse cells showed no obvious difference in gross distribution by genotype (D, F) but human cells were divided in their transcriptomic clustering patterns by time point (d, top). The average expression profile of each sample was correlated (Spearman) to BrainSpan gene expression data profiling the human brain transcriptome in postmortem specimens across the lifespan (G). Abbreviations: pcw, post-conception weeks (prenatal); m, months (postnatal); y, years (postnatal).
- Fig. 1 code.R > used to generate panels in Figure 1
- anno_data.csv > input data file for code
- Annotation data for heat map visualization
- brainspan_heatmap.csv > input data file for code
- Columns: scRNA-seq samples (human cortical organoids and mouse cortex)
- Rows: Age (pcw = post-conception week; m = month; y = year)
- scRNA-seq Seurat Objects (contain all scRNA-seq data, required for R script above). Refer to Methods for more details on the generation of samples and data processing.
- Mouse isocortex Seurat Object: ms_obj.rds
- Human cortical organoid Seurat Object: sub_obj2.rds
Fig. 2
Transcriptomic evidence of metabolic changes in 3q29Del. The umbrella pathways are most frequently found to be differentially expressed based on up- (B) and down- (C) regulated genes in cortical organoids (A). Oxidative phosphorylation (OXPHOS) was enriched among both increased and decreased genes, but all clusters contributing to up-regulated OXPHOS were from 2-month organoids and all clusters contributing to down-regulated OXPHOS were from 12-month organoids. (D) Example violin plots visualizing log-normalized expression data of genes dysregulated in 2-month organoid clusters: MT-CO3 (increased in 3q29Del) encodes the respiratory chain complex IV subunit COX3, LDHA (decreased in 3q29Del) is a key enzyme in glycolysis. (E) Example violin plots visualizing log-normalized expression data of genes dysregulated in 12-month organoid clusters: MT-ND1 (decreased in 3q29Del) encodes a component of respiratory chain complex I and MT-ATP6 (decreased in 3q29Del) encodes a component of the ATP Synthase complex. The most frequently up- (G) and down- (I) regulated umbrella pathways in mouse isocortex (F) are shown. Treemaps derived by Revigo analysis (H and J) display the hierarchical organization of specific Gene Ontology Biological Processes (GO:BP) identified in pathway analysis. Similar colors denote semantic similarity. The size of each rectangle is proportional to the number of clusters exhibiting over-representation of a given GO:BP term. Abbreviations: OPC, oligodendrocyte progenitor cells; NSC, neural stem cells; cl, cluster.
- Fig. 2 code.R > used to generate panels in Figure 2
- mouse_down.csv > input data
- Columns:
- Term - gene ontology pathway
- Cluster - Seurat cluster
- Direction - Up or Down
- Cell_Type - Annotation of cluster cell composition
- Cell_Class - Broader grouping of cell types
- source - gene ontology database (GO:BP = Gene Ontology: Biological Process)
- term_id --> intersections - output of g:Profiler analysis
- human_pathways_combined_timepoints.csv > input data
- Output of g:Profiler analysis
- human_common_down.csv > input data
- human_common_up.csv > input data
- mouse_up.csv > input data
Fig. 3
Common patterns of differential gene expression in two major mouse and human cell types. Astrocytes were identified in human cortical organoids (12 months) and mouse isocortex. Corresponding clusters are color-coded in blue in UMAP projections (A). The human homologs of mouse DEGs identified by MAST analysis were compared to organoid DEGs based on direction of change and a significant overlap was observed between the down-regulated DEGs of mouse and organoid astrocyte clusters (B). Pathway analysis of overlapping DEGs showed that all significantly enriched Gene Ontology: Biological Process (GO:BP) and Reactome (REAC) terms were related to mitochondrial function and metabolism (C). Upper and deep-layer excitatory neuron DEGs were pooled and unique organoid DEGs were compared to the human homologs of mouse DEGs based on the direction of change. Corresponding clusters are color-coded in red in UMAP projections (D). There was a significant overlap between the DEGs of mouse and organoid excitatory neuron clusters for both up-regulated and down-regulated genes (E). Decreased genes were heavily enriched for GO:BP and REAC terms related to mitochondrial function and cellular respiration (F).
- Fig. 3 code.R > use to generate panels in Figure 3
- mm_hs_cell_type_overlaps -Contains input files for the R script above
Fig. 4
Mitochondrial phenotypes in 3q29 mice and engineered cell lines. Mitochondrial fractions from adult mouse brain lysates were found to have selective decreases in components of OXPHOS Complexes II and IV (A, quantified in B, N=5). At least 7 3q29-encoded proteins interact with mitochondria-localized proteins (C, from Antonicka et al. 2020). Symbol size reflects topological coefficients. HEK cell lines were engineered to carry either the heterozygous 3q29Del or completely lack PAK2 as shown by Western blot. Control HEK-293T cells (CTRL) transition from a glycolytic to a more aerobic cellular respiration state in galactose medium (E). Oxygen consumption rate (OCR) is significantly increased by 48-hour galactose medium challenge in CTRL cells (F) but not in 3q29 or PAK2 cells. Both 3q29 and PAK2 cells displayed increased baseline OCR (G) and decreased response to galactose (H). In the glucose medium, 3q29 cells showed reduced spare capacity (I) and increased ATP production (J). Proton leak (K) was found to be increased in 3q29 cells in glucose and decreased in PAK2 cells in galactose. Maximal respiration was significantly elevated in 3q29 cells in glucose but was unchanged from CTRL in galactose conditions.
- Fig. 4 code.R > used to generate panels in Figure 4
- blot images
- Contains mouse OXPHOS blot images
- mouse_oxphos_blots.csv > input data, mouse blot quantifications
- hek_protein_pak2.csv > input data, PAK2 blot quantifications (note: PAK2 blot images are in the Supplemental Figures folder)
- hek_seahorse_ocr_ecar.csv > input data
- Columns:
- Experiment (number)
- Assay_date
- Cell_Line (CTRL, 1D7, or 3F2)
- Genotype (Control, del3q29, or PAK2)
- Medium (Glucose or Galactose)
- Time (of the assay in minutes)
- OCR (Oxygen Consumption Rate per minute per microgram of protein)
- ECAR (ExtraCellular Acidification Rate per minute per microgram of protein)
- hek_seahorse_measures.csv > input data
- Columns:
- Metadata (Experiment, Assay_date)
- Cell_Line (CTRL, 1D7, or 3F2)
- Genotype (Control, del3q29, or PAK2)
- Medium (Glucose or Galactose)
- Seahorse Mitochondrial Stress Test output (Basal (OCR) --> Coupling Efficiency)
- Additional measures:
- Spare Respiratory Capacity (pct) - percent of baseline
- Galactose Response = fold change of Basal OCR in Galactose medium over Basal OCR in Glucose medium
- Change in ATP Production = ATP production in Galactose medium over ATP Production in Glucose medium
- Max_Ratio = Maximal Respiration in 3q29 or PAK2 / Maximal Respiration in CTRL
Fig. 5
Lack of metabolic flexibility in 3q29Del neural progenitor cells. A) Control and 3q29Del neural progenitor cells (NPCs) exhibited normal morphology and stained positive for the neurofilament protein Nestin, multipotency marker SOX2, and NPC marker PAX6 (scale = 50um, quantified in Fig. S15). B) Illustration of experimental design. NPCs were challenged for 48-hr in a neural medium containing glucose (GLU) or galactose (GAL). C) Table of cell lines used in this experiment. Data from three separate cohorts was combined in plots D-M. N=15 from 6 independent NPC lines for all experiments. D) The energy map indicates that galactose treatment pushes cells from a more glycolytic to a more aerobic metabolic profile. E) Control NPCs significantly increase oxygen consumption rate (OCR) in galactose medium. F) 3q29Del NPCs exhibited no significant change in OCR in the galactose medium. G) No significant difference in baseline OCR mean in glucose medium was observed, but 3q29 NPCs displayed a significantly lower baseline OCR mean in galactose medium (H). I) Galactose response (i.e., basal OCR fold change over glucose) was unchanged in 3q29Del NPCs. J) Maximal respiration was unchanged in the glucose medium, but was significantly decreased in 3q29Del NPCs in the galactose medium (K). Similarly, (L) the maximal respiration ratio of 3q29Del:Control NPCs was unchanged in glucose medium (GLU) but was significantly reduced in galactose conditions (GAL). There was no significant change in spare capacity in 3q29Del NPCs (M).
- Fig. 5 code.R > used to generate panels in Figure 5
- all_measures.csv > input data
- See above for column descriptions
- all_ocr_ecar.csv > input data
- See above for column descriptions
Code/Software
All visualization code is written in R. Single-cell RNA-seq analysis is performed in the Seurat R package.
Methods
Cell Culture and Genome Engineering
Whole blood samples of 5-10 mL were collected in EDTA Vacutainer test tubes and processed for the isolation of erythroid progenitor cells (EPCs) using the Erythroid Progenitor Kit (StemCell Technologies). EPCs were reprogrammed using Sendai particles (CytoTune-iPS 2.0 Reprogramming kit, Invitrogen) and plated onto Matrigel-coated six-well plates (Corning). Cultures were transitioned from erythroid expansion media to ReproTesR (StemCell Technologies) and then fed daily with ReproTesR until clones were isolated. iPSCs were maintained on Matrigel-coated tissue culture plates with mTeSR Plus (StemCell Technologies).
Cell lines were characterized for stem cell markers by RT-PCR and immunocytochemistry after at least 10 passages in culture. Total RNA was isolated from each cell line with the RNeasy Plus Kit (Qiagen) according to the manufacturer’s protocol. mRNA was reverse transcribed into cDNA using the High Capacity cDNA Synthesis Kit (Applied Biosystems). Expression of pluripotency genes OCT4, SOX2, REX1, and NANOG was determined by RT-PCR. Sendai virus inactivity was confirmed using Sendai genome-specific primers.
Isogenic 3q29Del iPSC and HEK cell lines were generated using the SCORE method (11). To identify low-copy repeat (LCR) target sequences, the reference sequence (hg38) between TNK2 – TFRC (centromeric) and BDH1 – RUBCN (telomeric) was downloaded and aligned in NCBI BLAST. A ~20 Kb segment was found to be 97% identical and was searched for gRNA sequences using CHOPCHOP (https://chopchop.cbu.uib.no) (14). Three single gRNA sequences (IDT) that were predicted to each cut at a single site in both LCRs were identified and cloned into pSpCas9(BB)-2A-Puro (PX459) V2.0, which was a gift from Feng Zhang (Addgene plasmid #62988; http://n2t.net/addgene:62988; RRID: Addgene_62988) (42).
Single gRNA plasmids were transfected into a neurotypical control iPSC line (IRB#CR002-IRB00088012, maintained in mTeSR or mTeSR+ (STEMCELL, Vancouver) on Matrigel (Corning)-coated plates using a reverse transfection method and Mirus TransIT-LT1 reagent (Mirus Bio, Madison, WI) and transfected cells were transiently selected for puromycin resistance. Genome cleavage efficiency for each gRNA was calculated using the GeneArt Genomic Cleavage Detection Kit (Thermo) and gRNA_2 (5’-CAGTCTTGGCTACATGACAA-3’, directed to -strand, hg38 chr3:195,996,820 - chr3:197,634,397) was found to be the most efficient with cleaved bands at the predicted sizes. Cells transfected with gRNA_2 were dissociated and cloned out by limiting dilution in mTeSR supplemented with 10% CloneR (STEMCELL). Putative clonal colonies were manually transferred to Matrigel-coated 24-well plates for expansion and screened for change in copy number of the 3q29Del locus gene PAK2 (Hs03456434_cn) using TaqMan Copy Number Assays (Thermo). Three (of 100) clones showed an apparent loss of one copy of PAK2 and were subsequently screened for loss of the 3q29 genes TFRC (Hs03499383_cn), DLG1 (Hs04250494_cn), and BDH1 (Hs03458594_cn) and for no change in copy number to external (non-deleted) 3q29 genes TNK2 (Hs03499383_cn) and RUBCN (Hs03499806_cn) all referenced to RNASEP (Thermo #4401631). All cell lines retained normal karyotypes (WiCell, Madison, WI) and were free of mycoplasma contamination (LookOut, Sigma).
To generate 3q29Del HEK-293T cell lines, HEK cells (RRID:CVCL_0063) were transfected with either empty px459 or px459+gRNA_1 (5’-ttagatgtatgccccagacg-3’, directed to the +strand) and screened and verified with TaqMan copy number assays as described above. PAK2 was deleted from a control HEK-293T line as detailed above (PAK2 gRNA 5’-TTTCGTATGATCCGGTCGCG-3’, directed to -strand). Clones were screened by Western blot (Rabbit monoclonal PAK2 from Abcam; ab76293; RRID AB_1524149; 1:5000 dilution) and confirmed by Sanger sequencing PCR-amplified gDNA. HEK cell lines were also negative for mycoplasma contamination.
Genome-wide Optical Mapping
1.5E6 iPSCs were pelleted, washed with DPBS, and frozen at -80°C following aspiration of all visible supernatant. 750ng of DNA was labeled, stained, and homogenized using the DNA Labeling Kit-DLS (Bionano; 80005). Stained DNA was loaded onto the Saphyr chip G1.2 and the chip was scanned in order to image the labeled DNA using the Saphyr System. Structural variants were called relative to the reference genome (hg38) using Bionano Solve. Structural variants were compared to the parent (unedited) cell line using the Bionano Solve Variant Annotation Pipeline.
Cortical Organoid Differentiation
Engineered isogenic 3q29Del iPSC lines and the unedited parent line, along with two additional clonal lines from the same donor, were expanded in mTeSR or mTeSR+ on Matrigel-coated plates. On DIV 0, colonies were gently released from plates in 0.35mg/ml Dispase according to an established protocol (15). Floating colonies were re-suspended in mTeSR supplemented with 10uM Y-27632 (Reprocell, Beltsville, MD) in ultra-low attachment 10cm dishes (Corning). After 48hr, spheroids were transitioned to Neural Induction Medium (20% Knockout Serum Replacement, 1% Non-essential amino acids, 100U/mL Pen/Strep, 0.5% GlutaMAX, 0.1mM 2-mercaptoethanol in DMEM/F12 w/ HEPES), supplemented with 5uM Dorsomorphin and 10uM SB-431542 (added fresh) with daily media changes through DIV 6. On DIV 7, Neural Induction Medium was replaced with Neural Medium (Neurobasal-A with 2% B-27 w/o vitamin A, 1% GlutaMAX, 100U/mL Pen/Strep) supplemented with fresh EGF (20ng/ml, R&D Systems) and FGF (20ng/ml, R&D Systems) for daily media changes through day 16. From day 17 to 25, organoids were fed Neural Medium with EGF and FGF every two days. From day 26-42, Neural Medium was supplemented with BDNF (20ng/ml, R&D Systems) and NT-3 (20ng/ml, R&D Systems) every two days. From day 43 onwards, organoids were fed Neural Medium without supplements twice weekly.
Mouse Genotyping and Maintenance
All animal experiments were performed under guidelines approved by the Emory University Institutional Animal Care and Use Committee. Mice were genotyped as described previously (6) and noted as either Control (wild-type, C57BL/6 N Charles River Laboratories) or 3q29Del (B6.Del16+/Bdh1-Tfrc, MGI:6241487). Male 3q29Del mice and Control littermates were included in the scRNA-seq study. Both male and female mice were included in mitochondrial fractionation experiments.
Tissue Dissociation and Sorting
Single-cell suspensions from cortical organoids (DIV 50 = “2-month” N=2 Control, N=2 3q29Del, and DIV 360 = “12-month” N=2 Control, N=2 3q29Del) and postnatal day 7 (P7) mouse cortices (N=4 Control, N=4 3q29Del) were produced by a papain dissociation method based on a published protocol (43). Organoids were dissociated in three batches that were each balanced for genotype and “age”. Mouse samples were also dissociated in three batches each balanced by genotype. In both sets the experimenter was blinded to genotype. Tissue was coarsely chopped with a sterile scalpel and digested for 1hr at 34°C in a pH-equilibrated papain solution (Worthington, Lakewood, NJ) with constant CO2flow over the enzyme solution. Digested tissue was gently spun out of papain, through ovomucoid solutions, and sequentially triturated with P1000 and P200 pipet tips. Live cells were counted by manual and automated methods (Countess II, Thermo) and in organoid samples were isolated from cellular debris by fluorescence-activated cell sorting on a FACSAria-II instrument (calcein AM-high, Ethidium Homodimer-1 low).
Single-cell Library Prep and RNA-Sequencing
Single-cell suspensions were loaded into the 10X Genomics Controller chip for the Chromium Next GEM Single Cell 3’ kit workflow as instructed by the manufacturer with a goal capture of 10,000 cells per sample. The resulting 10X libraries were sequenced using Illumina chemistry. Mouse samples and libraries were prepared and sequenced at a separate time from human samples.
Single-cell RNA-seq
To quantify gene expression at single-cell resolution, the standard Cell Ranger (10x Genomics) and Seurat data processing pipelines were followed for demultiplexing base call files into FASTQ files, alignment of scRNA-seq reads to species-specific reference transcriptomes with STAR (mouse: mm10, human: GRCh38), cellular barcode and unique molecular identifier (UMI) counting, and gene- and cell-level quality control (QC). To filter out low-quality cells, empty droplets, and multiplets, genes expressed in <10 cells, cells with >30% reads mapping to the mitochondrial genome, and cells with unique feature (gene) counts >7,000 were removed based on manual inspection of the distributions of each QC metric individually and jointly. Outlier cells with low unique feature counts were further removed via sample-specific thresholding of corresponding distributions (<250 for mice; <700 for organoids). Thresholds were set as permissive as possible to avoid filtering out viable cell populations, consistent with current best-practice recommendations.
The sctransform function in Seurat was used for normalization and variance stabilization of raw UMI counts based on regularized negative binomial regression models of the count by cellular sequencing depth relationship for each gene while controlling for mitochondrial mapping percentage as a confounding source of variation. Resulting Pearson’s residuals were used to identify the most variable features in each dataset (n=3,000 by default), followed by dimensionality reduction by PCA and UMAP, shared nearest neighbor (SNN) graph construction based on the Euclidean distance between cells in principal component space, and unbiased clustering of cells by Louvain modularity optimization. Optimal clustering solutions for each dataset were determined by building cluster trees and evaluating the SC3 stability index for every cluster iteratively at ten different clustering resolutions with the clustree function in R. The effect of cell-cycle variation on clustering was examined by calculating and regressing out cell-cycle phase scores in a second iteration of sctransform, based on the expression of canonical G2/M and S phase markers. Consistent with the developmental context of the interrogated datasets, cell-cycle differences were found to covary with cell-type and retained in final analyses as biologically relevant sources of heterogeneity. Cluster compositions were checked to confirm comparable distributions of experimental batch, replicate ID, and genotype metadata. Cluster annotations for cell-type were determined based on the expression of known cell-type and cortical layer markers curated from the literature. Clusters exhibiting cell-type ambiguity were further sub-clustered to refine annotations or dropped from downstream analysis in case of inconclusive results (human cl.7 and cl.16; mouse cl. 25 and cl. 27).
Seahorse Mitochondrial Stress Assay
HEK-293T cells that had been engineered to carry the 3q29Del (3q29Del), PAK2 knockout (PAK2), and mock-edited control cells (CTRL) were plated on poly-D-lysine coated 96-well Seahorse assay plates (XF96, Agilent) in DMEM (Gibco A144300) supplemented with 10% FBS, 2mM L-glutamine, 1mM sodium pyruvate, and either 10mM D-(+)-glucose (“Glu”, 7.5E3 cells/well) or 10mM galactose (“Gal”,15E3 cells/well). After 48 hours, cells were washed twice in XF DMEM Assay Medium (Agilent) with either glucose or galactose (10mM) supplemented with 1mM pyruvate, and 2mM glutamine.
Neural progenitor cells (NPCs) were plated at 5E4 cells/well in poly-L-ornithine and laminin-coated 96-well Seahorse assay plates in STEMdiff Neural Progenitor Medium. After 24 hours, all media was aspirated and exchanged for Neural Medium containing 2mM L-glutamine, 1mM sodium pyruvate, and either 17.5 mM glucose or galactose for 48hrs.
Mitochondrial stress test compounds were loaded into injection ports as indicated by the manufacturer to achieve the following final concentrations for : HEK cells: 1uM oligomycin, 0.25uM FCCP, 0.5uM rotenone, 0.5uM antimycin A (all sourced from Sigma). NPC assays were performed with 2uM oligomycin, 0.5uM FCCP, 1uM rotenone, and 1um antimycin A. Cells equilibrated at 37°C with ambient CO2 for approximately 1hr prior to assay initiation. At the end of the experiment, cells were washed twice in PBS+Ca2++Mg2+ and lysed at 4C for 30 min in 0.5% Triton X-100 protein buffer (150mM NaCl, 10mM HEPES, 0.1mM MgCl2, 1mM EGTA, 1x HALT Protease + Phosphatase inhibitor). Protein concentrations in each well were determined by BCA (Pierce) to normalize oxygen consumption rate data. Data were analyzed in Wave (Agilent).
Statistical Analysis
Differential gene expression testing for genotype was performed on log normalized expression values (scale.factor=10,000) of each cluster separately with a two-part generalized linear model that parameterizes the bimodal expression distribution and stochastic dropout characteristic of scRNA-seq data, using the MAST algorithm, while controlling for cellular detection rate. A threshold of 0.1 was implemented as the minimum cut off for average log-fold change (logfc.threshold) and detection rates (min.pct) of each gene in either genotype to increase the stringency of differential expression analysis. Multiple hypothesis testing correction was applied conservatively using the Bonferroni method to reduce the likelihood of type 1 errors, based on the total number of genes in the dataset. To facilitate comparative transcriptomics, human homologs (including multiple paralogs) were identified for all differentially-expressed genes (DEGs) in the mouse dataset via the NCBI’s HomoloGene database (ncbi.nlm.nih.gov/homologene/). Data processing and analysis pipelines were harmonized across the mouse and organoid datasets, yielding parallel computational approaches for cross-species comparison of differential expression signals. The BrainSpan Developmental Transcriptome dataset used for developmental stage estimations was obtained by bulk RNA sequencing of postmortem human brain specimens collected from donors with no known history of neurological or psychiatric disorders, as described previously. This large-scale resource is accessible via the Allen Brain Atlas data portal (https://www.brainspan.org/static/download/, file name: “RNA-Seq Gencode v10 summarized to genes”); dbGaP accession number: phs000755.v2.p1. All statistical analyses of scRNA-seq data were performed in R (v.4.0.3).
To interpret differential gene expression results, pathways likely impacted by the 3q29Del were determined based on statistically over-represented gene-sets with known functions using g:Profiler. DEGs (Bonferroni adj. p<0.05) for each cluster were identified as described above and input with an experiment-specific background gene set (genes with min.pct > 0.1 in any cluster). GO:Biological Process (GO:BP) and Reactome (REAC) databases were searched with 10 < term size < 2000. Significantly enriched pathways below a threshold of g:SCS < 0.05 were compiled and filtered in Revigo to reduce redundancy and determine umbrella terms.
Western blot analysis of mouse brain OXPHOS complex components: (Fig. 4B) one sample Wilcoxon signed-rank test (two-tailed), *p<0.05, **p<0.01, N = 5.
Seahorse Mitochondrial Stress Test Analysis: (Fig. 4F) two-way ANOVA, main effect of medium F(1, 6)=23.99, **P=0.0027; (Fig. 4G) one-way ANOVA, effect of genotype F(2, 9)=17.24, P=0.0008; CTRL vs. 3q29 ***P=0.0005, *CTRL vs PAK2 P=0.0332; (Fig. 4H) one-way ANOVA, effect of genotype F(2, 9)=8.838, P=0.0075; CTRL vs. 3q29 **P=0.0075, CTRL vs PAK2 *P=0.0138; (Fig. 4I) two-way ANOVA effect of genotype, F(2, 18)=448.0, P=0.0003; CTRL vs 3q29 **P=0.0047; (Fig. 4J) two-way ANOVA effect of genotype, F(2, 18)=4.309, P=0.0296; CTRL vs 3q29**P=0.0079; (Fig. 4K) two-way ANOVA, main effect of genotype F(2, 18)=31.16, P<0.0001; CTRL vs 3q29 ****P<0.0001, galactose CTRL vs PAK2***P=0.0007; (Fig. 4L) two-way ANOVA interaction of genotype and medium, F(2, 18)=4.219, P=0.0314; CTRL vs 3q29 *P=0.0364. N=4 for all HEK Seahorse experiments. (Fig. 5E) two-way ANOVA main effect of medium, F(1, 28)=9.295, **P=0.0050; (Fig. 5F) two-way ANOVA, medium effect F(1, 28)=0.01219, P=0.9129; (Fig. 5G) two-tailed ratio paired t-test, P=0.7015; (Fig. 5H) two-tailed ratio paired t-test, **P=0.0009; (Fig. 5l) two-tailed ratio paired t-test, P=0.0935; (Fig. 5J) two-tailed ratio paired t-test, P=0.5028; (Fig. 5K) two-tailed ratio paired t-test, ***P=0.0007; (Fig. 5L) one sample two-tailed t-test, **P=0.0026; (Fig. 5M) two-way ANOVA genotype effect F(1, 56)=0.5930, P=0.4445. N=15 for all NPC Seahorse experiments.