Exploratory analysis of sleep deprivation effects on gene expression and regional brain metabolism
Data files
Mar 20, 2025 version files 28.57 MB
-
1_functional_clustering.csv
29.72 KB
-
10_laterality_anova_ROI.xlsx
33.46 KB
-
2_Gene_anova_BH_FDR.csv
10.25 MB
-
3_significant_gene_anova.xlsx
38.58 KB
-
4_ubl_conj_genes_function-mpv.csv
6.29 MB
-
5_top_300_genes.xlsx
101.45 KB
-
6_ROI_ANOVA_FULL.csv
8 KB
-
7_allROI_PCA_loadings.csv
5.46 KB
-
8_correlation_matrix_ROI_genePCA.xlsx
153.50 KB
-
9_correlation_allROI_300cond.xlsx
3.17 MB
-
applied_subset_40000.csv
8.41 MB
-
df_for_corr_matrix.csv
14.49 KB
-
PET_SEGMENTATION.csv
10.92 KB
-
README.md
5.35 KB
-
ROI_faw.csv
48.43 KB
Abstract
Sleep deprivation affects cognitive performance and immune function, yet its mechanisms and biomarkers remain unclear. This study explored the relationships among gene expression, brain metabolism, sleep deprivation, and sex differences.
Methods
Fluorodeoxyglucose-18 positron emission tomography (18F-FDG PET) measured brain metabolism in regions of interest (ROIs), and RNA analysis of blood samples assessed gene expression pre- and post-sleep deprivation. Mixed model regression and principal component analysis (PCA) identified significant genes and regional metabolic changes.
Results
There were 23 and 28 differentially expressed probesets for the main effects of sex and sleep deprivation, respectively, and 55 probesets for their interaction (FDR-corrected p<0.05). Functional analysis revealed enrichment in nucleoplasm- and UBL conjugation-related genes. Genes showing significant sex effects mapped to chromosomal regions Y and 19 (Benjamini-Hochberg (BH) FDR p<0.05), with 11 genes (4%) and 29 genes (10.5%) involved, respectively. Differential gene expression highlighted sex-based differences in innate and adaptive immunity.
For brain metabolism, sleep deprivation resulted in significant decreases in the left insula, medial prefrontal cortex (BA32), somatosensory cortex (BA1/2), and motor premotor cortex (BA6) and increases in the right inferior longitudinal fasciculus, primary visual cortex (BA17), amygdala, cerebellum, and bilateral pons. Hemispheric asymmetry in brain metabolism was observed, with BA6 decreases correlating with increased UBL conjugation gene expression.
Conclusion
Sleep deprivation broadly impacts brain metabolism, gene expression, and immune function, revealing cellular stress responses and hemispheric vulnerability. These findings enhance understanding of the molecular and functional effects of sleep deprivation.
Description of the data and file structure
This data was collected in order to study relationships between gene expression and regional brain metabolism in the context of sleep deprivation and sex.
ROI_faw
and applied_subset_40000
are the data collected from subjects, after preprocessing was performed.
Applied subset 40000 columns appear as:
-
ProbeID from applied subset 40000 are the probe names/identifiers from the affymetrix microarray chip, designated by the manufacturer. RNA microarray chips measure fluorescence intensity (RFU). The raw intensity RFU values for each probe (ProbeID) were background-corrected, normalized, and log-transformed to a log2 scale to populate this table.
-
P01- P08 denotes patient, and pre post condition (before or after sleep deprivation) (e.g: P01_post represents the patient 1, post sleep deprivation)
This data set is used by the applied_subset and genes_degradation_residuals
function in the script funtions.r
PET SEGMENTATION columns:
PET SEGMENTATION is needed to run get_top8_component_pcaplot and get_top8_cum_var functions from function.R.
BrainRegion specifies name of the brain region (ROI). Subesequent columns identify subjects with 4 letter arbitruary acronym (created during data acquisition to help maintain double blinding), and a number (patient number, same as Applied subset 40000 and ROIfaw). Pre = pre sleep deprivation, SD = post sleep deprivation. Values represent metabolic value for specified brain region, and is unitless.
ROIfaw is in long format with columns:
MetVal (arbituary units) is the normalized metabolic value extracted from the BrainRegion (brain region) for that patient (Patient_ID) under the pre or post condition (Conditon; Pre = pre sleep deprivation, post = post sleep deprivation). Gender for each patient was also recorded (F=female, M=male).
df for corr matrix has columns that appear as:
There are many more columns, PC denotes principal component, and the number denotes which principal component. _gene or _ROI denotes which principal component analysis the PC came from, as gene and brain imaging data were analyzed with PCA separately.
Each subject’s PC score is a weighted combination of their original measurements (either gene expression levels or brain ROI values). This score reflects where the subject falls along a latent dimension or principal component. A higher or lower PC score indicates how strongly a subject expresses the underlying pattern defined by that principal component. Because the data was normalized by z-score during principal component analysis, the PC values are unitless.
The rest of the columns are brain areas, and the values represent metabolic values from that brain region of interest (ROI), for that subject under the condition (specified by condition). affyrnadeg denotes the RNA degradation slope value, used to evaluate and control for the sample quality. Condition (1= pre sleep deprivation, 2= post sleep deprivation) and Sex (1=male, 2=female) are categorical, age is recorded in years, SSS is the Stanford Sleepiness Score, and PVT is the psychomotor vigilance test, and is the subject’s response delay to a stimulus, measured in milliseconds.
data from corr matrix is used to create the dataframe for:
calculate_and_save_correlation <- function(df, output_filename)
This function is from the function.R script
Supplementary Tables for the associated manuscript:
1_functional_clustering.xlsx - functional clustering of gene expression principal component analysis and ANOVA using DAVID
2_Gene_anova_BH_FDR.csv - two-way mixed ANOVA on gene expression, with Benjamini-Hochberg FDR
3_significant_gene_anova.xlsx - post FDR significant genes from the gene ANOVA (supplementary table 2_Gene_anova_BH_FDR.csv)
4_ubl_conj_genes_function-mpv.xlsx - the genes and functions for the UBL conjugation functional cluster, see first supplementary table. Also includes ANOVA for genes in this functional cluster
5_top_300_genes.xlsx - the top 300 genes that have the strongest loading values for each gene principal component, also has the top 300 most significant genes from the gene ANOVA
6_ROI_ANOVA_FULL.xlsx - Two Way Mixed ANOVA Brain Region Metabolism (all regions)
7_allROI_PCA_loadings.docx - PCA Feature Loadings for Brain Regions
8_correlation_matrix_ROI_genePCA.xlsx - correlation matrix with all the brain region ROI, and gene PC’s
9_correlation_allROI_300cond.xlsx - correlation matrix with all the brain region ROI, and top 300 genes from condition ANOVA
10_laterality_anova_ROI.xlsx - ANOVA for hemisphere effects and interaction with gender and on brain region
functions.r is a collection of scripts used to generate the statistical tests for the associated research paper. These include 2 way mixed ANOVAs, correlation matrixes, and principal component analysis.
Code/software
R
libraries:
ggplot2
car
psych
dplyr
tidyr
Hmisc
lme4
afex
lsr
lmerTest
readr
dbscan
glmnet
biomaRt
qvalue
hgu133plus2.db
AnnotationDbi
purrr
Consent to participate statement: Informed consent was obtained from all participants before the start of this study
Sleep Deprivation
Eight healthy subjects, 4 male and 4 female, were recruited from the University of California Irvine, after IRB approval. On day 1, subjects were initially assigned a 24-hour period of normal activity (e.g. walk, talk, study, watch TV, play games, use the computer, etc.). These subjects were tested on the Psychomotor Vigilance Test (PVT) and asked to rate their subjective level of sleepiness on the Stanford Sleepiness Scale (SSS) at baseline. Higher scores indicate a longer, more delayed, response time on the PVT, while higher scores on the SSS indicate greater degrees of sleepiness. The SSS scale is shown in Table 1. Each subject’s performance on the Psychomotor Vigilance Test (PVT), and subjective sleepiness ratings (SSS) were recorded both before and after sleep deprivation (Table 2). There was no significant difference in age between male and female subjects (Table 3), all of whom had no prior psychiatric history.
Blood samples were collected on baseline day at 1 p.m, pre-sleep deprivation (pre-SD). Sleep deprivation activities and blood sample acquisition times are recorded in Table 4. At the end of day 1 (11 p.m), subjects were moved to an outpatient research facility for the sleep deprivation protocol. They were requested not to nap or sleep during the sleep deprivation period, and were additionally tasked with filling out forms and answering questions about their mood every two to four hours. Staff members monitored the subjects during the sleep deprivation period. Subjects were allowed to walk, talk, study, watch TV, play games or cards, read, and use the computer, but were not allowed caffeinated foods or beverages. A second blood sample was collected 18 hours after starting sleep deprivation activities (SD Day 2, 1 p.m), subjects completed the protocol and were driven home by cab.
Gene Data Processing
Blood samples (3 ml) were drawn from each subject, into Tempus tubes (ABI, ThermoFisher, Carlsbad, CA) 24 hours apart. The blood samples collected at baseline and 18 hours after starting sleep deprivation activities were processed with Affymetrix HG-U133 Plus 2.0 gene expression microarray chips according to the manufacturer’s instructions (Affymetrix, ThermoFisher, Carlsbad CA). Data processing was done using R 4.2 and BioConductor 3.16 [32]. The Affymetrix HG-U133 Plus 2.0 microarray ‘cel’ files were read using the affy routine with the hgu133plus2.db package. Quantile normalization was used to standardize probeset data [33]. A linear model was fitted to the expression data for each probeset using ‘lmfit’ from the limma package, to eliminate weakly expressed probesets, and the top 40,000 probesets were found using the topTables function. Type III mixed ANOVA was implemented using the lmerTest library in R, with the main effects being sex, sleep deprivation, and sleep deprivation-sex interaction. Age and RNA integrity number (RIN) were used as covariates. The top 300 probesets for each main effect from mixed ANOVA and PCA were analyzed for enrichment using the Database for Annotation, Visualization and Integrated Discovery (DAVID) [34; 35]. Principal component analysis was conducted using the pca function with normalized and scaled expression data.
F18-FDG PET Scan Processing
The pre-SD and post-SD F18 FDG-PET scans were obtained from each subject. Each F18-FDG PET scan was normalized in MATLAB (Mathworks, Sherborn, Massachusetts, USA) using Statistical Parametric Mapping (SPM) 5 software (Functional Imaging Laboratory, Wellcome Department of Cognitive Neurology, University College London, London, UK) to spatially transform the images to a template conforming to the space derived from standard brains from the Montreal Neurological Institute, and convert it to the space of the stereotactic atlas of Talairach and Tournoux. The images were then smoothed with a Gaussian low-pass filter of 8mm to minimize noise and improve spatial alignment.
Regions of interest (ROI) analysis was done by extracting metabolic values from regions of interest using VINCI (“Volume Imaging in Neurological Research, Co-Registration and ROI included”) software. Supplementary Figure 1 shows ROI segmentation of FDG-PET scans labeled with brain regions and Brodmann areas (BA).
A type III mixed two-way ANOVA was implemented using the lmerTest library in R. The model considered sex as a between-subjects factor and condition (pre-sleep deprivation vs. post-sleep deprivation) as a within-subjects factor. Principal component analysis was performed using the pca() function in the BioConductor environment [32] in R. Prior to extracting principal components, all probesets were scaled by extracting the mean value and dividing by the standard deviation for that variable in R.