Differential SP4 expression and HSP60 abundance in buccal swabs from patients with schizophrenia
Data files
Jan 29, 2026 version files 2.15 MB
-
Crosta_Figure1_RTqPCR.zip
29.09 KB
-
Crosta_Figure2_AXCPT.zip
798.76 KB
-
Crosta_Figure4_tMS.zip
799 B
-
Crosta_Master_Summary_Data.csv
11.96 KB
-
Crosta_Supplementary_Tables_Correlations.zip
49.05 KB
-
Crosta_SupplementaryFigure1_JOVI.zip
471.02 KB
-
Crosta_SupplementaryFigure2_RTqPCR_SP4.zip
2.05 KB
-
Crosta_SupplementaryFigure4_Western_Blot_Analysis_of_HSP60.zip
747.82 KB
-
README.md
41.31 KB
Abstract
Schizophrenia (SCZ) is heterogenous and polygenic, making it difficult to diagnose and treat. This is unsurprising as diagnostic criteria are solely based on behavioral markers. There is a critical need for easy-to-collect biomarkers that aid in treatment. To identify potential biomarkers, we recruited SCZ patients and controls. Using buccal cell lysates, we performed real-time quantitative PCR and identified significant differences in Sp4 mRNA between patients and controls. Targeted mass spectrometry identified increased heat shock protein 60 (HSP60) in SCZ samples. To determine the utility of Sp4 mRNA and HSP60 protein as biomarkers, we evaluated their relationship with symptom severity and aberrant cognitive processes. Correlational analyses revealed that elevated Sp4 mRNA and HSP60 protein define a subgroup of SCZ patients who demonstrate symptomology and poor memory. These data support buccal cell Sp4 mRNA and HSP60 protein as easy-to-collect, candidate SCZ biomarkers for a subset of patients.
Short title: SP4 and HSP60 biomarkers in schizophrenia patients
Journal: Science Advances (submitted / revision package)
Dataset DOI:10.5061/dryad.6djh9w1fj.zip
Overview
This README describes all data files associated with the manuscript, including raw experimental outputs, processed datasets, and a consolidated master summary dataset used for all statistical analyses. The dataset includes molecular, cognitive, and clinical measures derived from buccal swabs and behavioral assessments in patients with schizophrenia and control subjects.
All statistical analyses, normalization procedures, and model specifications are described in detail in the main text and the Materials and Methods section of the manuscript. No analysis scripts or proprietary analysis workspaces are provided. All datasets are supplied in non-proprietary CSV format with NA used to denote missing or non-applicable values.
A single master summary file consolidates all de-identified participant-level variables used for correlation analyses, hierarchical regression models, and generation of all figures and supplementary tables.
Software used
• Microsoft Excel-compatible CSV files — raw and processed tabular data
• QIAGEN Digital Insights software — RT-qPCR data processing, quality assessment, and targeted proteomics data handling
Important clarification:
Raw RT-qPCR output files include numerous machine-generated and quality-control variables produced by the instrument software. Only analysis-relevant variables were used for downstream analyses. These curated variables are compiled in the master summary dataset, which represents the authoritative, analysis-ready version of the data used for all reported results.
File organization
• qRT-PCR — raw RT-qPCR plate-level output files (CSV)
• AXCPT_Data — AX-CPT participant-level raw task output files and summary data (CSV)
• JOVI_Data — JOVI participant-level raw task output files and summary data (CSV)
• HSP60_tMS — targeted mass spectrometry raw abundance and normalized HSP60 data (CSV)
• Supplementary — supplementary figures, tables, and validation source files
• Crosta_Master_Summary_Data.csv — consolidated, analysis-ready dataset
All CSV files contain no embedded formulas, formatting, or merged cells. Blank cells in original outputs have been replaced with NA.
List of Submitted Files
Crosta_Master_Summary_Data.csv
Consolidated, analysis-ready dataset containing all de-identified participant-level demographic, molecular, cognitive, and clinical variables used in all statistical analyses.
Crosta_Figure1_RTqPCR.zip: (Figure 1 - RT-qPCR of SP4, NOS1AP, Dexras1)
Raw RT-qPCR plate-level output files exported from QIAGEN Digital Insights software.
Files contain machine-generated Ct values and quality-control metrics. Blank cells have been replaced with NA.
RTqPCR_Plate01_Raw.csv
RTqPCR_Plate02_Raw.csv...
RTqPCR_Plate09_Raw.csv
RTqPCR_Plate10_Raw.csv
Crosta_Figure2_AXCPT.zip (AXCPT_Data - Figure 2 — AX Continuous Performance Task)
Raw participant-level AX-CPT task output files. One CSV file per participant containing unprocessed CNTRACS/E-Prime output, including trial-level accuracy and reaction time data.
AXCPT_C[ID]_raw.csv Control participants C01-C29, excluding C04 and C27.
AXCPT_P[ID]_raw.csv Patient participants P02-P35, excluding P01, P15-16, P23-24, P29, P31-32.
axcpt_scores_summary.csv Participant-level summary dataset containing averaged accuracy, error rates, and reaction times (milliseconds) for AX, AY, BX, and BY trial types.
Crosta_Figure4_tMS.zip (HSP60_tMS - Figure 4 — targeted mass spectrometry)
tMS_abundance.csv
Raw HSP60 protein abundance values measured by targeted mass spectrometry.
tMS_normalization.csv
Normalized HSP60 abundance values used for statistical analyses.- JOVI_Data - Supplementary Figure 1 — JOVI task
Crosta_SupplementaryFigure1_JOVI.zip (Raw participant-level JOVI task output files)
One CSV file per participant containing unprocessed CNTRACS/E-Prime output, including trial-level responses, accuracy, and reaction times.
JOVI_C[ID]_raw.csv: Control participants C01-C29, excluding C04 and C27.
JOVI_P[ID]_raw.csv: Patient participants P02-P35, excluding P01, P15-16, P23-24, P29, P31-32)
JOVI_scores_Summary.csv (JOVI summary file):
Condition-level summary dataset (conditions 1–8) containing percent correct and mean reaction time (milliseconds) for correct responses.
Crosta_SupplementaryFigure2_RTqPCR_SP4.zip (Supplementary Figure 2 — Orthogonal SP4 primer RT-qPCR)
RTqPCR_SP4-Origen.csv
RTqPCR_SP4-RTP1.csv
RTqPCR_SP4-TF.csv
Each file contains participant-level Ct mean, Delta Ct, and Delta Delta Ct values.
Crosta_SupplementaryFigure4_Western_Blot_Analysis_of_HSP60.zip (Supplementary Figure 4 — Western blot validation)
250110-Western Blot Analysis of HSP60 in Human Frontal Cortex_Determining Linear Range of HSP60_Buccal Swab Western Blot.pdf
Crosta_Supplementary_Tables_Correlations.zip (Supplementary tables)
Supplementary_Table_4_Correlations.csv
Supplementary_Table_5_Multiple_Linear_Regression.csv
Participant identifiers
• C### = control subject
• P### = patient with schizophrenia
All participant identifiers are fully de-identified and consistent across all datasets.
Master summary dataset (central analysis file)
A single CSV file consolidates demographic, molecular, cognitive, and clinical variables used in all statistical analyses reported in the manuscript.
File
• Filename: Crosta_Master_Summary_Data.csv
• Rows: Individual participants
• Columns: Defined below
This dataset was used for:
• Correlation analyses
• Hierarchical multiple linear regression analyses
• Generation of all summary statistics and figures
Data dictionary — Crosta_Master_Summary_Data.csv
Participant and demographic variables
SubjectID - De-identified participant identifier (C### = control participant; P### = participant with schizophrenia).
Age - Range of years that fits each participant's age at the time of experiment.
Unit: 5-year-buckets.
Sex - binarized to 1 or 2
Race - Self-reported ethnicity category (1 = Asian; 2 = Black or African American; 3 = Hispanic or Latino, 4 = White).
Buccal swab molecular biomarkers (unitless)
SP4mRNA - Relative SP4 mRNA expression measured by RT-qPCR using the comparative Ct method. Values represent ΔΔCt-derived expression normalized to GAPDH and plate-specific control samples.
Dexras1mRNA - Relative Dexras1 (RASD1) mRNA expression measured by RT-qPCR and normalized to GAPDH using the comparative Ct method.
NOS1APmRNA - Relative NOS1AP mRNA expression measured by RT-qPCR and normalized to GAPDH using the comparative Ct method.
HSP60_tMS_Normalized - Normalized abundance of HSP60 protein measured by targeted mass spectrometry. Values were normalized relative to control samples within analytical batches.
Memory performance (HVLT-R)
HVLTRTotalRecallRaw - Total number of words correctly recalled across learning trials of the Hopkins Verbal Learning Test–Revised.
HVLTRDelayedRecallRaw - Number of words correctly recalled following the delay period.\
HVLTRRetentionRaw - Percentage of retained words calculated from delayed recall relative to learning performance.
HVLTRRDIRaw - Recognition Discrimination Index, reflecting the ability to distinguish target words from distractors.
Cognitive control and goal maintenance (AX-CPT)
AXCPT_AX_ErrorRate - Proportion (0-1) of incorrect responses on AX trials.
AXCPT_AY_ErrorRate - Proportion (0–1) of incorrect responses on AY trials.
AXCPT_BX_ErrorRate - Proportion (0–1) of incorrect responses on BX trials.
AXCPT_BY_ErrorRate - Proportion (0–1) of incorrect responses on BY trials.
AXCPT_AX_ReactionTime_ms - Mean reaction time for correct responses on AX trials.
AXCPT_AY_ReactionTime_ms - Mean reaction time for correct responses on AY trials.
AXCPT_BX_ReactionTime_ms - Mean reaction time for correct responses on BX trials.
AXCPT_BY_ReactionTime_ms - Mean reaction time for correct responses on BY trials.
Symptom severity (PANSS)
All PANSS variables are unitless composite scores.
PANSS_Positive - Positive symptom factor score derived from the Positive and Negative Syndrome Scale.
PANSS_Negative - Negative symptom factor score derived from the Positive and Negative Syndrome Scale.
PANSS_Cognitive - Cognitive/disorganization factor score derived from PANSS items.
PANSS_Excitement - Excitement factor score derived from PANSS items.
PANSS_Depression - Depression/anxiety factor score derived from PANSS items.
PANSS_Disorganization - Disorganization symptom cluster score based on Cuesta & Peralta classification.
PANSS_Anergia - Anergia symptom cluster score reflecting reduced motivation and energy.
PANSS_ThoughtDisturbance - Thought disturbance symptom cluster score.
PANSS_Activation - Activation symptom cluster score reflecting agitation and hostility.
PANSS_ParanoidBelligerence - Paranoid/belligerence symptom cluster score.
PANSS_Depression_Cluster - Depression symptom cluster score.
Visual perception and integration (JOVI)
JOVI_percentcorrect_0deg - Percentage of correct responses at 0° orientation jitter.
JOVI_percentcorrect_7_8deg - Percentage of correct responses at 7–8° orientation jitter.
JOVI_percentcorrect_9_10deg - Percentage of correct responses at 9–10° orientation jitter.
JOVI_percentcorrect_11_12deg - Percentage of correct responses at 11–12° orientation jitter.
JOVI_percentcorrect_13_14deg - Percentage of correct responses at 13–14° orientation jitter.
JOVI_percentcorrect_15_16deg - Percentage of correct responses at 15–16° orientation jitter.
JOVI_percentcorrect_LineCatch - Percentage of correct responses on line catch trials.
JOVI_percentcorrect_NoBackgroundCatch - Percentage of correct responses on no-background catch trials.
JOVI_ReactionTime_0deg_ms - Mean reaction time for correct responses at 0° orientation jitter.
JOVI_ReactionTime_7_8deg_ms - Mean reaction time for correct responses at 7–8° orientation jitter.
JOVI_ReactionTime_9_10deg_ms - Mean reaction time for correct responses at 9–10° orientation jitter.
JOVI_ReactionTime_11_12deg_ms - Mean reaction time for correct responses at 11–12° orientation jitter.
JOVI_ReactionTime_13_14deg_ms - Mean reaction time for correct responses at 13–14° orientation jitter.
JOVI_ReactionTime_15_16deg_ms - Mean reaction time for correct responses at 15–16° orientation jitter.
JOVI_ReactionTime_LineCatch_ms - Mean reaction time for correct responses on line catch trials.
JOVI_ReactionTime_NoBackgroundCatch_ms - Mean reaction time for correct responses on no-background catch trials.
Description of the data and file structure: Data Dictionary
File: Crosta_Figure1_RTqPCR.zip: RT-qPCR of SP4, NOS1AP, and Dexras1
Description: Data for Figure 1: RT-qPCR of SP4, NOS1AP, and Dexras1 in the manuscript.
RT-qPCR data for SP4, NOS1AP, and Dexras1 mRNA expression in buccal swab samples are provided as CSV files corresponding to individual qPCR plates (10 total) within the qRT-PCR/ directory. Each plate-level workbook contains raw instrument outputs exported from the qPCR analysis software, including mean quantification cycle values (Cq Mean, corresponding to Ct Mean), technical replicate information, and machine-generated quality-control metadata.
RT-qPCR data processing and quality assessment were performed using QIAGEN Digital Insights software. Raw RT-qPCR output files contain numerous instrument-generated columns related to amplification performance, melt curve characteristics, and baseline thresholding. Only a subset of these columns was used for downstream analyses. Analysis-relevant values were extracted, processed, and compiled into analysis-ready datasets. Each reaction was run in triplicate.
Positive controls (human brain cDNA) and negative controls (no-template/water controls) were included during assay validation and quality assessment but are not included in the deposited analysis-ready datasets, as they were not used in statistical analyses.
Contents: 10 plate-level raw output files (RTqPCR_Plate01_Raw.csv...RTqPCR_Plate10_Raw.csv) and RTqPCR_Data_Summary.csv
RTqPCR_Data_Summary.csv contains an analysis-ready summary of RT-qPCR measurements used to generate Figure 1. Data are aggregated at the participant level, with one row per participant and gene target. Values in this file were derived from raw plate-level RT-qPCR output files after quality control and normalization.
Variables for raw output files:
Well Position - Position of the reaction well on the qPCR plate (e.g., A1, B3). Used to identify the physical location of each reaction.
Omit - Flag indicating whether the reaction was excluded from analysis by the instrument software.
Values: TRUE = omitted; FALSE = included.
Sample_Name - De-identified sample identifier corresponding to a participant or control material.
C### = control participant; P### = patient. Dilution (e.g., 1:10) indicates RNA input dilution.
Target_Name - Gene amplified in the reaction. Includes GAPDH (housekeeping gene), SP4, NOS1AP, Dexras1, or control reactions (e.g., water).
Task - Reaction classification assigned by the qPCR software.
“Unknown” indicates a biological sample; “NTC” indicates a no-template control.
Reporter - Fluorescent reporter dye used to detect amplification (e.g., SYBR Green).
Quencher - Fluorescent quencher associated with the reporter dye. For SYBR Green assays, this value may be NA or constant.
Quantity - Estimated quantity of amplified target based on standard curve calculations, when applicable.
Unit: arbitrary units defined by the instrument software.
Quantity Mean - Mean estimated quantity across technical replicates.
Unit: arbitrary units.
Quantity SD - Standard deviation of quantity estimates across technical replicates.
Unit: arbitrary units.
RQ - Relative quantity calculated by the software, typically normalized to a reference sample or control.
Unit: relative (unitless).
RQ Min - Minimum relative quantity value across technical replicates.
Unit: relative (unitless).
RQ Max - Maximum relative quantity value across technical replicates.
Unit: relative (unitless).
CT - Cycle threshold (Ct) value for the individual reaction well. Ct is the PCR cycle at which fluorescence exceeds background threshold.
Unit: cycles (unitless numeric value).
Ct_Mean - Mean cycle threshold (Ct) across technical triplicates. Ct is the PCR cycle at which fluorescence exceeds background threshold.
Unit: cycles (unitless numeric value).
Ct SD - Standard deviation of Ct values across technical replicates.
Unit: cycles.
Delta Ct - Difference between Ct value of the target gene and Ct value of the housekeeping gene (GAPDH) for the same sample.
Unit: cycles.
Delta_Ct_Mean - Difference between Ct_Mean of the target gene and Ct_Mean of GAPDH for the same sample. Represents within-sample normalization.
Unit: cycles (unitless numeric value).
Delta Ct Mean - Mean Delta Ct across technical replicates.
Unit: cycles.
Delta Ct SD - Standard deviation of Delta Ct values across technical replicates.
Unit: cycles.
Delta Ct SE - Standard error of the mean of Delta Ct values across technical replicates.
Unit: cycles.
Delta_Delta_Ct - Difference between the sample’s Delta_Ct_Mean and the mean Delta Ct of control samples run on the same qPCR plate. Plate-specific controls were used to correct for inter-plate variability.
Unit: cycles.
Automatic Ct Threshold - Indicator specifying whether the Ct threshold was automatically determined by the software.
Values: TRUE/FALSE.
Ct Threshold - Fluorescence threshold value used to calculate Ct during the exponential phase of amplification.
Unit: fluorescence units (arbitrary).
Automatic Baseline - Indicator specifying whether the baseline range was automatically determined by the software.
Values: TRUE/FALSE.
Baseline Start - PCR cycle number at which the baseline calculation begins.
Unit: cycles.
Baseline End - PCR cycle number at which the baseline calculation ends.
Unit: cycles.
Amp Status - Amplification status assigned by the software.
Values include “Amp” (successful amplification), “No Amp” (no amplification), or “Inconclusive”.
Comments - Free-text comments generated by the software regarding amplification quality or issues.
Cq Conf - Calculated confidence score for the Ct (Cq) value. Values near 1 indicate high confidence; values near 0 indicate failed or unreliable amplification.
Unit: relative confidence (0–1).
MTP - Multiplex flag indicating whether the reaction was multiplexed.
Values: TRUE/FALSE.
EXPFAIL - Experimental failure flag assigned by the software.
TRUE indicates a failed reaction.
NOAMP - Flag indicating no detectable amplification.
TRUE indicates absence of amplification.
THOLDFAIL - Threshold failure flag indicating abnormal threshold determination.
Values: TRUE/FALSE.
Tm1 - Primary melting temperature (Tm) of the amplified product determined from melt curve analysis.
Unit: degrees Celsius (°C).
CQCONF - Binary confidence flag for Ct value.
Y = confident Ct; N = not confident.
Tm2 - Secondary melting temperature peak, if present.
Unit: degrees Celsius (°C).
HIGHSD - Flag indicating high standard deviation among technical replicates.
Values: TRUE/FALSE.
Tm3 - Additional melting temperature peak detected during melt curve analysis.
Unit: degrees Celsius (°C).
OUTLIERRG - Flag indicating that the reaction was identified as a statistical outlier by the software.
Values: TRUE/FALSE.
Tm4 - Additional melting temperature peak, if detected.
Unit: degrees Celsius (°C).
NA - Indicates that a value is not applicable (e.g., housekeeping gene rows do not have Delta Ct or Delta Delta Ct values, or control reactions without amplification).
Variables for RTqPCR_Data_Summary.csv:
Plate Number - Numeric identifier of the RT-qPCR plate on which the sample was processed. Plates were analyzed independently, and control samples within each plate were used as the reference for ΔΔCt calculations.
Subject_ID - De-identified participant identifier (C## = control participant, P## = participant with schizophrenia).
Ct Mean GAPDH - Mean cycle threshold (Ct) value for the housekeeping gene GAPDH, averaged across technical replicates for the given subject on the specified plate. Ct values represent the PCR cycle at which fluorescence crossed the detection threshold (unitless).
Ct Mean SP4 - Mean cycle threshold (Ct) value for SP4 mRNA across technical triplicates. Ct is the PCR cycle at which fluorescence exceeds background threshold.
Unit: cycles (unitless numeric value).
Ct Mean Dexras - Mean cycle threshold (Ct) value for Dextas1 mRNA across technical triplicates. Ct is the PCR cycle at which fluorescence exceeds background threshold.
Unit: cycles (unitless numeric value).
Ct Mean NOS1AP - Mean cycle threshold (Ct) value for NOS1AP mRNA across technical triplicates. Ct is the PCR cycle at which fluorescence exceeds background threshold.
Unit: cycles (unitless numeric value).
Delta Ct Mean SP4 - Difference between Ct_Mean of SP4 mRNA and Ct_Mean of GAPDH for the same sample. Represents within-sample normalization.
Unit: cycles (unitless numeric value).
Delta Ct Mean Dexras - Difference between Ct_Mean of Dexras mRNA and Ct_Mean of GAPDH for the same sample. Represents within-sample normalization.
Unit: cycles (unitless numeric value).
Delta Ct Mean NOS1AP - Difference between Ct_Mean of NOS1AP mRNA and Ct_Mean of GAPDH for the same sample. Represents within-sample normalization.
Unit: cycles (unitless numeric value).
Delta Delta Ct SP4 - This value represents relative SP4 expression normalized to plate-specific controls. Difference between the sample’s Delta_Ct_Mean of SP4 mRNA and the mean Delta Ct of control samples run on the same qPCR plate. Plate-specific controls were used to correct for inter-plate variability.
Unit: cycles.
Delta Delta Ct Dexras - This value represents relative Dexras1 expression normalized to plate-specific controls. Difference between the sample’s Delta_Ct_Mean of Dexras1 mRNA and the mean Delta Ct of control samples run on the same qPCR plate. Plate-specific controls were used to correct for inter-plate variability.
Unit: cycles.
Delta Delta Ct NOS1AP - This value represents relative NOS1AP expression normalized to plate-specific controls. Difference between the sample’s Delta_Ct_Mean of NOS1AP mRNA and the mean Delta Ct of control samples run on the same qPCR plate. Plate-specific controls were used to correct for inter-plate variability.
Unit: cycles.
NA values - Indicates that a value was not available or not applicable for that subject–gene combination (e.g., target not amplified, failed quality control, or reference gene only).
File: Crosta_Figure2_AXCPT.zip:AXCPT data for patients and controls
Description: Data for Figure 2: AX Continuous Performance Task in the manuscript
AX-CPT behavioral data assessing cognitive control and goal maintenance are provided in CSV format within the AXCPT_Data/ directory. Two complementary data products are included: participant-level raw task output files and a consolidated, analysis-ready summary dataset.
Contents: 53 individual raw data files (AXCPT_C01_raw.csv...AXCPT_P35_raw.csv) and axcpt_scores_summary.csv
Participant-level raw AX-CPT data in the raw data files are provided as individual Excel workbooks (AXCPT_Cxx_raw.csv and AXCPT_Pxx_raw.csv), where each file corresponds to a single subject (C### = control; P### = patient with schizophrenia). These files contain trial-by-trial task output exported directly from the CNTRACS AX-CPT task software (E-Prime format), including detailed information on task events, stimulus presentation, responses, reaction times, and device metadata.
Each raw file includes a set of machine-generated variables reflecting practice and test blocks, cue and probe events, trial types (AX, AY, BX, BY), response accuracy, and timing measures.
All values in the summary dataset were derived by aggregating trial-level data from the participant-level raw AX-CPT files. Missing or non-applicable values are explicitly coded as NA. All statistical analyses reported in the manuscript were performed exclusively using this consolidated summary dataset.
Participant identifiers and group assignments are consistent across raw and summary files.
Variables for raw files:
ExperimentName - Name of the E-Prime experiment file used to administer the AX Continuous Performance Task (AX-CPT).
Subject - De-identified participant identifier.
Session - Session number for the participant.
DataFile.Basename - Base filename of the raw E-Prime data file generated during task execution.
Display.RefreshRate - Refresh rate of the display monitor used during task presentation, in Hertz (Hz).
ExperimentVersion - Version identifier of the AX-CPT task.
Group - Diagnostic group identifier assigned within the task software.
Handedness - Participant-reported handedness as recorded by the task software.
RandomSeed - Random number generator seed used to initialize trial order and stimulus presentation.
RuntimeCapabilities - Capabilities of the runtime environment detected by the E-Prime software.
RuntimeVersion - Version of the E-Prime runtime software used during data collection.
RuntimeVersionExpected - Expected runtime version specified by the experiment configuration.
StudioVersion - Version of E-Prime Studio used to create the task.
SubjectCharacterString - Internal subject identifier string generated by the software.
ThankYou.DEVICE - Marker indicating presentation of the task completion or “thank you” screen.
Title.DEVICE - Marker indicating presentation of the title or introductory screen.
Block - Block number within the task structure.
EndofPractice.ACC - Accuracy during the final practice trial (1 = correct, 0 = incorrect).
EndofPractice.CRESP - Correct response code for the final practice trial.
EndofPractice.DEVICE - Device marker associated with the end-of-practice event.
EndofPractice.DurationError - Flag indicating a timing irregularity during the end-of-practice event.
EndofPractice.OnsetDelay - Delay before onset of the end-of-practice screen, in milliseconds.
EndofPractice.OnsetTime - Onset time of the end-of-practice screen, in milliseconds.
EndofPractice.OnsetToOnsetTime - Time between consecutive onset events during end-of-practice, in milliseconds.
EndofPractice.RESP - Participant response recorded during the end-of-practice event.
EndofPractice.RT - Reaction time recorded during the end-of-practice event, in milliseconds.
EndofPractice.RTTime - Absolute timestamp of the end-of-practice reaction time, in milliseconds.
EndPractice1.DEVICE - Device marker indicating termination of the practice phase.
InstructScreens - Indicator for instructional screen presentation.
InstructScreens.Cycle - Cycle number associated with instructional screen presentation.
InstructScreens.Sample - Sample index associated with instructional screen presentation.
Pause.DEVICE - Marker indicating presentation of a pause screen.
Practice - Indicator for practice trial execution.
Practice.Cycle - Cycle number associated with practice trials.
Practice.Sample - Sample index associated with practice trials.
PracticeReady.DEVICE - Marker indicating readiness screen prior to practice trials.
Procedure[Block] - Procedure name executed at the block level.
Ready.DEVICE - Marker indicating presentation of a readiness screen prior to test blocks.
Running[Block] - Indicator of block-level execution state.
Slide01.DEVICE - Marker indicating presentation of slide 01.
Slide02Build1.DEVICE / Slide02Build8.DEVICE - Markers indicating presentation of sequential build elements within slide 02.
Slide02Build1.RESP – Slide02Build8.RESP - Participant responses associated with slide 02 build elements.
Slide03CardPractice.DEVICE - Marker indicating presentation of card-based practice slide.
Slide04ToTrials.DEVICE - Marker indicating transition from instruction to trials.
Slide05Sounds1.DEVICE / Slide05Sounds4.DEVICE - Markers indicating sound cue presentation events.
Slide05Sounds4.RESP - Participant response associated with the fourth sound cue.
Test - Indicator for execution of test trials.
Test.Cycle - Cycle number associated with test trials.
Test.Sample - Sample index associated with test trials.
Trial - Sequential trial number.
CorrectRespCue - Correct response associated with the cue stimulus.
CorrectRespTarg - Correct response associated with the probe (target) stimulus.
Cue - Cue stimulus presented on the trial.
Cue.ACC - Cue accuracy (1 = correct, 0 = incorrect).
Cue.CRESP - Correct response code for the cue.
Cue.DEVICE - Device marker associated with cue presentation.
Cue.DurationError - Flag indicating abnormal timing during cue presentation.
Cue.OffsetTime - Offset time of the cue stimulus, in milliseconds.
Cue.OnsetDelay - Delay before cue onset, in milliseconds.
Cue.OnsetTime - Onset time of the cue stimulus, in milliseconds.
Cue.OnsetToOnsetTime - Time between consecutive cue onsets, in milliseconds.
Cue.RESP - Participant response to the cue stimulus.
Cue.RT - Reaction time to the cue stimulus, in milliseconds.
Cue.RTTime - Absolute timestamp of cue reaction time, in milliseconds.
Cue.StartTime - Start time of cue presentation, in milliseconds.
CueFeedback.DurationError / CueFeedback.OnsetToOnsetTime - Variables describing timing, duration, and execution of feedback following cue responses, including onset, offset, delays, and timing errors (all time values in milliseconds).
CueList - Identifier for the cue stimulus list.
CueMask.StartTime - Start time of cue masking event, in milliseconds.
Interval - Inter-stimulus interval duration, in milliseconds.
Prac1 - Indicator for an additional practice-related trial or event.
Prac1.Cycle - Cycle number associated with the additional practice event.
Prac1.Sample - Sample index associated with the additional practice event.
Probe.ACC - Probe (target) accuracy (1 = correct, 0 = incorrect).
Probe.CRESP - Correct response code for the probe stimulus.
Probe.DEVICE - Device marker associated with probe presentation.
Probe.OffsetTime - Offset time of the probe stimulus, in milliseconds.
Probe.OnsetTime - Onset time of the probe stimulus, in milliseconds.
Probe.RESP - Participant response to the probe stimulus.
Probe.RT - Reaction time to the probe stimulus, in milliseconds.
Probe.RTTime - Absolute timestamp of probe reaction time, in milliseconds.
Probe.StartTime - Start time of probe presentation, in milliseconds.
ProbeFeedback.ActionDelay / ProbeFeedback.TimingMode -
Variables describing timing, duration, execution, and timing mode of feedback following probe responses (all time values in milliseconds).
Procedure[Trial] - Procedure name executed at the trial level.
Running[Trial] - Indicator of trial-level execution state.
Target - Target stimulus presented on the trial.
TargetList - Identifier for the target stimulus list.
Trials1to36, Trials1to36b, Trials1to36c, Trials1to36d - Identifiers for grouped trial sets defined by the task structure.
Trials1to36.Cycle – Trials1to36d.Sample - Cycle and sample indices associated with grouped trial sets.
TrialType - Trial classification (AX, AY, BX, or BY), defining cue–probe combinations used in AX-CPT analyses.
NA - Indicates non-applicable values for task phases without responses.
Variables for axcpt_scores_summary:
subject_ID - De-identified participant identifier (C### = control subject, P### = patient with schizophrenia).
Average_of_Probe.ACC - Mean accuracy across all probe trials.
Unit: proportion (0–1).
Average_of_Probe.RT - Mean reaction time for correct probe responses.
Unit: milliseconds.
AX_mean_error_rate, AY_mean_error_rate, BX_mean_error_rate, BY_mean_error_rate - Proportion of incorrect responses for each trial type.
Unit: proportion (0–1).
AX_mean_RT, AY_mean_RT, BX_mean_RT, BY_mean_RT - Mean reaction time for correct responses by trial type.
Unit: milliseconds.
File: Crosta_Figure4_tMS.zip
Description: Data for Figure 4: HSP60 targeted mass spectrometry results in the manuscript
Targeted mass spectrometry (tMS) was used to quantify HSP60 protein abundance in buccal cell lysates from patients with schizophrenia and control subjects. To ensure clarity, accessibility, and reuse, raw mass spectrometry output and normalized values used for statistical analyses are provided in two separate CSV files.
Contents: two files: tMS_abundance.csv: Raw targeted mass spectrometry signal corresponding to HSP60 protein abundance and **tMS_normalization.csv: **HSP60 abundance normalized relative to the mean control abundance within the same analytical batch.
Variables for tMS_abundance.csv:
Sample - Well or sample position identifier used during targeted mass spectrometry acquisition.
Subject_ID - De-identified participant identifier (C### = control subject, P### = patient with schizophrenia).
HSP60_Abundance - Raw targeted mass spectrometry signal corresponding to HSP60 protein abundance.
Unit: instrument intensity (arbitrary units).
Variables for tMS_normalization.csv:
Subject_ID - De-identified participant identifier (C### = control subject, P### = patient with schizophrenia).
HSP60_Abundance - Raw HSP60 abundance value (copied from raw file).
HSP60_normalized - HSP60 abundance normalized relative to the mean control abundance within the same analytical batch.
Unit: relative abundance (unitless).
Supplementary Figures and Tables
File: Crosta_SupplementaryFigure1_JOVI.zip
Data for Supplementary Figure 1: Jittered-Orientation Visual Integration (JOVI) task in the manuscript
JOVI task data are provided as raw participant-level files and a consolidated summary file.
Contents: 54 files of raw data from JOVI task (JOVI_C01_Raw.csv...JOVI_P35_Raw.csv) and JOVI_scores_Summary.csv
Raw JOVI files are provided as CSV files named JOVI_Cxx_raw.csv or JOVI_Pxx_raw.csv. These files contain unprocessed CNTRACS/E-Prime output and include experiment metadata, stimulus presentation variables, trial identifiers, response accuracy, and reaction time measurements.
Variables for raw files:
ExperimentName - Name of the E-Prime experiment file used to administer the Jittered Orientation Visual Integration (JOVI) task.
Subject - De-identified participant identifier assigned by the experimental software.
Session - Session number for the participant.
CNTRACSID - Internal CNTRACS study identifier associated with the testing session.
DataFile.Basename - Base filename of the E-Prime data file generated during task execution.
Display.RefreshRate - Monitor refresh rate during stimulus presentation (Hz).
ExperimentVersion - Version number of the JOVI task used.
Group - Diagnostic group identifier assigned by the task software.
RandomSeed - Random number generator seed used to initialize trial order.
RuntimeCapabilities - Capabilities of the runtime environment detected by the software.
RuntimeVersion - Version of the runtime software used during task execution.
RuntimeVersionExpected - Expected runtime version specified by the task configuration.
StudioVersion - Version of E-Prime Studio used to build the task.
Block - Block number within the task.
MasterList - Identifier for the master stimulus list used for trial presentation.
MasterList.Cycle - Cycle number within the master stimulus list.
MasterList.Sample - Sample index within the master stimulus list.
PleasantnessDisplay.ACC - Accuracy flag for responses during the pleasantness rating display (1 = correct, 0 = incorrect).
PleasantnessDisplay.CRESP - Correct response code for the pleasantness display.
PleasantnessDisplay.RESP - Participant response during the pleasantness display.
PleasantnessDisplay.RT - Reaction time for the pleasantness display response (milliseconds).
PleasantnessDisplay.RTTime - Absolute timestamp of the pleasantness display reaction time (milliseconds).
Procedure[Block] - Procedure state at the block level.
Running[Block] - Execution state at the block level.
Trial - Sequential trial number within the session.
answer - Expected correct answer for the trial.
Block10list – Block24list, Block1list – Block9list
Indicators specifying membership of the current trial in predefined stimulus lists associated with different task conditions or difficulty levels. These fields are generated automatically by the task structure.
correctresponse - Correct response associated with the stimulus presented on the trial.
criticaltrials - Indicator of whether the trial is designated as a critical (analysis-relevant) trial.
criticaltrials.Cycle - Cycle number within the critical trials list.
criticaltrials.Sample - Sample index within the critical trials list.
delta - Angular offset or jitter applied to the stimulus orientation on the trial (degrees).
ISI.ACC - Accuracy flag for inter-stimulus interval events (1 = correct, 0 = incorrect).
ISI.CRESP - Correct response code for the inter-stimulus interval.
ISI.RESP - Participant response during the inter-stimulus interval.
ISI.RT - Reaction time during the inter-stimulus interval (milliseconds).
leftegs - Indicator for trials presenting left-oriented contour elements.
leftegs.Cycle - Cycle number within the left-oriented element list.
leftegs.Sample - Sample index within the left-oriented element list.
orientation - Orientation direction of the stimulus (e.g., left or right).
practicetrials - Indicator for practice trials.
practicetrials.Cycle - Cycle number within the practice trials list.
practicetrials.Sample - Sample index within the practice trials list.
pracwithfeedbacklist - Indicator for practice trials presented with feedback.
pracwithfeedbacklist.Cycle - Cycle number within the practice-with-feedback list.
pracwithfeedbacklist.Sample - Sample index within the practice-with-feedback list.
Procedure[Trial] - Procedure state at the individual trial level.
responseinput.ACC - Accuracy of the participant’s response to the stimulus (1 = correct, 0 = incorrect).
responseinput.CRESP - Correct response code for the stimulus.
responseinput.RESP - Participant’s recorded response.
responseinput.RT - Reaction time for the stimulus response (milliseconds).
rightegs - Indicator for trials presenting right-oriented contour elements.
rightegs.Cycle - Cycle number within the right-oriented element list.
rightegs.Sample - Sample index within the right-oriented element list.
Running[Trial] - Execution state at the trial level.
size - Stimulus size condition used for the trial.
stimuli - Filename or identifier of the image stimulus presented on the trial.
NA - Indicates that a value was not applicable or not generated for a given trial (e.g., non-critical trials, practice trials, or software-generated placeholders).
Variables for JOVI_scores_Summary.csv:
Patient_ID - De-identified participant identifier (C### = control subject, P### = patient with schizophrenia).
CONDITION - Stimulus condition (1–8).
Percent_Correct - Percentage of correct responses per condition.
Unit: percent (0–100).
Mean_RT_For_Correct_Responses - Mean reaction time for correct responses per condition.
Unit: milliseconds.
File: Crosta_SupplementaryFigure2_RTqPCR_SP4.zip
Description: Data for Supplementary Figure 2: RT-qPCR of SP4 in the manuscript
RT-qPCR validation of SP4 expression using three independent primer sets.
Contents: three CSV files of raw data for three primer sets: RTqPCR_SP4-Origen.csv, RTqPCR_SP4-RTP1.csv, and RTqPCR_SP4-TF.csv
Variables:
Subject_ID - De-identified participant identifier (C### = control subject, P### = patient with schizophrenia).
CT_Mean_GAPDH - Mean cycle threshold (Ct) value for GAPDH, averaged across technical triplicates for the sample. Represent the PCR cycle number at which fluorescence crossed the detection threshold.
Unit: cycles.
CT_Mean_Gene - Mean cycle threshold (Ct) value for SP4 mRNA measured using the SP4-Gene (Origene/RTP1/TF) primer set, averaged across technical triplicates.
Unit: cycles.
DeltaCt_Mean_Gene - Mean ΔCt value for SP4-TF, calculated as: ΔCt = (Ct Mean SP4-Gene) − (Ct Mean GAPDH). This value represents SP4-Gene (Origene/RTP1/TF) expression normalized to the GAPDH internal control within the same sample.
Unit: cycles.
DeltaDeltaCt_Gene - ΔΔCt value for SP4-Gene (Origene/RTP1/TF) expression, calculated as: ΔΔCt = (ΔCt of the sample) − (mean ΔCt of control subjects on the same qPCR plate). This value represents plate-specific normalization of SP4 expression relative to controls.
Unit: cycles.
NA - Indicates that a ΔΔCt value could not be computed, typically because a valid plate-matched control reference was unavailable for that sample.
File: Crosta_SupplementaryFigure4_Western_Blot_Analysis_of_HSP60.zip
Description: Data for Supplementary Figure 4: Western blot data for HSP60 in the manuscript
Western blot linear-range determination and full blot images. Data are provided as a PDF file containing representative blots and corresponding validation information. Quantitative analyses derived from these experiments are reported in the manuscript.
Contents: one powerpoint file of Western blots and quantitation.
Variables: No tabular raw data files are associated with this supplementary figure.
Files: Crosta_Supplementary_Tables_Correlations.zip
Description: Data for Supplementary Tables 4–5 in the manuscript
Contents: Two Excel spreadsheets - Crosta et al Supplementary Table 4.Correlations.cvs and Crosta et al Supplementary Table 5.Linear Regressions.cvs
Correlation and hierarchical regression outputs with Benjamini–Hochberg FDR correction (Excel)
External proteomics repository
Proteomics data are deposited with the ProteomeXchange Consortium (PRIDE):
- Dataset ID: PXD055067
- DOI: 10.6019/PXD055067
Reproducibility statement
All statistical analyses, normalization procedures, and model specifications are described in detail in the main manuscript and Materials and Methods section. The Master Summary Excel file represents the complete, analysis-ready dataset used to generate all reported results.
Human subjects data
We received explicit consent from our participants to publish the de-identified data in the public domain, and explain all data have codes with no link to names or identifying information.
