Data from: Autoantibody landscapes in neurological Long COVID and post-COVID cognitive impairment show heterogeneity without a shared disease signature
Data files
Mar 27, 2026 version files 1.28 GB
-
fig3a_PatientHits_CSF_StainPos_nLC_vs_neverCOVIDcontrols.csv
718 B
-
fig4c_PatientHits_CSF_nLC_vs_neverCOVIDcontrols.csv
1.12 KB
-
fig4d_PatientHits_CSF_nLC_vs_postCOVIDcontrols.csv
1.09 KB
-
fig5b_PatientHits_Serum_idcrp_epicc_brace.csv
981 B
-
figS2a_PatientHits_CSF_neverCOVIDcontrols_vs_nLC.csv
834 B
-
figS3a_PatientHits_CSF_postCOVIDcontrols_vs_nLC.csv
1.01 KB
-
figS4a_PatientHits_Serum_nLC_vs_neverCOVIDcontrols.csv
833 B
-
figS6c_PatientHits_Serum_idcrp_epicc_casevscontrol.csv
945 B
-
idcrp_epicc_RPK.csv
567.64 MB
-
README.md
3.47 KB
-
yale_nLC_and_control_rpk_matrix.csv
713.11 MB
Abstract
Neurologic Long COVID (n-LC) includes persistent cognitive and autonomic symptoms after SARS-CoV-2 infection. Prior studies of post-COVID conditions have described diverse humoral autoreactivity, but findings are heterogeneous, and it remains unclear whether n-LC is associated with a consistent CNS-directed humoral signature. We performed a cross-cohort case-control analysis to detect autoantibodies in cerebrospinal fluid (CSF) and serum from n-LC participants. In the Yale COVID Mind Study, CSF from n-LC participants and from pre-pandemic and post-COVID asymptomatic controls was assessed by mouse brain immunofluorescence and proteome-wide phage immunoprecipitation sequencing (PhIP-seq), with candidate reactivities evaluated by orthogonal assays and supervised modeling. In the Epidemiology, Immunology, and Clinical Characteristics of Emerging Infectious Diseases with Pandemic Potential (IDCRP EPICC) cohort, post-COVID sera collected prior to iPhone- or iPad-based cognitive screening were profiled by PhIP-seq and compared between participants with and without cognitive impairment. CSF immunoreactivity on mouse brain tissue was observed in both n-LC and controls, with similar overall frequencies, although n-LC participants more often showed nuclear-predominant staining patterns. PhIP-seq identified sparse, largely patient-specific peptide reactivities to nuclear and neuronal proteins in CSF and serum. Supervised models provided limited discrimination between cases and controls. Candidate autoantigens had limited disease specificity on orthogonal testing. EPICC serum profiling similarly failed to distinguish individuals with and without cognitive impairment. Across cohorts and compartments, n-LC did not exhibit a shared autoantibody signature. These findings support the absence of a dominant, common CNS autoantibody-mediated mechanism in n-LC.
Dataset DOI: 10.5061/dryad.kprr4xhkt
Description of the data and file structure
PhIP-Seq in neurological Long COVID and post-COVID cognitive impairment cases compared to their respective controls
The following data files have been uploaded:
1) yale_nLC_and_control_rpk_matrix.csv
2) idcrp_epicc_RPK.csv
These two files contain the peptide level PhIP-seq data generated in two seperate cohorts. One (Yale COVID MIND study) is comparing the nLC (neuro Long COVID cases) with pre-pandemic controls and post-COVID asymptomatic controls. And the second cohort (IDCRP_EPICC) focuses on LC participants who underwent iPhone/iPad-based cognitive testing, stratified by cognitive impairment status compared with control patients. PhIP-seq experiments used a phage display library spanning the entire human proteome. Raw sequencing were aligned to the input PhIP-seq library using RAPSearch2, and then rpK were calculated, and uploaded in these csv file. where:
- peptide: List of peptides screened in the PhIP-seq experiment, which are the row names (along with other mretadata row names like peptide_id, gene, sequence)
- Patient samples, which are the column names
- Units: rpK (reads per 100,000)
3) Enriched peptides results displayed in manuscript figures ( PhIP-Seq heatmap) are available in a table format with the following naming pattern "ManuscriptFigurenumber_Patienthits_Compartment_comparison" representing how many patients enriched a partcular peptide in a heatmap.
- first column shows the Peptide (in the heatmap) as peptide_id, and
- second column is the Total number of patients enriching the peptide as patient_hits
Files and variables
File: yale_nLC_and_control_rpk_matrix.csv
Description:
- UNIT: rpK
Variables
- peptide_id
- peptide
- gene
- sequence
- Sample names in columns
File: idcrp_epicc_RPK.csv
Description:
Variables
- peptide_id
- peptide
- gene
- sequence
- Sample names in columns
Description:
File: fig3a_PatientHits_CSF_StainPos_nLC_vs_neverCOVIDcontrols.csv
File: fig4c_PatientHits_CSF_nLC_vs_neverCOVIDcontrols.csv
File: fig4d_PatientHits_CSF_nLC_vs_postCOVIDcontrols.csv
File: fig5b_PatientHits_Serum_idcrp_epicc_brace.csv
File: figS2a_PatientHits_CSF_neverCOVIDcontrols_vs_nLC.csv
File: figS3a_PatientHits_CSF_postCOVIDcontrols_vs_nLC.csv
File: figS4a_PatientHits_Serum_nLC_vs_neverCOVIDcontrols.csv
File: figS6c_PatientHits_Serum_idcrp_epicc_casevscontrol.csv
Variables
- peptide_id
- patient_hits (unit in numbers)
Code
Available on GitHub: https://github.com/UCSF-Wilson-Lab/NeuroLongCOVID_Yale_EPICC_PhIPseq
Human subjects data
Consent was obtained from all participants for the public sharing of their de-identified data. All direct personal identifiers, including names, contact information, and any information that could reasonably enable identification of individual participants, were removed or anonymized. Where necessary, data were further generalized or aggregated to reduce the risk of re-identification. These measures ensure that the dataset is fully de-identified and suitable for public dissemination.
DNA libraries were barcoded and amplified, gel-purified, and subjected to Next-Generation Sequencing on an Illumina NovaSeq instrument (Illumina, San Diego, CA). Sequencing reads from raw fastq files were aligned to the reference library using RAPSearch2, and RPK was calculated. This data is uploaded here.
