Skip to main content

Data from: Immunoreactive peptide maps of SARS-CoV-2

Cite this dataset

Joshi, Shreyas; Mishra, Nischay (2021). Data from: Immunoreactive peptide maps of SARS-CoV-2 [Dataset]. Dryad.


Serodiagnosis of SARS-CoV-2 infection is impeded by immunological cross-reactivity among the human coronaviruses (HCoVs) SARS-CoV-2, SARS-CoV-1, MERS-CoV, OC43, 229E, HKU1, and NL63. To study humoral immune responses specific to SARS-CoV-2, it is imperative to identify peptides that may enable discrimination between exposure to SARS-CoV-2 and other HCoVs. We used a high-density peptide microarray and plasma samples collected at two time points from 50 subjects with SARS-CoV-2 infection confirmed by qPCR, samples collected in 2004-2005 from 11 subjects with IgG antibodies to SARS-CoV-1, 11 subjects with IgG antibodies to other seasonal human coronaviruses (HCoV), and 10 healthy human subjects. The peptide microarrays consist of ~172000 12-mer peptide probes spanning whole viral genomes. This dataset consists of reactivity values for individual peptides from all 132 plasma samples for both IgG and IgM. Also made available is the correspondence key that provides a reference for all viral genome sequences that a peptide originated from.


Each peptide array is divided in 12 subarrays, with each subarray comprising ~172,000 twelve-amino acid (aa) nonredundant linear peptides that tile the proteomes of known HCoVs with 11 amino acid overlap. A total of 132 plasma samples were tested using eleven 12-plex peptide arrays. If there are three or more continuous reactive peptides (with reactivity values above the threshold of 10000) in samples, then those peptide sequences are reassembled to constitute an epitope. The length of an epitope depends on the number of such continuous reactive peptides.

Usage notes

Aggregate data for all samples:

In peptide microarray, text files containing peptide probe reactivity values for individual samples are generated as tab-delimited text files. Data from these files was aggregated into CSV files for IgG and IgM. Each row contains probe sequences and their unique identifiers that can be matched with their reference in correspondence key for its source, followed by reactivity values for all 132 samples. 



Correspondence key and header map:

The cross-reactivity among human coronaviruses is because of the highly conserved nature of their genomes. Because of this, it is imperative to map 12-mer peptides to their source for proper epitope reassembly.  Correspondence key file containing probes sequences, their unique identifiers and the source makes it possible to detect the longest epitope sequence. The header map file acts as an additional layer of information with details of protein names and their locations in the genome.



Epitope re-assembly

The software for epitope re-assembly is an R script that takes either IgG or IgM as an argument and makes separate epitope files for all coronaviruses.


Multidimensional analysis