MHC class II genes mediate susceptibility and resistance to coronavirus infections in bats
Data files
May 10, 2023 version files 500.93 KB
-
Hip_community-CoV-data_Schmidetal2023MolEcol.csv
129.60 KB
-
Hip_community-data_Schmidetal2023MolEcol.csv
102.30 KB
-
Hip_MHC-CoV-data_Schmidetal2023MolEcol.csv
245.14 KB
-
Hip_MHCII-DRB2_sequences_Schmidetal2023MolEcol.xlsx
20.07 KB
-
README.md
3.82 KB
Abstract
Understanding the immunogenetic basis of coronavirus (CoV) susceptibility in major pathogen reservoirs, such as bats, is central to infer their zoonotic potential. Members of the cryptic Hipposideros bat species complex differ in CoV susceptibility, but the underlying mechanisms remain unclear. The genes of the major histocompatibility complex (MHC) are the best understood genetic basis of pathogen resistance, and differences in MHC diversity are one possible reason for asymmetrical infection patterns among closely related species. Here, we aimed to link asymmetries in observed CoV (CoV-229E, CoV-2B, and CoV-2Bbasal) susceptibility to immunogenetic differences amongst four Hipposideros bat species. From the 2,072 bats assigned to their respective species using the mtDNA cytochrome b gene, members of the most numerous and ubiquitous species, Hipposideros caffer D, were most infected with CoV-229E and SARS-related CoV-2B. Using a subset of 569 bats we determined that much of the existent allelic and functional (i.e., supertype) MHC DRB class II diversity originated from common ancestry. One MHC supertype shared amongst all species, ST12, was consistently linked to susceptibility with CoV-229E, which is closely related to the common cold agent HCoV-229E, and infected bats with ST12 had a lower body condition. The same MHC supertype was connected to resistance to CoV-2B, and bats with ST12 were less likely be co-infected with CoV-229E and CoV-2B. Our work suggests a role of immunogenetics in determining CoV susceptibility in bats. We advocate for the preservation of functional genetic and species diversity in reservoirs as means of mitigating the risk of disease spillover.
Methods
Bats were live-trapped at five locations with one to three roosting sites in 12 two-month-long capture periods between September 2010 and August 2012 in Central Ghana, West Africa (Figure 1A). If possible, all bats were classified to species level using morphological characteristics. Morphometric details such as forearm length or weight were described elsewhere (Baldwin et al., 2021). Two minimally invasive wing punches (3mm) were taken from each bat and stored in molecular-grade ethanol at -20°C for DNA extraction. Additionally, faecal samples were collected and stored in RNAlater at -80°C for virus and microbiome screening.
DNA extraction was performed using wing punch tissue from 2,072 bats of the 6,654 bats assigned to the H. caffer complex or H. abae. The extraction followed an ammonium acetate protocol (Nicholls et al., 2000). Building on previous primer designs for the Hipposideros species complex (Vallo et al., 2008), the mtDNA cytochrome b gene (cytb) was amplified by polymerase chain reaction (PCR) using adapted primers suitable for high-throughput Illumina sequencing. After sequencing, the cytb gene was confirmed by homology analysis using the NCBI BLAST search. Subsequently, all sequences were analysed in Geneious 11.1.5 and assigned to the lineages B, C, or D of the H. caffer species complex (henceforth called species, Baldwin et al., 2014, 2021; Vallo et al., 2008) or the sympatric species H. abae using the MAFFT alignment tool (Katoh & Standley, 2013).
RNA was purified from approximately 20 mg of faecal material suspended in 500 µl RNAlater stabilizing solution using the MagNA Pure 96 system (Roche, Penzberg, Germany) with elution volumes set at 100 µl. We used a real-time reverse transcription-PCR assay designed to detect several alpha- and beta-CoVs and genetically related bat CoVs using the SSIII RT-PCR kit (Life Technologies, Karlsruhe, Germany) and a cycling protocol in a LightCycler 480 (Roche, Penzberg, Germany) as described previously (Corman et al., 2015; Drexler et al., 2009; Pfefferle et al., 2009). Bats were categorised as positive for a specific CoV if the CT-value was equal to or smaller than 38.0 (Corman et al., 2015).
A 171 bp fragment within the MHC class II DRB exon 2 loci of 575 Hipposideros samples was amplified using primers modified from (Schad et al., 2011). Hipposideros samples were MHC-genotyped using an Illumina platform (see Supplementary Material). The samples collectively reached beyond the threshold sample size (>200) suggested for wild populations (Gaigher et al., 2019). The MHC class II DRB exon 2 sequences were analysed using the genotyping pipeline ACACIA (Allele CAlling proCedure for Illumina Amplicon sequencing data; (Gillingham et al., 2021). The ACACIA workflow and post hoc elimination of singletons and alleles with low reliability, preserved reliable MHC allele information for a total of 569 bats. Allelic MHC diversity was then grouped into functional supertypes based on shared amino acid motifs at positively selected sites (PSSs), following the assumption that PSSs likely belong to, or are closely linked to, functionally important antigen-binding-sites (Roved et al., 2022; Schwensow et al., 2019; Sepil et al., 2013). Positively selected sites were identified for each species separately using CODEML integrated into the program PAML4 (Yang, 2007). Four PAML models were tested: M1a (nearly neutral), M2a (positive selection), M7 (beta), and M8 (beta and ω). M2a and M8 performed equally well, evidencing selection acting on specific sites. For subsequent supertype assignment across species, a total of 14 ‘consensus’ PSS sites were selected based on being a) identified by both M2a and M8 with a posterior probability of at least 95% and b) selected in at least 3 of the 4 species. Antigen-binding specificity can be quantified by z-values describing the physio-chemical properties of amino acids encoded by the codon present at PSSs (Sandberg et al., 1998). A matrix containing the z values of each allele’s PSS amino acid was used in the functions find.clusters and dapc of the ‘adegenet’ R package (Jombart et al., 2010) to cluster alleles into groups (i.e. MHC supertypes or STs) with similar binding functionality.
This was the methodological backbone of the data uploaded here. For more information please refer to the open-access publication and its supplementary material.
Usage notes
We provide three datasets (Hip_MHC-CoV-data_Schmidetal2023MolEcol.csv; Hip_community-data_Schmidetal2023MolEcol.csv; and Hip_community-CoV-data_Schmidetal2023MolEcol.csv) as CSV format. They form the backbone of the analysis performed in the associated R Markdown (Schmid.Meyer_MolEcol23_Rscript.Rmd).
All statistical data analyses were computed in the open-access platform R.
Additionally, we provide an Excel file (Hip_MHCII-DRB2_sequences_Schmidetal2023MolEcol) with four sheets listing the MHC class II DRB exon 2 allele sequences identified among the four hipposiderid species.