Skip to main content

Association of SNPs in Microtus arvalis and clade infections by TULV-CEN.S and TULV-EST.S

Cite this dataset

Labutin, Anton; Saxenhofer, Moritz; White, Thomas; Heckel, Gerald (2021). Association of SNPs in Microtus arvalis and clade infections by TULV-CEN.S and TULV-EST.S [Dataset]. Dryad.


The natural host ranges of many viruses are restricted to very specific taxa. Little is known about the molecular barriers between species that lead to the establishment of this restriction or generally prevent virus emergence in new hosts. Here, we identify genomic polymorphisms in a natural rodent host associated with a strong genetic barrier to the transmission of the European Tula orthohantavirus (TULV). We analyzed the very abrupt spatial transition between two major phylogenetic clades in TULV across the comparatively much wider natural hybrid zone between evolutionary lineages of their reservoir host, the common vole (Microtus arvalis). A genomic scan of 79 225 Single Nucleotide Polymorphisms (SNPs) in 323 TULV infected host individuals detected 30 SNPs that were associated with specific TULV clades in two replicate sampling transects. Focusing the analysis on 199 voles with evidence of genomic admixture at the individual level (0.1 - 0.9) supported statistical significance for all 30 loci. Host genomic variation at these SNPs explained up to 37.6% of clade-specific TULV infections. Genes in the vicinity of associated SNPs are involved in functions related to immune response or membrane transport. This study demonstrates the relevance of natural hybrid zones as systems not only for studying processes of evolutionary divergence and speciation, but also for the detection of evolving genetic barriers for specialized parasites.


Host genotyping :

Genotyping by Sequencing (GBS) (Elshire et al., 2011) was carried out for all TULV infected common voles and additional samples from locations where no TULV infection was detected on the Illumina NextSeq platform at Cornell University. Restriction enzymes PstI and MspI were used to generate the libraries in 96-well plates. SNPs were identified and individuals genotyped simultaneously using the GBS v2 pipeline (part of the Tassel 5 software) (Glaubitz et al., 2014), using a chromosome-level M. arvalis genome assembly as the reference sequence (Gouy et al., in prep). Default parameters were used, except that a minimum of five reads were required to identify a unique tag. 

Genome wide association study using GEMMA:

The Genome-wide Efficient Mixed Model Association (GEMMA) analysis was conducted with the v.0.98.1 software (Zhou & Stephens, 2012). We ran GEMMA’s linear model on default parameters for the whole dataset of 323 infected individuals, as well as separately for the Bavaria transect (190 infected individuals) and Porcelain transect (133 infected individuals). Association strength of individual SNPs was estimated with GEMMA by calculating the Wald test p-value. Only SNPs, which were significantly (p < 0.05) associated with clade specific TULV infections across all three GWAS (Full data set, Bavaria transect only, Porcelain transect only) were considered for further analyses. In order to assess the potential impact of imputation on the GWAS results, we removed any of the 323 individuals, which had any missing data among the significant SNPs, which resulted in 165 remaining individuals. In order to further assess the possible impact of spatial autocorrelation of vole lineages and virus clades, we ran a separate GWAS including only the 199 admixed individuals with cluster membership between 0.1 and 0.9.


Swiss National Science Foundation, Award: 31003A_149585

Swiss National Science Foundation, Award: 176209