Seasonal dynamics of the wild rodent faecal virome
Data files
Jul 07, 2022 version files 3.02 MB
-
blastx-viruses-results.csv
-
families-host-range.csv
-
imputed_timeseries_data_Wytham_rodent_virome_2022-06.csv
-
MNKA_wytham_since_Nov2016.csv
-
README.txt
-
results-cleaning-blastx-contingency.csv
-
sample_per_pool.csv
-
WTCHG_contigs_blastxNR_results_virus_uniq.fasta
Abstract
Viral discovery studies in wild animals often rely on cross‐sectional surveys at a single time point. As a result, our understanding of the temporal stability of wild animal viromes remains poorly resolved. While studies of single host–virus systems indicate that host and environmental factors influence seasonal virus transmission dynamics, comparable insights for whole viral communities in multiple hosts are lacking. Utilizing noninvasive faecal samples from a long‐term wild rodent study, we characterized viral communities of three common European rodent species (Apodemus sylvaticus, A. flavicollis and Myodes glareolus) living in temperate woodland over a single year. Our findings indicate that a substantial fraction of the rodent virome is seasonally transient and associated with vertebrate or bacteria hosts. Further analyses of one of the most common virus families, Picornaviridae, show pronounced temporal changes in viral richness and evenness, which were associated with concurrent and up to ~3‐month lags in host density, ambient temperature, rainfall and humidity, suggesting complex feedbacks from the host and environmental factors on virus transmission and shedding in seasonal habitats. Overall, this study emphasizes the importance of understanding the seasonal dynamics of wild animal viromes in order to better predict and mitigate zoonotic risks.
Methods
Full methods are detailed in the manuscript (see preprint doi: https://doi.org/10.1101/2022.02.09.479684). An abridged version is provided below describing the data.
Study population: Wild rodents were trapped and sampled over a one-year period (January 2017 to January 2018) in Wytham Woods, a 385-ha mixed deciduous woodland near Oxford, UK. Three common rodent species are regularly caught at this site: two species of Apodemus mice (Apodemus sylvaticus and A. flavicollis, with A. sylvaticus more abundant) and the bank vole (Myodes glareolus). One night of trapping on a single c. 2.4ha trapping grid was carried out approximately fortnightly year-round. Small Sherman traps (baited with six peanuts, a slice of apple, and sterile cotton wool for bedding material) were set at dusk and collected at dawn the following day. Newly captured individuals were PIT-tagged for unique identification. Faecal samples were collected from the bedding material with sterilized tweezers and frozen at -80°C within 10 hours of trap collection. Traps that showed any sign of animal contact (traps that held captured animals and trigger failures where an animal has interfered with the trap but not been captured) were washed thoroughly with bleach in between trapping sessions to prevent cross-contamination. All live-trapping work was conducted with institutional ethical approval and under Home Office licence PPL-I4C48848E.
Sample selection and processing: We randomly selected 133 individual faecal samples (57 A. sylvaticus, 25 A. flavicollis, 51 M. glareolus). Five sampling intervals were defined, which took into account the breeding cycle of the three rodent species. These were 1) Jan–Feb 2017, 2) Mar–Apr 2017, 3) May–Jul 2017, 4) Aug–Oct 2017, 5) Nov–Jan 2017/18. Faecal samples were pooled by species and sampling interval, using equal aliquots of 40mg faeces per individual per pool. For the last sampling interval where there were fewer individuals of A. flavicollis and M. glareolus available (2 and 7 respectively), greater masses of faeces per individual (150mg and 70 mg respectively) were used for pooling to ensure sufficient material for sequencing.
The samples were processed as follows to enrich for RNA within encapsulated viruses: 1) Frozen archived faecal samples were first pooled, then suspended in DNA/RNA Shield Stabilization Buffer (Zymo), vortexed thoroughly, and supernatant was filtered through a 0.45nm pore filter ; 2) RNase treatment (RNase One) to remove non-encapsulated RNA from sample; 3) RNA extraction using Zymo Quick Viral RNA and RNA Clean and Concentrator 5 kits; 4) DNA digestion following RNA extraction; 5) ribosomal depletion with Illumina Ribo-Zero Plus kit, which allows for ribosomal RNA removal in human, mouse, rat, and bacterial samples, during sequencing library preparation. Sequencing library preparation, which included cDNA synthesis, and sequencing was carried out by the Oxford Genomics Centre on Illumina NovaSeq 6000 platform.
Viral genomes reconstruction: A total of 355,917,017 pair-end reads of 150 base-pairs (bp) were obtained after sequencing. Illumina adaptors were removed, and reads were filtered for quality scores ≥30 and read length >45bp) using cutadapt 1.18 (Martin, 2011). A total of 352,872,111 cleaned paired-end reads were de novo assembled into 435,021 contigs by MEGAHIT 1.2.8 with default parameters (D. Li et al., 2015). Viral contigs were identified by comparing the assembled contigs against the NCBI RefSeq viral database using DIAMOND 0.9.22 with an e-value cutoff of <10-5 (Buchfink et al., 2014). To eliminate false positives, all contigs that matched virus sequences were used as queries to perform reciprocal searches on NCBI non-redundant protein sequence database with an e-value cutoff of <10-5 (Altschul et al., 1990). We considered each viral contig as a viral operational taxonomic unit (vOTU). The abundance of each vOTU contig was assessed by iterative mapping reads against each contig using BOWTIE2 2.3.4.3 (Langmead, 2010) and BBMap 35.34 (Bushnell, 2014). For viral contigs corresponding to complete or nearly complete contigs, we examined Open Reading Frames (ORFs) using ORF finder (parameters: minimum ORF size of 300 bp, standard genetic code, and assuming there are start and stop codons outside sequences) in Geneious prime version 2019.1.1 (Kearse et al., 2012) to exclude misassembled genomes.
Virus abundance and diversity metrics: After assignment of contigs to vOTU, we normalised the abundance of contigs to the total reads and individuals used in a pool. To reduce the impact of contamination in our analyses, we excluded viral contigs with less than one read per 10 million. The abundance of viruses was then compared using normalised read abundance. Virus diversity was assessed using both the number of virus genera (hereafter ‘richness’) and virus genera evenness (calculated with Shannon diversity, hereafter ‘evenness’) using R library vegan.
Predictors of picornavirus richness and evenness: We evaluated drivers of two outcome variables – picornavirus richness and evenness – using Gaussian distributed generalised linear models (GLMs). We modelled picornaviruses in wood mice and bank voles separately, and only modelled these virus-host combinations as up to 6 picornaviruses were found and these hosts were sampled for viruses at each timepoint throughout the year. Four predictor variables with time series covering the preceding relevant seasons (June 2016–Dec 2016) and picornavirus characterization period (Jan 2017–Dec 2017) were used to identify significant environmental and population factors affecting picornavirus richness and evenness. Temperature, humidity, and rain data were collected hourly at two microclimate stations within the woodlands. Host population density for each species was measured by the minimum number known alive per hectare based on bimonthly trapping events across a 2.4ha grid between November 2016 to January 2018. Since predictor and outcome variables were calculated at different frequencies (daily to seasonally) - we used locally estimated scatterplot smoothing (LOESS) and generalised additive models (GAMs) to model a continuous estimate of each variable over the study period (June 2016–Jan 2018 for predictor variables; Jan 2017–Jan 2018 for outcome variables). Bimonthly estimates for Picornavirus richness, picornavirus evenness (calculated with Shannon diversity index in R library vegan), and host population density were inferred with LOESS, while bimonthly estimates for microclimate data (temperature, humidity, and rain) were inferred with GAMs.
Usage notes
All data table can be opened with Excel or R, and all code can be opened in R Studio.