Data from: Virus community structuring is shaped by habitat heterogeneity and resource utilisation strategies
Data files
Jul 12, 2023 version files 4.03 GB
-
Dryad_McLeish_etal_raw_virome.txt
-
README.md
Abstract
After decades of disconcerted research, we still recognise large gaps in the understanding of mechanisms that govern disease dynamics in complex biological communities. To determine how spatial structuring of plant communities caused by anthropic disturbance affects resource utilisation traits of viruses, we combine high-throughput, network, and metacommunity approaches. We find that the disturbance gradient corresponded to network modules and habitat specificity exhibited by a majority of viruses. Communities were connected through key hub species of either generalist viruses or high potential host reservoirs. Spatial dependencies were evident in regression models of species richness and correlations between metacommunity structure and both host range and transmission mode at finer spatial resolutions. We propose that virus community assembly is influenced by variation in niche opportunities. Distinctions in virus community composition caused by resource compartmentalisation can be used to track ecological traits important in forecasting transmission risk.
Methods
We conducted 78 collections at 23 sampling sites between July 2015 and June 2017 in the Vega del Tajo-Tajuña agricultural region of the south-central plateau of the Iberian Peninsula.
Total RNA extractions were conducted for the detection of all virus genome types (+ssRNA, -ssRNA, dsRNA, ssDNA, and dsDNA). Individual RNA extracts were pooled by plant species and collection (i.e., a single instance of sampling at a study site) to obtain a single preparation for HTS. Together, the HTS sample comprised 323 libraries of 2,037 pooled individual samples.
High throughput sequencing was outsourced (CRG, Barcelona, Spain; http://www.crg.eu/). All library preparation included a step for rRNA depletion. Paired-end reads of 125 or 150 nt. were sequenced on Illumina HiSeq platforms. All reads were provided with Phred quality scores greater or equal to Q30. The trimming of adapter contamination was conducted using cutadapt v1.8.3.
To standardise the counts of OTUs, the following pipeline was applied to the BLAST query matches of each read library. Reads were retained only if: 1) the query coverage was 100%; 2) the alignment length was greater or equal to 125 nt.; 3) the matches were paired-reads only; 4) where each read-pair of a given library matched a single reference genome only; 5) and matched with a single region of that reference; and 6) where the difference between the maximum and minimum of query start positions relative to a reference genome, had a span of more than 1% of the length of the reference. Local BLAST queries were conducted with Blast+ version 2.2.29 to identify virus OTUs. The queries were conducted against a database of plant virus genomic references of +ssRNA, -ssRNA, dsRNA, ssDNA, and dsDNA available from NCBIs Viral Genome Browser. Information about virus and host taxonomy was merged with the BLAST query output. The data represent all raw query results before the application of the readible read filtering pipeline. The column headers of the data table consist of the following:
"lib" (sequencing library code), "ncbi" (NCBI accession code), "read" (read orientation), "pident" (standard BLAST flag), "length" (standard BLAST flag), "sstart" (standard BLAST flag), "send" (standard BLAST flag), "bitscore" (standard BLAST flag), "qcovs" (standard BLAST flag), "taxon" (host species name), "host.family" (host family name), "habitat" (habitat category), "site" (code of study site), "collection" (code of collection), "title" (virus OTU species title), "vir.abbr" (virus OTU abbreviation), "vir.family" (virus OTU family), "genera" (virus OTU genus), and "genome" (virus genome type).
Usage notes
A text editor or Microsoft Excel can be used to open the data.