Data from: The role of a viral symbiont in the thermal mismatch of host-parasitoid interactions
Data files
Mar 05, 2026 version files 70.75 MB
-
ccbv_annot_IDs_goterms.csv
42.59 KB
-
CcBV_F_allContrasts_merged_ANNOTATED.csv
43.78 KB
-
CcBV_L_allContrasts_merged_ANNOTATED.csv
43.88 KB
-
Ccmaster_coldata.csv
18.97 KB
-
ccwide_counts.csv
54.54 KB
-
cds_CcBV_counts_matrix.mx
53.92 KB
-
ffield_tot_immunedata.csv
12.21 KB
-
lab_tot_immunedata.csv
15.88 KB
-
master_coldata.csv
19.59 KB
-
ms_annot_IDs_goterms_man_annot.csv
5.43 MB
-
ms_annot_IDs_goterms.csv
3.45 MB
-
ms_F_allContrasts_merged_ANNOTATED.csv
12.37 MB
-
ms_L_allContrasts_merged_ANNOTATED.csv
12.31 MB
-
ms_ncbi_dataset_allIDs.tsv
2.81 MB
-
Ms_transcript_counts_matrix.mx
14.11 MB
-
mswide_counts.csv
6.64 MB
-
README.md
14.67 KB
-
refseq_annotations_ms_GCF_014839805.1_complete.tsv
7.06 MB
-
Supplementary_datafile_genecategories.xlsx
2.95 MB
-
term2gene_ccbv.csv
1.96 KB
-
term2gene_ms.csv
1.75 MB
-
term2name_ccbv.csv
1.10 KB
-
term2name_ms.csv
1.50 MB
-
tricem_immune_data_combined.csv
23.83 KB
-
tricem_tot_rnaData_-_RNA_data.csv
13.48 KB
Abstract
High temperature events are becoming more severe with climate change, altering species interactions and ecological networks. Symbionts can influence the thermal tolerance of their hosts, yet the mechanisms underlying these effects are poorly understood. We tested the impact of a high temperature event on the molecular interactions among a caterpillar host, Manduca sexta, its parasitoid wasp, Cotesia congregata, and the wasp’s symbiotic virus. As inmany host–parasitoid systems, high temperatures are lethal to developing parasitoids, but not hosts. Typically, the parasitoid’s viral symbiont immunosuppresses M. sexta. Here, we show that elevated temperatures led to an impairment of this immunosuppression, persisting for days after the event ended. Viral gene expression in the host was altered by heat, with distinct expression patterns tied to the virus’s genomic architecture. Specifically, viral transcription varied according to the gene’s position on viral circular genomic segments: genes located on circles known to integrate into host DNA exhibited increased or unchanged expression following high temperature exposure, while genes on nonintegrating circles showed marked reductions in expression. These results demonstrate that high temperatures can disrupt parasitic immunosuppression, which could help explain the lower thermal tolerance of parasitoids relative to hosts. The genomic structure of the viral symbiont may be associated with these effects, but additional research is needed to evaluate this hypothesis. Our findings highlight the importance of complex interactions between environmental temperature, microbial symbionts and host immunity in the ecological responses of host–parasitoid systems to high temperature events.
https://doi.org/10.5061/dryad.47d7wm3p0
This submission contains immune and RNAseq data; code for data cleaning, analysis, statistics, and figure generation; all annotation and other input files needed for the running of the code (some of which is sourced from NCBI database, appropriately cited throughout the submission); and a collated table for M. sexta and CcBV gene synonyms and functional characterizations used in the making of the associated manuscript.
Description of the data and file structure
Files:
There are 3 scripts included - one for cleaning of RNAseq matrices, one for running DESeq analyses of the cleaned matrices, and finally one for generating figures and statistics for both RNAseq and immune assays.
We have included all data files necessary to generate figures. If the user wishes to change inputs, they can start with the cleaning script (1) or proceed to the deseq script (2) to change deseq analysis inputs. User must run script 2 in order for script 3 to run (3 relies on Rdata files generated in 2).
Lastly, we have included a table (Supplementary_datafile_genecategories.xlsx) which details the protein and transcript IDs (where appropriate) and their species-specific synonyms, with functional characterizations.
- for this file, there are multiple sheets, one for M. sexta, one for CCBV, and one readme. Column descriptions are listed in 'readme' tab.
1) tricem_rnaseq_cleaning.Rmd - * output from this file is also included in the dryad upload, so skip this step unless you want to change things associated with the cleaning of raw transcript matrix counts.
this script takes counts matrices for M. sexta and CcBV (output from FeatureCounts) and creates datasets in the proper format to be piped into the DESeq script. This script will create several output files - one for each population and one combined each for M. sexta and CcBV. There are 3 files to input into the cleaning script:
- cds_CcBV_counts_matrix.mx
- This file contains the counts matrix for CcBV aligned sequences
- Ms_transcript_counts_matrix.mx
- This file contains the counts matrix for M. sexta aligned sequences
- tricem_tot_rnaData_-_RNA_data.csv
- This is the master spreadsheet with treatment group information for all libraries
- MISSING DATA: empty values for date.hatch and wt.3rd columns are missing data that exist but were not collected. Empty values for all other columns are not applicable (e.g., number of ovipositions for non-parasitized caterpillars, or time of removal from heat shock for control caterpillars, etc.)
- Column information
- Sample_ID unique sample identifier with lab and field identifying information
- expt lab or field population experiment
- sample.num unique sample identifier internal to lab and field experiments
- origin population origin. CC = Central Crops Research Station, RM = Rocky Mount Research Station, MF = Mason Farm preserve, LAB = lab population.
- temp.treat treatment temperature. Unit is degrees Celsius.
- para.status parasitism status. NP = no parasitism, P = parasitized
- date.hatch date the egg hatched, if known
- date.3rd date of molt to 3rd instar
- wt.3rd mass (mg) at molt to 3rd instar
- date.inj.or.ovp date of sham injection (NP) or parasitism oviposition (P)
- time.inj.or.ovp time of sham injection (NP) or parasitism oviposition (P)
- num.ovp number of observed oviposition events by C. congregata females into caterpillar. Empty values in this column indicate no ovipositions (for NP groups)
- date.HS.in date the heat shock treatment started. Empty values in this column indicate no heat shock treatment (for 25C control group)
- time.HS.in time placed into heat shock. Empty values in this column indicate no heat shock treatment (for 25C control group)
- date.HS.out date the heat shock treatment ended. Empty values in this column indicate no heat shock treatment (for 25C control group)
- time.HS.out time of removal from heat shock treatment. Empty values in this column indicate no heat shock treatment (for 25C control group)
- date.freeze date the sample was frozen for later RNA extraction
- time.freeze time sample was frozen
- date.soak date the sample was soaked in RNAlater solution
- date.FB.dissect date the sample was removed from RNAlater solution and fat body was dissected
- date.RNA.extract date RNA was extracted from fat body
- date.library date the library was prepared for transcriptomics.
2) tricem_cleaned_deseq.Rmd
This script will take the cleaned rna data from the cleaning script and run differential expression analyses. To be thorough, we ran pairwise and interactive analyses, but proceeded with interactive in the manuscript because it was most informative (patterns of expression were the same regardless of pairwise vs interactive analyses).
This script will output several files to be used in the figure and statistics script (below). If you'd like to skip this step, you can proceed ahead with the output files also included in dryad, but will need to run this script for the .Rdata files it produces. The output files have with library information, annotation information, and differential expression information for M. sexta and CcBV, one for each population
- ms_L_allContrasts_merged_ANNOTATED.csv
- ms_F_allContrasts_merged_ANNOTATED.csv
- CcBV_L_allContrasts_merged_ANNOTATED.csv
- CcBV_F_allContrasts_merged_ANNOTATED.csv
- MISSING DATA: NA values in these datasets indicate gene was not detected at a high enough level in either treatment group to have differential expression.
- COLUMN INFORMATION: transcript_accession is the Genbank transcript accession associated with the read. Subsequent columns show mean number of reads (baseMean), log2Fold change (log2FoldChange), fold change standard error (lfcSE), F statistic (Fstat), p value (Fpvalue), and adjusted p value (Fpadj) for each DEseq term (40P treatment versus 25P, 25P vs. 25NP, etc.)
Input files:
- ccwide_counts.csv and mswide_counts.csv - datasets with read counts for each gene for each sample, wide format. 'cc' refers to Cotesia congregata bracovirus, 'ms' Manduca sexta. These were output from script 1. Columns indicate individual experimental samples.
- master_coldata.csv and Ccmaster_coldata.csv - master metadata files for M. sexta and Cotesia congregata bracovirus, respectively (includes treatment conditions, other experimental data). These were output from script 1.
- MISSING DATA: empty values for date.hatch and wt.3rd columns are missing data that exist but were not collected. Empty values for all other columns are not applicable (e.g., number of ovipositions for non-parasitized caterpillars, or time of removal from heat shock for control caterpillars, etc.)
- COLUMN INFORMATION: same as tricem_tot_rnaData_-_RNA_data.csv, except with the addition of column bugID which indicates whether the sample was from the lab population experiment, field population experiment, or a subset of individuals from the lab population included in the field experiment for comparison purposes
- ccbv_annot_IDs_goterms.csv - has annotation information for CcBV transcripts
- MISSING DATA: empty or NA values indicate no hit (e.g., no GO term, no synonym for the gene, etc)
- COLUMN INFORMATION:
- Description gene name
- category bracovirus gene category
- cat_ALT_BEN subcategory indicating BEN domain, if applicable
- synonym gene synonym, if applicable
- integrating whether the gene is located on an integrating or non-integrating DNA circle
- circle CcBV circle number the gene is located on
- e.Value e value for NCBI search
- GO.IDs GO numeric identifiers
- GO.Names GO ID names
- Enzyme.Codes enzyme codes, if applicable
- Enzyme.Names enzyme names, if applicable
- InterPro.IDs Interpro IDs, if known
- InterPro.GO.IDs combination of Interpro and GO IDs
- InterPro.GO.Names combination of Interpro and GO ID names
- transcript_accession transcript accession for gene
- GENE FAMILY duplicate column. empty
- NOTES notes, if applicable
- ms_annot_IDs_goterms_man_annot.csv - annotation information for m. sexta after confirming via searching protein ids against database from He and colleagues
- MISSING DATA: empty or NA values indicate no hit (e.g., no GO term, no synonym for the gene, etc)
- COLUMN INFORMATION:
- Symbol gene name shorthand
- GO_terms GO IDs
- NAME gene name full
- Gene.ID Gene ID from NCBI
- Protein.accession NCBI protein accession
- transcript_accession NCBI transcript accession
- go_description GO description
- Synonyms gene synonyms if applicable
- SEXTA-SPECIFIC-CONFIRMED gene function if specifically known from M. sexta
- NON-SEXTA-SPEC-BLAST gene function predicted from BLAST search from other organisms, not M. sexta specific
- ms_annot_IDs_goterms.csv - annotation information (go terms, descriptions, etc) for m. sexta collated from:
- ms_ncbi_dataset_allIDs.tsv and refseq_annotations_ms_GCF_014839805.1_complete.tsv - have annotation information for m.sexta
- If the user wants to rerun the go search, they can use the above 2 files to do so. Otherwise, the ms_annot_IDs_goterms.csv has already done this
- MISSING DATA: empty or NA values indicate no hit (e.g., no GO term, no synonym for the gene, etc)
- COLUMN INFORMATION: same as ms_annot_IDs_goterms_man_annot.csv
3) Tricem_figs_stats_combined.Rmd
This script runs statistics and figure generation for the manuscript (all main figures and analyses, as well as supplemental). This includes immune assays and RNAseq.
Data inputs for this script:
- ffield_tot_immunedata.csv
- This file contains data for immune assays for the field population hosts
- COLUMN INFORMATION:
- sample.num unique sample identifier
- temp.treat.x treatment temperature in degrees Celsius
- para.status.x parasitism status (NP = no parasitism, P = parasitized)
- origin.x population origin. CC = central crops research station, RM = rocky mount research station, MF = Mason Farm preserve
- avg.field.cntrl.SD standard deviation of area encompassing control non-injected side of implant (pixels), averaged from 2 images taken of the same implant rotated 90 degrees between each photograph.
- avg.field.cntrl.GV gray value of area encompassing control non-injected side of implant (0 = pure black, 255 = pure white), averaged from 2 images taken of the same implant rotated 90 degrees between each photograph.
- avg.field.cntrl.area area (pixels) encompassing control non-injected side of implant, averaged from 2 images taken of the same implant rotated 90 degrees between each photograph.
- temp.treat.y duplicate column, artifact of merging encapsulation and capsule melanization datasets together. treatment temperature in degrees Celsius
- para.status.y duplicate column, artifact of merging encapsulation and capsule melanization datasets together. parasitism status (NP = no parasitism, P = parasitized)
- origin.y duplicate column, artifact of merging encapsulation and capsule melanization datasets together. population origin. CC = central crops research station, RM = rocky mount research station, MF = Mason Farm preserve
- avg.field.trt.SD standard deviation of encapsulated area (pixels), averaged from 3 images taken of the same implant rotated 90 degrees between each photograph.
- avg.field.trt.GV gray value of encapsulated area (0 = pure black, 255 = pure white), averaged from 3 images taken of the same implant rotated 90 degrees between each photograph.
- avg.field.trt.area encapsulated area (pixels), averaged from 3 images taken of the same implant rotated 90 degrees between each photograph.
- expt experiment (lab or field population M. sexta)
- end.area encapsulated area, calculated by subtracting the control area from the injected side of the implant to give just the area encompassed by encapsulating hemocytes
- end.SD standard deviation of encapsulated area, calculated by subtracting the control SD from the injected side of the implant to give just the SD from encapsulating hemocytes
- end.GV gray value of encapsulatign hemocytes, calculated by subtracting the control GV from the injected side of the implant to give just the GV encompassed by encapsulating hemocytes
- lab_tot_immunedata.csv
- This file contains data for immune assays for the lab population hosts
- tricem_immune_data_combined.csv includes combined immune assay data from the lab and field populations. Was not used in this script, but provided here in case potential users need it. Data is the same as from lab_tot_immunedata.csv and ffield_tot_immunedata.csv
- COLUMN INFORMATION: same as ffield_tot_immunedata.csv, except column names have 'lab' instead of 'field' to preserve data from each experiment
- Term2name files
- One for manduca and one for ccbv; for go enrichment (term2name_ccbv.csv, term2name_ms.csv for CcBV and Manduca sexta respectively)
- COLUMN INFORMATION: GO.IDs GO numeric identifiers; GO.Names description of GO term
- Term2gene files
- One for manduca and one for ccbv; for go enrichment (term2gene_ccbv.csv, term2gene_ms.csv for CcBV and Manduca sexta respectively)
- COLUMN INFORMATION: TERM GO IDs; GENE transcripts assigned associated GO terms
- Annotation files: ms_annot_IDs_goterms_man_annot.csv; ccbv_annot_IDs_goterms.csv (same file info as in the above script)
- All contrasts merged files - these were generated in the above script, but also loading them into dryad in case user wants to skip that step
- ms_L_allContrasts_merged_ANNOTATED.csv
- ms_F_allContrasts_merged_ANNOTATED.csv
- ccbv_L_allContrasts_merged_ANNOTATED.csv
- ccbv_F_allContrasts_merged_ANNOTATED.csv
- COLUMN INFORMATION listed in above section
- R data files from above script
- "interactive_deseq.RData"
- "pairwise_deseq.RData"
Code/Software
R code is provided for data cleaning of RNAseq data, differential expression analysis, and finally analysis and figure generation for immune assays and rnaseq.
