Data from: Parasite prevalence in a social host has colony-wide impacts on transcriptional activity and survival
Data files
Oct 23, 2025 version files 316.16 MB
-
BLAST_output_complete.tsv
193.78 MB
-
BLAST_output.tsv
39.81 MB
-
cestode_degscript_complete.R
20.86 KB
-
cestode_gene_count_matrix.csv
792.21 KB
-
colony_all.csv
805 B
-
demo_data.csv
14.39 KB
-
eggnog_output.tsv
1.70 MB
-
GO_genelevel_filtered.tsv
6.51 MB
-
GO_terms.tsv
5.61 MB
-
infected_degscript.R
23.52 KB
-
JAUOZQ01.1.annotation_BRAKER3-v1.0.gtf
49.46 MB
-
MM_v50qswcl.emapper.annotations.tsv
11.06 MB
-
parrate_gene_count_matrix.csv
1.92 MB
-
parrate_transcript_count_matrix.csv
5.41 MB
-
readfile_cestodes.csv
423 B
-
readfile_combined.csv
931 B
-
readfile_infected.csv
427 B
-
readfile_uninfected.csv
421 B
-
README.md
8.14 KB
-
Script_survival_analysis.R
5.24 KB
-
uninfected_degscript.R
17.99 KB
Abstract
Parasites pose significant challenges not only to individual hosts but also to entire social groups. We investigated the effects of parasitism by the cestode Anomotaenia brevis on colonies of its intermediate host, the ant Temnothorax nylanderi. We evaluated changes in worker and queen survival rates and transcriptional activity in the fat body of infected and uninfected workers, as well as in the parasite itself, in relation to infected worker prevalence and colony size. Cestode-infected workers are known to exhibit a significantly extended lifespan compared to uninfected workers. Here, we demonstrate that the survival rates of infected workers, uninfected queens, and uninfected workers decrease with increasing infected worker prevalence and increase with colony size. Transcriptomic analysis revealed stress-related signatures in all workers, regardless of infection status, as infection prevalence increased. Moreover, gene expression patterns, particularly in uninfected workers, were strongly influenced by colony size. The transcriptional activity of the parasitic cestode also shifted with infected worker prevalence, highlighting the complex dynamics of host-parasite interactions. These results demonstrate that parasites in social species impose colony-wide impacts that extend beyond infected individuals, even in the absence of direct cross-nestmate infection risks. Moreover, the consequences of parasitism can be modulated by colony size.
Dataset DOI: 10.5061/dryad.8cz8w9h3b
Description of the data and file structure
Transcriptomics data
RNA was extracted using the Qiagen RNeasy extraction kit, RNA extraction failed for a single sample from an uninfected colony (colony F; table S1). We sent the rest of the samples to Novogene (Cambridge, UK) for mRNA library preparation using a NEBNext Ultra RNA Library Prep Kit. For cestode samples, a SMARTer amplification step was performed to accommodate low RNA content. The libraries were sequenced on an Illumina Novaseq 6000 in paired-end 150 bp mode, yielding a minimum of 9 Gb per sample. RNA-seq data for four of the uninfected workers were also utilized in Sistermans et al. (2023), which were obtained using the same protocol.
We mapped the cestode RNA reads using STAR v 2.7.10b with default parameters (Dobin et al., 2013) onto a concatenated reference assembly, combining the cestode genome and the T. nylanderi genome assembly (Jongepier et al., 2022). This approach enabled us to filter out contaminant reads that mapped to the ant genome. We then constructed a gene count matrix using htseq-count (Zanini et al., 2022) with default parameters including the discarding of multiply mapped. For functional annotation, we BLASTed the longest CDS isoform of each gene against the non-redundant invertebrate database (retrieved august 10th 2021) using BLAST diamond (Buchfink et al., 2014), and obtained Gene Ontology (GO) terms for these isoforms using eggNOG mapper against the eggNOG 5 database (Cantalapiedra et al., 2021).
For the ant RNA libraries, we first removed contaminant sequences from A. brevis, Homo sapiens, and Escherichia coli, as well as adaptors and vectors using fastqscreen v0.14.0 (Wingett & Andrews, 2018). The remaining sequences were trimmed down to 120 bp using fastp (Chen et al., 2018) with an N-base cut-off of 15 and mapped against a genome assembly of T. nylanderi (Jongepier et al., 2022) with hisat2 v2.1.0 (Kim et al., 2015) using the "--downstream-transcriptome-assembly" parameter. We then obtained a genome-guided transcriptome assembly and gene count matrix with stringtie v1.3.6 (Pertea et al., 2015). Functional annotation was performed on the transcriptome assembly as described above for the cestode.
GTF annotation
We generated a reference genome assembly for the cestode A. brevis. We extracted high-molecular-weight DNA from a pool of 29 cestodes isolated from a single infected worker. An ultra-low input library was prepared and sequenced using a PacBio Sequel II system. HiFi reads were called using DeepConsensus v1.2.0 (Baid et al., 2023) and assembled using Flye v2.9.2 (Kolmogorov et al., 2019), with the option "--pacbio-raw" to address sequence diversity within the sample, as well as options "-i 4", "--no-alt-contigs" and "--scaffold." The resultant assembly was deposited in the NCBI database under accession number JAUOZQ01. The assembly was annotated with BRAKER v3.0.3 (Brůna et al., 2021) in native mode using A. brevis RNA-seq data—from this study, Sistermans et al., 2023, and Stoldt et al., 2021—and from all nuclear protein sequences of the Cyclophyllid family available on NCBI (accessed on August 13, 2023).
Survival analysis
Re-analysis of data from Beros et al. 2021.
Files and variables
File: infected_degscript.R
Description: R transcriptomics script for infected ants
File: uninfected_degscript.R
Description: R transcriptomics script for uninfected ants
File: parrate_gene_count_matrix.csv
Description: Gene count matrix for all ants, the first column names all the gene names, all other columns are the gene counts for each sample (sample names are in the first row)
File: readfile_combined.csv
Description: Readfile containing metadata for all ants. column 1: sample name, column 2: proportion of infected ants in the colony from which the sample was taken, column 3: colony ID, column 4: colony size in number of total adult ants disregarding the queen, column 5: number of cestodes the ant contains, column 6: whether the RNA was isolated in 2020 (a) or 2022 (b), column 7: whether the ant was infected or not with the cestode.
File: readfile_infected.csv
Description: Readfile containing metadata for infected ants. Column 1: sample name, column 2: proportion of infected ants in the colony from which the sample was taken, column 3: colony ID, column 4: colony size in number of total adult ants disregarding the queen, column 5: number of cestodes the ant contains
File: parrate_transcript_count_matrix.csv
Description: Transcript count matrix all ants. First column contains all transcript names, every other column transcript counts of corresponding transcripts (sample names are in the first row).
File: readfile_uninfected.csv
Description: Readfile containing metadata for uninfected ants. Column 1: sample name, column 2: proportion of infected ants, column 3: colony ID, column 4: colony size, column 5: whether the RNA was isolated in 2020 (a) or 2022 (b).
File: GO_genelevel_filtered.tsv
Description: GO terms of all ant genes
File: eggnog_output.tsv
Description: Eggnog output for ants including GO terms
File: BLAST_output_complete.tsv
Description: BLAST output for all ant genes
File: cestode_degscript_complete.R
Description: R transcriptomics script for infected cestodes.
File: GO_terms.tsv
Description: GO terms for all cestode genes
File: BLAST_output.tsv
Description: BLAST output for all cestode genes
File: readfile_cestodes.csv
Description: Readfile containing metadata for all cestodes. Column 1: sample name, column 2: proportion of infected ants in the colony from which the sample was taken, column 3: colony ID, column 4: colony size in number of total adult ants disregarding the queen, column 5: number of cestodes sequenced in the sample.
File: cestode_gene_count_matrix.csv
Description: Gene count matrix for all cestode genes. First column contains all transcript names, every other column gene counts of the corresponding transcripts (sample names are in the first row).
File: MM_v50qswcl.emapper.annotations.tsv
Description: Eggnog output for cestodes including GO terms
File: colony_all.csv
Description: Colony information for the survival analysis (derived from Beros et al. 2021). Column 1: Colony ID, column 2: number of infected ants in the colony, column 3: total number of workers.
File: demo_data.csv
Description: Survival data for the survival analysis (derived from Beros et al. 2021). Column 1: colony ID of the sample, column 2: whether the ant is infected (infected), freshly emerged from pupa (callow), uninfected adult (old), or a queen (queen) by the time of the first observation, column 3: Age of death of the ant in days, column 4: whether the ant is alive at the end of the experiment (0) or dead (1)
File: Script_survival_analysis.R
Description: R script survival analysis
File: JAUOZQ01.1.annotation_BRAKER3-v1.0.gtf
Description: GTF annotation for Anomotaenia brevis genome
Code/software
All data was analysed using R (version 4.3.2 for the transcriptomics analysis and version 4.2.3 for the survival analysis). Most other files were thus also opened on R. The exception was the GTF file containing the annotation which was used for the pre-processing of raw reads of* Anomotaenia brevis*.
Access information
Other publicly accessible locations of the data:
- Raw RNA seq reads are found on NCBI under BioProject ID PRJNA1246159
Survival data was derived from:
- Beros, S., Lenhart, A., Scharf, I., Negroni, M. A., Menzel, F., & Foitzik, S. (2021). Extreme lifespan extension in tapeworm-infected ant workers. Royal Society Open Science, 8(5). https://doi.org/10.1098/rsos.202118
