Data from: Experimental horizontal transfer of phage-derived genes to Drosophila confers innate immunity to parasitoids
Data files
Jan 02, 2025 version files 5.07 GB
-
Fig1_rawdata.zip
5.07 GB
-
Fig2_rawdata.zip
26.84 KB
-
Fig5_rawdata.zip
10.79 KB
-
README.md
5.51 KB
Abstract
Metazoan parasites have played an outsized role in shaping innate immunity in animals. Insects are excellent models for illuminating the strategies that animals evolved to neutralize such enemies, including nematodes and parasitoid wasps. One such strategy relies on endosymbioses between insects and bacteria that express phage-encoded toxins as well as horizontal transfer of the genes that encode the toxins to insects. Here, we used genome editing in Drosophila melanogaster to recapitulate the evolution of two of these toxin genes — cytolethal distending toxin B (cdtB) and apoptosis inducing protein of 56kDa (aip56) — that were horizontally transferred likely from phages of endosymbiotic bacteria to insects millions of years ago. We found that a cdtB::aip56 fusion gene (fusionB), which is conserved in D. ananassae subgroup species, dramatically promoted fly survival and suppressed parasitoid wasp development when heterologously expressed in D. melanogaster immune tissues. We found that FusionB was a functional nuclease and was secreted into the host hemolymph where it targeted the parasitoid embryo’s serosal tissue. Although the killing mechanism remains unknown, when expressed ubiquitously, fusionB resulted in delayed development of late stage fly larvae and eventually killed pupating flies. These results point to the salience of regulatory constraint in mitigating autoimmunity during the domestication process following horizontal transfer. Our findings demonstrate how horizontal gene transfer can instantly provide new, potent innate immune modules in animals.
README: Experimental horizontal transfer of phage-derived genes to Drosophila confers innate immunity to parasitoids
https://doi.org/10.5061/dryad.q573n5tsh
Description of the data and file structure
The following repository contains raw data for genomic analyses and count-based experiments described in the manuscript "Experimental horizontal transfer of phage-derived genes to Drosophila confers innate immunity to parasitoids"
Authors: Rebecca L. Tarnopol*, Josephine Tamsil, Gyöngyi Cinege, Ji Heon Ha, Kirsten I. Verster, Edit Ábrahám, Lilla B. Magyar, Bernard Y. Kim, Susan L. Bernstein, Zoltán Lipinszki, István Andó, Noah K. Whiteman
Please cite the associated manuscript: https://doi.org/10.1016/j.cub.2024.11.071
Files and variables
File: Fig1_rawdata.zip
Description: Contains raw data for generation of Figure 1 and related supplementary figures
> ananassae_final_wholegenomes.zip contains all files used for genome annotation using Cactus, CAT, and manual curation at the HGT locus, used for Figure 1. Contents of the zip file are:
> reference (D. ananassae reference genome for Cactus/CAT annotation)
> genomes (genome sequences for all ananassae subgroup spp. tested)
> commands.sh (Cactus and CAT commands for genome annotation)
> ananassae.txt (contains species tree in newick format and filenames of genome assemblies)
> ananassae.treefile (species tree)
> ananassae_CAT.config (code for genome annotation pipeline)
> ananassae-001.hal (whole genome alignments from Cactus, can be viewed with the Comparative Genomics Toolkit)
> HGT Loci gff files (curated annotations of the HGT loci in each of the ananassae *species, contains only contigs with *cdtB or *fusion *genes)
> proteinphylogenies contains all files used for generating the protein trees seen in Fig 1A and Fig S1. Contents of the zip file are:
> 240313_allcdtbs_NEW.aln (untrimmed CdtB protein alignment)
> 240314_allalnaip.aln (untrimmed AIP56 protein alignment)
> 240313_allcdtbs_NEW_kpicgappy.fa (trimmed CdtB protein alignment)
> 240314_allalnaip_kpicsmartgap.fa (trimmed AIP56 protein alignment)
> 240313_IQtree_allcdtbkpicgappy_VTG4 (IQTree output files for CdtB protein phylogeny)
> 240315_allalnaip_kpicsmartgap tree (IQTree output files for AIP56 protein phylogeny)
> cdtbtree_midpointrooted.nwk (midpoint rooted CdtB protein phylogeny in newick format)
> aip56tree_midpointrooted.nwk (midpoint rooted AIP56 protein phylogeny in newick format)
> mergedcdtdf.csv (state file for mapping states to CdtB phylogeny)
> mergedaipdf.csv (state file for mapping states to AIP56 phylogeny)
> cdtbaip56genetrees.R (R script to recreate phylogenies in Fig 1 and Fig S1)
> FigS1genetrees.pdf (PDF version of trees in Fig S1)
> hgt_alignments.zip contains the alignment files for the HGT loci shown in Figs S2-3. These alignments were displayed untrimmed to capture amino acid diversity among these gene products. Contents of the zip file are:
> AIP56alignment.fasta
> CdtBalignment.fasta
>alphafold.zip contains the raw results from ColabFold runs on *E. coli *CDT, FusionA, and FusionB as seen in Figure S4. Contents of this folder are:
> EcoliCDT_55775.result.zip (*E. coli *CDT modeling)
> FusA_82091.result.zip (FusionA modeling)
> FusB_ea177.result.zip (FusionB modeling)
File: Fig2_rawdata.zip
Description: Raw count data and code for Figure 2 and associated supplementary figures. Empty/missing data are indicated with NA values.
> Fig2metadata.txt (description of data contained in each variable in .csv files to generate Figure 2)
> parasitizationdata_clean.csv (raw counts per vial for parasitization experiments)
> Lb_alivedead.csv (raw counts per larvae for wasp survivorship experiment in Fig 2E)
> parastization_clean.R (code for analysis of all parasitization experiments, uses parasitizationdata_clean.csv as input)
> waspsurvival.R (code for analysis for Fig 2E, uses Lb_alivedead.csv as input)
File: Fig5_rawdata.zip
Description: Raw data and code for Figure 5 and associated supplementary figures.
> Fig5metadata.txt (description of data contained in each variable for the .csv files to generate Figure 5)
> Fig5rawdata.csv (raw counts per vial for Actin-GAL4 experiments)
> 231203 FusB Development Timecourse - development time.csv (raw development time data for Actin-GAL4 experiments)
> 231203 FusB Development Timecourse - pupariation rate.csv (raw pupariation rate data for Actin-GAL4 experiments)
> Fig5a_analysis.R (script to recreate Figure 5A and associated statistical analysis, uses Fig5rawdata.csv as input)
> developmenttime.R (script to recreate Figures 5B-C and associated statistical analyses, uses 231203 FusB Development Timecourse .csv files as inputs)
Code/software
R scripts to generate phylogenies and graphs in Figures 1, 2, and 5 are provided in the respective .zip files. Scripts were built in R v 4.3.2.
Access information
Data was derived from the following sources:
- Genomes used for analysis were derived from the following study: https://doi.org/10.1371/journal.pbio.3002697