The Enterprise, a massive transposon carrying Spok meiotic drive genes
Data files
Jan 28, 2021 version files 50.43 MB
Abstract
The genomes of eukaryotes are full of parasitic sequences known as transposable elements (TEs). Most TEs studied to date are relatively small (50 – 12000 bp), but can contribute to very large proportions of genomes. Here we report the discovery of a putative giant tyrosine-recombinase-mobilized DNA transposon, Enterprise, from the model fungus Podospora anserina. Previously, we described a large genomic feature called the Spok block which is notable due to the presence of meiotic drive genes of the Spok gene family. The Spok block ranges from 110 kb to 247 kb and can be present in at least four different genomic locations within P. anserina, despite what is an otherwise highly conserved genome structure. We propose that the reason for its varying positions is that the Spok block is not only capable of meiotic drive, but is also capable of transposition. More precisely, the Spok block represents a unique case where the Enterprise has captured the Spoks, thereby parasitizing a resident genomic parasite to become a genomic hyperparasite. Furthermore, we demonstrate that Enterprise (without the Spoks) is found in other fungal lineages, where it can be as large as 70 kb. Lastly, we provide experimental evidence that the Spok block is deleterious, with detrimental effects on spore production in strains which carry it. This union of meiotic drivers and a transposon has created a selfish element of impressive size in Podospora, challenging our perception of how TEs influence genome evolution and broadening the horizons in terms of what the upper limit of transposition may be.
Methods
This dataset contains the genome assembly and annotation files used for our study on the Spok block elements (i.e. the Enterprises). The assemblies of the strains CBS237.71m, PaWa137, PaWa139 were made from MinION Oxford Nanopore data assembled with Minimap2 v. 2.11 and Miniasm v. 0.2, and polished with Illumina HiSeq X data. The strain PaWa131 only has Illumina HiSeq X data, which was assembled with SPAdes v. 3.12.0. The Spok blocks were extracted from previously published genomes in the Spok paper dataset, where the assembly of CBS237.71m is also available.
Folders included:
- Genomes: genome assemblies and annotations. The assemblies are also available in the NCBI BioProject PRJNA685103.
- SpokBlocks: multifasta file with the sequences of all Spok blocks studied, along with their individual annotation files in gff formats. Notice that the annotation files are in the coordinates of the blocks, not of their native genomes.
- Others: The repeat library of Podospora and the data file used for the fitness experiment.
Usage notes
Most of these files are already present in a GitHub repository, in an order that follows better the analyses themselves. Here instead there are files organized by type.