Transcriptome-wide comparisons and virulence gene polymorphisms of host-associated genotypes of the cnidarian parasite Ceratonova shasta in salmonids
Data files
Dec 14, 2020 version files 2.43 GB
Abstract
Ceratonova shasta is an important myxozoan pathogen affecting the health of salmonid fishes in the Pacific Northwest of North America. C. shasta exists as a complex of host-specific genotypes, some with low to moderate virulence, and one that causes a profound, lethal infection in susceptible hosts. High throughput sequencing methods are powerful tools for discovering the genetic basis of these host/virulence differences, but deep sequencing of myxozoans has been challenging due to extremely fast molecular evolution of this group, yielding strongly divergent sequences that are difficult to identify, and unavoidable host contamination. We designed and optimized different bioinformatic pipelines to address these challenges. We obtained a unique set of comprehensive, host-free myxozoan RNA-seq data from C. shasta genotypes of varying virulence from different salmonid hosts. Analyses of transcriptome-wide genetic distances and maximum likelihood multigene phylogenies elucidated the evolutionary relationship between lineages and demonstrated the limited resolution of the established Internal Transcribed Spacer marker for C. shasta genotype identification, as this marker fails to differentiate between biologically distinct genotype II lineages from coho salmon and rainbow trout. We further analyzed the datasets based on polymorphisms in two gene groups related to virulence: cell migration and proteolytic enzymes including their inhibitors. The developed SNP-calling pipeline identified polymorphisms between genotypes and demonstrated that variations in both motility and protease genes were associated with different levels of virulence of C. shasta in its salmonid hosts. The prospective use of proteolytic enzymes as promising candidates for targeted interventions against myxozoans in aquaculture is discussed. We developed host-free transcriptomes of a myxozoan model organism from strains that exhibited different degrees of virulence, as a unique source of data that will foster functional gene analyses and serve as a base for the development of potential therapeutics for efficient control of these parasites.
Usage notes
Host filtering at read level – Ceratonova shasta and neither reads lists
Lists of Ceratonova shasta mapping reads and reads that did not map neither to host (Oncorhynchus mykiss) nor to parasite (C. shasta). Reads belong to three different genotypes (I, IIC, IIR) and to five different libraries (I, IIC, IIR-RBTJ7, IIR-RBTC16, IIR-RBT6).
Lists of reads, 20 .list files.
I_csR1.list
I_csR2.list
I_neitherR1.list
I_neitherR2.list
IIC_csR1.list
IIC_csR2.list
IIC_neitherR1.list
IIC_neitherR2.list
IIR_RBTJ7_csR1.list
IIR_RBTJ7_csR2.list
IIR_RBTJ7_neitherR1.list
IIR_RBTJ7_neitherR2.list
IIR_RBTC16_csR1.list
IIR_RBTC16_csR2.list
IIR_RBTC16_neitherR1.list
IIR_RBTC16_neitherR2.list
IIR_RBT6_csR1.list
IIR_RBT6_csR2.list
IIR_RBT6_neitherR1.list
IIR_RBT6_neitherR2.list
Genotype IIR (RBT6) assembled transcriptomes
Two assembled transcriptomes using IIR RBT6 C. shasta only reads or C. shasta only + neither reads.
2 .fasta files
IIR_RBT6_cs_only.fasta
IIR_RBT6_cs_neither.fasta
Host and other contaminants (other microorganisms) IIR RBT6 filtered assemblies
Filtered assembled transcriptomes for host (Oncorhynchus mykiss) and other microorganisms (virus, fungi, bacteria) using Taxon ID annotations.
2 .fasta files
IIR_RBT6_cs_only_extra_clean.fasta
IIR_RBT6_cs_neither_extra_clean.fasta
Longest representatives reference IIR RBT6 cs+neither assembly used for SNPs analyses
1 . fasta file
IIR_RBT6_cs_neither_extra_clean_woreps.fasta
SNPs tables
Genotypes called from nucleotide frequencies with a minimum coverage of 5 and 0.25 heterozygosity threshold.
5 .tab files
I_genotypes.tab
IIC_genotypes.tab
IIR_RBTJ7_genotypes.tab
IIR_RBTC16_genotypes.tab
IIR_RBT6_genotypes.tab
Genotypes called from nucleotide frequencies with a minimum coverage of 20 reads and 0.1 heterozygosity threshold.
5 .tab files
I_genotypes2000101.tab
IIC_genotypes2000101.tab
IIR_RBTJ7_genotypes2000101.tab
IIR_RBTC16_genotypes2000101.tab
IIR_RBT6_genotypes2000101.tab
Phylogenomic and SNPs-based phylogenetic alignments
9 .nex and 51 .fasta files
cmkb_permissive.nex
cmkb_permissive_filtered.nex
combined.nex
combined_cmkb_curated.nex
combined_merops_curated.nex
combined_only_available_all.nex
merops_permissive.nex
merops_permissive_filtered.nex
phylogenomic_tree.nex
51 individual genes alignments for phylogenomics in .fasta