A new genome of an African weakly electric fish (Campylomormyrus compressirostris, Mormyridae) indicates rapid gene family evolution in Osteoglossomorpha
Data files
Jan 25, 2023 version files 17.01 GB
-
campylomormyrus.ccs.bam
14.44 GB
-
campylomormyrus.ccs.bam.pbi
10.84 MB
-
campylomormyrus.fasta
862.61 MB
-
campylomormyrus.gff
1.70 GB
-
KCNA.gene.fasta
25.50 KB
-
README.md
626 B
Abstract
Background
Teleost fishes comprise more than half of the vertebrate species. Within teleosts, most phylogenies consider the split between Osteoglossomorpha and Euteleosteomorpha/Otomorpha as basal, preceded only by the derivation of the most primitive group of teleosts, the Elopomorpha. While Osteoglossomorpha are generally species-poor, the taxon contains the African weakly electric fish (Mormyroidei), which have radiated into numerous species. Within the mormyrids, the genus Campylomormyrus is mostly endemic to the Congo Basin. Campylomormyrus serves as a model to understand mechanisms of adaptive radiation and ecological speciation, especially with regard to its highly diverse species-specific electric organ discharges (EOD). Currently, there are few well-annotated genomes available for electric fish in general and mormyrids in particular. Our study aims at producing a high-quality genome and to use this to examine genome evolution in relation to other teleosts. This will facilitate further understanding of the evolution of the osteoglossomorph fish in general and of electric fish in particular.
Results
A high-quality weakly electric fish (C. compressirostris) genome was produced from a single individual with a genome size of 862Mb, consisting of 1,497 contigs with an N50 of 1,399 kb and a GC-content of 43.69%. Gene predictions identified 34,492 protein-coding genes, which is a higher number than in the two other available Osteoglossomorpha genomes of Paramormyrops kingsleyae and Scleropages formosus. A CAFE5 analysis of gene family evolution comparing 33 teleost fish genomes suggests an overall faster gene family turnover rate in Osteoglossomorpha than in Otomorpha and Euteleosteomorpha. Moreover, the ratios of expanded/contracted gene family numbers in Osteoglossomorpha are significantly higher than in the other two taxa, except for species that had undergone an additional genome duplication (Cyprinus carpio and Oncorhynchus mykiss). As potassium channel proteins are hypothesized to play a key role in EOD diversity among species, we put a special focus on them, and manually curated 16 Kv1 genes. We identified a tandem duplication in the KCNA7a gene in the genome of C. compressirostris.
Conclusions
We present the fourth genome of an electric fish and the third well-annotated genome for Osteoglossomorpha, enabling us to compare gene family evolution among major teleost lineages. Osteoglossomorpha appears to exhibit rapid gene family evolution, with more gene family expansions than contractions. The curated Kv1 gene family showed seven gene clusters, which is more than in other analyzed fish genomes outside Osteoglossomorpha. The KCNA7a, encoding for a potassium channel central for EOD production and modulation, is tandemly duplicated which may related to the diverse EOD observed among Campylomormyrus species.
Methods
Samples
Genomic DNA was isolated from available frozen fin clips, which had been previously taken in the course of another study from an adult C. compressirostris artificially bred and raised at University Potsdam, Germany. The CTAB protocol was used to obtain high molecular weight genomic DNA. The concentration and quality were further verified with Nanodrop spectrophotometer and Agilent TapeStation before sequencing.
Genome sequencing
For Pacbio sequencing, a 15-kb SMRT cell DNA library was prepared and sequenced on a PacBio Sequel platform with one SMRT cell by a commercial company (Novogene). This produced 294 Gb long reads, which were used to generate the HiFi long reads using circular consensus sequencing (CCS) mode (Pacific Biosciences, USA).
De novo genome assembly
The genome size and heterozygosity were estimated by GenomeScope 2.0 using a k-mer value of 32. The genome was further assembled by hifiasm with the HiFi reads as input. The separated primary haplotigs were visualized in Bandage. This showed that some of the contigs contain two forks, which are likely homozygous breakpoints. Therefore, the program purge_dups was additionally applied for haplotig purging in the primary haplotigs. The mitochondrial DNA was separately assembled with the MitoHiFi.