Skip to main content

Data from: De novo assembly and characterization of the Hucho taimen transcriptome

Cite this dataset

Tong, Guang-Xiang et al. (2018). Data from: De novo assembly and characterization of the Hucho taimen transcriptome [Dataset]. Dryad.


Taimen (Hucho taimen) is an important ecological and economic species that is classified as vulnerable by the IUCN Red List of Threatened Species; however, limited genomic information is available on this species. RNA-Seq is a useful tool for obtaining genetic information and developing genetic markers for non-model species in addition to its application in gene expression profiling. In this study, we performed a comprehensive RNA-Seq analysis of taimen. We obtained 157 M clean reads (14.7 Gb) and used them to de novo assemble a high-quality transcriptome with a N50 size of 1060 bp. In the assembly, 82% of the transcripts were annotated using several databases, and 14,666 of the transcripts contained a full open reading frame. The assembly covered 75% of the transcripts of Atlantic salmon and 57.3% of the protein-coding genes of rainbow trout. To learn about the genome evolution, we performed a systematic comparative analysis across 11 teleosts including 8 salmonids, and found 313 unique gene families in taimen. Using Atlantic salmon and rainbow trout transcriptomes as the background, we identified 250 positive selection transcripts. The pathway enrichment analysis revealed a unique characteristic of taimen: it possesses more immune-related genes than Atlantic salmon and rainbow trout; moreover, some genes have undergone strong positive selection. We also developed a pipeline for identifying microsatellite marker genotypes in samples, and successfully identified 24 polymorphic microsatellite markers for taimen. These data and tools are useful for studying conservation genetics, phylogenetics, evolution among salmonids and selective breeding for threatened taimen.

Usage notes