Data from: De novo assembly and characterization of the Hucho taimen transcriptome

Tong G, Xu W, Zhang Y, Zhang Q, Yin J, Kuang Y

Date Published: December 27, 2017

DOI: https://doi.org/10.5061/dryad.9gd3n

Files in this package

Content in the Dryad Digital Repository is offered "as is." By downloading files, you agree to the Dryad Terms of Service. To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this data. CC0 (opens a new window) Open Data (opens a new window)

Title Trinotate annotation
Downloaded 1 time
Description The taimen transcirptome was annotated using Trinotate(https://trinotate.github.io/) according to the guidance. NR, Uniprot-Sprot and Pfam databases were used.
Download Trinotate.tsv.zip (24.26 Mb)
Details View File Details
Title Interproscan annotation
Downloaded 1 time
Description Interproscan annotation for taimen transcriptome. 79,800 transcripts were annotated using Interproscan.
Download Interproscan.tsv.zip (15.31 Mb)
Details View File Details
Title Gene Ontology annotation
Downloaded 1 time
Description The sequences with significant hits in the Uniprot database or Pfam database were assigned GO terms using the Trinotate package, and the GO terms were assigned using Interproscan.72,728 transcripts were assigned to 15,107 GO terms, including 10,185 biological process terms, 1,429 cellular component terms and 3,493 molecular function terms.
Download GO.txt.zip (2.690 Mb)
Details View File Details
Title KEGG annotation
Downloaded 1 time
Description A KEGG pathway analysis was performed using GhostKOALA . A total of 51,698 transcripts were assigned to 8,052 KEGG ortholog groups
Download KEGG.txt.zip (873.9 Kb)
Details View File Details
Title eggNOG annotation
Downloaded 2 times
Description The COG functional category annotation using eggNOG-mapper (Huerta-Cepas et al., 2017), 72,605 putative proteins were annotated.
Download eggNOG-mapper.tsv.zip (7.547 Mb)
Details View File Details
Title Assembly transcriptome
Downloaded 3 times
Description The transcriptome sequences were assembled using the Trinity package. Before assembly, low-quality reads were filtered from the raw reads using Trimmomatic with the parameters LEADING:20 TRAILING:20 SLIDINGWINDOW:4:20 MINLEN:50. The clean reads from the two pooled libraries were merged and in silico normalized using the Trinity package with default parameters to reduce the running time and memory consumption. A parameter kmer size of 25 and a depth of at least two kmer were used for assembly with the Trinity package. The contigs resulting from Trinity were further fed to the TGI clustering Tool (version 2.1) to process alternative splicing and redundant sequences.The raw RNA-Seq reads and assembled transcripts were deposited in the European Nucleotide Archive under the project ID PRJEB19675 and accession numbers HAGJ01000001 to HAGJ01190473 for the assembled transcripts.
Download transcriptome.embl.dat.zip (68.29 Mb)
Details View File Details
Title SNP.vcf
Downloaded 1 time
Description Clean reads were firstly mapped to transcripts using Bowtie2, then SNPs were called SNPs using SAMtools. Raw SNPs with a minimum depth of 4 and minimum quality of 20 were filtered out using Vcftools (Danecek et al., 2011), and SNPs clustered within 50 bp were also filtered out. SNPs were annotated using snpEFF(http://snpeff.sourceforge.net/)
Download SNP.vcf.zip (12.49 Mb)
Details View File Details
Title ORF prediction
Downloaded 1 time
Description TransDecoder (https://transdecoder.github.io/) was used to predict the open reading frames (ORFs) and translate proteins, and homology searches against pFam and Uniprot databases were performed as supporting evidence for the ORFs. The ORFs with fewer than 30 amino acids were discarded.
Download TransDecoder.zip (79.59 Mb)
Details View File Details
Title Microsatellite Primers
Downloaded 1 time
Description Sputnik software was used to search di-, tri-, tetra-, pena- and hex-nucleotide motif SSRs. Primers were designed using the Primer3 package.After aligning the amplicon sequences to the Atlantic salmon genome with BLAT, primers were chose because their amplicons were located in the genome of Atlantic salmon with identities >70% and spanned distances close to the length of the amplicons.
Download SSR.primers.txt.zip (127.4 Kb)
Details View File Details
Title Sequences of index and primers
Downloaded 1 time
Description This pack contains 4 files, "forward_index.txt" and "reverse_index" are index sequences for demultiplexing reads to samples, and "primers.txt" was primer sequences for classifying reads to loci, and "sample_config.txt" is index config for samples. These files were used to genotype 32 taimen samples which collected from the Hutou section of the Wusuli River (E133˚40´17″, N45˚58´50˝) . The raw reads sequenced with Illumina HiSeq2500 platform in 250 Pair-End mode were deposited to in the European Nucleotide Archive under the project ID PRJEB19675 with accession number ERR2029723.
Download Index_and_primers_for_genotype.zip (2.395 Kb)
Details View File Details
Title Pipeline for characterizing polymorphism and defining genotype of microsatellite markers
Downloaded 3 times
Description This pack contains DeMultiIndex binary files, SSRGeno binary files(Linux 64 bit and MacOS 64bit system), an R script for drawing allele depth barplot and a manual document.
Download microsatellite_pipeline.zip (1.136 Mb)
Details View File Details

When using this data, please cite the original publication:

Tong G, Xu W, Zhang Y, Zhang Q, Yin J, Kuang Y (2017) De novo assembly and characterization of the Hucho taimen transcriptome. Ecology and Evolution, online in advance of print. https://doi.org/10.1002/ece3.3735

Additionally, please cite the Dryad data package:

Tong G, Xu W, Zhang Y, Zhang Q, Yin J, Kuang Y (2017) Data from: De novo assembly and characterization of the Hucho taimen transcriptome. Dryad Digital Repository. https://doi.org/10.5061/dryad.9gd3n
Cite | Share
Download the data package citation in the following formats:
   RIS (compatible with EndNote, Reference Manager, ProCite, RefWorks)
   BibTex (compatible with BibDesk, LaTeX)

Search for data

Be part of Dryad

We encourage organizations to: