Skip to main content
Dryad

Data for: Heterosigma akashiwo transcriptome gene annotations

Cite this dataset

Ueki, Shoko; Sato, Masanao (2023). Data for: Heterosigma akashiwo transcriptome gene annotations [Dataset]. Dryad. https://doi.org/10.5061/dryad.m0cfxpp56

Abstract

Heterosigma akashiwo is a eukaryotic, cosmopolitan, and unicellular alga (class: Raphidophyceae), and produces fish-killing blooms. There is a substantial scientific and practical interest in its ecophysiological characteristics that determine bloom dynamics and its adaptation to broad climate zones. A well-annotated genomic/genetic sequence information enables researchers to characterize organisms using modern molecular technology. In the present study, we conducted H. akashiwo RNA sequencing, a de novo transcriptome assembly of 84,693,530 high-quality deduplicated short-read sequences. The obtained RNA reads were assembled by Trinity assembler and 144,777 contigs were identified with N50 values of 1085. The raw data were deposited in the NCBI SRA database (BioProject PRJDB6241 and PRJDB15108), and the assemblies are available in NCBI TSA database (ICRV01).  Total 60,877 open reading frames with the length of 150 bp or greater were predicted. Here, the top Gene Ontology terms, the pfam hits, and the BLAST hits were annotated for all the predicted genes, and shared as text files.

Methods

For functional characterization of the predicted gene models, the transcriptome (ICRV01, ICRV01000001-ICRV01144777, https://www.ncbi.nlm.nih.gov/Traces/wgs/ICRV01) was subjected to gene ontology (GO) analysis, BLASTP search, and Pfam domain search.
The GO terms were assigned to the predicted peptides in a two-step process. First, the best-match homologs of the H. akashiwo peptides were identified following a BLASTP search (E-value<1) of a custom database composed of RefSeq gene models of Arabidopsis thalianaHomo sapiensMus musculus, and Saccharomyces cerevisiae (S288C). Second, the H. akashiwo peptides were annotated with the GO terms (http://geneontology.org) assigned to their best-match homologs. The Pfam database (http://pfam.xfam.org) was used to predict the domains in the H. akashiwo gene models.

Usage notes

The provided data are all in text format.

Funding

Japan Society for the Promotion of Science, Award: 16H06449

Japan Science and Technology Agency, Award: 989459

Nagase Science Technology Foundation

Casio Science Promotion Foundation

Joint Usage/Research Center, Institute of Plant Science and Resources, Okayama University

Japan Society for the Promotion of Science, Award: 221S0002