A recent study identified candidate genes linked to magnetoreception in rainbow trout (Oncorhynchus mykiss) by sequencing transcriptomes from the brains of fish exposed to a magnetic pulse. However, the discovery of these candidate genes was limited to sequences that aligned to the reference genome. The unaligned, or unmapped, sequences may yet contain valuable information resulting from regions missing, misassembled, or divergent from the reference. Using the available sequencing data from the trout brain transcriptomes, we assembled >27 million unmapped sequences (5.8% of total sequences) into 45,142 contigs and identified 12 differentially expressed contigs as a result of exposure to a pulsed magnetic field. These contigs encoded a putative superoxide dismutase – a protein necessary to prevent oxidative damage – and collagen alpha-1 type II – a structural protein important for the development and integrity of the retina. These genes were consistent with the previous study suggesting an effect of the magnetic pulse on oxidative consequences of free iron and on non-visual encephalic photoreceptors. Our results demonstrate the utility of assembling unmapped sequencing reads in studies of gene expression and identify additional candidate genes associated with a magnetic sense in trout.
Cleaned Assembly
The file contains the filtered de novo assembly (TPM > 1) produced from the unmapped reads using Trinity. There are 45,142 contigs present.
Trinity_TPM1.fasta
Blast2GO Annotation file
This file contains the entire database created using Blast2GO to annotate all the contigs from the filtered unmapped reads assembly (Trinity_TPM1.fasta).
Trinity_TPM1.b2g
Blast2Go Annotations (Text format)
This file contains a summary of the annotation using Blast2GO in text format for all the contigs from the filtered unmapped reads assembly (Trinity_TPM1.fasta).
Full-annotation.tsv
Expression Matrix
This spreadsheet contains the raw expression level (counts) in each of the 10 libraries analyzed for differential expression. The first row lists the names of the 10 RNA-seq libraries (see 'Note' in README). Raw counts were produced using RSEM v1.3.0.
Expression-matrix.tsv
Sample Matrix for EdgeR
This spreadsheet contains only two columns. The first column is the group (control or pulsed) and the second the library ID (4,7,9,10,11,12,13,14,15,16). This file is only used by EdgeR to analyze differential expression.
samples.tsv
Differential Expression Results
This spreadsheet contains the results from calculations of differential expression for each contig. Only 39,714 contigs were assessed for differential expression. The columns are: Contig_name; sampleA; sampleB; LogFC (fold change); logCPM (counts per million); P-value; FDR (false discovery rate).
DE-table.tsv
Raw Assembly
This file contains the raw, de novo assembly produced from the unmapped reads using Trinity. There are 104,588 contigs present.
Trinity.fasta
C1 Unmapped Reads
C1.fastq.gz
C2 Unmapped Reads
C2.fastq.gz
C3 Unmapped Reads
C3.fastq.gz
C4 Unmapped Reads
C4.fastq.gz
C9 Unmapped Reads
C9.fastq.gz
C10 Unmapped Reads
C10.fastq.gz
C11 Unmapped Reads
C11.fastq.gz
C12 Unmapped Reads
C12.fastq.gz
P5 Unmapped Reads
P5.fastq.gz
P6 Unmapped Reads
P6.fastq.gz
P7 Unmapped Reads
P7.fastq.gz
P8 Unmapped Reads
P8.fastq.gz
P13 Unmapped Reads
P13.fastq.gz
P14 Unmapped Reads
P14.fastq.gz
P15 Unmapped Reads
P15.fastq.gz
P16 Unmapped Reads
P16.fastq.gz
README file
This file contains a complete description of all files in this data archive, including md5 tags, file formats and all code used to generate the data in this study.