Skip to main content

Paternal effects in a wild-type zebrafish implicate a role of sperm-derived small RNAs

Cite this dataset

Ord, James; Heath, Paul; Fazeli, Alireza; Watt, Penelope (2020). Paternal effects in a wild-type zebrafish implicate a role of sperm-derived small RNAs [Dataset]. Dryad.


While the importance of maternal effects has long been appreciated, a growing body of evidence now points to the paternal environment having an important influence on offspring phenotype. Indeed, research on rodent models suggests that paternal stress leaves an imprint on the behaviour and physiology of offspring via non-genetic information carried in the spermatozoa, however fish have been understudied with regard to these sperm-mediated effects. Here we investigated whether the zebrafish was subject to heritable influences of paternal stress by exposing males to stressors (conspecific-derived alarm cue, chasing, and bright light) before mating and assessing the behavioural and endocrine responses of their offspring, including their behavioural response to conspecific-derived alarm cue. We found that after males are exposed to stress, their larval offspring show weakened responses to stressors. Small RNA sequencing subsequently revealed that the levels of several small noncoding RNAs, including microRNAs, PIWI-interacting RNAs, and tRNA-derived small RNAs, were altered in the spermatozoa of stressed fathers, suggesting that stress-induced alterations to the spermatozoal RNA landscape may contribute to shaping offspring phenotype. The work demonstrates that paternal stress should not be overlooked as a source of phenotypic variation and that spermatozoal small RNAs may be important intergenerational messengers in fish.


Phenotype data and analyses (folder: Phenotype)


Included are data of zebrafish larval thigmotaxis (edge preference behaviour) expressed as % of animals detected in the peripheral 6mm of a 5cm petri dish in response to alarm substance or a 'blank' control treatment (water). The data were recorded from videos using automated tracking software (Viewpoint® ZebraLab) which calculated the averge number of animals present in a defined central zone (covering most of the dish area apart from the outer 6mm, or approx. 1.5 larval standard lengths) for every 10-second time interval. These values were subtracted from the total number of animals to calculate the percentage of animals in the peripheral zone (% thigmotaxis).

Contents of 'larval_thigmo.csv':

  1. Time: Timepoint in seconds at which the avgcount variable was recorded.
  2. avgcount: the raw output value from the Viewpoint Zebralab software; the average number of animals detected within the defined central zone (6mm from the edge of the 5cm petri dish) during the preceeding 10-second time interval.
  3. ID: The identity code of the petri dish or group of larval offspring measured.
  4. Batch: Experimental batch (6 in total)
  5. Treatment: Larval treatment; Sham = control / 'blank' treatment (50ul water), Alarm = alarm substance derived from the skin of adult conspecifics (50ul administered)
  6. Parent: The identity code of the male animal that sired the offspring.
  7. Paternal: Paternal treatment; Control or PCS (paternal chronic stress)
  8. N: Total number of larvae in the dish
  9. thigmoper: % thigmotaxis; calculated as ((N-avgcount)/N)*100.
  10. Time_min: Timepoint in minute; used to summarise the data into 1-minute time bins.


Included are data of cortisol measurements (estimates in pg/larva) from pools of larval offspring from control and stressed males. Larvae were exposed to one of two stress treatments (alarm substance or stirring), alongside larvae which were not exposed to stressors ('blank' treatment). These data are presented as two separate datasets. The 'blank' samples in the stirring dataset are the same as those that appear in the alarm substance dataset. A description of how the data were obtained follows below:

Following stress treatments, pools of larvae (13-30 per sample) were euthanised in ice water and stored at -20°C. For cortisol extraction, pools of larvae were homogenised in 2ml tubes in 120µl distilled water using a pellet mixer (Qiagen Tissue Lyser II). One ml of ethyl acetate (Sigma) was added to homogenates, tubes were vortexed for 30 seconds, incubated at room temperature for 30 mins, and centrifuged at 3000xg for 10 mins to induce phase separation. Tubes were snap-frozen on dry ice and the ethyl acetate layer was transferred to a fresh 2ml tube. The extraction was then repeated with another 1ml of ethyl acetate and the two extracts were pooled. Ethyl acetate was evaporated for 24h under a fume hood, and dried extracts were resuspended in 60µl of sample buffer (2% BSA in PBS) and stored at 4°C until ELISA (no more than one week).

For ELISA, wells of Immulon-2 high binding 96-well plates were coated with rabbit anti-cortisol monoclonal antibody (1.6µg/ml) for 16 hours, washed 3x with 300µl 0.05% tween-20 in PBS, blocked with 300µl 1% BSA in PBS for 30-60 minutes, and re-washed before 50µl of sample was added with 50ul of cortisol horseradish peroxidase conjugate (1:1600 dilution). Plates were then incubated for 2 hours at room temp. on an orbital shaker and re-washed before 100µl of tetramethylbenzedene (Abcam) was added. Plates were incubated in darkness for a further 20-30 minutes at room temperature and the colour reaction was stopped by adding 100µl of 1M sulphuric acid. Absorbance at 450nm (OD value) was read using a Varioskan Flash plate reader with SkanIt user interface software (Thermo Scientific). Sample cortisol was quantitated using a standard curve generated from at least three replicates of each of seven cortisol concentrations (0, 0.5, 1, 5, 10, 20, and 50ng/ml).

The raw OD and BB0 (OD of sample / mean OD of 0 standard) values for each sample are included in the datasets. The OD and BB0 values of the standards are included as a separate excel file. To obtain sample concentrations, a separate standard curve using BB0 values of standards was generated for each assay and BB0 values of each sample fitted to the standard curve for each assay separately. The script used to estimate cortisol concentrations of samples via a standard curve is included (ELISA_script.R). In the case of one sample for which the BB0 value was beyond the lower concentration limit of the standard curve, a concentration value of 0 was assigned. To obtain the final value used for analysis (pg/larva), the concentration estimates obtained from the standard curve (ng/ml, equivalent to pg/µl) were multiplied by 60 (60µl being the total volume in which the sample was suspended) to obtain an estimate of total quantity in pg, then divided by the number of larvae in the pool to obtain pg/larva.

Contents of 'cortisol_alarm.csv' and 'cortisol_stirring.csv':

  1. Assay: The assay / plate on which the sample was run. All assays followed the same protocol and were all carried out within the same two-week period.
  2. OD: Raw 450nm absorbance value from the plate reader
  3. BB0: OD of the sample divided by the OD of the 0 standard for that assay (see 'assay_standards.xlsx')
  4. conc: concentration estimate from the standard curve
  5. quant_est: conc*60; an estimate of the total quantity of cortisol (pg) in 60ul of solution (the volume in which the larval extracts were resuspended).
  6. Nlarvae: number of larvae in the pool
  7. cortperlarva: quant_est/Nlarvae; estimate of amount of cortisol (pg) in one larva; the final response variable
  8. ID: The identity code of the petri dish or group of larval offspring measured.
  9. Paternal: Paternal treatment; Control or PCS (paternal chronic stress)
  10. Batch: Experimental batch (6 in total)
  11. Father: The identity code of the male animal that sired the offspring.
  12. Treatment: Larval treatment; Sham = control / 'blank' treatment (50ul water), Alarm = alarm substance derived from the skin of adult conspecifics (50ul administered); Stirring = stirring the dish (~2 rotations per second) for 1 minute.

The script to analyse both thigmotaxis and cortisol data is via mixed effects models is included (phenotype_analysis_2020.R).

Small RNA sequencing workflow (folder: sRNAseq)

The attached workflow was developed to perform differential expression analyses of small RNAs sequenced from zebrafish (Danio rerio) spermatozoal samples derived from stressed and non-stressed (control) males. Details of the folder contents can be found in the README files, and a step-by-step summary of the workflow can be found in the workflow.txt file.

The workflow targets three distinct classes of small RNAs in a sequential manner: microRNAs (miRNAs), PIWI-interacting RNA (piRNA) clusters, and tRNA-derived small RNAs (tsRNAs). For miRNAs and piRNAs, 18-24nt and 25-32nt sequences could be aligned to known miRNA and piRNA cluster sequences in public databases (miRbase and piRNA cluster database, respectively). However, as we were not aware of any extensive databases of D, rerio tsRNAs at the time of the analysis, a custom database was built after identifying sequences in our own sequencing data which aligned to the 5' and 3' ends of D. rerio mature tRNA sequences in the genomic tRNA databse. Due to the high sequence redundancy of tRNAs, the putative tsRNAs identified were subject to clustering, producing a database of putative tsRNAs clustered according to sequence similarity. The original reads could then be aligned to this database and counted to each cluster, thus allowing for assessment of cluster-level differential expression. Although used here for zebrafish, the tsRNA workflow could be applied to any species provided mature tRNA sequences are available.

Also included are scipts for differential expression analyses of the three small RNA subtypes, as well as for identification of sperm miRNA target genes in the oocyte transcriptome (after downloading predicted target data from miRmap). This analysis uses a previously-published unfertilised egg transcriptome dataset (GEO accession GSM1085060).

Major elements of the attached workflow (FASTQ processing, alignment, read counting, sequence clustering) were developed, tested, and run on a high performance computing cluster (Sheffield Advnaced Research Computer). Other elements (differential expression analyses) were performed locally on a laptop computer running Windows 10 and R version 3.6.1.

Usage notes

FASTQ location and adaptor sequence

The raw FASTQ files for use with the small RNA sequencing workflow have been deposited to the NCBI Sequence Read Archive uder BioProject PRJNA635603. Read files still contain the original adaptor sequence (TGGAATTCTCGGGTGCCAAGG) which should be trimmed (FASTA file provided in the processing_alignment_counting folder of the sRNAseq directory). A number of contaminating primer-derived sequences were also identified (contained in an excel sheet, also in the aforementioned folder) which can be filtered out in the same manner as performed for the adapter sequence.

Key resources used:


European Research Council, Award: 668989