Skip to main content
Dryad logo

Data from: Genetic drift dominates genome-wide regulatory evolution following an ancient whole genome duplication in Atlantic salmon

Citation

Verta, Jukka-Pekka; Barton, Henry; Pritchard, Victoria; Primmer, Craig (2021), Data from: Genetic drift dominates genome-wide regulatory evolution following an ancient whole genome duplication in Atlantic salmon, Dryad, Dataset, https://doi.org/10.5061/dryad.t4b8gtj1b

Abstract

Whole genome duplications (WGD) have been considered as springboards that potentiate lineage diversification through increasing functional redundancy. Divergence in gene regulatory elements is a central mechanism for evolutionary diversification, yet the patterns and processes governing regulatory divergence following events that lead to massive functional redundancy, such as WGD, remain largely unknown. We studied the patterns of divergence and strength of natural selection on regulatory elements in the Atlantic salmon (Salmo salar) genome, which has undergone WGD 100-80 Mya. Using ChIPmentation, we first show that H3K27ac, a histone modification typical to enhancers and promoters, is associated with genic regions, tissue specific transcription factor binding motifs, and with gene transcription levels in immature testes. Divergence in transcription between duplicated genes from WGD (ohnologs) correlated with difference in the number of proximal regulatory elements, but not with promoter elements, suggesting that functional divergence between ohnologs after WGD is mainly driven by enhancers. By comparing H3K27ac regions between duplicated genome blocks, we further show that a longer polyploid state post-WGD has constrained regulatory divergence. Patterns of genetic diversity across natural populations inferred from re-sequencing indicate that recent evolutionary pressures on H3K27ac regions are dominated by largely neutral evolution. In sum, our results suggest that post-WGD functional redundancy in regulatory elements continues to have an impact on the evolution of the salmon genome, promoting largely neutral evolution of regulatory elements despite their association with transcription levels. These results highlight a case where genome-wide regulatory evolution following an ancient WGD is dominated by genetic drift.

Methods

Experimental design and collection of material

We collected immature male gonads from five 11-12-month-old male Atlantic salmon raised in common-garden conditions (see details in(Verta et al. 2020). Fish were euthanized using an overdose of MS-222, followed by decapitation. Gonads were dissected under a microscope, flash-frozen in liquid nitrogen and stored in -80 degrees C until use for chromatin extraction.

H3K27ac ChIPmentation

We integrated the original ChIPmentation protocol(Schmidl et al. 2015)to the workflow from ThermoFisher MAGnify ChIP kit. Gonads were homogenized in D-PBS buffer using OMNI Beadruptor Elite device in 2 ml tubes and 2.8mm stainless steel beads. Chromatin was fixed using 1% formaldehyde for 2 min, followed by quenching using 0.125 M glycerine concentration for 5 min. Cells were collected using centrifugation and resuspended in lysis buffer supplemented with protease inhibitors. Chromatin was sheared in 150 ul volumes using a Bioruptor device with settings high power and 8 cycles of 30 sec on, 30 sec off. Debris was pelleted by centrifugation and sheared chromatin was diluted to IP conditions. An aliquot of sheared chromatin was reserved as input control. Acetylated histones were immunoprecipitated in +4 degrees C for 2 hours using 1 microgram of Abcam ab4729 on ThermoFisher Dynabeads Protein A/G. Beads were subsequently washed following MAGnify kit protocol, with an additional final wash using 10 mM Tris (pH 8). Bead-bound chromatin was then treated with a tagmentation reaction containing Illumina Tn5 transposase for 5 min at 37 degrees C. Tagmentation was terminated by adding 7.5 volumes of RIPA buffer and incubation on ice for 5 min. ChIPmented chromatin was subsequently washed twice with both RIPA and TE buffer. Crosslinks were reversed using a proteinase-K treatment and ChIPment DNA was captured using magnetic beads. These steps were performed following the MAGnify kit protocol (for 3 samples), or alternatively by using a reverse crosslinking buffer (10 mM Tris-Hcl pH8, 0.5% SDS, 300 mM NaCl, 5 mM EDTA, proteinase-K) and Macherey-Nagel NucleoMag magnetic beads (for 2 samples). Input controls were treated with tagmentation reaction for 5 min at 55 degrees C. Tn5 was inactivated by adding SDS and tagment DNA was purified using Macherey-Nagel NucleoMag magnetic beads. Successful adapter integration was tested using PCR and primers aligning with Nextera adapters.

Alignment of ChIP-seq reads

ChIPmentationand matched input control libraries were sequenced using Illumina Nextseq chemistry at the Institute of Biotechnology of the University of Helsinki. Libraries were sequenced using both single-end and paired-end strategies, and the resulting ChIP fragment directories combined for each sample as described in the following section. For single-end libraries, reads were passed through a quality-control including Nextera adapter trimming using fastp (Chen et al. 2018)and the following parameters--low_complexity_filter --trim_tail1=1 --trim_front1=19. Reads were then aligned to the Atlantic salmon genome (Lien et al. 2016)downloaded from NCBI (version: ICSASG_v2, available from: ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/233/375/GCF_000233375.1_ICSASG_v2/GCF_000233375.1_ICSASG_v2_genomic.fna.gz) usingbowtie2 v2.4.2(Langmead Salzberg 2012)and the following parameters--very-sensitive --end-to-end. Paired-end libraries were correspondingly analysed using fastp (--low_complexity_filter --trim_front1=19 --trim_tail1=50 --trim_tail2=12) andbowtie2(--very-sensitive --maxins 1500 --end-to-end). Alignment files were then quality filtered using samtools (Li et al. 2009)and parameters-F 256 -q 20.

Identification and annotation of H3K27ac peaks

We used HOMER (Heinz et al. 2010) to call enriched H3K27ac regions over input control and to annotate the reproducible peaks. ChIP fragment distributions were created using the commandmakeTagDirectoryand parameters-keepOne -single -tbp 1 -mis 5 -GCnorm default. Tag directories for single-end and paired-end libraries for the same samples were then combined similarly with amakeTagDirectorycommand. Reproducible H3K27ac regions were identified with the command getDifferentialPeaksReplicates.pl, specifying parameters-style histone. Results were transformed into a bed file withpos2bed.pl. We then used custom R(Team 2013)code and bedtools intersect(Quinlan Hall 2010)to filter the peaks for any overlapping 1 Kb genomic windows with the top 1% of mean sequencing coverage to avoid problematic regions of the genome. Peaks were then annotated using aannotatePeaks.pl command and parameters-CpGandHOMERmotif search was performed usingfindMotifsGenome.pland parameters-size given. Finally, specific instances of motifs and their coverage were identified usingannotatePeaks.pland parameters-m, and parameters-size 4000 -hist 10 -m -d.

RNA-seq alignment and quantification

RNA-seq reads of immature male gonads (SRR8479243, SRR8479245, SRR8479246)(Skaftnesmo et al. 2017)as well as 14 salmon tissues from(Lien et al. 2016)were downloaded fromSequence Read Archiveand filtered usingfastpand default parameters. We usedSTAR(Dobin et al. 2013)to create a genome index (-runModegenomeGenerate) and align RNA-seq reads to the Atlantic salmon genome downloaded fromNCBI,in manual two-pass mode, with the following parametersoutFilterIntronMotifs RemoveNoncanonicalUnannotated chimSegmentMin 10 outFilterType BySJout alignSJDBoverhangMin 1 alignIntronMin 20 alignIntronMax 1000000 alignMatesGapMax 1000000 quantMode GeneCounts alignEndsProtrude 10 ConcordantPair limitOutSJcollapsed 5000000. Alignments were quantified over gene models downloaded fromNCBIusingRfunctionfeaturecountsfrom theRsubreadpackage(Liao et al. 2019)and normalized usingDESeq2 varianceStabilizingNormalization(Love et al. 2014)(immature male gonads) or RPKM (14 tissues).

 

References

Chen S, Zhou Y, Chen Y, Gu J. 2018. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 34:i884–i890. doi: 10.1093/bioinformatics/bty560.

Dobin A et al. 2013. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 29:15–21. doi: 10.1093/bioinformatics/bts635.

Heinz S et al. 2010. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell. 38:576–589. doi: 10.1016/j.molcel.2010.05.004.

Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie 2. Nat Meth. 9:357–359. doi: 10.1038/nmeth.1923.

Li H et al. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 25:2078–2079.

Liao Y, Smyth GK, Shi W. 2019. The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads. Nucleic Acids Res. 47:gkz114-. doi: 10.1093/nar/gkz114.

Lien S et al. 2016. The Atlantic salmon genome provides insights into rediploidization. Nature. 533:200–205. doi: 10.1038/nature17164.

Love MI, Huber W, Anders S. 2014. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome biology. 15:550. doi: 10.1186/s13059-014-0550-8.

Quinlan AR, Hall IM. 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 26:841–842. doi: 10.1093/bioinformatics/btq033.

Schmidl C, Rendeiro AF, Sheffield NC, Bock C. 2015. ChIPmentation: fast, robust, low-input ChIP-seq for histones and transcription factors. Nat Meth. 12:963–965. doi: 10.1038/nmeth.3542.

Skaftnesmo K, Edvardsen R, Taranger G, Schulz R. 2017. Integrative testis transcriptome analysis reveals differentially expressed miRNAs and their mRNA targets during early puberty in Atlantic salmon. Bmc Genomics. 18:801. doi: 10.1186/s12864-017-4205-5.

Team RC. 2013. R: A Language and Environment for Statistical Computing.

Verta J-P et al. 2020. Cis-regulatory differences in isoform expression associate with life history strategy variation in Atlantic salmon. Plos Genet. 16:e1009055. doi: 10.1371/journal.pgen.1009055.

Usage Notes

The results presented in the main article can be reproduced by running the R Markdown scripts in this Dryad repository. 

Funding

H2020 European Research Council, Award: 742312

Academy of Finland, Award: 314254

Academy of Finland, Award: 314255

Helsingin Yliopisto