Data for: A complex mechanism translating variation of a simple genetic architecture into alternative life histories
Data files
Nov 11, 2024 version files 44.92 GB
-
1148_Gn_H3K27ac_merged_treat_pileup.bdg
652.26 MB
-
1148_Gn_H3K4me3_merged_treat_pileup.bdg
401.10 MB
-
1148_Gn_vgll3_merged_treat_pileup.bdg
579.90 MB
-
1151_Gn_H3K27ac_merged_treat_pileup.bdg
989.88 MB
-
1151_Gn_H3K4me3_merged_treat_pileup.bdg
1.14 GB
-
1151_Gn_vgll3_merged_treat_pileup.bdg
131.42 MB
-
1172_Gn_H3K27ac_merged_treat_pileup.bdg
912.31 MB
-
1172_Gn_H3K4me3_merged_treat_pileup.bdg
1.06 GB
-
1172_Gn_vgll3_merged_treat_pileup.bdg
112.47 MB
-
1173_Gn_H3K27ac_merged_treat_pileup.bdg
1.13 GB
-
1173_Gn_H3K4me3_merged_treat_pileup.bdg
778.42 MB
-
1173_Gn_vgll3_merged_treat_pileup.bdg
382.02 MB
-
1176_Gn_H3K27ac_merged_treat_pileup.bdg
680.84 MB
-
1176_Gn_H3K4me3_merged_treat_pileup.bdg
1.29 GB
-
1176_Gn_vgll3_merged_treat_pileup.bdg
221.80 MB
-
1215_Gn_H3K27ac_merged_treat_pileup.bdg
1.04 GB
-
1215_Gn_H3K4me3_merged_treat_pileup.bdg
1.25 GB
-
1215_Gn_vgll3_merged_treat_pileup.bdg
336.86 MB
-
1216_Gn_H3K27ac_merged_treat_pileup.bdg
993.50 MB
-
1216_Gn_H3K4me3_merged_treat_pileup.bdg
930.94 MB
-
1216_Gn_vgll3_merged_treat_pileup.bdg
165.97 MB
-
1249_Gn_H3K27ac_merged_treat_pileup.bdg
971.62 MB
-
1249_Gn_H3K4me3_merged_treat_pileup.bdg
1.39 GB
-
1249_Gn_vgll3_merged_treat_pileup.bdg
224.58 MB
-
1358_Gn_H3K27ac_merged_treat_pileup.bdg
1.04 GB
-
1358_Gn_H3K4me3_merged_treat_pileup.bdg
233.82 MB
-
1358_Gn_vgll3_merged_treat_pileup.bdg
233.82 MB
-
1384_Gn_H3K27ac_merged_treat_pileup.bdg
1.03 GB
-
1384_Gn_H3K4me3_merged_treat_pileup.bdg
803.86 MB
-
1384_Gn_vgll3_merged_treat_pileup.bdg
1.27 GB
-
1428_Gn_H3K27ac_merged_treat_pileup.bdg
352.75 MB
-
1428_Gn_H3K4me3_merged_treat_pileup.bdg
204.23 MB
-
1428_Gn_vgll3_merged_treat_pileup.bdg
468.27 MB
-
1453_Gn_H3K27ac_merged_treat_pileup.bdg
1.20 GB
-
1453_Gn_H3K4me3_merged_treat_pileup.bdg
875.01 MB
-
1453_Gn_vgll3_merged_treat_pileup.bdg
1.32 GB
-
1454_Gn_H3K27ac_merged_treat_pileup.bdg
1.31 GB
-
1454_Gn_H3K4me3_merged_treat_pileup.bdg
1.10 GB
-
1454_Gn_vgll3_merged_treat_pileup.bdg
234.45 MB
-
1597_Gn_H3K27ac_merged_treat_pileup.bdg
1.11 GB
-
1597_Gn_H3K4me3_merged_treat_pileup.bdg
623.40 MB
-
1597_Gn_vgll3_merged_treat_pileup.bdg
917.97 MB
-
1641_Gn_H3K27ac_merged_treat_pileup.bdg
673.73 MB
-
1641_Gn_H3K4me3_merged_treat_pileup.bdg
424.95 MB
-
1641_Gn_vgll3_merged_treat_pileup.bdg
807.74 MB
-
1671_Gn_H3K27ac_merged_treat_pileup.bdg
851.56 MB
-
1671_Gn_H3K4me3_merged_treat_pileup.bdg
1.10 GB
-
1671_Gn_vgll3_merged_treat_pileup.bdg
450.87 MB
-
1709_Gn_H3K27ac_merged_treat_pileup.bdg
918.23 MB
-
1709_Gn_H3K4me3_merged_treat_pileup.bdg
1.27 GB
-
1709_Gn_vgll3_merged_treat_pileup.bdg
354.86 MB
-
1735_Gn_H3K27ac_merged_treat_pileup.bdg
754.48 MB
-
1735_Gn_H3K4me3_merged_treat_pileup.bdg
612.05 MB
-
1735_Gn_vgll3_merged_treat_pileup.bdg
311.85 MB
-
1838_Gn_H3K27ac_merged_treat_pileup.bdg
950.51 MB
-
1838_Gn_H3K4me3_merged_treat_pileup.bdg
935.85 MB
-
1838_Gn_vgll3_merged_treat_pileup.bdg
174.15 MB
-
Aligned.primary.filtered.sortedByCoord.reheaded.RGadded.dupRem.blackList.bed
657.46 KB
-
blue_ensgenes_NCBIannot.txt
464.63 KB
-
ChIPmentationAllSamples.MACS3.Rmd
23.86 KB
-
ChIPmentationGenotypes.MACS3.Rmd
54.43 KB
-
ChIPmentationWGCNA.Rmd
7.12 KB
-
coral1_ensgenes_NCBIannot.txt
9.77 KB
-
cyan_ensgenes_NCBIannot.txt
102.75 KB
-
CytoscapeInput-edges-blue.txt
125.15 MB
-
CytoscapeInput-edges-magenta.txt
2.70 MB
-
CytoscapeInput-nodes-blue.txt
96.24 KB
-
CytoscapeInput-nodes-magenta.txt
25.57 KB
-
darkgreen_ensgenes_NCBIannot.txt
33.89 KB
-
expressed_NCBIannot.txt
5.51 MB
-
featureCountsEnsembl.RData
18.49 MB
-
geneCountsEnsembl.txt
5.92 MB
-
H3K27ac_peaks.txt
14.76 MB
-
H3K27acEE_peaks.txt
17.42 MB
-
H3K27acLLdown_peaks.txt
18.41 MB
-
H3K4me3_peaks.txt
11.74 MB
-
H3K4me3EE_peaks.txt
11.93 MB
-
H3K4me3LLdown_peaks.txt
14.43 MB
-
lavenderblush3_ensgenes_NCBIannot.txt
9.97 KB
-
lightyellow_ensgenes_NCBIannot.txt
44.50 KB
-
magenta_ensgenes_NCBIannot.txt
117.72 KB
-
midnightblue_ensgenes_NCBIannot.txt
67.81 KB
-
motifs_VGLL3StrongPromotersEnhancersEEknownResults.txt
51.69 KB
-
motifs_VGLL3StrongPromotersEnhancersLLknownResults.txt
51.74 KB
-
README.md
5.57 KB
-
RNAseqAnalysisRsubreadCountsInteractionContinuous.RData
389.89 MB
-
RNAseqAnalysisRsubreadCountsInteractionContinuous.Rmd
29.36 KB
-
RNAseqAnalysisWGCNAwithRUVcounts.Rmd
39.15 KB
-
Salmo_salar-GCA_905237065.2-2021_07-genes.gtf
1.57 GB
-
Salmo_salar-GCA_905237065.2-unmasked.sizes.genome
95.79 KB
-
salmon4_ensgenes_NCBIannot.txt
11.15 KB
-
sig_days_Annot_ensgenes_NCBIannot.txt
131.57 KB
-
sig_int_Annot_ensgenes_NCBIannot.txt
3.73 KB
-
sig_LLvsEE_Annot_ensgenes_NCBIannot.txt
8.48 KB
-
VGLL3_blue_neighbours.txt
28.31 KB
-
VGLL3broad_peaks.txt
8.68 MB
-
VGLL3EE_peaks.txt
4.77 MB
-
VGLL3LLdown_peaks.txt
6.31 MB
-
VGLL3PromotersEnhancersEEunique_ensgenes_NCBIannot.txt
1.02 MB
-
VGLL3PromotersEnhancersLLunique_ensgenes_NCBIannot.txt
1.48 MB
-
VGLL3PromotersEnhancersShared_ensgenes1_NCBIannot.txt
871.54 KB
-
VGLL3PromotersEnhancersShared_ensgenes2_NCBIannot.txt
892.48 KB
Abstract
Pubertal age is an important life-history trait that is underlined by a relatively simple genetic architecture in Atlantic salmon (Salmo salar). Nearly 40% of pubertal age variation in natural populations is explained by genetic variation cosegregating with the transcription cofactor gene vestigial-like 3 (vgll3). Using controlled-crossed salmon homozygous for either the late (L) or early (E) maturation conferring allele of vgll3, we investigated the molecular mechanisms mediating vgll3 association with pubertal age. Salmon were produced by controlled crosses of gametes from alternative homozygous vgll3 genotypes from the ”Oulujoki” stock obtained from the Finnish Natural Resources Institute (LUKE). Salmon were raised in common garden conditions in a recirculating aquatic facility at the University of Helsinki with natural temperature and photoperiod. Individually tagged and genotyped males were sacrificed and dissected periodically during the second year of growth and testis samples were taken from phenotypically immature males. One testis from each fish was flash-frozen on liquid nitrogen and stored at -80 °C until analysis for gene expression (RNA-seq). Another testis was frozen in a cryostorage buffer containing 10% DMSO with a 1 °C/min cooling rate and stored at -80 °C until analysis chromatin modifications (H3K27ac, H3K4me3, and VGLL3 ChIPmentation). Our results showed that seasonal variation was a major driver of gene expression differences in the testes. In addition, multiple key puberty genes were upregulated in vgll3EE, compared to vgll3LL genotypes, indicating that the vgll3 genotype mediates pubertal age differences by coordinated regulation of diverse cellular pathways including hormonal signalling, cell motility, TGFb-signalling, and cellular metabolism. Gene co-expression modules differentially expressed between vgll3 genotypes showed an over-representation of corresponding cellular processes and functions, indicating that the vgll3 genotype has a large-scale influence on signalling pathway activity. Using ChIPmentation in paired samples from the same individuals, we identified enhancers (H3K27ac), promoters (H3K27ac & H3K4me3), and VGLL3 binding regions to test if vgll3 function directly mediates the differences in gene expression and cellular phenotypes observed. Vgll3 genotypes showed marked differences in the activity of VGLL3 regulatory elements that are associated with unique cellular functions in each genotype, for example, signaling receptors and cell adhesion genes in vgll3EE and regulators of cell cycle progression in vgll3LL. Furthermore, the majority of DEGs between vgll3 genotypes were associated with VGLL3 binding regions, suggesting that differential expression may be directly mediated by functional differences in VGLL3 protein. Taken together, these results indicate that VGLL3 is widely associated with gene regulatory regions in immature testes and suggest that vgll3 genotype has a wide-scale influence on cellular physiology and development through coordinated regulation of distinct genomic loci and cellular functions. Despite the relatively simple genetic architecture of pubertal age variation in Atlantic salmon, the mechanism acting through the transcription cofactor vgll3 integrates the regulation of multiple distinct signaling pathways and developmental programs. Overall, our results exemplify a hidden complexity of molecular mechanisms mediating the large, pleiotropic effect of single genes on alternative life histories.
README: Data for: A complex mechanism translating variation of a simple genetic architecture into alternative life histories
https://doi.org/10.5061/dryad.vhhmgqp1g
This dataset contains custom R code to analyse RNA-seq and ChIPmentation data produced from Atlantic salmon testes. RNA-seq count tables, MACS3 output files, other output files, and the salmon genome version used in the analysis are included in the dataset as stand-alone files or RData objects.
Description of the data and file structure
Most primary RNA-seq data is contained in the "featureCountsEnsembl.RData" object that contains summarised RNA-seq counts over gene models (from Rsubread featureCounts). Counts can also be accessed as a stand-alone file "geneCountsEnsembl.txt". Analysis of RNA-seq data is carried out with code in the RNAseqAnalysisRsubreadCountsInteractionContinuous.Rmd file and subsequently with code in the RNAseqAnalysisWGCNAwithRUVcounts.Rmd file.
ChIPmentation peak data are provided as .xls or bedgraph output files from MACS3. Transcription factor motif overrepresentation results are provided as .txt files from HOMER. ChIPmentation data are analysed with code in the ChIPmentationAllSamples.MACS3.Rmd, ChIPmentationGenotypes.MACS3.Rmd and ChIPmentationWGCNA.Rmd files. Comparison with gene expression requires that the RNA-seq analysis scripts are run first and the results saved as RData objects (provided in the dataset).
Sharing/Access information
Links to other publicly accessible locations of the data:
- SRA accession PRJNA1042649
Usage notes
The scripts in this dataset were run on R version 4.2.0 and require the following libraries:
- DESeq2
- vsn
- AnnotationHub
- WGCNA
- ggplot2
- magrittr
- GenomicFeatures
- ChIPpeakAnno
- reshape2
- clusterProfiler
- viridis
- stringr
- biomaRt
- UpSetR
- pheatmap
- RColorBrewer
- ggrepel
- igraph
- network
- sna
- ggraph
- visNetwork
- threejs
- networkD3
- ndtv
The workflow of the analyses follows the steps:
- RNAseqAnalysisRsubreadCountsInteractionContinuous.Rmd
- RNAseqAnalysisWGCNAwithRUVcounts.Rmd
- ChIPmentationAllSamples.MACS3.Rmd
- ChIPmentationGenotypes.MACS3.Rmd
- ChIPmentationWGCNA.Rmd
Files
This submission contains the following files (by format):
.txt files
Files containing read counts for RNA-seq data.
- geneCountsEnsembl.txt
Files containing results from differential expression analysis. Filename describes tested effect (genotype "LLvsEE", sampling date "days", interaction between genotype and sampling date "int").
- sig_LLvsEE_Annot_ensgenes_NCBIannot.txt
- sig_days_Annot_ensgenes_NCBIannot.txt
- sig_int_Annot_ensgenes_NCBIannot.txt
Files containing results from gene coexpression network analysis. Gene annotations for module genes.
- magenta_ensgenes_NCBIannot.txt
- cyan_ensgenes_NCBIannot.txt
- coral1_ensgenes_NCBIannot.txt
- lavenderblush3_ensgenes_NCBIannot.txt
- lightyellow_ensgenes_NCBIannot.txt
- darkgreen_ensgenes_NCBIannot.txt
- blue_ensgenes_NCBIannot.txt
- salmon4_ensgenes_NCBIannot.txt
- midnightblue_ensgenes_NCBIannot.txt
Files containing edges and nodes for specific coexpression modules.
- CytoscapeInput-nodes-magenta.txt
- CytoscapeInput-edges-magenta.txt
- CytoscapeInput-nodes-blue.txt
- CytoscapeInput-edges-blue.txt
- VGLL3_blue_neighbours.txt
Files containing results from transcription factor binding motif analysis in HOMER:
- motifs_VGLL3StrongPromotersEnhancersEEknownResults.txt
- motifs_VGLL3StrongPromotersEnhancersLLknownResults.txt
Files containing ChIPmentation peaks called by MACS3 including genomic locations, read pileups, fold enrichments, and p-values. Filenames describe the antigen (H3K27ac, H3K4me3, VGLL3) and the vgll3 genotype (EE, LL).
- H3K27ac_peaks.txt
- H3K4me3_peaks.txt
- VGLL3broad_peaks.txt
- H3K27acEE_peaks.txt
- H3K4me3EE_peaks.txt
- VGLL3EE_peaks.txt
- VGLL3LLdown_peaks.txt
- H3K27acLLdown_peaks.txt
- H3K4me3LLdown_peaks.txt
Files containing results of ChIPmenation and RNA-seq analysis. Each row is an expressed gene assigned to a VGLL3 promoter or enhancer. Gene sets are either unique to vgll3 genotypes ("EEunique", "LLunique") or shared between genotypes ("shared").
- VGLL3PromotersEnhancersEEunique_ensgenes_NCBIannot.txt
- VGLL3PromotersEnhancersLLunique_ensgenes_NCBIannot.txt
- VGLL3PromotersEnhancersShared_ensgenes1_NCBIannot.txt
- VGLL3PromotersEnhancersShared_ensgenes2_NCBIannot.txt
RData files
These files contain results and data of RNA-seq analysis in R.
- featureCountsEnsembl.RData
- RNAseqAnalysisRsubreadCountsInteractionContinuous.RData
RMarkdown files
These files contain scripts used in the analysis.
- RNAseqAnalysisRsubreadCountsInteractionContinuous.Rmd
- RNAseqAnalysisWGCNAwithRUVcounts.Rmd
- ChIPmentationAllSamples.MACS3.Rmd
- ChIPmentationGenotypes.MACS3.Rmd
- ChIPmentationWGCNA.Rmd
Bedgraph files
These files (.bdg) contain ChIPmentation read pileups for visualization in a genome browser or R. Filename specifies individual sample (four digits), tissue (gonad "Gn"), and antigen (H3K27ac, H3K4me3, VGLL3). Information on the genotype of each individual sample and the sampling date can be found in the RNAseqAnalysisRsubreadCountsInteractionContinuous.RData object under "design".
Other formats
Genome files, genome annotations, and black-list regions were used in the analysis.
- Salmo_salar-GCA_905237065.2-unmasked.sizes.genome
- Salmo_salar-GCA_905237065.2-2021_07-genes.gtf
- Aligned.primary.filtered.sortedByCoord.reheaded.RGadded.dupRem.blackList.bed
Methods
Overall design:
Immature testis samples from 20 fish (11 vgll3EE and 9 vgll3LL) were analysed for gene expression. Chromatin profiles were analysed for 19 fish (9 vgll3EE and 10 vgll3LL), of which 17 were shared between RNA-seq samples. Samples represented a continuum of sampling dates from early April up to December of the second year of the fish.
RNA-seq methods:
Total RNA was extracted from flash-frozen testes using the Macherey-Nagel Nucleospin 96 RNA kit and quantified using a Qubit instrument. 100 ng of total RNA was used for RNA-seq library construction using an Illumina stranded mRNA kit and manufacturer instructions. Libraries were sequenced at the University of Helsinki Institute of Biotechnology sequencing service using a NextSeq500 instrument and 75 bp paired-end reads.
ChIPmentation methods:
Cryostored testis tissue was removed from -80 °C storage and rapidly thawed in a 37 °C water bath. Thawed tissue was rinsed with ice-cold D-PBS and transferred to a clean tube. Tissue was homogenised in 250 μl ice-cold D-PBS using an OMNI Bead Ruptor Elite instrument with 7 ml tubes and 2.4 mm ceramic beads, and a program with one burst of 5s with speed 2.4. The cell suspension was filtered through a Flowmi Cell Strainer and inspected under a microscope with Trypan blue staining. Cell suspension was diluted with room temperature D-PBS to a final volume of 450 μl and cross-linked with Diagenode ChIP crosslinking gold in 1X concentration for 30 min, followed by fixation with 1% formaldehyde for 2 min. Formaldehyde was quenched in 0.125 M glycine for 5 min and cells were collected with centrifugation at 400 g for 10 min. Cells were washed two times with 500 ul ice-cold PBS and centrifuged at 400 g for 10 min in between washes.
Cells were subject to ChIPmentation with Thermo MAGnify ChIP-kit and Illumina Tn5 reagents as follows. Cells were collected using centrifugation, resuspended in 50 μl lysis buffer supplemented with protease inhibitors, and lysed on ice for 5 min. Chromatin was sheared in 50 μl volumes using a Bioruptor device with settings high power and 3x eight cycles of 30 s on, 30 s off. Debris was pelleted by centrifugation and sheared chromatin was diluted to four equal aliquots of 100 μl using dilution buffer supplemented with protease inhibitors. One aliquot of sheared chromatin was reserved as input control. The remaining three aliquots were immunoprecipitated in 4 °C o/n using 1 μg of Abcam ab4729, 2 μg of Abcam ab8580, and 10 μg of a custom procured anti-VGLL3 antibody on ThermoFisher Dynabeads Protein A/G. Beads were subsequently washed following MAGnify kit protocol, with an additional final wash using 150 μl of ice-cold 10 mM Tris (pH 8). Bead-bound chromatin was then treated in 20 μl volume of tagmentation reaction containing Illumina Tn5 transposase for 5 min at 37 °C. Input controls were treated with tagmentation reaction for 5 min at 55 °C. Tagmentation was terminated by adding 7.5 volumes of RIPA buffer and incubation on ice for 5 min. Chromatin was subsequently washed twice with 150 μl of ice-cold RIPA and TE buffer. Crosslinks were reversed using a proteinase-K treatment and ChIPment DNA was captured using Macherey-Nagel NucleoMag magnetic beads. ChIPmentation libraries were measured using a Qubit instrument and a control PCR was run with Nextera sequencing oligos to assess library amplification on agarose gel. Finally, libraries were indexed, pooled, and sequenced at the University of Helsinki Institute of Biotechnology sequencing and FIMM sequencing services using NextSeq500 (75 bp paired-end) and Novaseq6000 (150 bp paired-end) instruments, respectively.