Prevalent and dynamic binding of the cell cycle checkpoint kinase Rad53 to gene promoters
Data files
Dec 18, 2022 version files 79.63 GB
-
ChIP_seq_all.zip
-
EdU_seq.zip
-
LID300600.zip
-
LID301136_2_1.zip
-
LID301136_2_2.zip
-
LID301417.zip
-
LID301760.zip
-
LID302393.zip
-
LID303277.zip
-
LID305186_2_1.zip
-
LID305186_2_2.zip
-
LID305218.zip
-
LID305704.zip
-
LID305826.zip
-
LID306560.zip
-
LID306595.zip
-
LID306600.zip
-
README_Datasets_for_Prevalent_and_dynamic_binding_of_the_cell_cycle_checkpoint_kinase_Rad53_to_gene_promoters.rtf
-
RNA_seq.zip
Abstract
Replication of the genome must be coordinated with gene transcription and cellular metabolism, especially following replication stress in the presence of limiting deoxyribonucleotides. The S. cerevisiae Rad53 (CHEK2 in mammals) checkpoint kinase plays a major role in cellular responses to DNA replication stress. Cell-cycle-regulated, genome-wide binding of Rad53 to chromatin was examined. Under replication stress, the kinase bound to sites of active DNA replication initiation and fork progression, but unexpectedly to the promoters of about 20% of genes encoding proteins involved in multiple cellular functions. Rad53 promoter binding correlated with changes in expression of a subset of genes. Rad53 promoter binding to certain genes was influenced by sequence-specific transcription factors and less by checkpoint signaling. However, in checkpoint mutants, untimely activation of late-replicating origins reduces the transcription of nearby genes, with concomitant localization of Rad53 to their gene bodies. We suggest that the Rad53 checkpoint kinase coordinates genome-wide replication and transcription under replication stress conditions.
Methods
Isolation and preparation of DNA for whole-genome replication profile analysis
Modified protocol based on previously described ((Sheu et al., 2016, 2014)). Briefly, yeast cells were synchronized in G1 with α-factor and released into medium containing 0.2 mg/mL pronase E, 0.5 mM 5-ethynyl-2′ -deoxyuridine (EdU) with or without addition of 200 mM HU as indicated in the main text. At the indicated time point, cells were collected for preparation of genomic DNA. The genomic DNA were fragmented, biotinylated, and then purified. Libraries for Illumina sequencing were constructed using TruSeq ChIP Library Preparation Kit (Illumina). Libraries were pooled and submitted for 50 bp paired-end sequencing.
Sample preparation for Chromatin immunoprecipitation coupled to deep sequencing (ChIP-seq)
Chromatin immunoprecipitation (ChIP) was performed as described ((Behrouzi et al., 2016)) with modification. About 109 synchronized yeast cells were fixed with 1% formaldehyde for 15 min at room temperature (RT), then quenched with 130 mM glycine for 5 min at RT, harvested by centrifugation, washed twice with TBS (50 mM Tris.HCl pH 7.6, 150 mM NaCl), and flash frozen. Cell pellets were resuspended in 600 µl lysis buffer (50 mM HEPES-KOH pH 7.5, 150 mM NaCl, 1 mM EDTA, 1% Triton X-100, 0.1% Na-Deoxycholate, 0.1% SDS, 1 mM PMSF, protease inhibitor tablet (Roche)), and disrupted by bead beating using multi-tube vortex (Multi-Tube Vortexer, Baxter Scientific Products) for 12-15 cycles of 30 seconds vortex at maximum intensity. Cell extracts were collected and sonicated using Bioruptor (UCD-200, Diagenode) for 38 cycles of pulse for 30 seconds ”ON”, 30 seconds “OFF” at amplitude setting High (H). The extract was centrifuged for 5 min at 14,000 rpm. The soluble chromatin was used for IP.
Antibodies against Cdc45 (CS1485, this lab (Sheu and Stillman, 2006)), Rad53 (ab104232, Abcam) , γ-H2A (ab15083, Abcam) were preincubated with washed Dynabeads Protein A/G (Invitrogen, 1002D and 1004D). For each immunoprecipitation, 80 μl antibody-coupled beads were added to soluble chromatin. Samples were incubated overnight at 4°C with rotation, after which the beads were collected on magnetic stands, and washed 3 times with 1 ml lysis buffer and once with 1 ml TE, and eluted with 250 μl preheated buffer (50 mM Tris.HCl pH 8.0, 10 mM EDTA, 1% SDS) at 65°C for 15 min. Immunoprecipitated samples were incubated overnight at 65°C to reverse crosslink, and treated with 50 μg RNase A at 37°C for 1 hr. 5 μl proteinase K (Roche) was added and incubation was continued at 55°C for 1 hr. Samples were purified using MinElute PCR purification kit (Qiagen). Libraries for Illumina sequencing were constructed using TruSeq ChIP Library Preparation Kit (Illumina, IP-202-1012 and IP-202-1024).
The duplicate Rad53 ChIP-Seq data was compared to published ChIP-Seq data for Swi6 (Park et al., 2013) (SRX360900: GSM1241092: swi6_DMSO_illumina; Saccharomyces cerevisiae; ChIP-Seq), creating Gini indexes from calculated Lorenz curves (Andri et mult. al. S (2021). DescTools: Tools for Descriptive Statistics. R package version 0.99.41, https://cran.r-project.org/package=DescTools).
Sample preparation for RNA seq
About 2-3x108 flash-frozen yeast cells were resuspended in Trizol (cell pellet: Trizol = 1:10) and vortex for 15 sec and incubate 25ºC for 5 min. Add 200 μl chloroform per 1 ml of Trizol-cell suspension, vortex 15 sec, then incubate at room temp for 5 min and centrifuge to recover the aqueous layer. The RNA in the aqueous layer were further purified and concentrated using PureLink Column (Invitrogen, 12183018A). The RNA was eluted in 50 µl and store at 20ºC if not used immediately. Store at -80ºC for long term. Paired-end RNA-seq libraries were prepared using TruSeq stranded mRNA library preparation kit (Illumina, 20020594).
Generation of coverage tracks using the Galaxy platform
For visualization of read coverage in the Integrated Genome Browser ((Freese et al., 2016)), the coverage tracks were generated using the Galaxy platform maintained by the Bioinformatics Shared Resource (BSR) of Cold Spring Harbor Lab. The paired-end reads from each library were trimmed to 31 bases and mapped to sacCer3 genome using Bowtie ((Langmead, 2010)). The coverage track of mapped reads was then generated using bamCoverage ((Ramírez et al., 2014)) with normalization to 1x genome.
Definition of the origin-types
Based on the BamCoverage output for EdU signal in WT, rad53K227A and mrc1D, we categorized 829 origins listed in the oriDB database ((Siow et al., 2012)). We define the early origins as the one whose signal at the first time point is larger than 2. The late origins are extracted from the rest of the origins if the average signal value at the later time point is larger than 2 in rad53K227A and mrc1D mutants. Among the 829 entries in oriDB, we defined 521 as active origins (with EdU signal in WT or checkpoint mutants rad53K227A and mrc1D), in which 256 was categorized as early origins (with EdU signal in WT) and 265 as late origins (with signal in checkpoint mutants but not in WT). The remaining 308 entries do not have significant signal under our condition and were deemed inactive origins.
Computational analysis of sequence data
The sequenced reads were trimmed by cutadapt with an option of “nextseq-trim”, then aligned by STAR ((Dobin et al., 2013)) in a paired-end mode to the sacCer3 genome masked at repetitive regions. The gene structure is referred from SGD reference genome annotation R64.1.1 as of Oct. 2018. For RNA-seq quantification analysis, the total counts of aligned reads were computed for each gene by applying “GeneCounts” mode. For ChIP-seq quantification analysis, the reads were mapped using the same pipeline. Additionally, peak calling was done by MACS2 in a narrow peak mode.
Gene expression analysis
Differentially expressed genes (DEGs) and their p-values were computed for each pair of the cases by nbinomWaldTest after size factor normalization using DESeq2 ((Love et al., 2014)). Using the list of DEGs, GO and KEGG enrichment analyses were performed via Pathview library. ClusterProfiler was applied to visualize fold changes of DEGs in each KEGG pathway. Co-expression analysis of significant DEGs was further performed base on co-expression network constructed in CoCoCoNet ((Lee et al., 2020)). CoCoCoNet has established the co-expression matrix of Spearman’s correlation ranking based on 2,690 samples downloaded from SRA database. We carried out clustering for the correlation matrix downloaded from CoCoCoNet (yeast_metaAggnet) by dynamicTreeCut in R (or hierarchical clustering) to obtain at most 10 clusters. The enrichment analysis for the gene set of each cluster was performed in the same way with RNA-seq analysis.
ChIP-seq signal normalization
For ChIP-seq signal normalization, two different methods were applied to different types of analysis. For ChIP-seq residual analysis, we used simple normalization. In this process, each case sample is compared with the corresponding control sample of DNA input to compute log2 fold changes within each 25 bp window reciprocally scaled by multiplying the total read counts of another sample. Then, the average of fold changes is computed for each duplicate. For ChIP-seq heatmap analysis, we employed the origin-aware normalization to account for the higher background around origin region as a result of DNA replication. In the origin-aware normalization, the same computation used in simple normalization, or log2 fold change with scaling by the total read count, is independently applied for the region proximal to the origins and others. For the heatmap presented in this paper, the origin-proximal region is defined as the region within 5,000 bp upstream and downstream.
Heatmap analyses at origins and TSS
After the average fold change computation and normalization from ChIP-seq signals, the signal strength is visualized around the target regions such as TSSs and replication origins are extracted using normalizeToMatrix function in EnrichedHeatmap (window size is 25 bp and average mode is w0). We ordered heatmaps to examine a different signal enrichment pattern for the characteristics of each origin or gene. The heatmap row of each origin is ordered by the assigned replication timing for ChIP-seq signals around replication origins. The replication time for the origins is annotated with the replication timing data published previously ((Yabuki et al., 2002)). From the estimated replication time for each 1,000 bp window, we extracted the closest window from the center of each replication origin and assigned it as the representative replication timing if their distance is no more than 5,000 bp. Early and late origins groups are categorized according to the definition of the origin-types using the replication profile data from this study. The final set of the replication origins used in the heatmap analysis is obtained after filtering out the replication origins overlapped with any of 238 hyper-ChIPable regions defined in the previous study ((Teytelman et al., 2013)). In total, 167 early and 231 late origins pass this filter and are used in the heatmaps analysis in this study. For heatmaps of the ChIP-seq signals around TSS, we ordered genes based on RNA-seq fold changes for all DEGs or per co-expression cluster of DEGs based on gene co-expression network constructed in CoCoCoNet ((Lee et al., 2020)).
ChIP-seq residual analysis
To detect the time-dependent increase or decrease of Rad53 binding signals, we first focused on the 500 bp window upstream from each TSS and computed the sum of the fold change signals estimated for each 25-bp window scaled by the window size as an activity of Rad53 binding for each gene. The overall activity scores are varied for each time point probably because of the different Rad53 protein level or other batch-specific reasons. To adjust such sample-specific differences for a fair comparison, a linear regression is applied for the activity scores of all genes between G1 and other time points HU45 and HU90 using lm function in R. Then we selected top genes showing the deviated signals from the overall tendency according to the absolute residual values between the actual and predicted values, excluding the genes with signal value lower than -0.075 after scaling the maximal signal to 1. The top 1,000 genes with the highest absolute residual values were selected from 2 sets of experiments. The common 435 genes among the duplicates were selected for further analysis.
References
Behrouzi R, Lu C, Currie M, Jih G, Iglesias N, Moazed D. 2016. Heterochromatin assembly by interrupted Sir3 bridges across neighboring nucleosomes. Elife 5:e17556. doi:10.7554/elife.17556
Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. 2013. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15–21. doi:10.1093/bioinformatics/bts635
Freese NH, Norris DC, Loraine AE. 2016. Integrated genome browser: visual analytics platform for genomics. Bioinformatics 32:2089–2095. doi:10.1093/bioinformatics/btw069
Lee J, Shah M, Ballouz S, Crow M, Gillis J. 2020. CoCoCoNet: conserved and comparative co-expression across a diverse set of species. Nucleic Acids Res 48:gkaa348-. doi:10.1093/nar/gkaa348
Love MI, Huber W, Anders S. 2014. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15:550. doi:10.1186/s13059-014-0550-8
Park D, Lee Y, Bhupindersingh G, Iyer VR. 2013. Widespread Misinterpretable ChIP-seq Bias in Yeast. Plos One 8:e83506. doi:10.1371/journal.pone.0083506
Ramírez F, Dündar F, Diehl S, Grüning BA, Manke T. 2014. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res 42:W187–W191. doi:10.1093/nar/gku365
Sheu Y-J, Stillman B. 2006. Cdc7-Dbf4 phosphorylates MCM proteins via a docking site-mediated mechanism to promote S phase progression. Mol Cell 24:101–113. doi:10.1016/j.molcel.2006.07.033
Sheu Y-J, Kinney JB, Lengronne A, Pasero P, Stillman B. 2014. Domain within the helicase subunit Mcm4 integrates multiple kinase signals to control DNA replication initiation and fork progression. Proceedings of the National Academy of Sciences 111:E1899-908. doi:10.1073/pnas.1404063111
Sheu Y-J, Kinney JB, Stillman B. 2016. Concerted activities of Mcm4, Sld3, and Dbf4 in control of origin activation and DNA replication fork progression. Genome Research 26:315–330. doi:10.1101/gr.195248.115
Siow CC, Nieduszynska SR, Müller CA, Nieduszynski CA. 2012. OriDB, the DNA replication origin database updated and extended. Nucleic Acids Res 40:D682–D686. doi:10.1093/nar/gkr1091
Yabuki N, Terashima H, Kitada K. 2002. Mapping of early firing origins on a replication profile of budding yeast. Genes Cells 7:781–789. doi:10.1046/j.1365-2443.2002.00559.x