Epigenetic gene regulation is controlled by distinct regulatory complexes utilizing specialized paralogs of Telomere Repeat Binding factors
Data files
Apr 15, 2026 version files 116.41 MB
-
1B_Fig.R
2.40 KB
-
1B_Fig.txt
2.06 KB
-
Araport11.txdb
39.37 MB
-
ChIPseqscripts.zip
8.77 KB
-
CLF_GFP-Rep1-islands-summary
389.50 KB
-
CLF_GFP-Rep2-islands-summary
504.86 KB
-
Data_Processing_After_review.R
25.52 KB
-
Data_processing_Jacobsen_Data_After_review.R
25.90 KB
-
database_genes_all_overlapping.csv
2.43 MB
-
Figure_2_Script_overlapping_genes_after_review.R
22.30 KB
-
Figure_5_All_Overlapping_v2.R
8.11 KB
-
GSM6185716_FLAG-ChIPseq-JMJ14-Rep1_peaks.narrowPeak
604.60 KB
-
GSM6185717_FLAG-ChIPseq-JMJ14-Rep2_peaks.narrowPeak
647.88 KB
-
Jacobsen_Comparisons_bins.csv
14.32 MB
-
Jacobsen_TRB1.bed
201.13 KB
-
Jacobsen_TRB2.bed
123.38 KB
-
Jacobsen_TRB3.bed
69.54 KB
-
Krause_et_al_env.yml
1.14 KB
-
PEAT.xlsx
7.75 MB
-
README.md
35.02 KB
-
RNA-seq.R
50.09 KB
-
RNAseqcounts.zip
2.83 MB
-
RNAseqscripts.zip
6.10 KB
-
S1_File.xlsx
817.95 KB
-
S2_File.xlsx
1.07 MB
-
S3_File.xlsx
17.11 MB
-
S4_File.xlsx
987.58 KB
-
S5_File.xlsx
145.66 KB
-
Samples.ChIP-seq.txt
2 KB
-
Samples.RNA-seq.txt
1.59 KB
-
Supplementary_Table_Bins_2_nucleosomes.csv
14.64 MB
-
Supplementary_Table_ChIP_Seq_TRB1.csv
1.28 MB
-
Supplementary_Table_ChIP_Seq_TRB2.csv
656.53 KB
-
Supplementary_Table_ChIP_Seq_TRB3.csv
146.52 KB
-
SWN_GFP-Rep1-islands-summary
580.61 KB
-
SWN_GFP-Rep2-islands-summary
503.21 KB
-
TRB1-IDR.SigpValueBL
749.99 KB
-
TRB1.fasta
4.63 MB
-
TRB2-IDR.SigpValueBL
376.69 KB
-
TRB2.fasta
2.69 MB
-
TRB3-IDR.SigpValueBL
83.82 KB
-
TRB3.fasta
513 KB
Abstract
Epigenetic regulators shape chromatin landscapes, allowing cells to express distinct gene sets depending on cell-type, developmental stage or environmental cues. These regulatory complexes rely on interactions with sequence-specific DNA binding proteins, such as the small family of TELOMERE REPEAT BINDING FACTORS (TRBs). TRBs are components of chromatin regulatory complexes with opposing functions, such as the epigenetic repressors Polycomb Repressive Complex 2 (PRC2) and a JMJ14/NAC complex that respectively add and removes the repressive H3K27me3 and positive H3K4me3 modification, but also with the plant-specific PEAT complex that is linked to histone acetylation and gene activation. We dissected the partial redundancy between TRB1, TRB2 and TRB3 in target gene selection and interaction with different chromatin regulatory complexes. High redundancy of TRBs is suggested by major phenotypic changes that are only observed trb triple mutants; however, we found different target site preference between TRB1-3 and preferred partnership with chromatin complexes. Furthermore, TRB paralogs interacted with the NuA4 histone acetylation complex, both together with and in absence of PEAT. Among the three paralogs, TRB1 had more unique binding sites and correlated stronger with PEAT and NuA4 functions. In contrast, TRB2 and TRB3 were more dependent on the presence of bona fide telo-box motifs and were more likely to be found at PRC2 associated sites. Overall, we provide insight into the diverse roles of TRBs in epigenetic gene regulation and how their diversification contributes to their apparent redundancy, as well as their observed activating and repressing effects on gene expression.
Dataset DOI: 10.5061/dryad.gtht76j2r
Description of the data and file structure
Description of file repository
This repository contains raw data files, intermediate analysis files and scripts that allow recapitulating the analyses presented in the PLOS Genetics manuscript PGENETICS-D-25-01278
Some larger items, such as raw read data for RNAseq and ChIPseq and the raw proteomics data, are not included as they have been deposited in public databases.
The ChIPseq and RNAseq data that support the findings of this study are publicly available from the European Nucleotide Archive (ENA) with the identifier PRJEB63124.
The proteomics data that support the findings of this study are publicly available from the Proteome Xchange database with the identifier PXD069673.
The repository does not include data for aligned BAM-files. These can be regenerated with the provided scripts if necessary.
Files and R script for Flowering time analysis
Raw data table: 1B_Fig.txt
Rscript: 1B_Fig.R
Bash commands for RNAseq mapping
The files depend on software that can be installed in a conda or micromamba environment using the supplied YAML file:
To install the environement use:
micromamba install -n Mendler_Krause -f Krause_et_al_env.yml
The Linux bash scripts (and some helper scirpts in R) include download instructions for the raw RNAseq reads deposited in ENA. They should be called in the order of their numbering.
Bash commands require a RNAseq sample table:
The final output of read mapping to the Arabidopsis thaliana genome TAIR10 and gene counts are provided in this zipped archive: RNAseqcounts.zip
R commands for RNAseq analysis
Rscript to recapitulate analysis of RNAseq data presented in of the manuscript. Can be run after downloading and unpacking RNAseqcounts.zip in the same folder and adding also Samples.RNA-seq.txt to the same folder.
The script regenerates all RNAseq related Figures (1_Fig, S1_Fig, S2_Fig) of the manuscript and S1_File.xlsx
Bash commands for ChIPseq mapping and peak identification
The files depend on software that can be installed in a conda or micromamba environment using the supplied YAML file, see above for RNAseq.
The scripts should be run in their numbered order and include download instructions for the raw ChIPseq reads deposited in ENA.
Bash commands require a ChIPseq sample table:
Final Output of the analysis are IDR merged peak files:
Metaanalysis of ChIPseq data
R scripts to generate related Figures (Figs 2, 4, and 5) and Supplementary Figures (S5_Fig, S6 _Fig.) in the manuscript and Supplemental Files ( S2_File.xlsx , S3_File.xlsx ,S5_File.xlsx ).
Generate cross-correlation matrix of peaks and annotate them to genes
The script below creates the final cross-correlation database (S3_File.xlsx) and annotates TRB peaks to genes (S2_File.xlsx).
Data_Processing_After_review.R
The script requires this annotation database in the same folder: Araport11.txdb
The script requires Peak data files in the same folder:
The final output of the script are these tables:
| Output | Description | Supllementary File |
|---|---|---|
| Supplementary_Table_ChIP_Seq_TRB1.csv | TRB1 ChIPseq peaks annotated to gene | S2_File.xlsx |
| Supplementary_Table_ChIP_Seq_TRB2.csv | TRB2 ChIPseq peaks annotated to gene | S2_File.xlsx |
| Supplementary_Table_ChIP_Seq_TRB3.csv | TRB3 ChIPseq peaks annotated to gene | S2_File.xlsx |
| Supplementary_Table_Bins_2_nucleosomes.csv | Genomic bins of 196bp annotated to all complexes under study | S3_File.xlsx |
| database_genes_all_overlapping.csv | Genes annotated to all complexes under study | S3_File.xlsx |
Comparing out data to TRB CHIPseq data from the Jacobsen Group
During the revision, a reviewer requested a comparison of our analysis to a previously published dataset: Arabidopsis TRB proteins function in H3K4me3 demethylation by recruiting JMJ14 | Nature Communications
Data_processing_Jacobsen_Data_After_review.R
The script below recapitulates the analysis, it requires all inputs of the previous script but also these peak files from the published dataset:
The output database is named: Jacobsen_Comparisons_bins.csv
Generate Figure panels of presented in Figure 2
The script generates the graphic output presented in the manuscript.
Figure_2_Script_overlapping_genes_after_review.R
It requires these tables that are the output of the previous R script:
Supplementary_Table_Bins_2_nucleosomes.csv
database_genes_all_overlapping.csv
IP MS-MS data presented in Figure 3
The processed IP-MS data were analyzed using Perseus_v2.1.5.0 as described in the manuscript (see also Materilas and Methods section of this repository. The data uploaded to Perseus and the analysis results are part of the manuscript and can e downloaded here:
All plots were generated within Perseus.
Motif enrichment presented in Figure 4
Multi-fasta files were extracted from merged peak files for TRB1-3 (S1_File.xlsx) and submitted to MEME-ChIP - MEME Suite.
The fasta files are included in the repository:
To recapitulate the motif enrichment analysis, these files can be uploaded to MEME-ChIP - MEME Suite.
GO-term enrichment presented in Figure 5
The script creates underlies the GO-term enrichment anlysis presented in Figure 5 and generates Files that are compiled in S4_File.xlsx
The script requires database_genes_all_overlapping.csv
in the same folder.
Files and variables
File: Krause_et_al_env.yml
Description: Set-up of conda environment required to recapitulate analyses
File: 1B_Fig.txt
Description: Leaf count of trb double and prope triple combinations compared to Col-0 scored as total leaves until first flower.
Variables
- Genotype: Col-0, trb1-2 trb2-3, trb1-2 trb3-2, trb2-3 trb2-3, and prope triple combinations
- Experiment: Flowering time analysis
- Condition: greenhouse
- Photoperiod: long days (16h light /8h dark)
- Temperature: ca. 22°C in soil
- Rosette: scored number
- Cauline: scored number
- Total: sum of both leaf types
File: 1B_Fig.R
Description: Analysis script for flowering time analysis
File: Samples.RNA-seq.txt
Description: Sample table required to download and process RNAseq dataset
Variables
- sample: Individual name combining genotype and biological replicate
- set: Experimental set, biological replicate
- run: BGI run name
- genotype: Genotype
- tissue: tissue type
- age: age of biological material (days)
- sample_accession:ENA sample accession
- Alias: ENA sample alias
- read_accession: ENA read accession
File: RNAseqscripts.zip
Description: Archive of Linux bash commands to recapitulate RNAseq mapping and read to gene quantification
File: RNAseqcounts.zip
Description: raw count data files that allow skipping the mapping steps
File: RNA-seq.R
Description: Analysis script in R to recapitulate RNAseq analysis
File: S1_File.xlsx
Description: Details of RNAseq analysis
Variables
- Tab sig_1_wt_inS: DEGs trb1-2 vs Col-0
- baseMean: Euclidean mean of log-transformed count data
- log2FoldChange: Log2 fold change of count data trb1-2 vs Col-0
- lfcSE: Standard error of the log2FoldChange
- pvalue: pvalue of log2FoldChange
- padj: adjusted pvalue
- Tab sig_2_wt_inS: DEGs trb2-3 vs Col-0
- baseMean: Euclidean mean of log-transformed count data
- log2FoldChange: Log2 fold change of count data trb2-2 vs Col-0
- lfcSE: Standard error of the log2FoldChange
- pvalue: pvalue of log2FoldChange
- padj: adjusted pvalue
- Tab sig_2_wt_inS: DEGs trb3-2 vs Col-0
- baseMean: Euclidean mean of log-transformed count data
- log2FoldChange: Log2 fold change of count data trb3-2 vs Col-0
- lfcSE: Standard error of the log2FoldChange
- pvalue: pvalue of log2FoldChange
- padj: adjusted pvalue
- Tab sig_12_wt_inS: DEGs trb1-2 trb2-3 vs Col-0
- baseMean: Euclidean mean of log-transformed count data
- log2FoldChange: Log2 fold change of count data trb1-2 trb2-3 vs Col-0
- lfcSE: Standard error of the log2FoldChange
- pvalue: pvalue of log2FoldChange
- padj: adjusted pvalue
- Tab sig_13_wt_inS: DEGs trb1-2 trb3-2 vs Col-0
- baseMean: Euclidean mean of log-transformed count data
- log2FoldChange: Log2 fold change of count data trb1-2 trb3-2 vs Col-0
- lfcSE: Standard error of the log2FoldChange
- pvalue: pvalue of log2FoldChange
- padj: adjusted pvalue
- Tab sig_23_wt_inS: DEGs trb2-2 trb3-2 vs Col-0
- baseMean: Euclidean mean of log-transformed count data
- log2FoldChange: Log2 fold change of count data trb2-3 trb3-2 vs Col-0
- lfcSE: Standard error of the log2FoldChange
- pvalue: pvalue of log2FoldChange
- padj: adjusted pvalue
- Tab sig_12_1_inS: DEGs trb1-2 trb2-2 vs trb1-2
- baseMean: Euclidean mean of log-transformed count data
- log2FoldChange: Log2 fold change of count data trb1-2 trb2-2 vs trb1-2
- lfcSE: Standard error of the log2FoldChange
- pvalue: pvalue of log2FoldChange
- padj: adjusted pvalue
- Tab sig_12_2_inS: DEGs trb1-2 trb2-2 vs trb2-2
- baseMean: Euclidean mean of log-transformed count data
- log2FoldChange: Log2 fold change of count data trb1-2 trb2-2 vs trb2-2
- lfcSE: Standard error of the log2FoldChange
- pvalue: pvalue of log2FoldChange
- padj: adjusted pvalue
- Tab sig_23_2_inS: DEGs trb2-2 trb3-2 vs trb2-2
- baseMean: Euclidean mean of log-transformed count data
- log2FoldChange: Log2 fold change of count data trb2 trb2-2 vs trb2-2
- lfcSE: Standard error of the log2FoldChange
- pvalue: pvalue of log2FoldChange
- padj: adjusted pvalue
- Tab sig_23_3_inS: DEGs trb2-2 trb3-2 vs trb3-2
- baseMean: Euclidean mean of log-transformed count data
- log2FoldChange: Log2 fold change of count data trb2-2 trb3-2 vs trb3-2
- lfcSE: Standard error of the log2FoldChange
- pvalue: pvalue of log2FoldChange
- padj: adjusted pvalue
- Tab gene_categories: DEG categories for all conditions
- AGI: gene symbols
- trb1_vs_wt: categories (down, up or NA) for trb1-2 vs Col-0
- trb2_vs_wt: categories (down, up or NA) for trb2-2 vs Col-0
- trb3_vs_wt: categories (down, up or NA) for trb3-2vs Col-0
- trb12_vs_wt: categories (down, up or NA) for trb1-2 trb2-2 vs Col-0
- trb13_vs_wt: categories (down, up or NA) for trb1-2 trb1-3 vs Col-0
- trb23_vs_wt: categories (down, up or NA) for trb2-2 trb3-2 vs Col-0
- trb12_vs_trb1: categories (down, up or NA) for trb1-2 trb2-2 vs trb1-2
- trb12_vs_trb2: categories (down, up or NA) for trb1-2 trb2-2 vs trb2-2
- trb13_vs_trb1: categories (down, up or NA) for trb1-2 trb3-2 vs trb1-2
- trb13_vs_trb3: categories (down, up or NA) for trb1-2 trb3-2 vs trb3-2
- trb23_vs_trb2: categories (down, up or NA) for trb2-2 trb3-2 vs trb2-2
- trb23_vs_trb3: categories (down, up or NA) for trb2-2 trb3-2 vs trb3-2
- PAMk5: row-scaled VST transformed count matrix across all genotypes for all DEGs with PAMclusters indicated
- as above with the last tab Cluster
- detected_AGIs: all genes with acceptable read counts (sum >50 counts across all samples)
- List of genes scored as "expressed" in RNAseq analysis
File: Samples.ChIP-seq.txt
Description: ChIPseq sample file required for downöoad and processing
Variables
- Name: Original file base name
- Alias: Sample name as combination of Genotpye, Type and Replicate
- Category: Genotype
- Type: Experimental type (IP or INPUT)
- Replicate: Biological replicate
- Experiment: Experiment number
- Project: ENA project code
- Submission: ENA sample accession
- EBI-accession: ENA run accession
- link: ENA accession for download
- md5sum: MD5 checksum
File: ChIPseqscripts.zip
Description: Archive of Linux bash scripts that allow downloading the data from ENA and recapitulate the analysis
File: TRB1-IDR.SigpValueBL
Description: Merged peaks of two TRB1 ChIP replicates using the IDR
- Column1: Chromosome name
- Column2: Peak start
- Column3: Peak end
- Column4: Orientation (NA)
- Column5: IDR score
- Column6: NA
- Column7: -log10 scaled rank of Peak 1
- Column8: -log10 scaled rank of Peak 2
- Column9: Peak1 start
- Column10: Peak1 end
- Column11: Peak 1 SignalValue from EPIC2
- Column12: Peak2 start
- Column13: Peak2 end
- Column14: Peak2 SignalValue from EPIC2
File: TRB2-IDR.SigpValueBL
Description: Merged peaks of two TRB2 ChIP replicates using the IDR
- Column1: Chromosome name
- Column2: Peak start
- Column3: Peak end
- Column4: Orientation (NA)
- Column5: IDR score
- Column6: NA
- Column7: -log10 scaled rank of Peak 1
- Column8: -log10 scaled rank of Peak 2
- Column9: Peak1 start
- Column10: Peak1 end
- Column11: Peak 1 SignalValue from EPIC2
- Column12: Peak2 start
- Column13: Peak2 end
- Column14: Peak2 SignalValue from EPIC2
File: TRB3-IDR.SigpValueBL
Description: Merged peaks of two TRB3 ChIP replicates using the IDR
- Column1: Chromosome name
- Column2: Peak start
- Column3: Peak end
- Column4: Orientation (NA)
- Column5: IDR score
- Column6: NA
- Column7: -log10 scaled rank of Peak 1
- Column8: -log10 scaled rank of Peak 2
- Column9: Peak1 start
- Column10: Peak1 end
- Column11: Peak 1 SignalValue from EPIC2
- Column12: Peak2 start
- Column13: Peak2 end
- Column14: Peak2 SignalValue from EPIC2
File: TRB1.fasta
Description: Multi-fasta file for merged peaks of TRB1
File: TRB2.fasta
Description: Multi-fasta file for merged peaks of TRB1
File: TRB3.fasta
Description: Multi-fasta file for merged peaks of TRB1
File: S2_File.xlsx
Description: Details of ChIP-Seq analysis
-
Tab peaks_TRB1: Annotated ChIP-Seq Peaks of TRB1:YFP
Variables
- Number: Peak number
- TRB1_GRange.seqnames: Chromosome name
- TRB1_GRange.start: Peak start
- TRB1_GRange.end: Peak end
- TRB1_GRange.width: Peak width
- TRB1_GRange.strand: Peak orientation (NA)
- TRB1_GRange.Score: IDR score
- TRB1_GRange.annotation: Genomic annotation (Promoter, Distal Intergenic, 3'UTR)
- TRB1_GRange.geneChr: Chromosome number
- TRB1_GRange.geneStart: Gene start
- TRB1_GRange.geneEnd: Gene end
- TRB1_GRange.geneLength: Gene length
- TRB1_GRange.geneStrand: Orientation (1,2)
- TRB1_GRange.geneId: AGI gene code
- TRB1_GRange.distanceToTSS: Distance peak to TSS
-
Tab peaks_TRB2: Annotated ChIP-Seq Peaks of TRB2:YFP
Variables
- Number: Peak number
- TRB2_GRange.seqnames: Chromosome name
- TRB2_GRange.start: Peak start
- TRB2_GRange.end: Peak end
- TRB2_GRange.width: Peak width
- TRB2_GRange.strand: Peak orientation (NA)
- TRB2_GRange.Score: IDR score
- TRB2_GRange.annotation: Genomic annotation (Promoter, Distal Intergenic, 3'UTR)
- TRB2_GRange.geneChr: Chromosome number
- TRB2_GRange.geneStart: Gene start
- TRB2_GRange.geneEnd: Gene end
- TRB2_GRange.geneLength: Gene length
- TRB2_GRange.geneStrand: Orientation (1,2)
- TRB2_GRange.geneId: AGI gene code
- TRB2_GRange.distanceToTSS: Distance peak to TSS
-
Tab peaks_TRB3: Annotated ChIP-Seq Peaks of TRB3:YFP
Variables
- Number: Peak number
- TRB3_GRange.seqnames: Chromosome name
- TRB3_GRange.start: Peak start
- TRB3_GRange.end: Peak end
- TRB3_GRange.width: Peak width
- TRB3_GRange.strand: Peak orientation (NA)
- TRB3_GRange.Score: IDR score
- TRB3_GRange.annotation: Genomic annotation (Promoter, Distal Intergenic, 3'UTR)
- TRB3_GRange.geneChr: Chromosome number
- TRB3_GRange.geneStart: Gene start
- TRB3_GRange.geneEnd: Gene end
- TRB3_GRange.geneLength: Gene length
- TRB3_GRange.geneStrand: Orientation (1,2)
- TRB3_GRange.geneId: AGI gene code
- TRB3_GRange.distanceToTSS: Distance peak to TSS
File: Supplementary_Table_ChIP_Seq_TRB1.csv
Description: TRB1 gene annotated peaks
Variables
- : row numbers
- TRB1_GRange.seqnames: Chromosome name
- TRB1_GRange.start: Peak start
- TRB1_GRange.end: Peak end
- TRB1_GRange.width: Peak width
- TRB1_GRange.strand: Peak orientation (NA)
- TRB1_GRange.Score: IDR score
- TRB1_GRange.annotation: Genomic annotation (Promoter, Distal Intergenic, 3'UTR)
- TRB1_GRange.geneChr: Chromosome number
- TRB1_GRange.geneStart: Gene start
- TRB1_GRange.geneEnd: Gene end
- TRB1_GRange.geneLength: Gene length
- TRB1_GRange.geneStrand: Orientation (1,2)
- TRB1_GRange.geneId: AGI gene code
- TRB1_GRange.distanceToTSS: Distance peak to TSS
File: Supplementary_Table_ChIP_Seq_TRB2.csv
Description: TRB2 gene annotated peaks
Variables
- :
- TRB2_GRange.seqnames: Chromosome name
- TRB2_GRange.start: Peak start
- TRB2_GRange.end: Peak end
- TRB2_GRange.width: Peak width
- TRB2_GRange.strand: Peak orientation (NA)
- TRB2_GRange.Score: IDR score
- TRB2_GRange.annotation: Genomic annotation (Promoter, Distal Intergenic, 3'UTR)
- TRB2_GRange.geneChr: Chromosome number
- TRB2_GRange.geneStart: Gene start
- TRB2_GRange.geneEnd: Gene end
- TRB2_GRange.geneLength: Gene length
- TRB2_GRange.geneStrand: Orientation (1,2)
- TRB2_GRange.geneId: AGI gene code
- TRB2_GRange.distanceToTSS: Distance peak to TSS
File: Supplementary_Table_ChIP_Seq_TRB3.csv
Description: TRB3 gene annotated peaks
Variables
- :
- TRB3_GRange.seqnames: Chromosome name
- TRB3_GRange.start: Peak start
- TRB3_GRange.end: Peak end
- TRB3_GRange.width: Peak width
- TRB3_GRange.strand: Peak orientation (NA)
- TRB3_GRange.Score: IDR score
- TRB3_GRange.annotation: Genomic annotation (Promoter, Distal Intergenic, 3'UTR)
- TRB3_GRange.geneChr: Chromosome number
- TRB3_GRange.geneStart: Gene start
- TRB3_GRange.geneEnd: Gene end
- TRB3_GRange.geneLength: Gene length
- TRB3_GRange.geneStrand: Orientation (1,2)
- TRB3_GRange.geneId: AGI gene code
- TRB3_GRange.distanceToTSS: Distance peak to TSS
File: Data_Processing_After_review.R
Description: R script to generate database of peak and annotation overlaps
File: Figure_2_Script_overlapping_genes_after_review.R
Description: R script to overlap peaks with gene annotations
File: Data_processing_Jacobsen_Data_After_review.R
Description: R script to compare independent TRB1-3 dataset
File: Araport11.txdb
Description: Araport11 gene annotation as txdb for import to R
File: PEAT.xlsx
Description: Excel table compiling published data for PEAT and NuA4 complex components
- Tab UBP5
Variables
- chr: Chromosome name
- Pstart: peak start
- Pend: Peak end
- width: Peak width
- abs_summit: Position of peak summit
- pileup: coverage count
- -LOG10(pvalue): P value of enrichment
- fold_enrichment: fold-enrichment
- -LOG10(qvalue): corrected p value
- name: Gene name
- Gene_id: AGI code for gene
- Strand: orientation of gene
- Gene_name: Gene name
- overlap_with_Genebody: Overlap type of peak with gene
- Tab PWWP1
Variables
- chr: Chromosome name
- Pstart: peak start
- Pend: Peak end
- width: Peak width
- abs_summit: Position of peak summit
- pileup: coverage count
- -LOG10(pvalue): P value of enrichment
- fold_enrichment: fold-enrichment
- -LOG10(qvalue): corrected p value
- name: Gene name
- Gene_id: AGI code for gene
- Strand: orientation of gene
- Gene_name: Gene name
- overlap_with_Genebody: Overlap type of peak with gene
- Tab EPCR1
Variables
- chr: Chromosome name
- Pstart: peak start
- Pend: Peak end
- width: Peak width
- abs_summit: Position of peak summit
- pileup: coverage count
- -LOG10(pvalue): P value of enrichment
- fold_enrichment: fold-enrichment
- -LOG10(qvalue): corrected p value
- name: Gene name
- Gene_id: AGI code for gene
- Strand: orientation of gene
- Gene_name: Gene name
- overlap_with_Genebody: Overlap type of peak with gene
- Tab HAM1
Variables
- chr: Chromosome name
- Pstart: peak start
- Pend: Peak end
- width: Peak width
- abs_summit: Position of peak summit
- pileup: coverage count
- -LOG10(pvalue): P value of enrichment
- fold_enrichment: fold-enrichment
- -LOG10(qvalue): corrected p value
- name: Gene name
- Gene_id: AGI code for gene
- Strand: orientation of gene
- Gene_name: Gene name
- overlap_with_Genebody: Overlap type of peak with gene
- Tab EPL1B
Variables
- chr: Chromosome name
- Pstart: peak start
- Pend: Peak end
- width: Peak width
- abs_summit: Position of peak summit
- pileup: coverage count
- -LOG10(pvalue): P value of enrichment
- fold_enrichment: fold-enrichment
- -LOG10(qvalue): corrected p value
- name: Gene name
- Gene_id: AGI code for gene
- Strand: orientation of gene
- Gene_name: Gene name
- overlap_with_Genebody: Overlap type of peak with gene
File: GSM6185716_FLAG-ChIPseq-JMJ14-Rep1_peaks.narrowPeak
Description: Published data in MACS2 narrow Peak format
File: GSM6185717_FLAG-ChIPseq-JMJ14-Rep2_peaks.narrowPeak
Description: Published data in MACS2 narrow Peak format
File: CLF_GFP-Rep1-islands-summary
Description: Published data in MACS2 broad Peak format
File: CLF_GFP-Rep2-islands-summary
Description: Published data in MACS2 broad Peak format
File: SWN_GFP-Rep2-islands-summary
Description: Published data in MACS2 broad Peak format
File: SWN_GFP-Rep1-islands-summary
Description: Published data in MACS2 broad Peak format
File: Jacobsen_TRB1.bed
Description: published TRB1 peakdata in bed format
File: Jacobsen_TRB2.bed
Description: published TRB2 peakdata in bed format
File: Jacobsen_TRB3.bed
Description: published TRB3 peakdata in bed format
File: Supplementary_Table_Bins_2_nucleosomes.csv
Description: Correlative peak database
Variables
- : row number
- bin_number: number of bin
- tiled_genome: genomic position (Chr:start-end)
- UBP5: Peak scores for UBP5
- EPCR1: Peak scores for EPCR1
- PWWP1: Peak scores for PWWP1
- HAM1: Peak scores for HAM1
- EPL1B: Peak scores for EPL1B
- CLF: Peak scores for CLF
- SWN: Peak scores for SWN
- JMJ14: Peak scores for JMJ14
- TRB1: Peak scores for TRB1
- TRB2: Peak scores for TRB2
- TRB3: PEAK scores for TRB3
- cum: sum of peak scores
- PEAT: Category PEAT
- NuA4: Category NuA4
- PRC2: Category PRC2
- JMJ14_Cluster: Category JMJ14
- Category: Named category
- TRB_Combination: Named TRB combination
File: database_genes_all_overlapping.csv
Description:
Variables
- : row numbert
- AGI: AGI gene code
- ENTREZ: ENTREZ gene code
- UBP5: Peak scores for UBP5
- EPCR1: Peak scores for EPCR1
- PWWP1: Peak scores for PWWP1
- HAM1: Peak scores for HAM1
- EPL1B: Peak scores for EPL1B
- CLF: Peak scores for CLF
- SWN: Peak scores for SWN
- JMJ14: Peak scores for JMJ14
- TRB1: Peak scores for TRB1
- TRB2: Peak scores for TRB2
- TRB3: PEAK scores for TRB3
- cum: sum of peak scores
- PEAT: Category PEAT
- NuA4: Category NuA4
- PRC2: Category PRC2
- JMJ14_Cluster: Category JMJ14
- Category: Named category
- TRB_Combination: Named TRB combination
File: Jacobsen_Comparisons_bins.csv
Description:
Variables
- : row number
- bin_number: number of bin
- tiled_genome: genomic position (Chr:start-end)
- UBP5: Peak scores for UBP5
- EPCR1: Peak scores for EPCR1
- PWWP1: Peak scores for PWWP1
- HAM1: Peak scores for HAM1
- EPL1B: Peak scores for EPL1B
- CLF: Peak scores for CLF
- SWN: Peak scores for SWN
- JMJ14: Peak scores for JMJ14
- TRB1: Peak scores for TRB1
- TRB2: Peak scores for TRB2
- TRB3: PEAK scores for TRB3
- cum: sum of peak scores
- PEAT: Category PEAT
- NuA4: Category NuA4
- PRC2: Category PRC2
- JMJ14_Cluster: Category JMJ14
- Category: Named category
- TRB_Combination: Named TRB combination
File: S3_File.xlsx
Description: Details of genomic binding analysis
Variables
- Tab database_bins
- as Supplementary_Table_Bins_2_nucleosomes.csv
- Tab database_genes
- as database_genes_all_overlapping.csv
File: S4_File.xlsx
Description: Details of IP-MS-MS analysis
Variables
- Tab raw: MaxQuant result
- Tab ANOVA_HSD_full: log2 transformed, filtered and imputed LFQ values, result of ANOVA and HSD annotated to all protein
- Tab: Significant: All proteins part of significant HSD groups
File: Figure_5_All_Overlapping_v2.R
Description: R script to run GO-term analysis
File: S5_File.xlsx
Description: Details of GO-Term enrichment analysis
Excel file containing one sheet:
All_GO_Terms: Table containing all enriched (p<=0.05) GO-terms, their
associated complexes, and co-bound TRB-paralogs
Variables
- Column1: row number
- Cluster: Named complex category
- ID: GO term ID
- Description: GO term description
- GeneRatio: Ratio of genes in set with annotation versus genes in set
- BgRatio: Background of genes with GO annotation versus all genes in analysis
- RichFactor: Enrichment Factor
- FoldEnrichment: Fold Enrichment
- zScore: ZScore of Enrichment
- pvalue: pValue of Enrichment
- p.adjust: adjusted P-Value of enrichment
- qvalue: q value of enrichment
- geneID: list of genes in set with GO term annotation (Entrez code)
- Count: number of genes in set with GO term annotation
- cat: complex category
- trbs: TRB category
Plant material and growth conditions
Plants were grown in greenhouse conditions or growth chambers as indicated under long day (LD) (16h light, 8h dark) photoperiod at 22°C ambient temperature. Plants were randomized within trays for phenotypic analyses. Plants for RNA-seq analysis were grown in growth chambers. Three biological replicates were grown in 1 week intervals in the same chamber, material was collected at ZT10 from 14-day-old seedlings. For ChIP-seq, plants were grown in tissue culture on GM plates in LD conditions at 21°C. Material from replicated plates was collected from 14-day-old-seedling at ZT10 as biological replicates. The trb1-2 (Salk_001540) and trb3-2 (Salk_134641) alleles are previously described T-DNA insertion lines, trb2-2 and trb2-3 are CRISPR/Cas9 edited alleles as previously described except that the editing transgene was removed by segregation (1). Double and triple mutants were generated by crosses. Transgenic lines TRB2pro-TRB2-YFP and TRB3pro-TRB3-YFP were previously described (1).
Scoring of flowering time
Flowering time was scored as the number of leaves at the main shoot (rosette and cauline leaves). Statistical analysis was done by ANOVA with HSD grouping using the agricolae package in R. To distinguish segregating prope triple from double mutants, genotyping of individual plants was carried out on genomic DNA prepared using Biospring96 (Qiagen) using manufacturer’s instructions. Alleles trb1-2 and trb3-2 were amplified using SALK_LBb1.3: ATTTTGCCGATTTCGGAAC in combination with SALK_001540_RP: ATGCCACCACAATAAATCTCG and SALK_13464_RP: ATGGTTCACGAGAAACCTGTG, respectively. To distinguish between trb2-3 and TRB2, two reactions were carried out for 28 cycles at 62°C annealing temperature dCAPS_TRB2_R: ATTGCCTCAAAGATGATCTTATCC in combination with 8-18-10-specifi: ACTTCCCCCGGAGGTTCTTG and 8-18-10-WT: ACTTCCCCCGGAGGTTCTG, respectively.
RNA preparation and RNA-seq
Total RNA was extracted from 3-4 14-d-old-seedlings with an RNeasy Plant Mini kit (Qiagen) according to the manufacturer’s instructions. To remove gDNA contamination, 10 µg of total RNA was DNase I treated, using the DNA-free DNA Removal kit (Invitrogen), as described in the kit’s instructions. RNA quality was assessed by Agarose Gel electrophoreses of an 200ng aliquot DNaseI treated RNA. The RNA samples were sent to BGI TECH SOLUTIONS (HONGKONG) for poly-A enrichment, library preparation and directional paired-end Nanoball sequencing on the DNBSEQ platform.
RNA-seq analysis
Paired end reads were mapped to the Arabidopsis thaliana TAIR10 reference genome indexed with the Araport11 genome annotation using STAR. Read counts were pooled for all splice variants as per gene counts. Sense strand gene counts were used for differential expression analysis with the R package DESeq2 using a threshold of padj < 0.05 to set differential expression of mutants vs Col-0 and of double mutants to their respective single mutants. Sample correlation clustering revealed that the trb2.1 library was as an outlier and was excluded from all statistical analysis. Venn diagrams and statistical testing of overlaps between samples used R packages ggvenn and SuperExactTest, respectively. Clustering of expression data and drawing of gene-normalized expression heatmaps were carried out using R package ComplexHeatmap using PAM-clustering.
ChIP and ChIP-seq library preparation
For all ChIP experiments, 2g of 14-d-old seedlings were collected in 50 ml 1x PBS buffer (137mM NaCl, 1.8mM KH2PO4, 10.1mM Na2HPO4, 2.7mM KCl), fixed with 1 % formaldehyde under vacuum two times for 10 min after which the crosslinking reaction was quenched with 5 ml glycine (1 M) under vacuum for 5 min. Fixed plant material was collected in a sieve, washed with autoclaved water, and dried with paper towels before being snap frozen with liquid N2. Frozen samples were ground at 7200 rpm three times for 30 s, using the Precellys Evolution Homogenizer in combination with a Cryolys Cooling Option (Bertin Instruments) in 7 ml reaction tubes with 3mm ceramic beads.
To extract nuclei, the ground samples were mixed with 30 ml NIB buffer (50mM HEPES-NaOH (pH 7.4), 5mM MgCl2, 25mM NaCl, 5% sucrose, 30% glycerine, 0.25% Triton X 100, freshly add: 0.1% β-mercaptoethanol, 0.1% SIGMA proteinase inhibitor), vortexed, filtrated using Miracloth (Merk) and spun down at 4000rpm and 4 C for 10min. The pellet was resuspended in 20ml 1x Washing buffer (16.7mM HEPES-NaOH (pH 7.4), 6.7mM MgCl2, 33.3mM NaCl, 13.3% sucrose, 13.3% glycerine, 0.25% Triton X 100, freshly add: 0.001% β-mercaptoethanol, 0.001% SIGMA proteinase inhibitor) and spun down at 4000rpm and 4°C for 10min. Then, extracted nuclei were resuspended in TE-SDS (1mM EDTA (pH 8.0), 10mM Tris-HCl (pH 7.4), 0.25% SDS) in a total volume of 600µl, rotated at 4°C and 12 rpm for 20 min, split in 2x300µl and sonicated with a Bioruptor Sonicator (Diagenode) that was attached to a Minichiller cooling system (huber) (Programme: red- 0.5 (on); green - 1 (off); 15 min, H) to produce DNA fragments of 200-500bp. Sonicated chromatin was separated from debris by centrifugation at 4°C and maximum speed for 10 min. For ChIP-seq, 400µl sonicated chromatin were mixed with 600 µl of IP dilution buffer (80mM Tris-HCl (pH 7.4), 230mM NaCl, 1.7% NP40, 0.17% DOC), 2µl RNase I (10 mg/ml), 2µl DTT (1M), and 2µl SIGMA proteinase inhibitor. Afterwards, equal volumes of the sonicated chromatin mix were split into two different tubes, 5µl of α-GFP (ab290, Abcam) were added to carry out IP. Samples were rotated at 4°C, 12 rpm overnight in a bohemian wheel. After overnight incubation, unspecific precipitates were removed by centrifugation (4°C, 20000g, 10 min) and the supernatant transferred to a tube containing 30µl rProtein A Sepharose Fast Flow antibody purification resin (GE Healthcare) beads equilibrated in RIPA buffer (0.6x IP Dilution buffer, 0.1% SDS). Samples were rotated at 12rpm and 4°C for 3 h. After centrifugation, 200µl of the supernatant from control samples was reserved as input and kept on ice. Beads were washed with 1 ml RIPA for five times to remove the background. At the 5th time, the samples were transferred to fresh tubes with 800µl RIPA and protein-DNA complexes were eluted from precipitated beads by mixing them two times with 160µl glycine elution buffer at RT. IP samples were neutralised with 80 µl of Tris-HCl (1 M, pH9.7). IP samples were de-crosslinked by adding 8µl SDS (10 %) and 5µl proteinase K (5mg/ml). For input samples, only 5µl proteinase K was added. DNA was extracted twice with equal amounts of phenol/chloroform and precipitated with 1/10 volumes NaAC (3M), 2.5 volumes EtOH (100%), and 1µl glycogen (10mg/ml) at -20° for 3h. Afterwards, the DNA was washed with 1ml EtOH (70 %), dried, and resuspended in 14µl H2O.
For ChIP–seq library preparation, two independent immunoprecipitations for Col-0, TRB2pro-TRB2-YFP and TRB3pro-TRB3-YFP were processed. Libraries were prepared with Ovation Ultralow Library System (NuGEN) according to the manufacturer’s instructions, using 71% (10µl) of each ChIP as starting material. Before amplification DNA concentration was measured, using a Qbit 4 (ThermoFisher Scientific), to determine the appropriate number of PCR cycles needed for each sample (see manufacturer’s manual). After amplification, DNA was run on a 2% low-melt agarose gel and fragments between 200 and 500 bp length were purified using the MinElute Gel Extraction Kit (Qiagen) according to the manufacturer’s instructions except that gel fragments were solved at RT and eluted in 15µl EB buffer. An aliquot of each library was tested via qPCR before and after PCR amplification to confirm that libraries showed similar fold-change between control and target regions. Sequencing was performed as single-end 100-nt reads (ca 13 mio reads/sample) on the Illumina HiSeq3000 platform by the Max Planck Genome Centre Cologne.
ChIP-seq analysis
After sequencing, adapter sequences ≥ 12bp were removed using Cutadapt (2). Reads were aligned to the A. thaliana genome (TAIR10) with the Burrow-Wheeler Aligner (BWA) (3) and BAM-files created using SAMtools (4). SAMtools was used to remove multi-mapping reads by filtering with MAPQ score<10, which resulted in 8.8 to 12.2 million reads per sample. Unique BAM-files were indexed with SAMtools, normalised to Counts Per Million mapped reads (CPM), and converted to bigWig-files using bamCoverage of the deeptools2 suite (5) for visualization in the Integrated Genome Viewer (IGV). A blacklist of over- and under sampled regions was generated by scoring read coverage of input and Col-0 ChIP samples across 200bp windows using BEDtools (6). Windows that were statistical outliers were determined using R and subsequently removed from the analysis. EPIC2 was used to determine enriched regions in two replicates against the pool of two Col-0 control IPs using pooled input samples as correction (7). Replicates were compared using the Irreproducible Discovery Rate (IDR) framework (8). Peak passing the threshold of 0.01 > IDR were merged using bedtools. Previously generated 35Sp-TRB1-YFP reads were included in the IDR analysis for better comparison (9).
Analysis of the binding behavior of various epigenetic regulatory complexes
Binding peaks of UBP5, EPRC1, PWWP1, HAM1, and EPL1B were sourced from (10), CLF and SWN from (11), and JMJ14 and a second set of TRB1-3 binding data were obtained from (12). To visualize the overlapping binding sites, the TAIR10 genome of A. thaliana was tiled into 397160 bins of 300 bp length and all peaks of these eight datasets and TRB1, TRB2 and TRB3 were assigned to overlapping bins. As the datasets were derived using different ChIP-Seq pipelines, the “Score” column of each dataset was normalised into deciles. Preliminary analysis of the overlap of the ChIP-Seq sets was performed through pearson correlation using Hmisc (13).
To visualize the overlapping binding sites, the bins were assigned to distinct categories: Bins bound by at least two of the PEAT components, both NuA4 components, both PRC2 components, or JMJ14 were assigned to “PEAT”, “NuA4”, “PRC2”, or “JMJ14” respectively. In addition, each of the 12 possible combinations of multiple complex assignments were added along with a category for unassigned bins, bringing the total to 17 categories. Each bin was assigned to one of these categories. Statistical analysis was performed using the MSET function of the SuperExactTest package (14) for pairwise comparison of the overlap of the generated categories with TRB1,2,3-bound bins. Heatmaps were generated using the ComplexHeatmap (15) package.
Cis-motif enrichment analysis
DNA motifs enriched in peak-assigned bins were identified through the XSTREME pipeline of the MEME-suit (16) using standard settings except for --meme-mod “anr”, providing the DNA motifs identified by (17). Motifs were declared as telo-box-like, if the identified motif was closely related to the telo-box, but did not fully capture the canonical Arabidopsis telomere repeat sequence of TTTAGGG. For each peak category, motifs were ranked based on their e-Value.
Gene Ontology enrichment analysis
The peaks of all ChIP-Seq sets used in this study were annotated to genes using “annotatePeak” of the “ChIPseeker” package (18,19). Since epigenetic regulatory complexes are not solely found at or near the TSS, the parameters “tssRegion” and “overlap” were set to “c(-2000, 2000)” and "all" respectively in order to correctly assign binding sites further away from the TSS. Additionally, the wide nature of peaks derived from epigenetic regulatory complexes leads to a high proportion of multi-gene spanning peaks. To compensate for this, addFlankGeneInfo was set to true and all flanking genes with “flank_gene_distances” of 0 were also annotated to the same peak. The resulting genes were assigned to epigenetic regulatory complexes in the same manner as the genomic bins. The different sets were subsequently used to calculate GO-Term enrichment using the “clusterProfiler” package (20).
Sample preparation for LC-MS/MS
Leaf tissue (7g) harvested from 5-week-old transgenic plants (CaMV 35Sp-TRB1-GFP, CaMV 35Sp-TRB3-GFP, CaMV 35Sp-EDS1-GFP) was cut with scissors into 0.5 - 1.0 cm pieces and disrupted on ice in 15 ml Precellys tubes containing 5 ml extraction buffer (2 M hexylene glycol, 0.5 M PIPES-KOH pH7.0, 10 mM MgCl2, 5 mM beta-mercaptoethanol) and 13-15 sterilized metal beads using a Precellys 24 homogenizer (Bertin instruments) for three rounds set to 10 s at 7500 rpm. Samples were filtered through a single and then a double Miracloth (Merk) layer, adjusted to a volume of 45 ml. 10% Triton X-100 was added stepwise to a final concentration of 0.8%. While samples were incubated on ice, the Percoll (Sigma-Aldrich) gradient was assembled by carefully underlying 6 ml of 30% Percoll solution with 6 ml of 80% Percoll in a centrifuge tube (Beckman Coulter #355631). In parallel, three 15 ml aliquots per sample were layered onto gradients and centrifuged (2,000 g, 4 °C, 30 min). The nuclei-enriched fractions (5ml) were collected from the interphase between the Percoll layers using a 5-ml pipette and the combined aliquots diluted in 23 ml gradient buffer (0.09 M hexylene glycol, 0.09 mM PIPES-KOH pH7, 1.83 mM MgCl2, 0.92 mM β-mercaptoethanol, 0.18% Triton X-100). To gently pellet the nuclei, the samples cushioned on 6 ml 30% Percoll solution were centrifuged at 2,000 g and 4°C for 10 min. The isolated nuclei were resuspended in 1 ml sample buffer (20 mM TrisHCl pH7.4, 2 mM MgCl2, 150 mM NaCl, 5% glycerol, 5 mM DTT, complete protease inhibitor (Roche)) and transferred into fresh 1.5 ml Eppendorf tubes and once washed in sample buffer (cenrifugation 1,000 g at 4°C for 15 min) and resupended in a final volume of 600µl sample buffer. Samples were treated with 1 µl DNase I (10 u/µl) and 2 µl of RNase A (10 mg/ml) for 15 min at 37°C and subsequently sonicated in a Bioruptor (Diagenode) water bath connected to a Minichiller cooling system (Huber) (6x 15 s “on”/15 s “off” at high intensity). After removal of debris (centrifugation at 16,000 g and 4°C for 15 min), supernatants were transferred into clean 2 ml Protein LoBind tubes (Eppendorf). The protein concentration was determined by Bradford assay (Bradford, 1976) and equal amounts (i.e. 1 mg) were used for subsequent affinity purification. Immunoprecipitation was carried out with 25 µl GFP-trap Agarose beads (gta-20; Chromotek) in 2 ml sample buffer with Triton X-100 (0.1%) and EDTA (2 mM) after incubation at 4°C for 2.5 h at constant rotation (12 rpm). The protein-bound GFP-trap beads were washed four times with 300 µL of wash buffer (20 mM Tris-HCl pH7.4, 150 mM NaCl, 2 mM EDTA).
Sample preparation and LC-MS/MS data acquisition
Proteins from GFP-trap enrichment were submitted to an on-bead digestion. In brief, dry beads were re-dissolved in 25 µL digestion buffer 1 (50 mM Tris, pH 7.5, 2M urea, 1mM DTT, 5 ng/µL trypsin) and incubated for 30 min at 30 °C in a Thermomixer with 400 rpm. Next, beads were pelleted, and the supernatant was transferred to a fresh tube. Digestion buffer 2 (50 mM Tris, pH 7.5, 2M urea, 5 mM CAA) was added to the beads; after mixing, the beads were pelleted, the supernatant was collected and combined with the previous one. The combined supernatants were then incubated o/n at 32 °C in a Thermomixer with 400 rpm; samples were protected from light during incubation. The digestion was stopped by adding 1 µL TFA and desalted with C18 Empore disk membranes according to the StageTip protocol (21). Dried peptides were re-dissolved in 2% ACN, 0.1% TFA (10 µL) for analysis and measured without dilution. Samples were analyzed using an EASY-nLC 1200 (Thermo Fisher) coupled to a Q Exactive Plus mass spectrometer (Thermo Fisher). Peptides were separated on 16 cm frit-less silica emitters (New Objective, 75 µm inner diameter), packed in-house with reversed-phase ReproSil-Pur C18 AQ 1.9 µm resin (Dr. Maisch). Peptides were loaded on the column and eluted for 115 min using a segmented linear gradient of 5% to 95% solvent B (0 min: 5%B; 0-5 min -> 5%B; 5-65 min -> 20%B; 65-90 min ->35%B; 90-100 min -> 55%; 100-105 min ->95%, 105-115 min ->95%) (solvent A 0% ACN, 0.1% FA; solvent B 80% ACN, 0.1%FA) at a flow rate of 300 nL/min. Mass spectra were acquired in data-dependent acquisition mode with a TOP15 method. MS spectra were acquired in the Orbitrap analyzer with a mass range of 300–1750 m/z at a resolution of 70,000 FWHM and a target value of 3×106 ions. Precursors were selected with an isolation window of 1.3 m/z. HCD fragmentation was performed at a normalized collision energy of 25. MS/MS spectra were acquired with a target value of 105 ions at a resolution of 17,500 FWHM, a maximum injection time (max.) of 55 ms and a fixed first mass of m/z 100. Peptides with a charge of +1, greater than 6, or with unassigned charge state were excluded from fragmentation for MS2, dynamic exclusion for 30s prevented repeated selection of precursors.
LC-MS/MS data data analysis
Raw data were processed using MaxQuant software (version 1.6.3.4, http://www.maxquant.org/) (22) with label-free quantification (LFQ) and iBAQ enabled (23). MS/MS spectra were searched by the Andromeda search engine against a combined database containing the sequences from A. thaliana (TAIR10_pep_20101214; ftp://ftp.arabidopsis.org/home/tair/Proteins/TAIR10_protein_lists/) and sequences of 248 common contaminant proteins and decoy sequences. Trypsin specificity was required and a maximum of two missed cleavages allowed. Minimal peptide length was set to seven amino acids. Carbamidomethylation of cysteine residues was set as fixed, oxidation of methionine and protein N-terminal acetylation as variable modifications. Peptide-spectrum-matches and proteins were retained if they were below a false discovery rate of 1%.
Statistical analysis of the MaxLFQ values was carried out using Perseus (version 1.5.8.5, http://www.maxquant.org/). Quantified proteins were filtered for reverse hits and hits “identified by site” and MaxLFQ values were log2 transformed. Missing values were imputed from a normal distribution (1.8 downshift, separately for each column). After grouping samples by condition, only proteins with three valid values in at least one condition were retained for subsequent analysis. Statistically significant enrichment was performed by ANOVA followed by Honest True Difference (HSD) test for groups TRB1, TRB3, ESD1 with FDR<0.05.
Protein expression and purification
A single colony of E. coli SoluBL21 (amsbio), carrying either pET-28b-TRB1 or pET-28b-TRB3 was used to inoculate 5ml preculture in LB-AMP (100 mg/ml ampicillin), and grown at 37°C, 200rpm overnight. The preculture was added to 1l LB-AMP-medium and grown at 37°C, 200 rpm until the OD600 was around 0.6 – 0.8. After addition of 1mM IPTG the culture was transferred to 16°C,200rpm overnight. Bacterial cells were collected using a JLA 10.500 rotor (Beckman/centrifuge Avanti J-25) at 4000rpm at 4 °C for 10 min. Afterwards, bacterial cells were resuspended in 40ml ice-cold lysis buffer (50mM NaPO4, 300mM NaCl, 10mM imidazole, 0.1M PMSF at pH 7.5) and disrupted using sonication (Ultrasonic-Desintegrator, Branson) (Programme: Strength: 6, Duty cycle: 40, 3x 2min). The cell debris was removed using a JA 25.50 rotor (Beckman/centrifuge Avanti J-25) at 13000 rpm, 4 °C for 30 min. For affinity purification, 500µl of Ni-NTA Agarose beads (Qiagen) were washed three times with 5 ml lysis buffer, collected at 800g,4°C for 1 min and added to the cell lysate. After incubation at 4°C, 12rpm for 2h then beads were collected, transferred to a fresh 5ml Eppendorf tube and washed five times with 5ml washing buffer (50mM NaPO4, 300mM NaCl, 20mM imidazole, pH 7.5) at 4°C, 12rpm for 5min. To elute proteins, the beads were incubated with 1.5ml elution buffer (50mM NaPO4, 300mM NaCl, 250mM imidazole, pH 7.5) at 4°C, 12rpm for 2h. After collecting the beads at 4 °C, 800 g for 1 min, the supernatant was collected and dialysed in 500ml dialysis buffer (50mM NaPO4, 300mM NaCl, pH 7.5) using Slide-A-Lyzer Dialysis Cassettes (10K MWCO, 3 mL, ThermoFisher Scientific) to remove the imidazole. Dialysed proteins were collected in Protein LoBind Tubes (Eppendorf) and kept at 4°C. Protein quantity was determined by Bio-Rad Protein Assay according to manufacturer’s instructions. To check protein integrity, 1µg of protein was mixed 2x SDS-Loading buffer (126 mM Tris-HCl (pH 6.8), 20% glycerol, 4% SDS, 0.02 % bromophenol blue), incubated at 95°C for 5min, and run in 1x TGS buffer (Bio Rad) on a 1.5mm, 12% SDS-PAGE at 100 V for approximately 1.5h. The SDS-PAGE was stained with Coomassie brilliant blue staining solution (1g Coomassie Brilliant Blue (Bio-Rad), 500ml MeOH, 100ml glacial acetic acid, 400ml H2O) and de-stained with H2O overnight.
Microscale thermophoresis (MST)
Forward and reverse 5’-Cy3-labelled oligonucleotides of 28bp were ordered from SIGMA-ALDRICH. Sequences originating from the SEP3 promoter region were Cy3-proSEP3-telo-box-Cy3: TTTAAATGTTAGGGTTTTTTGTAGGATT** and Cy3-proSEP3-NonInter-Cy3: AAAAATATTTATATCACATCATTGTTAT). Two versions of the (C)RACCTA motif were Cy3-(C)AACCTAA-Cy3: CATCATGGCAACCTAAGGCTGGTACT AG and Cy3-(C)GACCTAA-Cy3: CATCATGGCGACCTAAGGCTGGTACTAG. A four-telo-box-repeat (R4) oligomer Cy3-R4-telo-box-Cy3: GGTTTAGGGTTTAGGGTTTAGGGTTTAG was published in (24). Annealing was carried out in a heating block in dialysis buffer at a concentration of 10µM sense and anti-sense oligonucleotides by first incubating at 95°C for 15 min and subsequent slow cooling by switching off the heating block.
For all MST experiments, the Monolith NT.115 instrument (NanoTemper Technologies) and 1x dialysis buffer with Tween 20 (0.05%) and BSA (1.25 mg/ml) were used. Oligomer fluorescence intensity, absorption, and bleaching was tested with the instrument’s green channel via the Pretest feature included in the machine’s MO.Control software (NanoTemper Technologies). Samples were prepared according to the suggested protocol included in the software. Oligomer concentration was adjusted to obtain ≥200 fluorescent counts at a laser power ≤ 80 %. These conditions were met at an oligomer concentration of 20nM and an IR-laser power of 60% for Cy3-proSEP3-telo-box-Cy3 and Cy3-proSEP3-NonInter-Cy3 and of 80 % for Cy3-R4-telo-box-Cy3, Cy3-(C)AACCTAA-Cy3, and Cy3-(C)GACCTAA-Cy3. Afterward, general TRB-telo-box/telo-box-like element interaction and suitability of different capillaries was tested via the Binding Check feature and samples were prepared as suggested by the software. For this purpose, the highest possible protein concentration was mixed with 20nM of fluorescently labelled oligomers and incubated at RT and in the dark for 10 min. Afterward, the TRB-dsDNA mix was loaded onto Monolith NT.115 Premium Capillaries (NanoTemper Technologies) that prevented surface absorption, as TRB proteins tended to absorb to standard capillaries. Afterward, TRB-telo-box/telo-box-like element interaction was quantified by using the software’s Binding Affinity feature. For this MST assay, a dilution series was prepared according to the software’s instructions, using the beforehand determined laser powers and 20nM of oligomer mixed with 8.5µM to 260pM of TRB protein. The TRB-DNA mix was incubated at RT in the dark for 10 min before being loaded onto Premium capillaries. Each measurement was repeated at least three times. Binding curves were analysed and KD values were calculated with the MO.Affinity Analysis software (NanoTemper Technologies) according to the manufacturer’s instructions.
References
1. Zhou Y, Wang Y, Krause K, Yang T, Dongus JA, Zhang Y, et al. Telobox motifs recruit CLF/SWN-PRC2 for H3K27me3 deposition via TRB factors in Arabidopsis. Nat Genet. 2018 May;50(5):638–44.
2. Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet j. 2011 May 2;17(1):10.
3. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009 Jul 15;25(14):1754–60.
4. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009 Aug 15;25(16):2078–9.
5. Ramírez F, Ryan DP, Grüning B, Bhardwaj V, Kilpert F, Richter AS, et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016 Jul 8;44(W1):W160–5.
6. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010 Mar 15;26(6):841–2.
7. Stovner EB, Sætrom P. epic2 efficiently finds diffuse domains in ChIP-seq data. Bioinformatics. 2019 Nov 1;35(21):4392–3.
8. Li Q, Brown JB, Huang H, Bickel PJ. Measuring reproducibility of high-throughput experiments. Ann Appl Stat. 2011 Sep;5(3):1752–79.
9. Zhou Y, Hartwig B, James GV, Schneeberger K, Turck F. Complementary activities of TELOMERE REPEAT BINDING proteins and polycomb group complexes in transcriptional regulation of target genes. Plant Cell. 2016 Jan;28(1):87–101.
10. Zheng S-Y, Guan B-B, Yuan D-Y, Zhao Q-Q, Ge W, Tan L-M, et al. Dual roles of the Arabidopsis PEAT complex in histone H2A deubiquitination and H4K5 acetylation. Mol Plant. 2023 Nov 6;16(11):1847–65.
11. Shu J, Chen C, Thapa RK, Bian S, Nguyen V, Yu K, et al. Genome-wide occupancy of histone H3K27 methyltransferases CURLY LEAF and SWINGER in Arabidopsis seedlings. Plant Direct. 2019 Jan 31;3(1):e00100.
12. Wang M, Zhong Z, Gallego-Bartolomé J, Feng S, Shih Y-H, Liu M, et al. Arabidopsis TRB proteins function in H3K4me3 demethylation by recruiting JMJ14. Nat Commun. 2023 Mar 28;14(1):1736.
13. Harrell Jr FE. Hmisc: Harrell Miscellaneous. The R Foundation. 2024
14. Wang M, Zhao Y, Zhang B. Efficient Test and Visualization of Multi-Set Intersections. Sci Rep. 2015 Nov 25;5:16923.
15. Gu Z, Eils R, Schlesner M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics. 2016 Sep 15;32(18):2847–9.
16. Grant CE, Bailey TL. XSTREME: Comprehensive motif analysis of biological sequence datasets. bioRxiv. 2021 Jan 1
17. O’Malley RC, Huang S-SC, Song L, Lewsey MG, Bartlett A, Nery JR, et al. Cistrome and epicistrome features shape the regulatory DNA landscape. Cell. 2016 May 19;165(5):1280–92.
18. Yu G, Wang L-G, He Q-Y. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics. 2015 Jul 15;31(14):2382–3.
19. Wang Q, Li M, Wu T, Zhan L, Li L, Chen M, et al. Exploring epigenomic datasets by chipseeker. Curr Protoc. 2022 Oct;2(10):e585.
20. Wu T, Hu E, Xu S, Chen M, Guo P, Dai Z, et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation (Camb). 2021 Aug 28;2(3):100141.
21. Rappsilber J, Ishihama Y, Mann M. Stop and go extraction tips for matrix-assisted laser desorption/ionization, nanoelectrospray, and LC/MS sample pretreatment in proteomics. Anal Chem. 2003 Feb 1;75(3):663–70.
22. Cox J, Mann M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol. 2008 Dec;26(12):1367–72.
23. Tyanova S, Temu T, Cox J. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nat Protoc. 2016 Dec;11(12):2301–19.
24. Hofr C, Sultesová P, Zimmermann M, Mozgová I, Procházková Schrumpfová P, Wimmerová M, et al. Single-Myb-histone proteins from Arabidopsis thaliana: a quantitative study of telomere-binding specificity and kinetics. Biochem J. 2009 Apr 1;419(1):221–8, 2 p following 228.
