Metazoan transcriptional repressors regulate chromatin through diverse histone modifications. Contributions of individual factors to the chromatin landscape in development is difficult to establish, as global surveys reflect multiple changes in regulators. Therefore, we studied the conserved Hairy/Enhancer of Split family repressor Hairy, analyzing histone marks and gene expression in Drosophila embryos. This long-range repressor mediates histone acetylation and methylation in large blocks, with highly context-specific effects on target genes. Most strikingly, Hairy exhibits biochemical activity on many loci that are uncoupled to changes in gene expression. Rather than representing inert binding sites, as suggested for many eukaryotic factors, many regions are targeted errantly by Hairy to modify the chromatin landscape. Our findings emphasize that identification of active cis-regulatory elements must extend beyond the survey of prototypical chromatin marks. We speculate that this errant activity may provide a path for creation of new regulatory elements, facilitating the evolution of novel transcriptional circuits.
Microarray_raw_data.tar
Expression profiling analysis: Transcriptome data from four biological replicates were generated using 8x15K Customized Drosophila Genome Oligo Microarrays (Agilent). Slide image data was quantified using Agilent's Feature Extraction software.
diffReps_files.tar
Differentially changed genomic regions for histone marks were identified using the diffReps program, which uses a sliding window approach to scan the genome and find regions showing read count differences. Regions detected by diffReps were associated with genes by identifying the nearest RefSeq TSS and annotated to a genomic feature such as intergenic, intron, exon etc.
HOMER_peak_files.tar
Using HOMER with default settings, peaks for histone marks and Flag tagged Hairy protein were identified using signals from H3 ChIP and input respectively as background. Peaks called by HOMER were associated with genes by identifying the nearest RefSeq TSS and annotated to a genomic feature such as intergenic, intron, exon etc.
bedgraph_files.tar
ChIP-Seq experiments were visualized as custom tracks using Integrative Genomics Viewer (Broad Institute). Total uniquely mapped tags were normalized to 10 million reads to generate tracks using HOMER.
Machine_learning_CodeANDResults
To perform machine learning analysis to predict genes in the repressed and not-repressed categories, we wrote Python and Java codes to partition our dataset into 10 parts to perform feature selection and 10-fold cross validation classification utilizing the Weka machine learning software. The folder includes the codes used in this analysis and raw results that are summarized in the main text. For further description, a readme file is provided under WekaCode folder.
bed_files.tar.gz part a
Raw sequencing reads were mapped to genome as described in Materials and methods. The output bam files were converted to bed files and compressed in bed_files.tar.gz folder, which were broken into five pieces (part a, b, c, d and e) using split command. These files can be recombined to bed_files.tar.gz using cat command.
bed_files.tar.gz_a
bed_files.tar part b
Raw sequencing reads were mapped to genome as described in Materials and methods. The output bam files were converted to bed files and compressed in bed_files folder, which were broken into five pieces (part a, b, c, d and e) using split command. These files can be recombined to bed_files.tar.gz using cat command.
bed_files.tar.gz_b
bed_files.tar part c
Raw sequencing reads were mapped to genome as described in Materials and methods. The output bam files were converted to bed files and compressed in bed_files folder, which were broken into five pieces (part a, b, c, d and e) using split command. These files can be recombined to bed_files.tar.gz using cat command.
bed_files.tar.gz_c
bed_files.tar part d
Raw sequencing reads were mapped to genome as described in Materials and methods. The output bam files were converted to bed files and compressed in bed_files folder, which were broken into five pieces (part a, b, c, d and e) using split command. These files can be recombined to bed_files.tar.gz using cat command.
bed_files.tar.gz_d
bed_files.tar part e
Raw sequencing reads were mapped to genome as described in Materials and methods. The output bam files were converted to bed files and compressed in bed_files folder, which were broken into five pieces (part a, b, c, d and e) using split command. These files can be recombined to bed_files.tar.gz using cat command.
bed_files.tar.gz_e