Chapter 1:

Eukaryotic protein synthesis generally initiates at a start codon defined by an AUG and its surrounding Kozak sequence context, but the quantitative importance of this context in different species is unclear. We tested this concept in two pathogenic Cryptococcus yeast species by genome-wide mapping of translation and of mRNA 5’ and 3’ ends. We observed thousands of AUG-initiated upstream open reading frames (uORFs) that are a major contributor to translation repression. uORF use depends on the Kozak sequence context of its start codon, and uORFs with strong contexts promote nonsense-mediated mRNA decay. Transcript leaders in Cryptococcus and other fungi are substantially longer and more AUG-dense than in Saccharomyces. Numerous Cryptococcus mRNAs encode predicted dual-localized proteins, including many aminoacyl-tRNA synthetases, in which a leaky AUG start codon is followed by a strong Kozak context in-frame AUG, separated by mitochondrial-targeting sequence. Analysis of other fungal species shows that such dual-localization is also predicted to be common in the ascomycete mould, Neurospora crassa. Kozak-controlled regulation is correlated with insertions in translational initiation factors in fidelity-determining regions that contact the initiator tRNA. Thus, start codon context is a signal that quantitatively programs both the expression and the structures of proteins in diverse fungi.

Chapter 2:

The human pathogenic yeast Cryptococcus neoformans silences transposable elements using endo-siRNAs and an Argonaute, Ago1. Endo-siRNAs production requires the RNA-dependent RNA polymerase, Rdp1, and two partially redundant Dicer enzymes, Dcr1 and Dcr2, but is independent of histone H3 lysine 9 methylation. We describe here an insertional mutagenesis screen for factors required to suppress the mobilization of the C. neoformans HARBINGER family DNA transposon HAR1. Validation experiments uncovered five novel genes (RDE1-5) required for HAR1 suppression and global production of suppressive endo-siRNAs. The RDE genes do not impact transcript levels, suggesting the endo-siRNAs do not act by impacting target transcript synthesis or turnover. RDE3 encodes a non-Dicer RNase III related to S. cerevisiae Rnt1, RDE4 encodes a predicted terminal nucleotidyltransferase, while RDE5 has no strongly predicted encoded domains. Affinity purification-mass spectrometry studies suggest that Rde3 and Rde5 are physically associated. RDE1 encodes a G-patch protein homologous to the S. cerevisiae Sqs1/Pfa1, a nucleolar protein that directly activates the essential helicase Prp43 during rRNA biogenesis. Rde1 copurifies Rde2, another novel protein obtained in the screen, as well as Ago1, a homolog of Prp43, and numerous predicted nucleolar proteins. We also describe the isolation of conditional alleles of PRP43, which are defective in RNAi. This work reveals unanticipated requirements for a non-Dicer RNase III and presumptive nucleolar factors for endo-siRNA biogenesis and transposon mobilization suppression in C. neoformans.

Chapter 3:

Tools to understand how the spliceosome functions in vivo have lagged behind advances in its structural biology. We describe methods to globally profile spliceosome-bound precursor, intermediates and products at nucleotide resolution. We apply these tools to three divergent yeast species that span 600 million years of evolution. The sensitivity of the approach enables detection of novel cases of non-canonical catalysis including interrupted, recursive and nested splicing. Employing statistical modeling to understand the quantitative relationships between RNA features and the data, we uncover independent roles for intron size, position and number in substrate progression through the two catalytic stages. These include species-specific inputs suggestive of spliceosome-transcriptome coevolution. Further investigations reveal ATP-dependent discard of numerous endogenous substrates at both the precursor and lariat-intermediate stages and connect discard to intron retention, a form of splicing regulation. Spliceosome profiling is a quantitative, generalizable global technology to investigate an RNP central to eukaryotic gene expression.

Chapter 4:

We determined that over 60 spliceosomal proteins are conserved between many fungal species and humans but were lost during the evolution of S. cerevisiae, an intron-poor yeast with unusually rigid splicing signals. We analyzed null mutations in a subset of these factors, most of which had not been investigated previously, in the intron-rich yeast Cryptococcus neoformans. We found they govern splicing efficiency of introns with divergent spacing between intron elements. Importantly, most of these factors also suppress usage of weak nearby cryptic/alternative splice sites. Among these, orthologs of GPATCH1 and the helicase DHX35 display correlated functional signatures and copurify with each other as well as components of catalytically active spliceosomes, identifying a conserved G-patch/helicase pair that promotes splicing fidelity. We propose that a significant fraction of spliceosomal proteins in humans and most eukaryotes are involved in limiting splicing errors, potentially through kinetic proofreading mechanisms, thereby enabling greater intron diversity.

Data from: Quantitative global studies reveal differential translational control by start codon context across the fungal kingdom

Data files

Abstract

Data from: Quantitative global studies reveal differential translational control by start codon context across the fungal kingdom

Data files

Abstract

Works referencing this dataset