Data from: Protein-Chromophore interactions controlling photoisomerization in red/green cyanobacteriochromes
Rockwell, Nathan; Moreno, Marcus; Lagarias, J. Clark; Martin, Shelley (2022), Data from: Protein-Chromophore interactions controlling photoisomerization in red/green cyanobacteriochromes, Dryad, Dataset, https://doi.org/10.25338/B89K9J
Cyanobacteriochromes (CBCRs) are cyanobacterial photoreceptors distantly related to the phytochromes found in a broad range of bacteria, algae, and plants. CBCRs can control several aspects of cyanobacterial photobiology, including phototaxis, motile/sessile transitions, and complementary chromatic acclimation. CBCRs can sense a very broad range of light, with different subfamilies detecting UV, violet, blue, teal, green, yellow, orange, red, or far-red light (375-745 nm). This dataset is associated with a study of red/green CBCRs, a group in which red-absorbing dark states give rise to green-absorbing photoproducts upon light absorption and subsequent 15,16-photoisomerization of the phycocyanobilin chromophore. Interestingly, some members of this group fail to undergo this reaction. By comparing a conserved lineage of these red-inactive CBCRs to their photoactive red/green relatives, we identified three residues that determine whether photoisomerization can occur: introduction of these three substitutions is sufficient to block photoisomerization in a red/green CBCR or restore it in a red-inactive one. Such engineered red-inactive ones also mimic other properties of naturally occuring red-inactive CBCRs. This study thus demonstrates that it is possible to engineer the fate of the excited-state population of a biological photoreceptor with only a few amino acid substitutions.
This work used maximum likelihood phylogenetic analysis to identify a conserved lineage of red-inactive CBCRs. Chosen proteins were then obtained through commercial gene synthesis and recombinantly expressed in E. coli cells that had been engineered to product phycocaynobilin or other bilin chromophores. Purified proteins were then characterized using absorption, fluorescence, and circular dichroism (CD) spectroscopy. This dataset provides raw data for that analysis. Data are ogranized as a single gzipped tarball with an associated README file. Within the compressed archive, there are five subdirectories containing files for phylogenetic analysis, for absorption spectroscopy, for fluorescence spectroscopy, for CD spectroscopy, and for the sequences of the synthetic genes. Phylogenetic files are in CLUSTAL, PHYLIP, and Newick format (starting alignment in CLUSTAL format; input file for phylogenetic inference, PHYLIP format; output tree, Newick format). Spectroscopic data are in .csv or tab-delimited flat text. Synthetic gene sequences are in FASTA format.
Absorption spectra were collected on a Cary 50 spectrophotometer and were exported from the Cary software in .csv format. Those files were converted to unix newlines on the unix command line and are deposited here in that format. Processed files used for figure presentation are also included as tab-delimited text.
Fluorescence spectra were collected on a QM-6/2005SE fluorimeter equipped with red-enhanced photomultiplier tubes (Photon Technology International 814 Series). Files were displayed in tabular format and copy/pasted for text export. Emission spectra were numerically integrated for estimation of fluorescence quantum yield, and the data used for quantum yield estimation are included as tab-delimited text along wrocessed files used for figure presentation (also as tab-delimited text).
Circular dichroism spectra were collected on an Applied Photophysics Chirascan and were converted to tab-delimited text using the Applied Photophysics software. Those files were converted to unix newlines on the unix command line and are deposited here in that format. Processed files used for figure presentation are also included as tab-delimited text (compressed archive, .tar.gz extension).
Phylogenetic data include the three files used in the workflow for the published phylogeny. A multiple sequence alignment was constructed using MAFFT v7.450 (full command-line settings are in the manuscript). This file is deposited here in CLUSTAL format (.aln extension). For use in calculating a phylogeny, this file was combined with structural information generated using STRIDE or DSSP on the command line; this was performed as described in the main manuscript. The resulting file is deposited here in PHYLIP format (.phy extension) and can be used in PhyML-structure (full command-line settings are in the manuscript). The output phylogeny is deposited here in Newick format (.txt extension). For figure preparation, the Newick file was processed in FigTree prior to annotation, scaling, and coloring in Adobe Illustrator.
Note that a README.txt is associated with this submission.
Synthetic sequences are reported as the sequenced insert, including flanking restriction sites used in cloning.
U.S. Department of Energy, Award: DE-FG02-09ER16117
National Institutes of Health, Award: R35GM139598