Skip to main content

GUN4 appeared early in cyanobacterial evolution

Cite this dataset

Rockwell, Nathan C.; Lagarias, J. Clark (2023). GUN4 appeared early in cyanobacterial evolution [Dataset]. Dryad.


Photosynthesis relies on chlorophylls, which are synthesized via a common tetrapyrrole trunk pathway also leading to heme, vitamin B12, and other pigmented cofactors. The first committed step for chlorophyll biosynthesis is insertion of magnesium into protoporphyrin IX by magnesium chelatase. Magnesium chelatase is composed of H-, I-, and D-subunits, with the tetrapyrrole substrate binding to the H-subunit. This subunit is rapidly inactivated in the presence of substrate, light, and oxygen, so oxygenic photosynthetic organisms require mechanisms to protect magnesium chelatase from similar loss of function. An additional protein, GUN4, binds to the H-subunit and to tetrapyrroles. GUN4 has been proposed to serve this protective role via its ability to bind linear tetrapyrroles (bilins). In the current work, we probe the origins of bilin binding by GUN4 via comparative phylogenetic analysis and biochemical validation of a conserved bilin-binding motif. Based on our results, we propose that bilin-binding GUN4 proteins arose early in cyanobacterial evolution and that this early acquisition represents an ancient adaptation for maintaining chlorophyll biosynthesis in the presence of light and oxygen.


Phylogenetic data include the three files used in the workflow for teaching of the maximum likelihood phylogenies. For each analysis, a multiple sequence alignment was constructed using MAFFT v7.450 (command-line settings --genafpair --maxiterate 16 --clustalout –reorder). This file is deposited here in CLUSTAL format (.aln extension). The file was then converted to PHYLIP format with removal of gap-enriched columns (≥5%) using an in-house script. The PHYLIP format file was then used to infer a maximum-likelihood phylogeny in PhyML-3.1 with 100 bootstraps. Command-line settings for nucleic acid alignments were -m GTR -s SPR -a e -c 4 -v e -o tlr -b 100, and command-line settings for protein alignments were -m WAG -d aa -s SPR -a e -c 4 -v e -o tlr -b 100. Statistical robustness was assessed using the transfer bootstrap expectation (TBE) as calculated in booster (available at For figure preparation, the Newick file output by Booster was processed in FigTree prior to annotation, scaling, and coloring in Adobe Illustrator. For each analysis, three files are deposited: the initial sequence alignment in CLUSTAL format, the input file for PhyML in PHYLIP format, and the output file from booster in Newick format. For each Bayesian phylogeny, the input file is deposited (in NEXUS format). These input files were generated from the PhyML input files using the command-line -convert option of CLUSTAL.

Absorption spectra were collected on a Cary 50 spectrophotometer and were exported from the Cary software in .csv format. Those files were converted to unix newlines on the unix command line and post-processed to remove metadata. Kaleidagraph was used to plot the spectra and prepare figures. Spectra used for figure preparation were exported from Kaleidagraph as tab-delimited text files with unix newlines, and those files are deposited here in that format.

Fluorescence spectra were collected on a QM-6/2005SE fluorimeter equipped with red-enhanced photomultiplier tubes (Photon Technology International 814 Series). Files were displayed in tabular format and copy/pasted for text export. Kaleidagraph was then used to plot the spectra and prepare figures, with final file preparation carried out as for absorption spectra.

Usage notes

All files are reported as flat text files with unix newlines. A README.txt is associated with this submission.


United States Department of Energy, Award: DE-SC0002395