Underlying data for characterization of phycocyanobilin (PCB) biosynthesis in Galdieria sulphuraria
Data files
Apr 18, 2025 version files 1.18 MB
Aug 01, 2025 version files 1.26 MB
-
Galdieria_PCB_biosynthesis_updated.tar.gz
1.25 MB
-
README.md
11.05 KB
Dec 11, 2025 version files 4.93 MB
-
Galdieria_PCB_biosynthesis_version3.tar.gz
4.92 MB
-
README.md
12 KB
Abstract
This dataset contains data for studies of phycocyanobilin (PCB) synthesis in the red alga Galdieria sulphuraria. Red algae such as G. sulphuraria utilize phycobilisomes for light harvesting. The phycobilisomes of early-branching organisms such as G. sulphuraria or Cyanidioschyzon merolae contain PCB chromophores but not phycoerythrobilin (PEB), in contrast to the phycobilisomes of other red algae. The studies reported in this dataset examine biosynthesis of PCB in G. sulphuraria and C. merolae, starting from biliverdin IX-alpha (BV), the last known common precursor for PCB and PEB. In cyanobacteria, phages, green algae, and land plants, conversion of BV into PCB or PEB is carried out by a family of enzymes called ferredoxin-dependent bilin reductases (FDBRs). The current studies demonstrate that G. sulphuraria, but not C. merolae, require the action of an additional isomerase to synthesize PCB.
Deposited data include phylogenetic analyses, data for in vitro characterization of recombinantly expressed FDBRs from G. sulphuraria and C. merolae, and data for fractionation of G. sulphuraria extracts and isomerase assays on the resulting enriched fraction. The phylogenetic analyses demonstrates that G. sulphuraria has two FDBRs, GsPEBA and GsPEBB, whereas C. merolae has a single FDBR, CmPCYA. CmPCYA would be expected to convert BV into PCB, whereas GsPEBA would be expected to convert BV into 15,16-dihydrobiliverdin (15,16-DHBV) and GsPEBB would then convert 15,16-DHBV into PEB rather than PCB. In vitro characterization demonstrates that all three FDBRs carry out the expected reactions, meaning that the G. sulphuraria enzymes produce PEB but the phycobilisomes contain PCB. This conundrum is resolved by demonstrating the existence of an isomerase in G. sulphuraria extracts that can convert PEB into PCB.
Description of the Data and file structure
This dataset contains data for studies of phycocyanobilin (PCB) synthesis in the red alga Galdieria sulphuraria. Red algae such as G. sulphuraria utilize phycobilisomes for light harvesting. The phycobilisomes of early-branching organisms such as G. sulphuraria or Cyanidioschyzon merolae contain PCB chromophores but not phycoerythrobilin (PEB), in contrast to the phycobilisomes of other red algae. The studies reported in this dataset examine biosynthesis of PCB in G. sulphuraria and C. merolae, starting from biliverdin IX-alpha (BV), the last known common precursor for PCB and PEB. In cyanobacteria, phages, green algae, and land plants, conversion of BV into PCB or PEB is carried out by a family of enzymes called ferredoxin-dependent bilin reductases (FDBRs). The current studies demonstrate that G. sulphuraria, but not C. merolae, require the action of an additional isomerase to synthesize PCB.
Deposited data are in a single gzipped tarball (Galdieria_PCB_biosynthesis_version3.tar.gz) and include phylogenetic analyses, data for in vitro characterization of recombinantly expressed FDBRs from G. sulphuraria and C. merolae, and data for fractionation of G. sulphuraria extracts and isomerase assays on the resulting enriched fraction. The phylogenetic analyses demonstrate that G. sulphuraria has two FDBRs, GsPEBA and GsPEBB, whereas C. merolae has a single FDBR, CmPCYA. CmPCYA would be expected to convert BV into PCB, whereas GsPEBA would be expected to convert BV into 15,16-dihydrobiliverdin (15,16-DHBV) and GsPEBB would then convert 15,16-DHBV into PEB rather than PCB. In vitro characterization demonstrates that all three FDBRs carry out the expected reactions, meaning that the G. sulphuraria enzymes produce PEB but the phycobilisomes contain PCB. This conundrum is resolved by demonstrating the existence of an isomerase in G. sulphuraria extracts that can convert PEB into PCB.
Data are deposited as a gzipped tarball containing a series of tab-delimited text files with unix newlines. After extraction, deposited data are in four folders:
I. FDBR_phylogeny
II. CmPCYA_in_vitro
III. GsPEBA_GsPEBB_in_vitro
IV. Galdieria_PEB_PCB_isomerase
I. FDBR_phylogeny:
Two maximum-likelihood phylogenies were inferred from the same alignment for this study. One used PhyML version 3.3; the other used IQ-TREE version 3.0.1.
Five files are deposited: the original multiple sequence alignment in CLUSTAL format, the input file for both phylogenies in PHYLIP format after removal of gap-enriched columns, and output files from the two software packages. The files are:
NTE_118.aln (multiple sequence alingment in CLUSTAL format)
NTE_118.05.phy (input file in PHYLIP format)
PML_NTE_118.05.phy_phyml_tree.txt (PhyML output file)
IQ_NTE_118.05.phy.treefile (IQ-TREE output file)
IQ_NTE_118.05.phy.log (log file from IQ-Tree, including model search results)
NTE_118.aln (multiple sequence alingment in CLUSTAL format)
This is the original alignment, generated in MAFFT (v7.450) using the E-INS-i algorithm. After a header indicating the MAFFT version, aligned sequences are presented as interleaved blocks. Each block has a segment of each sequence after the name of the sequence, with '-' serving as the gap character. As an example, here are three arbitrary sequences:
Protein_1 THIS-SEQ-IS-IMAGINARY
Protein_2 THAT-SEQ-VT-LMVGVNARF
Protein_3 SQATGTENGASPVMAGINARF
NTE_118.05.phy (input file in PHYLIP format)
For phylogenetic inference in PHYML, the CLUSTAL-format alignment was converted into the PHYLIP format with removal of gap-enriched positions. The resulting file is now sequential (with one protein sequence followed by the next). The header line is also different, indicating the number of sequences (3, in the above example) and positions or characters (18 above, after removal of gap-enriched columns). For the above example, the converted case looks like this:
3 18
Protein_1 THISSEQISIMAGINARY
Protein_2 THATSEQVTLMVGVNARF
Protein_3 SQATTENASVMAGINARF
(noting that this example is arbitrary and is unrelated to the actual sequences in this study)
PML_NTE_118.05.phy_phyml_tree.txt (PhyML output file in Newick format)
The PhyML phylogeny is presented in the Newick format, suitable for use with a broad range of tree viewers but not well suited for analysis in a text editor or word processor. One support value is presented (TBE, or transfer bootstrap expectation). TBE was calculated from 100 bootstraps within PhyML.
IQ_NTE_118.05.phy.treefile (IQ-Tree output file in Newick format).
The IQ-Tree output file is again presented in Newick format. Two supports are presented: the SH-aLRT approximate likelihood test, followed by the ultrafast bootstrap approximation (UFBoot).
IQ_NTE_118.05.phy.log (IQ_Tree log file as flat text file).
IQ-Tree was run using ModelFinder to automatically choose the best substitution model for the alignment. The IQ_Tree log file is included to show the results of that step.
II. CmPCYA_in_vitro:
This folder contains data from in vitro characterization of CmPCYA protein purified after recombinant expression in E. coli. Two assays were used, and data from each data are deposited for a total of three files.
CmPCYA_Spectroscopic_timecourse.txt This file contains a series of absorption spectra measured at different times after initiation of a CmPCYA reaction. Each spectrum (or timepoint) is one column in the file, with the first column providing wavelength information in nm.
rezeroed.CmPCYA_spectra.txt This file contains the processed spectral data for the above file after re-zeroing and Savitsky-Golay smoothing.
CmPCYA_HPLC_assay.txt
This file contains data from an endpoint assay to identify reaction products via reverse-phase HPLC. After workup of the assay, reaction products were run on a C18 column along with controls. BV and PCB were detected using a diode arrray detector set for 650 nm. The file contains paired time/absorption values for three HPLC runs: one with CmPCYA reaction products, one with cyanobacterial PcyA (SynPcyA from Synechocystis sp. PCC6803) as a positive control for PCB formation, and one with BV substrate alone as a negative control).
III. GsPEBA_GsPEBB_in_vitro:
This folder contains data from in vitro characterization of purified GsPEBA and GsPEBB. Each enzyme was characterized by itself using the spectroscopic and HPLC assays also used for CmPCYA. The action of the two enzymes together was also studied using the same assays.
The deposited files are:
GsPEBA_Spectroscopic_timecourse.txt
This file contains a series of absorption spectra measured at different times after initiation of a GsPEBA reaction. Each spectrum (or timepoint) is one column in the file, with the first column providing wavelength information in nm. BV was substrate.
rezeroed.GsPEBA_Spectra.txt This file contains the processed spectral data for the above file after re-zeroing and Savitsky-Golay smoothing.
GsPEBB_Spectroscopic_timecourse.txt
This file contains a series of absorption spectra measured at different times after initiation of a GsPEBB reaction. Each spectrum (or timepoint) is one column in the file, with the first column providing wavelength information in nm. 15,16-DHBV was substrate.
rezeroed.GsPEBB_1516_spectra.txt This file contains the processed spectral data for the above file after re-zeroing and Savitsky-Golay smoothing.
rezeroed.GsPEBB_BV_spectra.txt This file contains the processed spectral data for a similar reaction with GsPEBB using BV as substrate rather than 15,15-DHBV.
GsPEBA_GsPEBB_individual_HPLC.txt
This file contains data from endpoint assays using reverse-phase HPLC. After workup, reaction products were run on a C18 column along with controls. Bilin chromophores were detected using a diode arrray detector. The file contains paired time/absorption values for four HPLC runs: one with GsPEBA reaction products, one with GsPEBB reaction products, and one each with 15,16-DHBV and PEB reference standards.
combined_GsPEBAB_spectroscopic_timecourse.txt
This file contains a series of absorption spectra measured at different times after initiation of a reaction using GsPEBA and GsPEBB in combination, with BV as substrate. Each spectrum (or timepoint) is one column in the file, with the first column providing wavelength information in nm.
rezeroed.combined_GsPEBAB_spectra.txt This file contains the processed spectral data for the above file after re-zeroing and Savitsky-Golay smoothing.
combined_GsPEBAB_HPLC.txt
This file contains data from endpoint assays using GsPEBB paired with either GsPEBA or cyanobacterial PebA from Synechococcus sp. WH8020. Data are deposited as paired time/absorption values for both HPLC runs.
IV. Galdieria_PEB_PCB_isomerase:
This folder contains data for fractionation of G. sulphuraria extracts and for assaying isomerization of PEB to PCB using the resulting fraction, demonstrating the existence of an isomerase activity in G. sulphuraria.
The deposited files are:
Affinity_SEC_chromatography.txt
Spectroscopic_Isomerase_Assay.txt
HPLC_Isomerase_Assay.txt
Spectroscopic_Assay_HeatTreated.txt
HPLC_Assay_HeatTreated.txt
Spectroscopic_Assay_NoProtein.txt
HPLC_Assay_NoProtein.txt
Affinity_SEC_chromatography.txt
This file contains paired time/absorption values at 280 nm (protein absorbance) for two fractionation steps with a soluble G. sulphuraria extract: affinity chromatography on Blue Sepharose and size exclusion chromatography (SEC) on Superdex 75. Data for bovine serum albumin (BSA) on SEC is included as a size control. After SEC, the resulting fraction was assayed using spectroscopic and HPLC assays, along with controls (no protein and heat-treated protein).
Spectroscopic_Isomerase_Assay.txt
This file contains a series of absorption spectra measured at different times after initiation of the isomerase reaction. Each spectrum (or timepoint) is one column in the file, with the first column providing wavelength information in nm. PEB was substrate.
HPLC_Isomerase_Assay.txt
This file contains data from the isomerase assay using reverse-phase HPLC. After workup, reaction products were run on a C18 column and monitored using a diode arrray detector set to track 560 nm (reporting the starting PEB substrate) and 650 nm (reporting the PCB product). The file contains time/A560/A650 values and
can be compared to standards and controls in other folders.
Spectroscopic_Assay_HeatTreated.txt
This file contains a series of absorption spectra measured at different times after initiation of the isomerase reaction. The protein fraction was heat-treated in this experiment.
HPLC_Assay_HeatTreated.txt
This file contains data from the isomerase assay using reverse-phase HPLC. The protein fraction was heat-treated in this experiment.
Spectroscopic_Assay_NoProtein.txt
This file contains a series of absorption spectra measured at different times after initiation of the isomerase reaction. The protein fraction was replaced with a buffer control in this experiment.
HPLC_Assay_NoProtein.txt
This file contains data from the isomerase assay using reverse-phase HPLC. The protein fraction was replaced with a buffer control in this experiment.
Change Log:
Version 1 is the initial submission approved 18 April 2025.
Version 2 replaces the earlier phylogenetic analysis with two analyses of a more extensive sequence alignment including additional sequences that have been recently identified.
Version 3 adds re-processed spectroscopic files to remove an artifact in the original Savitsky-Golay smoothing procedure and adds one additional spectroscopic asset as a control to show that GsPEBB does not use biliverdin IX-alpha as a substrate.
Sharing/access Information
Links to other publicly accessible locations of the data: none
Was data derived from another source? no
If yes, list source(s): n/a
Changes after Apr 18, 2025:
Version 1 is the initial submission approved 18 April 2025.
Version 2 replaces the earlier phylogenetic analysis with two analyses of a more extensive sequence alignment including additional sequences that have been recently identified.
Changes after Aug 1, 2025:
Version 3 adds re-processed spectroscopic files to remove an artifact in the original Savitsky-Golay smoothing procedure and adds one additional spectroscopic asset as a control to show that GsPEBB does not use biliverdin IX-alpha as a substrate.
