Skip to main content

Tree and amino acid alignments of thioredoxin, glucose-6-phosphate dehydrogenase, and malate dehydrogenase

Cite this dataset

Lhee, Duckhyun (2021). Tree and amino acid alignments of thioredoxin, glucose-6-phosphate dehydrogenase, and malate dehydrogenase [Dataset]. Dryad.


Redox regulation in phytoplankton is critical to monitor and stabilize metabolic pathways under changing environmental conditions. In plastids, the thioredoxin (TRX) system is linked to photosynthetic electron transport and fine tuning the metabolism to fluctuating light levels. Expansion of the number of redox signal transmitters and their protein targets, as seen in plants, is believed to increase cell robustness. In this study, we searched for genes related to redox regulation in the genome of the photosynthetic amoeba Paulinella micropora KR01 (hereafter, KR01). The genus Paulinella includes testate filose amoebae, in which a single clade acquired a photosynthetic organelle, the chromatophore, from an alpha cyanobacterial donor. This independent primary endosymbiosis occurred relatively recently (~ 124 Ma), when compared to Archaeplastida (> 1 Ga), making photosynthetic Paulinella a valuable model for studying the earlier stages of primary endosymbiosis. Our comparative analysis demonstrates that this lineage has evolved a thioredoxin system similar to that from other algae, relying however on genes with diverse phylogenetic origins (i.e., the endosymbiont, host, bacteria, red algae). One TRX of eukaryotic provenance is targeted to the chromatophore, implicating host-endosymbiont coordination of redox regulation. A chromatophore targeted glucose-6-phosphate dehydrogenase of red algal origin suggests that Paulinella exploited the existing redox regulation system in Archaeplastida to foster integration. Our study elucidates the independent evolution of the thioredoxin system in photosynthetic Paulinella, whose parts derive from the existing genetic toolkit in diverse organisms.


Putative homologs for each target protein were identified using a BlastP search (e-value cutoff: 1e-5) against a local database, which contained taxa selected from NCBI RefSeq and the MMETSP database to provide a broad taxon sampling. CD-HIT was used to remove the redundant isoforms. Multiple amino acid alignments were done using Clustal Omega with the default options. These alignments were refined manually based on conserved domains. Maximum likelihood-based phylogenetic analysis and bootstrap resampling of the data was done using IQ-TREE with 1,000 ultrafast bootstrap replications. The evolutionary model was automatically selected using the model test option incorporated in IQ-TREE. Highly diverged or contaminant sequences, which exhibited a long-branch or taxonomic misplacement in the tree were removed and the analysis done anew.