Phylogenetic analyses of the MopB superfamily
Data files
Mar 05, 2023 version files 144.21 MB
-
ActB.fasta
253.54 KB
-
ActBfinal.fasta
3.65 MB
-
ActBwcodes.fasta
229.72 KB
-
AH.fasta
25.46 KB
-
AHfinal.fasta
324.11 KB
-
AHwcodes.fasta
22.90 KB
-
AioA.fasta
95.30 KB
-
AioAIdrAfinal.fasta
368.05 KB
-
AioAwcodes.fasta
88.17 KB
-
allDMSORcodenames.txt
103.10 KB
-
allDMSORswoutMopaligned.fasta
23.94 MB
-
AORswcodes.fasta
4.53 KB
-
ArxAArrAcodes.fasta
67.13 KB
-
ArxAArrAfinal.fasta
232.26 KB
-
Aspfinal.fasta
326.43 KB
-
aSreA.fasta
8.84 KB
-
aSreAwcodes.fasta
8.27 KB
-
Bact_FwdB.fasta
6.47 KB
-
BactArrAArxA.fasta
73.46 KB
-
BactArrAArxAwcodes.fasta
67.13 KB
-
BisCfinal.fasta
264.48 KB
-
bSreA.fasta
132.86 KB
-
bSreASoeAfinal.fasta
1.30 MB
-
bSreAwcodes.fasta
123.49 KB
-
DmsAfinal.fasta
1.26 MB
-
DMSORcompwoutMop.fasta
2.84 MB
-
DorATorAfinal.fasta
2.17 MB
-
FdhG.fasta
186.61 KB
-
FdhGfinal.fasta
6.30 MB
-
FdhGfullnames.fasta
212.60 KB
-
Fdhs.fasta
546.13 KB
-
Fdhsfinal.fasta
9 MB
-
FdhsFsdA.fasta
306.20 KB
-
FdhsFsdAfullnames.fasta
333.48 KB
-
FhcB.fasta
43.38 KB
-
FhcBfinal.fasta
34.21 KB
-
FhcBwcodes.fasta
37.05 KB
-
FwdB_FmdB.fasta
20.20 KB
-
FwdB.fasta
17.74 KB
-
FwdBfinal.fasta
344.72 KB
-
IdrA.fasta
24.78 KB
-
IdrAwcodes
23.21 KB
-
MopB_AORs.dat
663.32 KB
-
MopB_Mohydrox.dat
874.50 KB
-
MopB.dat
826.30 KB
-
MopBMAGs_50.fasta
1.40 MB
-
MopBMAGs_50aligned.fasta
21.38 MB
-
MopBMAGs.fasta
43 MB
-
NapA.fasta
235.24 KB
-
NapAfinal.fasta
1.49 MB
-
NapAwcodes.fasta
216.17 KB
-
NarG.fasta
372.10 KB
-
NarGfinal.fasta
5.85 MB
-
NarGwcodes.fasta
349.76 KB
-
NasC.fasta
196.03 KB
-
NasCNarBfinal.fasta
7.09 MB
-
NasCwcodes.fasta
175.32 KB
-
Nqo3.fasta
208.48 KB
-
Nqo3final.fasta
1.11 MB
-
Nqo3wcodes.fasta
191.88 KB
-
PsrAPhsAfinal.fasta
705.25 KB
-
PsrAPhsASrrA.fasta
137.72 KB
-
PsrAPhsASrrAwcodes.fasta
126.42 KB
-
README.txt
1.47 KB
-
RhLPgtLfinal.fasta
254.10 KB
-
TtrASrdAarchArrA.fasta
252.76 KB
-
TtrASrdAarchArrAwcodes.fasta
237.97 KB
-
TtrASrdAfinal.fasta
930.22 KB
-
variousAspDMSORs.fasta
69.02 KB
-
variousAspDMSORswcodes.fasta
62.49 KB
-
variousSerDMSORs.fasta
399.18 KB
Abstract
The Dimethyl Sulfoxide Reductase (or MopB) family is a diverse assemblage of enzymes found throughout Bacteria and Archaea. Many of these enzymes are believed to have been present in the last universal common ancestor (LUCA) of all cellular lineages. However, gaps in knowledge remain on how MopB enzymes evolved and how this diversification of functions impacted global biogeochemical cycles through geologic time. In this study, we perform maximum likelihood phylogenetic analyses on manually curated comparative genomic and metagenomic datasets containing over 47,000 distinct MopB homologs. We demonstrate that these enzymes constitute a catalytically and mechanistically diverse superfamily defined, not by the molybdo- or tungstopterin containing pterin (Mo/W-bisPGD) co-factor, but rather by the structural fold that binds it in the protein. Our results suggest that major metabolic innovations were the result of the loss of the metal co-factor, or the gain or loss of protein domains. Phylogenetic analyses also demonstrated that formate oxidation and CO2 reduction were the ancestral functions of the superfamily, traits that have been vertically inherited from LUCA. Nearly all of the other families, which drive all other biogeochemical cycles mediated by this superfamily, originated in the bacterial domain. Thus organisms from Bacteria have been the key drivers of catalytic and biogeochemical innovations within the superfamily. The relative ordination of MopB families and their associated catalytic activities emphasize fundamental mechanisms of evolution in this superfamily. Further, it underscores the importance of prokaryotic adaptability in response to the transition from an anoxic to oxidized atmosphere.
Methods
Genomics datasets were obtained using NCBI's DELTA-BLAST tool. Analyses were restricted to phyla represented in the Integrated Microbial Genomics Encyclopedia of Bacteria and Archaea. Metagenomic datasets were obtained from GTDB-TK. All metagenomic dataset analyses were performed using a computing cluster. BLASTx was used to find putative MopB superfamily members in the GTDB-TK dataset.
All sequences were aligned using the online version of MAFFT. Phylogenetic analyses of the metagenomic datasets were done using a reduced dataset generated by the CDHIT program. We utilized the CDHIT 50 setting. All phylogenies and amino acid substitution model tests were done using IQTree. All phylogenies were generated using a computing cluster.
Usage notes
All datafiles can be viewed using a standard text editor. Sequence alignments and phylogenetic trees can be opened using any standard, open-source software programs for sequence and tree visualization.