Skip to main content

Data from: Extreme mitochondrial reduction in a novel group of free-living metamonads

Cite this dataset

Williams, Shelby (2024). Data from: Extreme mitochondrial reduction in a novel group of free-living metamonads [Dataset]. Dryad.


Metamonads are a diverse group of heterotrophic microbial eukaryotes adapted to living in hypoxic environments. All metamonads but one harbour metabolically altered ‘mitochondrion-related organelles’ (MROs) with reduced functions, however the degree of reduction varies. Here, we generate high-quality draft genomes, transcriptomes, and predicted proteomes for five recently discovered free-living metamonads. Phylogenomic analyses placed these organisms in a group we name the ‘BaSk’ (Barthelonids+Skoliomonads) clade, a deeply branching sister group to the Fornicata, a phylum that includes parasitic and free-living flagellates. Bioinformatic analyses of gene models shows that these organisms are predicted to have extremely reduced MRO proteomes in comparison to other free-living metamonads. Loss of the mitochondrial iron-sulfur cluster assembly system in some organisms in this group appears to be linked to the acquisition in their common ancestral lineage of a SUF-like minimal system Fe/S cluster pathway through lateral gene transfer. One of the isolates, Skoliomonas litria, appears to have lost all other known MRO pathways. No proteins were confidently assigned to the predicted MRO proteome of this organism suggesting that the organelle has been lost. The extreme mitochondrial reduction observed within this free-living anaerobic protistan clade demonstrates that mitochondrial functions may be completely lost even in free-living organisms.

README: README: Extreme mitochondrial reduction in a novel group of free-living metamonads

Genomes, gene models, alignments, HMM profiles and supplementary datasets for the BaSk clade - Skoliomonas litria, Skoliomonas sp. GEMRC, Skoliomonas sp. RCL, and Barthelona sp. PCE.

Description of the data and file structure

BaSk_clade_genome_genes folder:

  • Genomes (.fsa) and gene models (.fasta and .gff3) for each species of the BaSk clade.

alignments_hmm folder:

  • alignments_hmm/phylogenomics - final phylogenomic alignment along with a .txt file outline the presence/absence of each taxa for each concatenated gene. The .txt file also includes summary statistics for the number of present data (and % data presence) for each taxa and each gene.

  •  alignments_hmm/MRO_protiens - Alignments (.fas), corresponding HMM profiles, and raw phylogenetic trees relating to searches for MRO proteins, separated into folders according to similar biochemical pathways and functions. The Anaerobic_metabolism folder contains the corresponding files for proteins relating to ATP production, hydrogen production, and Fe-S cluster biogenesis. The Chaperones folder contains the corresponding files for cpn60 and hsp70. The CIA folder contains files for proteins of the CIA pathway. The MRO_import folder contains HMM profiles that were used to search for MRO membrane-bound transport proteins. The SUFCB folder contains the FAS and TREE files.

supplemental_data folder:

  • Supp_Data_1_genes.xlsx - A spreadsheet that lists all gene mentioned in the manuscript (for each BaSk clade member). For each gene, the following information is listed: gene model number, nucleotide sequence, amino acid sequence, mitochondrial targeting prediction information from TargetP, MitoProt, and Mitofates, and expression level value (FPKM). Highlighted gene models have 2 or more targeting software that predicts that the protein is MRO localizing. In addition, there is a list of Fe-S cluster biogenesis proteins that were searched for as part of this study.
  • Supp_Data_2_HydA_PFO.xlsx - A spreadsheet listing HydA and PFO proteins in each BaSk clade member, the top BLAST hit to the NCBI nr database for each, along with notable protein domain fusions.
  • Supp_Data_3_FPKM.xlsx - A spreadsheet listing the top 100 highly expressed gene models (in FPKM) for each BaSk clade member, annotated by top NCBI nr database hit. Highlighted gene models are proteins that are found in Supp_Data_1_genes.xlsx.
  • BaSk_FPKM.xlsx - Complete list of gene expression levels (FPKM) for all gene models in each BaSk clade member.


Natural Sciences and Engineering Research Council