Skip to main content
Dryad

Orthogroup alignments of Clostridium spp. isolates

Data files

Jun 17, 2026 version files 35.25 MB

Click names to download individual files

Abstract

De novo genome assemblies for 200+ Clostridium species isolates were generated by shovill v.1.1.0 and annotated by Bakta v.1.8.1using the db-light v.5 database. Orthofinder v.3.0.1b1 assigned orthology to annotated proteins, which were used for phylogenetic reconstruction. A subset of orthogroups were selected to generate phylogenies. 264 single copy orthologs were aligned by mafft v.7.453 and concatenated together to generate a supermatrix. 900 orthogroups were aligned by mafft for a gene tree species tree reconciliation performed by Astral-Pro. SNP alignments were generated by snp-sites 2.5.1 from consensus genomes of 200+ Clostridium species isolates. Consensus genomes were produced by SAMTools/BCFtools v1.4.1 after read mapping to closely related reference genomes (Hall, Alaska, and CDC_67071) with BWA-MEM v0.7.15-r1142-dirty. Polymorphisms due to recombination were identified and masked by Gubbins v2.3.4. Maximum likelihood phylogenies were generated fromtotal SNP and recombinant-free SNP alignments using IQ-TREE2 v2.4.0 and the TVM+F+ASC+G4 substitution model