Data from: Phylogenomics of the airbreathing catfishes (Siluriformes: Clariidae)
Data files
May 18, 2026 version files 28.87 MB
-
40_percent_complete_with_hDNA.zip
7.98 MB
-
70_percent_complete.zip
3.36 MB
-
C_gariepinus_40.tre
7.06 KB
-
Ch_alvarezi_40.tre
6.91 KB
-
Ch_longicaudatus_40.tre
9.80 KB
-
Ch_sanghaensis_40.tre
6.91 KB
-
Cl_petricola_40.tre
6.91 KB
-
Clariidae_40_hDNA.tre
11.48 KB
-
Clariidae_40p_hDNA_Trimmed.phylip.zip
17.32 MB
-
Clariidae_70.tre
10.81 KB
-
Clariidae_Astral.tre
9.14 KB
-
Clariidae_FF_all.tre
7.51 KB
-
Clariidae_SVD.tre
3.06 KB
-
Clariidae_SWSC_70p_partition
53.87 KB
-
H_krishnai_40.tre
7 KB
-
README.md
4.96 KB
-
Siluriformes_40.tre
4.36 KB
-
Table_S1.xlsx
19.46 KB
-
Table_S2.xlsx
16.06 KB
-
U_zammaranoi_40.tre
6.60 KB
-
X_eupogon_40.tre
6.88 KB
Abstract
The air-breathing, or walking, catfishes of the family Clariidae are characterized by a suprabranchial organ facilitating atmospheric respiration and their ability to traverse significant distances over land. With 118 species in 16 genera, clariids are most diverse in Africa but also inhabit Southeast Asia and the Middle East, with invasive populations worldwide. Previous phylogenetic studies, largely based on mitochondrial markers, have produced poorly resolved and often conflicting hypotheses of relationships. As a result, a well-supported framework for intrarelationships within this large, economically important group remains lacking, representing a major gap in our understanding of catfish biodiversity and evolution. Here, we present a new phylogeny based on ultraconserved elements (UCEs), incorporating 52 species across all genera. We define Clariidae to include both Horaglanis and Heteropneustes, confirm the monophyly of the African clariids, and identify three major lineages among them. These comprise a clade of predominantly large-bodied, widespread species, and groups mostly restricted to the Congolese and Lower Guinean ecoregions, respectively. We additionally assess the potential of historical museum specimens (hDNA) for phylogenomic studies by incorporating seven formalin-fixed, clariid specimens rare in collections, including the possibly extinct Lake Victoria deepwater Xenoclarias eupogon, the stygobitic Somalian Uegitglanis zammaranoi, and the enigmatic Asian Horaglanis krishnai. We demonstrate successful sequencing from specimens preserved 20 to 98 years ago, expanding the available genetic data for rare and historically collected taxa. Our study reveals extensive non-monophyly across the tree, including across the nominal genera Clarias, Clariallabes, Channallabes, and Gymnallabes, and underscores a need for substantial taxonomic review.
Dataset DOI: 10.5061/dryad.p5hqbzm30
Description of the data and file structure
Genomic libraries were prepared with KAPA Hyper Prep Kits (Kapa Biosystems) using SPRI magnetic beads (Rohland and Reich, 2012) for bead clean-ups. Samples were dual-indexed with eight bp barcode sequences (Glenn et al., 2019). Libraries were then pooled into groups of eight at equimolar ratios for target enrichment. Enrichment targeted 2708 UCE loci using a probe set designed for ostariophysan fishes (Faircloth et al., 2020). The enrichment protocol followed the MYcroarray MYBaits kit v.3.0 with hybridization times ranging from 18 to 36 hours. Pooled, enriched libraries were then sent to Genewiz (South Plainfield, NJ) for sequencing using an Illumina HiSeq 4000 with 150-cycle paired-end reads.
Files and variables
File: Clariidae_Astral.tre
Description: Multispecies coalescent (Astral-III) phylogeny of the Clariidae based on 854 UCEs (439,198 bp).
File: Siluriformes_40.tre
Description: Maximum Likelihood Phylogeny of 35 siluriform families showing monophyly of Clariidae, including Horaglanis and Heteropneustes. Based on a concatenated 40% matrix with 1,696 UCE loci and 582,116 bp.
File: Clariidae_40_hDNA.tre
Description: Maximum-likelihood phylogeny of Clariidae inferred from 40% concatenated matrix (1,554 UCE loci, 1,057,331 bp), including 8 hDNA samples.
File: Clariidae_40p_hDNA_Trimmed.phylip.zip
Description: Phylip file of 40% complete matrix including 8 hDNA samples, trimmed with Spruceup.
File: 40_percent_complete_with_hDNA.zip
Description: Phylip file of 40% complete matrix including 8 hDNA samples with no trimming.
File: Clariidae_SVD.tre
Description: Multispecies coalescent phylogeny from SVDQuartets analysis. The tree was inferred using the complete concatenated matrix of 2,511 UCEs (1,720,084 bp).
File: Table_S2.xlsx
Description: List of publicly available sequence data included in phylogenetic analyses.
File: Clariidae_70.tre
Description: Maximum Likelihood phylogeny of the Clariidae using concatenated and partitioned 70% UCE matrix of 854 UCE loci (439,198 bp).
File: 70_percent_complete.zip
Description: Phylip file of 70% complete matrix.
File: Clariidae_SWSC_70p_partition
Description: Partitioning scheme for 70% complete UCE matrix based on the sliding-window site composition partitioning method.
File: Table_S1.xlsx
Description: List of tissue samples sequenced for ultraconserved elements (UCEs). Tissue codes associated with AMNH material are housed at the Ambrose Monell Cryogenic Collection (AMCC).
File: C_gariepinus_40.tre
Description: Supplementary Figure S6. Maximum Likelihood phylogeny based on 40% complete UCE matrix, including AMNH 246957 Clarias gariepinus as a single formalin-fixed sample.
File: Ch_alvarezi_40.tre
Description: Supplementary Figure S2. Maximum Likelihood phylogeny based on 40% complete UCE matrix, including MRAC A4-031-0098 Channallabes alvarezi as a single formalin-fixed sample.
File: Ch_longicaudatus_40.tre
Description: Supplementary Figure S3. Maximum Likelihood phylogeny based on 40% complete UCE matrix, including MRAC 99-105-T06 Channallabes longicaudatus as a single formalin-fixed sample.
File: Ch_sanghaensis_40.tre
Description: Supplementary Figure S4. Maximum Likelihood phylogeny based on 40% complete UCE matrix, including AMNH 227545 Channallabes sanghaensis as a single formalin-fixed sample.
File: Cl_petricola_40.tre
Description: Supplementary Figure S5. Maximum Likelihood phylogeny based on 40% complete UCE matrix, including AMNH 71855 Clariallabes petricola as a single formalin-fixed sample.
File: X_eupogon_40.tre
Description: Supplementary Figure S9. Maximum Likelihood phylogeny based on 40% complete UCE matrix, including AMNH 71860 Xenoclarias eupogon as a single formalin-fixed sample.
File: U_zammaranoi_40.tre
Description: Supplementary Figure S8. Maximum Likelihood phylogeny based on 40% complete UCE matrix, including AMNH 12342 Uegtiglanis zammaranoi as a single formalin-fixed sample.
File: Clariidae_FF_all.tre
Description: Supplementary Figure S1. Maximum Likelihood phylogeny based on 40% complete UCE matrix, including all eight formalin-fixed samples without trimming.
File: H_krishnai_40.tre
Description: Maximum Likelihood phylogeny based on 40% complete UCE matrix, including ANSP 203598 Horaglanis krishnai. This tree also includes two non-siluriform outgroups from the family Sternopygidae (Ostariophysi: Gymnotiformes).
Access information
Other publicly accessible locations of the data: NCBI SRA: PRJNA1440936
