Data for: Phylogenomic and population genomic analyses of ultraconserved elements reveal deep coalescence and introgression shaped diversification patterns in Lamprologine cichlids of the Congo River
Data files
May 02, 2025 version files 3.43 GB
-
1_assembled_contigs.zip
3.42 GB
-
2_UCE_p75_loci_nexus.zip
877.88 KB
-
3_UCE_p75_concatenated.zip
7.45 MB
-
4_UCE_p75_pruned_random_nexus.zip
706.26 KB
-
5_SNPs_VCF.zip
5.13 MB
-
6_mtDNA_nexus.zip
32.11 KB
-
7_ML_tree.zip
6.61 KB
-
8_Gene_and_species_trees.zip
451.08 KB
-
9_Topology_tests.zip
16.08 KB
-
README.md
6.59 KB
Abstract
Understanding the drivers of diversification is a central goal in evolutionary biology, but can be challenging when lineages radiate quickly and/or hybridize frequently. Cichlids in the tribe Lamprologini, an exceptionally diverse clade found in the Congo basin, exemplify these issues: their evolutionary history has been difficult to untangle with previous datasets, particularly with regard to river-dwelling lineages in the genus Lamprologus. This clade notably includes the only known blind and depigmented cichlid, L. lethops. Here, we reconstructed the evolutionary, population, and biogeographic history of a clade of Lamprologus from the Congo River by sampling over 50 species of lamprologines using genomic data and providing the best species-level coverage of this fauna to date. We found that in the mid-late Pliocene, two lineages of Lake Tanganyika lamprologines independently colonized the Congo River, where they subsequently hybridized and diversified, forming the current monophyletic group of riverine Lamprologus. Our estimates for divergence time and introgression align with the region’s geological history and suggest rapid speciation in Lamprologus species from the Congo River marked by rapids-driven vicariance and water level fluctuations, repeated secondary contact, and reticulation. This complex hybrid origin, followed by a rapid series of isolation and reticulation events, illustrates the multifaceted dynamics of speciation that have shaped the rich biodiversity of this region.
Dryad DOI: https://doi.org/10.5061/dryad.8931zcs13
Data Files
1_assembled_contigs-
Assembled contigs of the samples analyzed in this study
/*.contigs.fasta.
-
2_UCE_p75_loci_nexus-
Alignments of individual UCE loci in Nexus format
/uce-*.nexus
-
3_UCE_p75_concatenated-
Concatenated alignment of UCE loci analyzed in this study
/lamprologus_dataset10_min75_mafft_gblocks-IQTree.charsets
/lamprologus_dataset10_min75_mafft_gblocks-PF.phy
/lamprologus_dataset10_min75_mafft_gblocks-raxml.charsets
/lamprologus_dataset10_min75_mafft_gblocks.phy
-
4_UCE_p75_pruned_random_nexus-
Alignments of 100 random UCE loci for the pruned data set used in the timetree analysis
/lamprologus_dataset10_mafft_gblocks_min75_rnd1_100loci-pruned
/lamprologus_dataset10_mafft_gblocks_min75_rnd2_100loci-pruned
/lamprologus_dataset10_mafft_gblocks_min75_rnd3_100loci-pruned
/lamprologus_dataset10_mafft_gblocks_min75_rnd4_100loci-pruned
-
5_SNPs_VCF-
VCF files of all SNPs and one random SNP per UCE locus for all samples
/ALL-lamprologus-only-PASS-Q30-25Pind50Ploc-SNPs.vcf
/ALL-lamprologus-only-PASS-Q30-25Pind50Ploc-SNPs.1rnd.vcf
-
6_mtDNA_nexus-
Alignments of mitochondrial ND2 gene analyzed for all Pseudocrenilabrinae cichlids and the LCR and MUCR clades of Lamprologus, in Nexus format
/Pseudocrenilabrinae_ND2 Alignment_n353.nex
/Lamprologus_LCR_ND2_n80.nex
/Lamprologus_MUCR_ND2_n61.nex
-
7_ML_tree-
Bootstrap consensus of the Maximum Likelihood phylogenetic tree inferred using IQTree
/lamprologus_dataset10_min75_mafft_gblocks-IQTree.charsets.charsets.contree.names.boot
-
8_Gene_and_species_trees-
Species tree of Lamprologini inferred in ASTRAL-III and gene tree files including all the best trees, gene trees only retaining nodes with bootstrap >1, and gene trees only retaining nodes with bootstrap >10
/lamprologus_dataset_min75.Astral.sptree.tre
/lamprologus_dataset10_besttrees.trees
/lamprologus_dataset10_BS1.tre
/lamprologus_dataset10_BS10.tre
-
9_Topology_tests-
Phylogenetic hypotheses compared in the Approximated Unbiased test
/lamprologus_dataset10_best_scheme_7hyp.trees.
-
Online-Only Supplementary Files
Supplementary_Materials_Appendices.pdf- Supplementary Materials Appendix A: Laboratory Methods and Bioinformatic Pipelines
- Figure S1. Maximum Likelihood phylogenetic hypotheses of Lamprologini based on mitochondrial gene ND2 sequences using IQ-TREE2
- Figure S2. Bayesian Inference phylogenetic hypotheses of Lamprologini based on mitochondrial gene ND2 sequences using MrBayes
- Figure S3. Local posterior probability, normalized quartet score, and number of informative gene trees for the main and alternative quartet topologies for all the nodes in the Congo River Lamprologus clade as recovered in ASTRAL-III. N1-N10 correspond to the nodes in the small inset tree at the top.
- Figure S4. Average Delta scores ± SD for all the species of riverine Lamprologus included in the SplitsTree analysis.
- Figure S5. Haplotype network of the “mainly LCR” clade samples of riverine Lamprologus based on mitochondrial gene ND2 sequences using PopArt.
- Figure S6. Haplotype network of the “mainly CUCR” clade samples of riverine Lamprologus based on mitochondrial gene ND2 sequences using PopArt.
- Figure S7. Calibrated timetree of Lamprologini based on the fixed ML topology using BEAST2. Values indicate mean age estimates, and bars represent 95% HPD intervals for each node. The estimates are the combined result of four independent analyses of ~100 UCE loci run in duplicate.
- Figure S8. Cross-entropy values as inferred from sNMF to choose the best-fitting (lowest) K value for our data values of K.
- Figure S9. Ancestry coefficients of riverine Lamprologus species as inferred from sNMF for the most frequent structure arrangements recovered from K = 2 through K = 10. Each bar represents an individual sample.
- Figure S10. Polytomy test results for three datasets using all available gene trees with nodes collapsed at increasing thresholds of bootstrap support (BS) a) all nodes included, b) nodes with BS>1 included, c) nodes with BS>10 included. In the x-axis we show, for each internal branch in the ASTRAL species tree, its estimated length in coalescent units (CU) in log scale, and in the y-axis the polytomy test 1-p-value and the local posterior probability (LPP) estimated in the species tree. The dashed line indicates the significance level (1-p-value = 0.95) of the polytomy test.
- Figure S11. f-branch test results for the ASTRAL tree topology.
- Figure S12. f-branch test results for the ML tree topology.
- Figure S13. Calibrated timetree of mitochondrial ND2 gene sequences of Lamprologini using BEAST2. Values indicate mean age estimates, and bars represent 95% HPD intervals for each node. Samples in red represent the “mainly LCR” clade, and samples in blue represent the “mainly CUCR” clade.
Supplementary_Materials_Tables.xlsx- Supplementary Table S1a. List of specimens analyzed in this study.
- Supplementary Table S1b. List of specimens obtained from GenBank used in the mitochondrial (ND2) analysis.
- Supplementary Table S2. Summary statistics of UCE sequencing results of samples analyzed in this study.
- Supplementary Table S3. Results of Approximate Unbiased (AU) topology tests.
- Supplementary Table S4. Results of ABBA-BABA test performed for all the populations trios of Congo River Lamprologus based on the ASTRAL species tree topology.
- Supplementary Table S5. Results of ABBA-BABA test performed for all the populations trios of Congo River Lamprologus based on the ML phyogenetic tree topology.
- Supplementary Table S6. Table summarizing the f-branch statistics from the ABBA-BABA tests based on the ASTRAL species tree and the ML phylogenetic tree topologies. Branches and values correspond to Supplementary Figures S11 and S12.
- Supplementary Table S7. Proposed taxonomy and nomenclature of lamprologines based on the phylogenomic systematic analysis in this study.
