Data from: Phylogenomic and anatomical evidence for the Late Cretaceous diversification of African characiform fishes, including a new family, under the influence of the Trans-Saharan Seaway
Data files
Nov 18, 2024 version files 26.69 MB
-
lepidarchidae-80.xml
1.26 MB
-
MCC-tree-all.tre
115.66 KB
-
muscle-nexus-edge-trimmed-clean-60p.phylip
17.01 MB
-
muscle-nexus-edge-trimmed-clean-70p.phylip
7.08 MB
-
muscle-nexus-edge-trimmed-clean-80p.nex
1.22 MB
-
README.md
1.26 KB
Abstract
Geological evidence supports the occurrence of an epicontinental Trans-Saharan Seaway bisecting the African continent during the Late Cretaceous to early Paleogene. The seaway formed a wide saltwater channel connecting the Neotethys with the South Atlantic, yet no previous study has investigated its impact on freshwater fish diversification. Phylogenomic data and time-calibrated trees indicate a Late Cretaceous signature for the appearance of three modern lineages of characiform fishes. Phylogenetic analyses using ultraconserved elements of 83 characiforms reveals that Alestidae, Hepsetus, and Lepidarchidae fam. nov. originated during the Santonian-Campanian of the Late Cretaceous (84–77.5 million years ago; Ma). Lepidarchidae consists of two monotypic taxa not previously recognized as sister species: the Niger tetra Arnoldichthys endemic to the lower Niger and Ogun rivers of Nigeria, and the dwarf jellybean tetra Lepidarchus from coastal rivers of Ghana, Côte d'Ivoire, Liberia, Sierra Leone, and Guinea. Microcomputed tomography scans (µCT) of 117 characiforms provide three novel morphological characters supporting Hepsetus and Lepidarchidae, four characters for monophyly of Lepidarchidae and five for a restricted Alestidae. The Santonian-Campanian divergence indicates allopatric speciation processes influenced by the Trans-Saharan Seaway, partitioning the African ichthyofauna in a west-east orientation. The timing for African characiform cladogenesis aligns with the Cenomanian fossil record and is circa 16–23 Ma younger than the earliest characiform-like fossils from Late Cretaceous outcrops of Morocco and Sudan. This study highlights the magnitude of Cretaceous transgression events shaping the freshwater biota and gaps in our understanding of the evolutionary history and paleobiogeography of ray-finned fishes across the African continent.
https://doi.org/10.5061/dryad.kd51c5bgm
Description of the data and file structure
Data was collected from museum specimens. Sequences were obtained from captures of ultraconserved elements of 84 taxa using Illumina approaches. Data was processed by Phyluce and analyses ran through RAxML, Astral, and BEAST.
Files and variables
File: lepidarchidae-80.xml
Description: Data file used for BEAST time-calibrated analysis.
File: MCC-tree-all.tre
Description: Maximum Clade Credibility timetree of Lepidarchidae and other characiforms.
File: muscle-nexus-edge-trimmed-clean-80p.nex
Description: Data matrix 80% complete of ultraconserved elements.
File: muscle-nexus-edge-trimmed-clean-60p.phylip
Description: Data matrix 60% complete of ultraconserved elements.
File: muscle-nexus-edge-trimmed-clean-70p.phylip
Description: Data matrix 70% complete of ultraconserved elements.
Code/software
PHYLUCE v1.5.0
RAxML v8.2.11
ASTRAL v5.6.1
BEAST v2.4.8
Total genomic DNA was extracted using the DNeasy tissue kit, quantified and enriched genomic libraries using the myBaits Ostariophysan 2.7Kv1 probeset with an overnight hybridization, washes at 65ºC, and additional quantification with a spectrofluorimetric assay. Samples were sequenced on the Illumina NovaSeq 6000 platform on a partial S4 PE150 lane to approximately 14Gbp total for captured libraries.
PHYLUCE v1.5.0 was used to perform all UCE data analysis, including removal of adapters and low-quality bases with Illumiprocessor and Trimmomatic, as well as assembling fasta contigs with Velvet v1.5.0. We searched for orthologous UCE loci using the myBaits Ostariophysan 2.7Kv1 probeset and removed potential paralog regions with phyluce_assembly_match_contigs_to_probes in PHYLUCE v1.5.0. We extracted and aligned loci using an edge-trimming method implemented in MUSCLE and used three matrices: the 60% complete matrix (i.e. loci present in at least 51 terminals), the 70% complete matrix (i.e. loci present in at least 59 terminals), and for calibration, the 80% complete matrix (i.e. loci present in at least 67 terminals).
Maximum likelihood (ML) were conducted in RAxML v8.2.11 with one thousand non-parametric bootstraps and individual gene trees were obtained using a coalescent-based analysis using ML trees for each UCE locus with RAxML-PTHREAD-SSE3 and species trees from the best likelihood trees using ASTRAL v5.6.1.
Time calibration analyses were performed in BEAUTI and BEAST v2.4.8 using the 80% complete UCE matrix under the GTR+G+I evolutionary model and a relaxed lognormal clock model. The priors included a birth-death tree model, diversification rate, one root constraint, and four fossil calibrations.
Specimens were µCT scanned using a GE Phoenix v|tome|x with a 180kV Nano Tube at resolutions ranging from 6.2 to 27.8 μm, with beam energy set between 110–100 kV and 200–180 mA.