Skip to main content
Dryad

Data from: Genomic and phenotypic delimitation of species in a temperate aquatic biodiversity hotspot

Data files

Nov 24, 2025 version files 1.78 GB

Click names to download individual files

Abstract

Biologists have relied on morphological characteristics to identify, define, and formally describe species for the past 250 years. The advent of phylogenetic species concepts and the introduction of molecular data have spawned new species delimitation methods applicable to a wide range of eukaryotic lineages. However, these approaches heavily emphasize genomic data, often overlooking phenotypic traits. We present and implement a species delimitation approach that utilizes genome-wide markers from ddRAD-seq and meristic morphological traits, which have long been used to identify and delineate fish species. Our methodology employs unsupervised machine learning to analyze morphological data without a priori species assignments, allowing phenotypic patterns to emerge independently from genomic-based species delimitation. We apply our combined genomic and phenotypic methodology to the freshwater systems of Southeastern North America, a biodiversity hotspot where conservation efforts are hampered by an incomplete knowledge of species diversity. Our investigation focuses on the darter clade Allohistium, a threatened lineage comprising two described species. Through phylogenomic, population genetic, and phenotypic model comparisons, we provide evidence supporting the delimitation of a third species of Allohistium, which we formally describe. Our approach shows how unsupervised machine learning can reveal cryptic morphological diversity that might otherwise be obscured by taxonomic preconceptions. This study demonstrates that model testing using diverse lines of evidence yields a more comprehensive, data-driven hypothesis of species diversity.