Phylogenomic analysis of the dwarfgobies (Teleostei: Gobiidae: Eviota and Sueviota)
Data files
Jan 19, 2026 version files 119.72 KB
-
Fig6_Eviota_164x440_R0.4_MLtree.tre
21.58 KB
-
FigS1_Eviota_160x2441.R0.2_MLtree.tre
9.30 KB
-
FigS2_Eviota_164x824_R0.3_MLtree.tre
9.30 KB
-
FigS3_Eviota_159x381_R0.4_MrBayestree.tre
77.60 KB
-
README.md
1.94 KB
Abstract
The goby genera Eviota and Sueviota (family Gobiidae) are commonly known as dwarfgobies, and collectively the two genera are among the most abundant and diverse groups of fishes on coral reefs. Despite the diversity, abundance, and ecological importance of this group, and the large and growing number of species described to date (132 Eviota, 9 Sueviota), there is a lack of understanding of the phylogenetic relationships among the major clades of Eviota, poor knowledge of the relationships between dwarfgobies and other Gobiidae species, and no information on the placement of Sueviota. In addition, as is the case with most small reef fishes, a clear understanding of taxonomically informative phenotypic characters is also lacking. To resolve the evolutionary history of dwarfgobies, we inferred a time-calibrated phylogeny of the group using genome-wide data from 440 ddRADseq loci captured across 98 Eviota and Sueviota specimens, plus 66 specimens of other related gobies. We also assessed the distribution of 14 external and osteological morphological characters across the tree to assess which may be useful for diagnosing clades. Our results robustly established the non-monophyly of the genus Eviota, which was resolved into two separate clades, both of which were resolved within a lineage of other coral-associated genera (Gobiodon, Paragobiodon, Pleurosicya, Minisicya, and Bryaninops). One of the two clades is herein elevated to its own genus, Eviotops, a name which was previously considered synonymous with Eviota. Additionally, we established that the genus Sueviota is deeply nested within one of the Eviota clades and is herein synonymized with Eviota. We also found strong phylogenetic signal for 12 out of the 14 phenotypic traits examined, providing strong complementary support for the two recovered clades, and establishing the validity of phenotypic traits that strongly correspond with genetic groupings that should aid in future taxonomic studies for this group.
Dataset overview
To examine the effects of missing data on the support of phylogenies, we tested three values of the parameter ‘-R’, the minimum percentage of individuals across the entire alignment required to include a locus, using the ‘populations’ module in Stacks. The values chosen were R=0.2, 0.3, and 0.4, corresponding to thresholds of 80%, 70%, and 60% missing data allowable, respectively.
We generated maximum likelihood trees under the GTR+G for the three -R filtered datasets using RAxML-NG v1.1 (Kozlov et al. 2019) on the CIPRES (Miller et al. 2010) supercomputer portal. Support for nodes was assessed with 100 bootstrap replicates and trees were visualized using FigTree v1.4.4 (Rambaut 2012). The entire alignment, including invariant sites were used for phylogenetic inference. Heterozygote site,s were coded using IUPAC ambiguity codes. For the R=0.4 dataset, we also inferred a tree using Bayesian Inference via the software MrBayes 3.2 (Ronquist et al. 2012) using the same substitution model. The MCMC analysis was run for 10 million generations, sampling every 1000 generations, discarding the first 20 percent of trees as burn-in. Convergence and mixing of MCMC chains were assessed in Tracer 1.5 (Rambaut and Drummond 2018). A 50% majority-rule consensus tree was generated from the resulting posterior distribution of trees. Descriptionof the data and file structure
Files
Files for the ddRADseq phylogenetic maximum likelihood trees (1-3) obtained from the Stacks/populations datasets using the 0.2-0.4 -R filtering parameters, and Bayesian inference tree (4) using the 0.4 -R filtering:
- -R = 0.2 (FigS1_Eviota_160x2441.R0.2_MLtree.tre)
- -R = 0.3 (FigS2_Eviota_164x824_R0.3_MLtree.tre)
- -R = 0.4 (Fig6_Eviota_164x440_R0.4_MLtree.tre)
- -R = 0.4 (FigS3_Eviota_159x381_R0.4_MrBayestree.tree)
To capture genome-wide SNP data, we used ddRADseq (Peterson et al. 2012). DNA extractions were sent to the NGS labs at the University of Wisconsin for sequencing and library preparation. In summary, the ddRADseq libraries were prepared following the methodology outlined in Elshire et al. (2011), which had been used successfully for a prior Eviota phylogeny using SNP data (Gómez-Buckley et al. 2025). To assemble the ddRADseq raw data, we used the Stacks 2.4.1 pipeline (Catchen et al. 2013) on the University of Washington’s High Performance Computer Cluster. We used the R packages RADstackshelpR 0.1.0 (https://github.com/DevonDeRaad/RADstackshelpR ) to calculate summary statistics from Stacks output vcf files, and to determine the optimal parameters for assembling putative RAD loci de novo (Mastretta et al. 2015, Paris et al. 2017) following the same protocols as in Gómez-Buckley et al. 2025.
