Accurately delimiting species is fundamentally important for understanding species diversity and distributions and devising effective strategies to conserve biodiversity. However, species delimitation is problematic in many taxa, including ‘non-adaptive radiations’ containing morphologically cryptic lineages. Fortunately, coalescent-based species delimitation methods hold promise for objectively estimating species limits in such radiations, using multilocus genetic data. Using coalescent-based approaches, we delimit species and infer evolutionary relationships in a morphologically conserved group of Central American freshwater fishes, the Poecilia sphenops species complex. Phylogenetic analyses of multiple genetic markers (sequences of two mitochondrial DNA genes and five nuclear loci) from 10/15 species and genetic lineages recognized in the group support the P. sphenops species complex as monophyletic with respect to outgroups, with eight mitochondrial ‘major-lineages’ diverged by ≥2% pairwise genetic distances. From general mixed Yule-coalescent models, we discovered (conservatively) 10 species within our concatenated mitochondrial DNA dataset, 9 of which were strongly supported by subsequent multilocus Bayesian species delimitation and species tree analyses. Results suggested species-level diversity is underestimated or overestimated by at least ~15% in different lineages in the complex. Nonparametric statistics and coalescent simulations indicate genealogical discordance among our gene tree results has mainly derived from interspecific hybridization in the nuclear genome. However, mitochondrial DNA show little evidence for introgression, and our species delimitation results appear robust to effects of this process. Overall, our findings support the utility of combining multiple lines of genetic evidence and broad phylogeographical sampling to discover and validate species using coalescent-based methods. Our study also highlights the importance of testing for hybridization versus incomplete lineage sorting, which aids inference of not only species limits but also evolutionary processes influencing genetic diversity.
concatenated mtDNA dataset
NEXUS file containing the 'concatenated mtDNA' dataset alignment, which includes 171 mtDNA subsamples sequenced for the mitochondrial cytochrome b and/or cytochrome oxidase 1 genes.
171_concat_mtDNA.NEX
50tax_plus8out_mtDNAonly
NEXUS file containing mtDNA locus data matching samples in the 50-sample 'concatenated nDNA' dataset.
concatenated nDNA dataset - LDHA alignment
NEXUS file containing the muscle-type lactate dehydrogenase (ldh-A) sequences in the concatenated nDNA dataset.
50tax_plus8out_LDHAonly.NEX
concatenated nDNA dataset - RPS7 alignment
NEXUS file containing the sequences of nuclear ribosomal protein S7 (RPS7) introns and exons in the concatenated nDNA dataset.
50tax_plus8out_S7only.NEX
concatenated nDNA dataset - X-src alignment
NEXUS file containing the tyrosine-kinase class X-src oncogene (X-src) sequences in the concatenated nDNA dataset.
50tax_plus8out_XSRConly.NEX
concatenated nDNA dataset - X-yes alignment
NEXUS file containing the tyrosine-kinase class X-yes oncogene (X-yes) sequences in the concatenated nDNA dataset.
50tax_plus8out_XYESonly.NEX
concatenated nDNA dataset - Glyt alignment
NEXUS file containing the sequences of glycosyltransferase (Glyt) sequences in the concatenated nDNA dataset.
50tax_plus8out_Glytonly.NEX
concatenated mtDNA + nDNA dataset - mtDNA alignment
NEXUS file containing the mtDNA sequences in the concatenated mtDNA + nDNA dataset.
80tax_mtDNAonly.NEX
concatenated mtDNA + nDNA dataset - LDHA alignment
NEXUS file containing the ldh-A sequences in the concatenated mtDNA + nDNA dataset.
80tax_LDHAonly.NEX
concatenated mtDNA + nDNA dataset - RPS7 alignment
NEXUS file containing the RPS7 sequences in the concatenated mtDNA + nDNA dataset.
80tax_S7only.NEX
concatenated mtDNA + nDNA dataset - X-src alignment
NEXUS file containing the X-src sequences in the concatenated mtDNA + nDNA dataset.
80tax_XSRConly.NEX
concatenated mtDNA + nDNA dataset - X-yes alignment
NEXUS file containing the X-yes sequences in the concatenated mtDNA + nDNA dataset.
80tax_XYESonly.NEX
concatenated mtDNA + nDNA dataset - Glyt alignment
NEXUS file containing the Glyt sequences in the concatenated mtDNA + nDNA dataset.
80tax_Glytonly.NEX
concatenated mtDNA GARLI ML tree
Newick format tree file of the single 'best' maximum-likelihood tree resulting from our analysis of the concatenated mtDNA dataset in GARLI.
concat_mtDNA_garli_ML_newick.txt
concatenated mtDNA GARLI ML tree - alt topology
Newick format tree file containing the single 'best' maximum-likelihood tree resulting from an additional analysis of the concatenated mtDNA dataset in GARLI. This 'alternative' topology differs from the best ML topology presented in the paper, for example in recovering sample "172554" as the most basal member of clade 2.
concat_mtDNA_garli_ML_alt_newick.txt
concatenated mtDNA + nDNA GARLI ML tree
Newick format tree file containing the GARLI topology shown in Fig. 3A.
Fig_3A_concat_mtDNA_nDNA_ML_newick.txt
concatenated nDNA GARLI ML tree
Newick format tree file containing the GARLI topology shown in Fig. 3B.
Fig_3B_concat_nDNA_garli_ML_newick.txt
concatenated mtDNA + nDNA *BEAST species tree
Newick format tree file containing the species tree shown in whole or in part in Fig. 4 and S3B Fig.
Figs_4_and_S3B_25spp_80seq_*BEAST_species_tree_newick.txt
concatenated mtDNA BEAST tree
Newick format tree file containing the time tree resulting from our analysis of the concatenated mtDNA dataset in BEAST, which is shown in S3A Fig.
S3A_Fig_concat_mtDNA_BEAST_newick.txt
LDHA BEAST tree
Newick format tree file containing the time tree resulting from our analysis of the ldh-A dataset in BEAST, which is shown in S4 Fig.
S4_Fig_LDHA_BEAST_tree_newick.txt
RPS7 BEAST tree
Newick format tree file containing the time tree resulting from our analysis of the RPS7 dataset in BEAST, which is shown in S4 Fig.
S4_Fig_S7_BEAST_tree_newick.txt
XSRC BEAST tree
Newick format tree file containing the time tree resulting from our analysis of the X-src dataset in BEAST, which is shown in S4 Fig.
S4_Fig_XSRC_BEAST_tree_newick.txt
XYES BEAST tree
Newick format tree file containing the time tree resulting from our analysis of the X-yes dataset in BEAST, which is shown in S4 Fig.
S4_Fig_XYES_BEAST_tree_newick.txt
GLYT BEAST tree
Newick format tree file containing the time tree resulting from our analysis of the Glyt dataset in BEAST, which is shown in S4 Fig.
S4_Fig_GLYT_BEAST_tree_newick.txt