Introgression and species delimitation in the longear sunfish Lepomis megalotis (Teleostei: Percomorpha: Centrarchidae)
Data files
May 10, 2021 version files 169.04 MB
Abstract
Introgression and hybridization are major impediments to genomic-based species delimitation because many implementations of the multispecies coalescent framework assume no gene flow among species. The sunfish genus Lepomis, one of the world’s most popular groups of freshwater sport fish, has a complicated taxonomic history. The results of ddRAD phylogenomic analyses do not provide support for the current taxonomy that recognizes two species, L. megalotis and L. peltastes, in the L. megalotis complex. Instead, evidence from phylogenomics and phenotype warrants recognizing six relatively ancient evolutionary lineages in the complex. The introgressed and hybridizing populations in the L. megalotis complex are localized and appear to be the result of secondary contact or rare hybridization events between non-sister species. Segregating admixed populations from our multispecies coalescent analyses identifies six species with moderate to high genealogical divergence, whereas including admixed populations drives all but one lineage below the species threshold of genealogical divergence. Segregation of admixed individuals also helps reveal phenotypic distinctiveness among the six species in morphological traits used by ichthyologists to discover and delimit species over the last two centuries. Our protocols allow for the identification and accommodation of hybridization and introgression in species delimitation. Genomic-based species delimitation validated with multiple lines of evidence provides a path towards the discovery of new biodiversity and resolving long-standing taxonomic problems.
BPP_Lmeg_pure.ctl
‘Control’ input for the BPP analysis for the dataset that no specimens of genomic admixture are included.
BPP_Lmeg_pure.Imap.txt
‘Imapfile’ input for the BPP analysis for the dataset that no specimens of genomic admixture are included.
BPP_Lmeg_pure_loci.prn
‘seqfile’ input for the BPP analysis for the dataset that no specimens of genomic admixture are included.
BPP_Lmeg_pure_Theta-Tau.csv
Mean and 95% CI of the theta and tau parameter estimates resulting from the BPP analysis for the dataset that no specimens of genomic admixture are included.
BPP_Lmeg_admix.ctl
‘Control’ input for the BPP analysis for the dataset that includes specimens of genomic admixture.
BPP_Lmeg_admix.Imap.txt
‘Imapfile’ input for the BPP analysis for the dataset that includes specimens of genomic admixture.
BPP_Lmeg_admix_loci.prn
‘seqfile’ input for the BPP analysis for the dataset that includes specimens of genomic admixture.
BPP_Lmeg_admix_Theta-Tau.csv
Mean and 95% CI of the theta and tau parameter estimates resulting from the BPP analysis for the dataset that includes specimens of genomic admixture.
MEGOZKMO_MSFS.obs
Multisite frequency spectrum input for the Fastsimcoal2 analysis for the dataset containing specimens of Lepomis megalotis, L. sp. Ozark, and genomic admixture between the two species.
MEGPELMP_MSFS.obs
Multisite frequency spectrum input for the Fastsimcoal2 analysis for the dataset containing specimens of Lepomis megalotis, L. peltastes, and genomic admixture between the two species.
MEGSOLMS_MSFS.obs
Multisite frequency spectrum input for the Fastsimcoal2 analysis for the dataset containing specimens of Lepomis megalotis, L. solis, and genomic admixture between the two species.
MEGOZKMO.ctl
‘Control’ input for the G-PHoCS analysis for the dataset containing specimens of Lepomis megalotis, L. sp. Ozark, and genomic admixture between the two species.
MEGOZKMO.gphocs
‘seq-file’ input for the G-PHoCS analysis for the dataset containing specimens of Lepomis megalotis, L. sp. Ozark, and genomic admixture between the two species.
MEGPELMP.ctl
‘Control’ input for the G-PHoCS analysis for the dataset containing specimens of Lepomis megalotis, L. peltastes, and genomic admixture between the two species.
MEGPELMP.gphocs
‘seq-file’ input for the G-PHoCS analysis for the dataset containing specimens of Lepomis megalotis, L. peltastes, and genomic admixture between the two species.
MEGSOLMS.ctl
‘Control’ input for the G-PHoCS analysis for the dataset containing specimens of Lepomis megalotis, L. solis, and genomic admixture between the two species.
MEGSOLMS.gphocs
‘seq-file’ input for the G-PHoCS analysis for the dataset containing specimens of Lepomis megalotis, L. solis, and genomic admixture between the two species.
Kim_et_al_Lepomis_megalotis_complex_Morphology_data.csv
Table with all measured morphological traits of the Lepomis megalotis complex.
SNAPP_Lmeg_divtime.xml
Input for the divergence time analysis using SNAPP.
SNAPP_Lmeg_speciestree.xml
Input for the species-tree estimation using SNAPP.
SNMF_Lmeg_only.ugeno
Dataset for the population structure analysis with a sparse non-negative matrix factorization (snmf) algorithm.
Treemix_Lmeg.snps.hdf5
Sequence data for the TreeMix analysis.
Treemix_Lmeg_imap.txt
‘imap’ input for the TreeMix analysis.
IQtree_Lepomis_ddRAD.nex
Concatenated nexus alignment for the dataset containing individuals of all Lepomis that were examined in the phylogenetic analyses.
Kim_et_al_Supplementary_MatMeth.docx
Supplementary Materials and Methods
Supplementary Figure S1
Map presenting inland water bodies discussed in the manuscript. Numbers indicate: 1, Cuatrociénegas Basin; 2, Rio Grande; 3, Colorado River; 4, Brazos River; 5, Sabine River; 6, Mississippi River; 7, Red River; 8, Ouachita River; 9, Little River; 10, Kiamichi River; 11, Lake Saint John; 12, Yazoo River; 13, Lake Providence; 14, Arkansas River; 15, White River; 16, Fourche La Fave River; 17, Canadian River; 18, Neosho River; 19, Verdigris River; 20, Hatchie River; 21, Tennessee River; 22, Elk River; 23, Emory River; 24, Clinch River; 25, Ohio River; 26, Kentucky River; 27, Scioto River; 28, Meramec River; 29, Missouri River; 30, Osage River; 31, Illinois River (upper Mississippi River system); 32, Lake George; 33, Iowa Fairport Fish Hatchery; 34, St. Lawrence River; 35, Lake Pontchartrain; 36, Mobile River; 37, Choctawhatchee River; 38, Altamaha River.
Supplementary Figure S2
Maximum likelihood phylogeny for the Lepomis megalotis complex and its sister group, L. marginatus, inferred from IQ–TREE analysis of concatenated ddRAD loci dataset. Bootstrap supports are shown next nodes. Figure 2 is a simplified version of this figure.
Supplementary Figure S3
Population structures inferred from snmf analysis for the Lepomis megalotis complex (K=7, 8, 9, 11) are aligned according to IQ–TREE phylogeny.
Supplementary Figure S4
G-PhoCS analyses results for three populations characterized by genetic admixture with the largest geographic range: a) Lepomis solis x L. megalotis in the middle and upper Tennessee River; b) L. peltastes x L. megalotis in the upper Ohio River; and c) L. sp. Ozark x L. megalotis in the Osage River. The 95% highest posterior density for each parameter (theta, tau, or migration) is shown in parenthesis next to the mean. Arrows indicate direction of gene flow. Width of boxes is proportional to the mean for the theta parameter estimate.
Supplementary Figure S5
Models tested in Fastsimcoal2 analyses for three populations characterized by genetic admixture with the largest geographic range: a) Lepomis solis x L. megalotis in the middle and upper Tennessee River; b) L. peltastes x L. megalotis in the upper Ohio River; and c) L. sp. Ozark x L. megalotis in the Osage River. Best-fit model for each population is highlighted with a red box.
Supplementary Table S1
List of described species that were synonymized with Lepomis megalotis prior to the present study. Current status follows the results of the present study.
Supplementary Table S2
List of specimens used in molecular analyses, GPS coordinates of sampling locations, and voucher and tissue catalog numbers. Specimen names correspond to IQ-TREE phylogeny shown in Supplementary Fig. S2. Museum catalog abbreviations: INHS, Illinois Natural History Survey; MMNS, Mississippi Museum of Natural Science; RNYFH, New York State Fish Hatchery at Randolph; SLU, Southeastern Louisiana University Vertebrate Museum; TCWC, Biodiversity Research and Teaching Collections (formerly Texas Cooperative Wildlife Collection); TNHC, Texas Natural History Collections; UF, University of Florida, Florida Museum of Natural History; UMSNH, Michoacan University of Saint Nicholas of Hidalgo; UT, University of Tennessee Etnier Ichthyological Collection; YPM, Yale Peabody Museum of Natural History; YFTC, Yale Fish Tissue Collection. Species abbreviations: Laqu, Lepomis aquilensis; Loua, Lepomis sp. Ouachita; Lsol, Lepomis solis; Lozk, Lepomis sp. Ozark; Lpel, Lepomis peltastes; Lmeg, Lepomis megalotis. NA, not available.
Supplementary Table S3
List of specimens used in morphological analyses, number of specimens (N), and GPS coordinates of sampling locations. Species identification was made according to the river basin or system, following the results of our IQ-TREE analysis of the concatenated ddRAD loci data. Some of the specimens identified as hybrids may represent genetically pure species of one of the parental species. Museum catalog abbreviations: ANSP, Academy of Natural Sciences of Philadelphia; AUM, Auburn University Natural History Museum; CMNFI, Canadian Museum of Nature; FMNH, Field Museum of Natural History; INHS, Illinois Natural History Survey; JFBM, Bell Museum of Natural History; KU, University of Kansas Biodiversity Institute; MMNS, Mississippi Museum of Natural Science; MPM, Milwaukee Public Museum; NYSM, New York State Museum; OSUM, Ohio State University, Museum of Biological Diversity; MRNF, Ministry of the Natural Resources and Quebecois Fauna; ROM, Royal Ontario Museum; TNHC, Texas Natural History Collections; UF, University of Florida, Florida Museum of Natural History; UMMZ, University of Michigan Museum of Zoology; USNM, National Museum of Natural History, Smithsonian Institution; UT, University of Tennessee David A. Etnier Ichthyological Collection; UWZM, University of Wisconsin Zoological Museum; YPM, Yale Peabody Museum of Natural History. NA, not available.
Supplementary Table S4
Models tested in the Fastsimcoal2 analysis for three populations characterized by genetic admixture with the largest geographic range. Parameter estimates of the highest estimated log-likelihood (MaxEstLhood) for each model are shown. Best-fit model for each population is highlighted. Model parameter abbreviations correspond to those shown in Supplementary Fig. S5. NA, not applicable.
Supplementary Table S5
Fastsimcoal2 best-fit models for three populations characterized by genetic admixture with the largest geographic range and the 95% confidence interval of each parameter from 100 parametric bootstrap replications. Model parameter abbreviations correspond to those shown in Supplementary Fig. S5 and model names correspond to those shown in Supplementary Fig. S5 and Supplementary Table S4. NA, not applicable.
Supplementary Table S6
Means (bold) and 95% confidence interval of means (in parenthesis) for variable morphometric and meristic traits among species in the Lepomis megalotis complex identified in Tukey honestly-significant difference test (p < 0.001).
Supplementary Table S7
Categorical morphological traits and the number of specimens for species in the Lepomis megalotis complex for each category.