Skip to main content
Dryad logo

Introgression and species delimitation in the longear sunfish Lepomis megalotis (Teleostei: Percomorpha: Centrarchidae)

Citation

Kim, Daemin; Bauer, Bruce; Near, Thomas (2021), Introgression and species delimitation in the longear sunfish Lepomis megalotis (Teleostei: Percomorpha: Centrarchidae), Dryad, Dataset, https://doi.org/10.5061/dryad.dbrv15f05

Abstract

Introgression and hybridization are major impediments to genomic-based species delimitation because many implementations of the multispecies coalescent framework assume no gene flow among species. The sunfish genus Lepomis, one of the world’s most popular groups of freshwater sport fish, has a complicated taxonomic history. The results of ddRAD phylogenomic analyses do not provide support for the current taxonomy that recognizes two species, L. megalotis and L. peltastes, in the L. megalotis complex. Instead, evidence from phylogenomics and phenotype warrants recognizing six relatively ancient evolutionary lineages in the complex. The introgressed and hybridizing populations in the L. megalotis complex are localized and appear to be the result of secondary contact or rare hybridization events between non-sister species. Segregating admixed populations from our multispecies coalescent analyses identifies six species with moderate to high genealogical divergence, whereas including admixed populations drives all but one lineage below the species threshold of genealogical divergence. Segregation of admixed individuals also helps reveal phenotypic distinctiveness among the six species in morphological traits used by ichthyologists to discover and delimit species over the last two centuries. Our protocols allow for the identification and accommodation of hybridization and introgression in species delimitation. Genomic-based species delimitation validated with multiple lines of evidence provides a path towards the discovery of new biodiversity and resolving long-standing taxonomic problems. 

Usage Notes

BPP_Lmeg_pure.ctl

‘Control’ input for the BPP analysis for the dataset that no specimens of genomic admixture are included.  

BPP_Lmeg_pure.Imap.txt

‘Imapfile’ input for the BPP analysis for the dataset that no specimens of genomic admixture are included. 

BPP_Lmeg_pure_loci.prn

‘seqfile’ input for the BPP analysis for the dataset that no specimens of genomic admixture are included. 

BPP_Lmeg_pure_Theta-Tau.csv

Mean and 95% CI of the theta and tau parameter estimates resulting from the BPP analysis for the dataset that no specimens of genomic admixture are included. 

BPP_Lmeg_admix.ctl

‘Control’ input for the BPP analysis for the dataset that includes specimens of genomic admixture.  

BPP_Lmeg_admix.Imap.txt

‘Imapfile’ input for the BPP analysis for the dataset that includes specimens of genomic admixture.  

BPP_Lmeg_admix_loci.prn

‘seqfile’ input for the BPP analysis for the dataset that includes specimens of genomic admixture.  

BPP_Lmeg_admix_Theta-Tau.csv

Mean and 95% CI of the theta and tau parameter estimates resulting from the BPP analysis for the dataset that includes specimens of genomic admixture.  

MEGOZKMO_MSFS.obs

Multisite frequency spectrum input for the Fastsimcoal2 analysis for the dataset containing specimens of Lepomis megalotis, L. sp. Ozark, and genomic admixture between the two species.

MEGPELMP_MSFS.obs

Multisite frequency spectrum input for the Fastsimcoal2 analysis for the dataset containing specimens of Lepomis megalotis, L. peltastes, and genomic admixture between the two species.

MEGSOLMS_MSFS.obs

Multisite frequency spectrum input for the Fastsimcoal2 analysis for the dataset containing specimens of Lepomis megalotis, L. solis, and genomic admixture between the two species.

MEGOZKMO.ctl

‘Control’ input for the G-PHoCS analysis for the dataset containing specimens of Lepomis megalotis, L. sp. Ozark, and genomic admixture between the two species.

MEGOZKMO.gphocs

‘seq-file’ input for the G-PHoCS analysis for the dataset containing specimens of Lepomis megalotis, L. sp. Ozark, and genomic admixture between the two species.

MEGPELMP.ctl

‘Control’ input for the G-PHoCS analysis for the dataset containing specimens of Lepomis megalotis, L. peltastes, and genomic admixture between the two species.

MEGPELMP.gphocs

‘seq-file’ input for the G-PHoCS analysis for the dataset containing specimens of Lepomis megalotis, L. peltastes, and genomic admixture between the two species.

MEGSOLMS.ctl

‘Control’ input for the G-PHoCS analysis for the dataset containing specimens of Lepomis megalotis, L. solis, and genomic admixture between the two species.

MEGSOLMS.gphocs

‘seq-file’ input for the G-PHoCS analysis for the dataset containing specimens of Lepomis megalotis, L. solis, and genomic admixture between the two species.

Kim_et_al_Lepomis_megalotis_complex_Morphology_data.csv

Table with all measured morphological traits of the Lepomis megalotis complex.

SNAPP_Lmeg_divtime.xml

Input for the divergence time analysis using SNAPP.

SNAPP_Lmeg_speciestree.xml

Input for the species-tree estimation using SNAPP.

SNMF_Lmeg_only.ugeno

Dataset for the population structure analysis with a sparse non-negative matrix factorization (snmf) algorithm.

Treemix_Lmeg.snps.hdf5

Sequence data for the TreeMix analysis.

Treemix_Lmeg_imap.txt

‘imap’ input for the TreeMix analysis.

IQtree_Lepomis_ddRAD.nex

Concatenated nexus alignment for the dataset containing individuals of all Lepomis that were examined in the phylogenetic analyses.

Kim_et_al_Supplementary_MatMeth.docx

Supplementary Materials and Methods

Supplementary Figure S1

Map presenting inland water bodies discussed in the manuscript. Numbers indicate: 1, Cuatrociénegas Basin; 2, Rio Grande; 3, Colorado River; 4, Brazos River; 5, Sabine River; 6, Mississippi River; 7, Red River; 8, Ouachita River; 9, Little River; 10, Kiamichi River; 11, Lake Saint John; 12, Yazoo River; 13, Lake Providence; 14, Arkansas River; 15, White River; 16, Fourche La Fave River; 17, Canadian River; 18, Neosho River; 19, Verdigris River; 20, Hatchie River; 21, Tennessee River; 22, Elk River; 23, Emory River; 24, Clinch River; 25, Ohio River; 26, Kentucky River; 27, Scioto River; 28, Meramec River; 29, Missouri River; 30, Osage River; 31, Illinois River (upper Mississippi River system); 32, Lake George; 33, Iowa Fairport Fish Hatchery; 34, St. Lawrence River; 35, Lake Pontchartrain; 36, Mobile River; 37, Choctawhatchee River; 38, Altamaha River.

Supplementary Figure S2

Maximum likelihood phylogeny for the Lepomis megalotis complex and its sister group, L. marginatus, inferred from IQ–TREE analysis of concatenated ddRAD loci dataset. Bootstrap supports are shown next nodes. Figure 2 is a simplified version of this figure.

Supplementary Figure S3

Population structures inferred from snmf analysis for the Lepomis megalotis complex (K=7, 8, 9, 11) are aligned according to IQ–TREE phylogeny.

Supplementary Figure S4

G-PhoCS analyses results for three populations characterized by genetic admixture with the largest geographic range: a) Lepomis solis x L. megalotis in the middle and upper Tennessee River; b) L. peltastes x L. megalotis in the upper Ohio River; and c) L. sp. Ozark x L. megalotis in the Osage River. The 95% highest posterior density for each parameter (theta, tau, or migration) is shown in parenthesis next to the mean. Arrows indicate direction of gene flow. Width of boxes is proportional to the mean for the theta parameter estimate.

Supplementary Figure S5

Models tested in Fastsimcoal2 analyses for three populations characterized by genetic admixture with the largest geographic range: a) Lepomis solis x L. megalotis in the middle and upper Tennessee River; b) L. peltastes x L. megalotis in the upper Ohio River; and c) L. sp. Ozark x L. megalotis in the Osage River. Best-fit model for each population is highlighted with a red box.

Supplementary Table S1

List of described species that were synonymized with Lepomis megalotis prior to the present study. Current status follows the results of the present study.

Supplementary Table S2

List of specimens used in molecular analyses, GPS coordinates of sampling locations, and voucher and tissue catalog numbers. Specimen names correspond to IQ-TREE phylogeny shown in Supplementary Fig. S2. Museum catalog abbreviations: INHS, Illinois Natural History Survey; MMNS, Mississippi Museum of Natural Science; RNYFH, New York State Fish Hatchery at Randolph; SLU, Southeastern Louisiana University Vertebrate Museum; TCWC, Biodiversity Research and Teaching Collections (formerly Texas Cooperative Wildlife Collection); TNHC, Texas Natural History Collections; UF, University of Florida, Florida Museum of Natural History; UMSNH, Michoacan University of Saint Nicholas of Hidalgo; UT, University of Tennessee Etnier Ichthyological Collection; YPM, Yale Peabody Museum of Natural History; YFTC, Yale Fish Tissue Collection. Species abbreviations: Laqu, Lepomis aquilensis; Loua, Lepomis sp. Ouachita; Lsol, Lepomis solis; Lozk, Lepomis sp. Ozark; Lpel, Lepomis peltastes; Lmeg, Lepomis megalotis. NA, not available.

Supplementary Table S3

List of specimens used in morphological analyses, number of specimens (N), and GPS coordinates of sampling locations. Species identification was made according to the river basin or system, following the results of our IQ-TREE analysis of the concatenated ddRAD loci data. Some of the specimens identified as hybrids may represent genetically pure species of one of the parental species. Museum catalog abbreviations: ANSP, Academy of Natural Sciences of Philadelphia; AUM, Auburn University Natural History Museum; CMNFI, Canadian Museum of Nature; FMNH, Field Museum of Natural History; INHS, Illinois Natural History Survey; JFBM, Bell Museum of Natural History; KU, University of Kansas Biodiversity Institute; MMNS, Mississippi Museum of Natural Science; MPM, Milwaukee Public Museum; NYSM, New York State Museum; OSUM, Ohio State University, Museum of Biological Diversity; MRNF, Ministry of the Natural Resources and Quebecois Fauna; ROM, Royal Ontario Museum; TNHC, Texas Natural History Collections; UF, University of Florida, Florida Museum of Natural History; UMMZ, University of Michigan Museum of Zoology; USNM, National Museum of Natural History, Smithsonian Institution; UT, University of Tennessee David A. Etnier Ichthyological Collection; UWZM, University of Wisconsin Zoological Museum; YPM, Yale Peabody Museum of Natural History. NA, not available.

Supplementary Table S4

Models tested in the Fastsimcoal2 analysis for three populations characterized by genetic admixture with the largest geographic range. Parameter estimates of the highest estimated log-likelihood (MaxEstLhood) for each model are shown. Best-fit model for each population is highlighted. Model parameter abbreviations correspond to those shown in Supplementary Fig. S5. NA, not applicable.

Supplementary Table S5

Fastsimcoal2 best-fit models for three populations characterized by genetic admixture with the largest geographic range and the 95% confidence interval of each parameter from 100 parametric bootstrap replications. Model parameter abbreviations correspond to those shown in Supplementary Fig. S5 and model names correspond to those shown in Supplementary Fig. S5 and Supplementary Table S4. NA, not applicable.

Supplementary Table S6

Means (bold) and 95% confidence interval of means (in parenthesis) for variable morphometric and meristic traits among species in the Lepomis megalotis complex identified in Tukey honestly-significant difference test (p < 0.001). 

Supplementary Table S7

Categorical morphological traits and the number of specimens for species in the Lepomis megalotis complex for each category. 

Funding

Yale Peabody Museum of Natural History

Yale Peabody Museum of Natural History