Mitochondrial and nuclear introgression among closely related taxa can greatly complicate the process of determining their phylogenetic relationships. In the Central Highlands of North America, many fish taxa have undergone introgression; in this study, we demonstrate the existence of an unusual introgression event in the Etheostoma zonale species group. We used one mitochondrial and seven nuclear loci to determine the relationships of the taxa within the E. zonale group, and their degree of differentiation. We found evidence of multiple divergent populations within each species; much of the divergence within species has taken place during the Pleistocene. We also found evidence of a previously unknown cryptic species in the Upper Tennessee River which diverged from the remainder of the group during the Pliocene, and has undergone mitochondrial and nuclear introgression with E. zonale, in an apparent process of speciation reversal. We examined the effects that using varying types of recombination tests to eliminate the signal of recombination from nuclear loci would have on the phylogenetic placement of this introgressed lineage in our species tree analyses.
Cytochrome b matrices
zonalecytballindividuals.nex gives the cytochrome b sequence for each individual in the study. zonalecytbmatrix.nex gives the sequence of each cytochrome b haplotype in the study, and is the matrix used to produce Figure 3.
cytochromebmatrices.zip
Nuclear Loci - Matrices with sequences for all individuals
Sequence matrices for each of the seven nuclear loci in the study. Each matrix includes the full length of each locus sequenced, prior to any removal of portions of the locus due to positive tests for recombination, and also prior to any removal of gaps. Each matrix contains the sequence for both alleles found in each individual which was sequenced at that locus. The name of each sequence is the name of the individual specimen which was sequenced.
nuclearlociallindividualsmatrices.zip
Nuclear Loci - Matrices with sequences for each distinct allele, prior to removal of gaps and tests for recombination
Sequence matrices for each of the seven nuclear loci in the study. Each matrix includes the full length of each locus sequenced, prior to any removal of portions of the locus due to positive tests for recombination, and also prior to any removal of gaps. Each matrix contains the sequence of each distinct allele found at that locus. The name of each sequence is the name of the allele that sequence represents.
nuclearlociinitialmatrices.zip
Nuclear Loci - Matrices with sequences for each distinct allele, after removal of gaps and tests for recombination
Sequence matrices for each of the seven nuclear loci in the study. Each matrix includes only that part of the locus remaining after removing any portions which resulted in positive tests of recombination, and also after removing gaps found in more than half of all alleles. Each matrix contains the sequence of each distinct allele found at that locus. The name of each sequence is the name of the allele that sequence represents. These matrices are the ones used to create the trees in Figure 4 and Figure S1. zonalenedd4lfinalmatrixsmallerportion.nex is the matrix used to produce Figure S3.
nuclearlocifinalmatrices.zip
Starbeast XML input files for species tree analyses
These are the XML files used to produce, in *BEAST, the seven species trees seen in Figure 5 and Figure S2. Figure 5a was produced using the file zonalestarbeastfullintrons30.xml; Figure 5b with zonalestarbeast30.xml, Figure 5c with zonalestarbeastnuclear30.xml, Figure S2a with zonalestarbeasthalfintrons30.xml, Figure S2b with zonalestarbeastdnasp30.xml, Figure S2c with zonalestarbeastcytbcfzonale.xml, and Figure S2d with zonalestarbeastdnaspnuclear30.xml. The codes used to designate individuals in these files may be converted to the names of each individual listed in Appendices S2, S4 and S5 using the files tableofcytochromebsequences.xlsx and tableofnuclearsequences.xlsx.
Starbeastxmlfiles.zip
Beast XML input files for EBSP analyses
The XML input files for the Extended Bayesian Skyline Plots presented in Figure 6. ebspuppertennessee.xml is the input file for Figure 6a, ebspuppertennesseenuclear.xml is the input file for Figure 6b, ebspsfholston.xml is the input file for Figure 6c, and ebspclinch.xml is the input file for Figure 6d. The codes used to designate individuals in these files may be converted to the names of each individual listed in Appendices S2, S4 and S5 using the files tableofcytochromebsequences.xlsx and tableofnuclearsequences.xlsx.
ebspxmlfiles.zip
Table of Cytochrome b Sequences
A table listing the name of each individual sequenced for cytochrome b, its collection number, tissue collection code, collection locality, haplotype, the Genbank accession number of its cytochrome b sequence, and the code for that individual's cytochrome b sequence used in the *BEAST and EBSP xml files.
tableofcytochromebsequences.xlsx
Table of Nuclear Locus Sequences
A table listing the name of each individual sequenced for any of the seven nuclear loci, its collection number, tissue collection code, collection locality, haplotype of each of nuclear locus sequenced for that individual, the Genbank accession number of each nuclear locus sequenced for that individual, and the codes for that individual's nuclear locus sequences used in the *BEAST and EBSP xml files.
tableofnuclearsequences.xlsx