Ecological divergence and the history of gene flow in the Nearctic milksnakes (Lampropeltis triangulum complex)
Data files
Nov 23, 2021 version files 13.87 MB
-
Data_D1_List_of_samples_and_localities_used_for_this_research.txt
-
Data_D2_Fasta_files_generated_for_Lampropeltis_triangulum_L.gentilis_and_L.elapsoides_from_ipyrad.zip
-
Data_D3_VCF_file_generated_for__Lampropeltis_triangulum_L._gentilis_and_L._elapsoides_from_ipyrad_filtered.vcf
-
Data_D5_Priors_.txt
-
Data_D6_DAPC_TESS_assignments_4_taxa_no_names_mtDNA_Assignments.txt
-
Data_D7_Vert_Net_Niche_Modeling_Data_Code.zip
Abstract
Many phylogeographic studies on species with large ranges have found genetic-geographic structure associated with changes in habitat and physical barriers preventing or reducing gene flow. These interactions with the geographic space, contemporary and historical climate, and biogeographic barriers may have complex effects on population genetic structure and speciation. While allopatric speciation at biogeographic barriers is considered the primary mechanism for generating species, more recently it has been shown that parapatric modes of divergence may be equally or even more common. With genomic data and better modeling capabilities we can more clearly define causes of speciation in relation to biogeography and migration between lineages, the location of hybrid zones with respect to the ecology of parental lineages, and differential introgression of genes between taxa. Here we examine the origins of three Nearctic milksnakes (Lampropeltis elapsoides, L. triangulum, and L. gentilis) using genomic-scale data to better understand the diversification of these taxa. Results from artificial neural networks show that a mix of a strong biogeographic barrier, environmental changes, and physical space has driven geographic structure in these taxa. These results underscore conspicuous environmental changes that occur as sister taxa diverged near the Great Plains into the forested regions of the Eastern Nearctic. This area has been recognized as a region for turnover for many vertebrate species but as we show here, the contemporary boundary does not isolate the sister species, L. triangulum and L. gentilis. These two species likely formed in the mid-Pleistocene and have remained partially reproductively isolated over much of this time, exchanging fewer than one migrant/generation with hybrid zones showing differential introgression of loci. We also show that when L. triangulum and L. gentilis are each in contact with the much older L. elapsoides, some limited gene flow has occurred. Given the strong agreement between nuclear and mtDNA genomes, along with estimates of fundamental niche, we suggest that all three lineages should continue to be recognized as unique species. Furthermore, this work emphasizes the importance of considering parapatric modes of divergence and differential allelic introgression over a complex landscape when considering mechanisms of speciation.
Methods
Electronic Supplementary Text
Text S1 – Supplementary information providing additional detail for computational methods used.
Electronic Supplementary Figures
Fig. S1 – Location of samples for grouping into numbered (and colored) populations for addressing longitudinal allele surfing.
Fig. S2 – A) results from DAPC showing Bayesian information criteria (BIC) against number of clusters, B) discriminant analysis of principal components (DAPC) scatter plots, and B) DAPC membership probabilities for each individual sample by species.
Fig. S3 – Correlation of admixture coefficients between TESS3r and SNMF for Lampropeltis gentilis, L. triangulum, and L. elapsoides.
Fig. S4 – Pie charts over a map of North America (partial) showing admixture coefficients generated from SNMF for the milksnakes examined here.
Fig. S5 – DAPC estimates of population structure considering lowest BIC over different GBS sequence assembly filtering strategies with average missing data per individual, number of individuals, and number of loci ranging from 19-48%, 129–159,137–3391, respectively.
Fig S6 – TESS3r estimates of population structure considering lowest BIC over different GBS sequence assembly filtering strategies with average missing data per individual, number of individuals, and number of loci ranging from 19-48%, 129–159,137–3391, respectively.
Fig. S7 – Predictions of group membership given ancestral coefficients, where Lampropeltis gentilis (green), L. triangulum (blue), L. annulata (purple) and L. elapsoides (red) are supported relative to those coefficients estimated from TESS3r equally supporting L. gentilis L. triangulum, or L. elapsoides (yellow) at southern end of the Mississippi River in Louisiana.
Fig. S8 – The four groups of milksnakes predicted by DAPC, red = Lampropeltis triangulum, blue = L. elapsoides, green = L. gentilis, purple = L. annulata, and red dots with yellow halos showing individuals predicted as L. triangulum but with L. gentilis mtDNA.
Fig. S9 – Map across all milksnakes demes showing genetic diversity estimated using EEMS.
Fig. S10 – Fit of observed data to different historical demographic models over principal component space.
Fig. S11 – Using 40 loci between Lampropeltis triangulum and L. elapsoides and nine loci for L. triangulum and L. gentilis falling below neutral cline widths showing (A) DAPC spatial predictions for two groups, (B) discriminant function distributions of those groups, and (C) distributions of ancestral coefficients from TESS3r.
Fig. S12 – Estimation of cline width and FST for each locus for both species-pair comparisons: Lampropeltis gentilis x L. triangulum and L. triangulum x L. elapsoides.
Fig. S13 – Location of mtDNA clades and GBS groups relative to the 50% admixture clines (yellow = cline between Lampropeltis gentilis and L. triangulum; pink = cline between L. triangulum and L. elapsoides) estimated using GBS data.
Electronic Supplementary Data
Data. D1 – List of samples and localities used for this research.
Data. D2 – Fasta files generated for Lampropeltis triangulum, L. gentilis, L. annulata, and L. elapsoides from ipyrad.
Data. D3 – VCF file generated for Lampropeltis triangulum, L. gentilis, L. annulata, and L. elapsoides from ipyrad filtered.
Data. D4 – .R code for estimating lineage structure, cline widths from admixture proportions, cline widths for loci, and redundancy analyses (RDA) and artificial neural networks (ANN) to examine the effect of space and ecology on genetic structure of Lampropeltis triangulum, L. gentilis, L. annulata, and L. elapsoides.
Data. D5 – Priors used for simulation of historical demographic parameters in PipeMaster for Lampropeltis triangulum, L. gentilis and L. elapsoides.
Data. D6 – Milksnake genetic assignment file results from DAPC and TESS3r.
Data. D7 – Raw VertNet (http://vertnet.org) downloads, R code, and cleaned and partitioned localities for L. triangulum, L. gentilis, and L. elapsoides used in ecological niche modeling. Note: R scripts to run this are a separate file and not zipped with these datasets.