The eastern subterranean termite, Reticulitermes flavipes, currently inhabits previously glaciated regions of the northeastern U.S., as well as the unglaciated southern Appalachian Mountains and surrounding areas. We hypothesized that Pleistocene climatic fluctuations have influenced the distribution of R. flavipes, and thus the evolutionary history of the species. We estimated contemporary and historical geographic distributions of R. flavipes by constructing Species Distribution Models (SDM). We also inferred the evolutionary and demographic history of the species using mitochondrial (cytochrome oxidase I and II) and nuclear (endo‐beta‐1,4‐glucanase) DNA sequence data. To do this, genetic populations were delineated using Bayesian spatial‐genetic clustering, competing hypotheses about population divergence were assessed using approximate Bayesian computation (ABC), and changes in population size were estimated using Bayesian skyline plots. SDMs identified areas in the north with suitable habitat during the transition from the Last Interglacial to the Last Glacial Maximum, as well as an expanding distribution from the mid‐Holocene to the present. Genetic analyses identified three geographically cohesive populations, corresponding with northern, central, and southern portions of the study region. Based on ABC analyses, divergence between the Northern and Southern populations was the oldest, estimated to have occurred 64.80 thousand years ago (kya), which corresponds with the timing of available habitat in the north. The Central and Northern populations diverged in the mid‐Holocene, 8.63 kya, after which the Central population continued to expand. Accordingly, phylogeographic patterns of R. flavipes in the southern Appalachians appear to have been strongly influenced by glacial‐interglacial climate change.
EnvironmentalFactors_EasternUS
Raster maps of present-day and historical environmental factors for the Eastern United States. The environmental factors are the outcome of factor analysis on 19 bioclimatic variables (Worldclim 1.4), resulting in four factors capturing: temperature range, summer temperature, dry- and wet-season precipitation. The temporal scale includes the present (1960-1990), the Mid-Holocene (~6,000 years ago), the Last Glacial Maximum (~22,000 years ago), and the Last Inter-glacial (~120,000-140,000 years ago). The spatial scale covers a large part of the Eastern United States (28-41 N, 74-91 W).
Species distribution modeling projections
Species distribution models (SDMs) were constructed using the ‘biomod2’ package in R. We first ran a rectilinear surface range envelope model, and then, from outside the area predicted as suitable habitat, we picked 20 independent sets of 100 pseudo-absence points, each of which were combined with the same 91 presence records. Four modeling algorithms were run: artificial neural networks, generalized boosted models or boosted regression trees, random forest, and maximum entropy. We used 5 cross-validation runs per algorithm, for a total of 400 runs (4 algorithms x 5 cross-validations x 20 datasets), with 5,000 iterations per run. To assess model performance, 75% of the data were used for training, with 25% set aside as "out-of-bag" test data. To maximize the accuracy of presence/absence classification, we used the True Skill Statistic (TSS = sum of sensitivity and specificity – 1), where SDMs with mean TSS above 0.2 were retained. We then used the ensemble framework to obtain a weighted average of all SDMs, where SDMs were weighted according to TSS values. Projections: present-day SDMs were based on mean climatological data spanning 1960–1990, and historical distributions were modeled for the Mid-Holocene (~6,000 years ago), the Last Glacial Maximum (~22,000 years ago), and the Last Inter-glacial (~120,000–140,000 years ago).
BIOMODoutput_projections.zip
DIYABC_files
DIYABC approximate Bayesian computation input file and settings (scenarios and parameters)
AppendicesS1-S8
Supplementary Methods: Appendix S1. Population sampling. Appendix S2. DNA isolation and genetic markers. Appendix S3. Construction of species distribution models. Appendix S4. Comparison of scenarios using approximate Bayesian computation. Supplementary Results: Appendix S5. Environmental factors used in species distribution models. Appendix S6. Genetic divergence, environment, and spatial structure. Appendix S7. Phylogeographic scenarios: error rates and parameter estimates. Appendix S8. Population size changes: standard and compound neutrality tests.
Approximate Bayesian computation - Posterior Probabilities and Error Rates
Posterior probabilities of phylogeographic scenarios in two-tier comparisons, type I (alpha, false positive) and type II (beta, false negative) error rates for "winning" scenarios (with highest posterior probabilities) in each step of each tier. The first tier has two steps (step 1: refugial scenarios; step 2: distributional shift scenarios). Tier 2 only has one step, but the winning scenarios from tier 1 are compared including or excluding a third scenario, vicariance. For the former (including vicariance), we computed type I and II error rates for all three scenarios, rather than just the winning scenario.
Posterior Probabilities - Error Rates.xlsx
BEAST_files
Input files for BEAST analyses. BEAST was run with (Ret) and without (Rf) the out-group Reticulitermes taxa.
BAPS_files
Input files for BAPS analyses, including a file that shows the mtDNA COI+COII locus haplotypes for each data point. To determine the number of geographically cohesive genetic groups of R. flavipes, we analyzed geo-referenced mtDNA sequences in BAPS v.6.0. We assessed values of K (i.e., the number of clusters) ranging from 2–20, with 10 replicate runs each.