Skip to main content

Distinguishing between dispersal and vicariance: A novel approach using anti-tropical taxa across the fish Tree of Life

Cite this dataset

Ludt, William; Myers, Corinne (2021). Distinguishing between dispersal and vicariance: A novel approach using anti-tropical taxa across the fish Tree of Life [Dataset]. Dryad.


Aim: Anti-tropical taxa are species split by the tropics into disjunct northern and southern populations. These distributions occur throughout the Tree of Life, but the mechanisms proposed to drive this pattern are debated and generally fit into two categories: dispersal and vicariance. Here we quantitatively test the prevalence of dispersal and vicariance as plausible drivers of anti-tropical marine distributions using intra-specific anti-tropical marine fishes as a model system.
Location: Primarily Indo-Pacific.
Major Taxa Studied: Marine fishes.
Methods: To test between dispersal and vicariance in latitudinally disjunct marine fishes, we used an ecological niche modeling framework to predict the spatiotemporal suitability of tropical habitats during contemporary and glacial time periods. Three different model configurations were used per species to test: (1) presence of contemporary tropical suitable habitat for northern populations, (2) the same for southern populations, (3) presence of tropical suitable habitat during the last glacial maximum for the entire species. These models were examined in an evolutionary context to determine if there was any phylogenetic signal in biogeographic predictions. Additionally, we tested if life history traits could account for biogeographic predictions.
Results: Our analyses resulted in 87 strongly supported models for 29 anti-tropical fishes across the fish Tree of Life (northern population model, southern population model, and full species model for each taxon). Model projections consistently matched predictions of vicariance in 13 fishes and 10 fishes matched predictions of dispersal regardless of thresholding approach. We failed to find any phylogenetic signal for anti-tropicality in general, or for dispersal and vicariant species specifically. Further, dispersal and vicariant tendencies were not found to be correlated with life history traits. 
Main conclusions: These data quantitatively support both dispersal and vicariance as active mechanisms driving disjunct distributions in marine systems and suggest that they occur stochastically across the fish Tree of Life. This novel approach for examining dispersal and vicariance hypotheses supports the species-specific nature of biogeographic mechanisms structuring distributions, and that a “one-size-fits-all” prediction for current and future species’ responses to environmental change is unlikely to be informative.


Species and Distribution Datasets

            To test if dispersal or vicariance drive anti-tropical distributions in marine systems we used fishes described in Randall (1981) as a model system. This dataset comprises all known cases of intra-specific anti-tropicality in fishes, whose anti-tropical distributions have diverged recently (i.e., within the Quaternary). Notably, this dataset includes ‘anti-equatorial’ species whose ranges extend into tropical latitudes, but remain disjunct across the equator. Species that may disperse as adults in deeper, cooler water were removed to constrain the analysis for the specific hypothesis tests of divergences due to glacial dispersal and vicariance. Species that recently underwent taxonomic splitting were also removed from the Randall (1981) species list because divergence times are not yet constrained to the Quaternary timeline of our hypotheses. One notable exception to this is Microcanthus strigatus, which was recently split longitudinally, but still maintains an anti-tropical distribution that was formed during the Pleistocene (Tea et al., 2019; Tea & Gill 2020). For the remaining taxa, distribution records were gathered from the Global Biodiversity Information Facility using the package ‘rgbif’ in R (Chamberlain, Barve, Mcglinn, & Chamberlain, 2017)­.  These data were filtered by comparing occurrences to distribution records from the literature (Allen & Erdmann, 2012; Kuiter, 1993; Randall, 2005); all questionable occurrences were removed.  

Species Distribution Models 

            Species that form anti-tropical distributions through vicariance or glacial dispersal have distinct predictions regarding contemporary tropical suitable abiotic habitat (TSH), which should be present in vicariant species, and absent in glacial dispersers. Furthermore, candidate glacial dispersers are predicted to exhibit a corridor of TSH between hemispheres during the LGM across which they could disperse. This may be observed as an increase in TSH area and/or continuity during the LGM. Therefore, to test the hypotheses of vicariance versus dispersal in driving patterns of modern anti-tropicality, we estimated area and distribution of TSH for each species under contemporary and LGM climate conditions using ecological niche models (ENM). 

            ENM is a widely used tool to predict species’ suitable habitat through space and time (Elith & Leathwick, 2009; Guisan et al., 2017; Myers, Stigall, & Lieberman, 2015; Peterson et al., 2011). These models perform multivariate statistical correlations between species’ occurrences and the combinations of environments existing in the training region. In this way, ENMs attempt to estimate a species’ abiotic niche, which is defined as the suite of abiotic conditions within which a species may survive and reproduce (Peterson et al., 2011; Soberón, 2007). ENMs are constructed in an n-dimensional environmental space (e-space) reflecting the number of environmental input variables and may be translated into species distribution models by projecting e-space model predictions onto geography (g-space; Peterson et al., 2011).

            In addition to spatially explicit species occurrences, ENMs require spatially continuous environmental layers that depict environmental gradients across space. Here we used ten abiotic environmental layers gathered from the MARSPEC database for current and LGM conditions.  These layers summarize the mean, range, and variance in sea surface temperature and salinity across the oceans at a 5 arc-minute (~10km) resolution (Sbrocco & Barber, 2013; Table S1). The Maxent algorithm (Phillips, 2005) was used for all models via the Maxent GUI ( which has been shown to work well with presence-only occurrence data (as utilized in this study), as well as non-uniform and smaller sample sizes (Hernandez, Graham, Master, & Albert 2006; Guisan et al., 2007; Pearson, Raxworthy, Nakamura, & Peterson, 2007; Peterson, 2001; Peterson et al., 2011). Default modeling parameters were used for all models; model clamping was turned off, and model extrapolation turned on following recommendations in Owens et al. (2013). 

ENMs were evaluated using a partial ROC analysis (pROC) with 500 iterations (with replacement, using a 50% testing percentage) while calculating model significance with a normal distribution (z statistic). This threshold-independent approach minimizes evaluation bias from presence-only data (Peterson, Papes, & Soberón, 2008). Whereas traditional AUC/ROC analysis is interpreted from 0 – 1 with 0.5 representing a random model, pROC ratios range from 0 – 2 with 1.0 representing a random model (Peterson et al. 2008). Similar to ROC/AUC values, higher pROC ratios support more robust model results. However, the more important test for model evaluation in this setting is finding that the bootstrapped pROC ratio distribution is statistically different from the null model (randomness) distribution and that no bootstrap replicates show a pROC ratio £ 1.0 (Cobos et al. 2019; Peterson et al. 2008).

Multivariate environmental similarity surfaces (MESS) were constructed for each model to detect non-analog environments where inaccurate extrapolation was likely to occur in LGM model projections (Elith, Kearney, & Phillips, 2010).  MESS maps (Supplemental Figures S1-S29) were compared to ENM output to determine if non-analog environments were coincident with predicted suitable TSH; models with TSH in potentially extrapolated regions were removed. Model response curves were also used to identify values above which models were likely to experience erroneous extrapolation. This was done for each environmental variable in each model, and results were cross-referenced with areas of predicted TSH to determine if TSH measurements were potentially influenced by model extrapolation (Owens et al. 2013).

ENMs were used to predict anti-tropical species’ TSH under three different modeling scenarios for each species: (1) N-model: models trained only using the extent of northern populations; (2) S-model: models trained using the extent of southern populations; (3) All-model: models trained using the full species distribution (i.e., including northern and southern populations, and tropical habitats in between). Training regions for these three scenarios were created to reflect each species’ current distribution and existing information regarding dispersal potential (Barve et al. 2011). Coral and rocky reef fishes have an average range size of 9,357,000km(Allen 2008), and most species disperse as pelagic larvae over several days to a few months. As these conditions lead to very high dispersal potentials, training regions at both the population level (northern and southern) and whole species level were large and uniquely defined for each species. This avoids modeling bias by including areas that the species could plausibly “sample” during a long-lived larval stage, and thus improves ENM discrimination. N-model and S-model training regions were not extended into the opposite hemisphere in order to specifically test the hypotheses of tropical suitability (Supplemental Figures S30–S58). We recognize that there is no “silver bullet” method for objectively defining training region extent without substantial investigation of species-specific larval dispersal, which does not currently exist. Thus, we have used the combined information from generalized larval dispersal potential and oceanographic conditions (e.g., predominant currents) to inform training extent for each species. Although this method is more subjective than assigning a single bounding box or buffer distance to occurrence points, it is also less arbitrary as it allows for the incorporation of existing larval ecological information and follows the recommendations of Barve et al. (2011).

To test the hypothesis of vicariance, N-models and S-models for each species were projected into modern tropical zones to predict the current area and distribution of TSH. Northern and southern populations were modeled independently in order to quantify TSH without making a priori assumptions that tropical habitat was not currently suitable. A training region that encompasses the full species distribution (and therefore includes the tropics), may produce biased ENMs wherein the modeling algorithm assumes that the lack of tropical occurrence points as an indication of unsuitable habitat. This method does make the a priori assumption that modern tropical zones are “uninformative” for training ENMs, but we find this preferable to the assumption that modern tropical habitat is “unsuitable.” The All-model was utilized to test the hypothesis that anti-tropical species dispersed across the tropics via TSH during the LGM. In this scenario, modern tropical habitat is hypothesized to be unsuitable, thus tropical environments are informative for training the model and generating predictions of abiotic habitat preferences. All-model predictions were projected to LGM climate layers, and area and distribution of TSH was calculated. 


To quantify modern TSH, ENMs were thresholded using two independent approaches: mean predicted probability (Meanprob; Freeman & Moisen, 2008; Liu, White, & Newell, 2013), and maximum sum of sensitivity and specificity threshold (MaxSSS; Liu, Berry, Dawson, & Pearson, 2005; Liu et al., 2013; Liu, Newell, & White, 2016; Jiménez-Valverde & Lobo 2007). For each model, TSH was calculated as a percentage of total tropical habitat (as defined within the bounds of the All-training region for each species). The extent of tropical habitat was defined by areas exhibiting temperatures in which tropical hermatypic corals grow, i.e., temperature ≥ 20ºC during the coolest period of the year (Briggs, 1974; Siqueira, Oliveira-Santos, Cowman, & Floeter, 2016). 

To test for vicariance driving anti-tropical distributions we compared N-model and S-model projections for each species using K-means clustering (Hartigan & Wong, 1979), which minimizes within group variation. This is an objective way to identify natural clusters of species with similar quantities of contemporary TSH consistent across both northern and southern ENM population predictions. As there can be multiple ways to cluster multivariate data with equal support, this analysis was repeated 100 times to establish consistency in grouping patterns. Species with large amounts of contemporary TSH support the hypothesis of vicariance, whereas species with small contemporary TSH are candidates for the glacial dispersal hypothesis. 

However, to support a hypothesis of glacial dispersal, candidate glacial dispersers are also expected to show either an increase of TSH and/or increased TSH continuity between hemispheres during the LGM. To quantify the former, the change in proportional TSH between both northern and southern contemporary, and LGM projections was calculated.  Notably, in defining the tropics by temperature, proportional TSH could increase between contemporary and glacial times merely as the result of a smaller tropical area during the LGM. To assess the effect of this potential bias, analyses were re-run defining the tropics by latitude, wherein there is no change in area between contemporary and LGM periods. 

In addition to examining the change in TSH between contemporary and LGM times, TSH continuity was estimated using the least cost path (LCP) function in ArcMap (ESRI, 2010), wherein corridor potential was measured as a function of distance weighted by quality of TSH across the tropics. The LCP between populations in both hemispheres was measured for all projections (i.e., N-model, S-model, and All-model projections). LCP requires a distance between starting and ending points as well as a “cost” matrix used to weight travel distance. Distance was determined using a minimum spanning polygon constructed in ArcMap (ESRI, 2010) covering the distribution of contemporary northern and southern populations. Due to the paucity of fossil data, and the uncertainty of LGM population distributions, the same minimum spanning polygons were used for contemporary and LGM LCP estimates. Cost matrices were defined by the un-thresholded model predictions (ranging from 0 – 1), such that higher model values were associated with less “cost” to traverse than low model values. Least cost paths were calculated in ArcMap, where shorter paths indicate greater continuity of suitable habitat across the tropics (i.e., lower “cost” for dispersal). Species demonstrating low modern TSH in combination with higher TSH during the LGM and/or low LCP distances across the tropics during the LGM support the hypothesis of glacial dispersal as a mechanism producing modern anti-tropicality. Models identified in the MESS or response curve analyses that exhibited extrapolation in regions with predicted TSH were removed and all above analyses were repeated.

Usage notes

Please see the readme file (ReadMe.txt) for descriptions and instructions on how to use the provided files.