Skip to main content
Dryad

Reconstructing Ecological Niche Evolution via Ancestral State Reconstruction with Uncertainty Incorporated

Cite this dataset

Owens, Hannah et al. (2021). Reconstructing Ecological Niche Evolution via Ancestral State Reconstruction with Uncertainty Incorporated [Dataset]. Dryad. https://doi.org/10.5061/dryad.c866t1g3j

Abstract

Reconstructing ecological niche evolution can provide insight into the biogeography and diversification of evolving lineages. However, comparative phylogenetic methods can infer the history of ecological niche evolution inaccurately because (1) species’ niches are often poorly characterized; and (2) phylogenetic comparative methods rely on niche summary statistics rather than full estimates of species’ environmental tolerances. Here we propose a new framework for coding ecological niches and reconstructing their evolution that explicitly acknowledges and incorporates the uncertainty introduced by incomplete niche characterization. Then, we modify existing ancestral state inference methods to leverage full estimates of environmental tolerances. We provide a worked empirical example of our method, investigating ecological niche evolution in the New World orioles (Aves: Passeriformes: Icterus spp.). Temperature and precipitation tolerances were generally broad and conserved among orioles, with niche reduction and specialization limited to a few terminal branches. Tools for performing these reconstructions are available in a new R package called nichevol.

Methods

Coding ecological niches for analysis

Coding the full information in niche ranges for analysis represents an initial challenge for such analyses. We first determine relevant analytical limits with respect to single environmental dimensions as the minimum and maximum values within the union of all accessible areas polygons for the species comprising the clade of interest. Next, we parse this range of values into equal-width bins; for each species, bins with values falling within the range of environmental conditions occupied by the species are coded as “suitable” (1). Bins with values represented within the species’ M but falling outside the range of environmental conditions occupied by the species are scored as “unsuitable” (0). In cases in which suitable niche conditions coincide with the limits of environmental conditions present in a species’ M, all more extreme values—that is, values more extreme than those manifested within M (e.g. Fig. 1)—are coded as unknown (?). This procedure allows explicit incorporation of uncertainty in our analyses. When suitability is unknown because the climatic values for the bin in question were not represented within the M of the species, but the bin in question was flanked by two suitable bins, it is also scored as “suitable,” under an assumption of a unimodal response to environmental conditions (Maguire 1973). These steps can be achieved using nichevol v.0.1.17 (Cobos et al. 2020), an R (v.3.6.1; R Core Team 2019) package we created to facilitate studies of niche evolution, including many of the analyses presented in this paper. Package documentation includes a tutorial demonstrating an analytical workflow implementing nichevol.

Binned ancestral range reconstruction demonstration

As a simple illustration of how this analysis works, we simulated data for a scenario of a shift from an ancestral warm niche to a cool niche, to demonstrate the ability of our new method to identify instances of niche evolution. First, we simulated distributions and accessible areas for 1000 species across South America—500 with a fundamental mean annual temperature niche of 24–28ºC (hereafter referred to as “cool niche species”) and 500 with a fundamental mean annual temperature niche of 25–29ºC (hereafter referred to as “warm niche species”). For each set of species, accessible areas were generated using an initial population of 10,000 random polygons using nichevol. We assumed that each species could occupy all suitable cells within its corresponding M (i.e. we ignored biotic factors). Suitable cells were identified based on a 2.5’-resolution raster of annual mean temperature data (Bio 1) from WorldClim v.1.4 (Hijmans et al. 2005) and the raster package v.3.0.0 (Hijmans 2019) in R; simulated Ms with no suitable conditions were removed (n = 1). The median suitable mean annual temperature value for each simulated species was calculated from suitable cell values within its M. We then determined the minimum and maximum mean annual temperature within the union of all simulated M polygons and parsed this range of values into equal-width, 1ºC bins using nichevol in R. Raw and annotated R code the niche simulation and coding steps, as well as input and output data, can be found in supplementary materials provided via Dryad (Code Supplement 1, Annotated Code Supplement 1, Data Package 1; DOI: 10.5061/dryad.c866t1g3j).

We generated a single, 15-taxon stochastic birth-death tree (birth rate = 1, death rate = 0) using the R package phytools v.0.6-99 (Revell, 2012) and assigned simulated species from the cool niche group to a monophyletic clade of 7 taxa. The remaining tips in the tree were assigned simulated species from the warm niche simulation group. We then used nichevol tools to perform BAR reconstructions using maximum parsimony (as implemented in castor v.1.4.3; Louca and Doebeli 2018) and maximum likelihood (as implemented in ape v.5.3; Paradis and Schliep 2018). For both algorithms, ancestral state reconstructions were performed for each bin separately, treating bin scores (including “uncertain”) as discrete characters under an equal transition rate model of evolution. Results were smoothed such that reconstructed suitable ancestral niche bins at each node were not interrupted by unsuitable bins, following the assumption of a unimodal response to environmental conditions (Maguire, 1973), and accounting for evolutionary non-independence of bins. Raw and annotated R code for these steps, as well as input and output, can be found in supplementary materials provided via Dryad (Code Supplement 2, Annotated Code Supplement 2, Data Package 1; DOI: https://doi.org/10.5061/dryad.c866t1g3j).

We note that we have kept this initial example simple for the purpose of illustration—many improvements could be made to this methodology, such as implementation of different character evolution models, Bayesian approaches in inferring ancestral character states, stochastic character mapping, and consideration of joint effects of environmental dimensions (e.g. temperature, precipitation) that are here considered independently. Furthermore, phylogenetic comparative methods are notoriously “data-hungry,” and BAR reconstructions will benefit from further detailed simulation-based examinations in the future. Our purpose here is to illustrate the crucial importance of incorporating uncertainty explicitly in the inference of abiotic ecological niche evolution patterns.

Oriole analyses

We next used BAR reconstructions to infer patterns of niche evolution in 34 species of New World orioles (genus Icterus). We used the single best ultrametric maximum likelihood phylogeny from Powell et al. (2014; their Figure 4) inferred from mitochondrial and nuclear DNA sequences. Distributional data for each species were drawn from the Global Biodiversity Information Facility (GBIF, 2018), a large portion of which were derived from eBird (Supplemental Table 1, Data Package 2; via Dryad, DOI: https://doi.org/10.5061/dryad.c866t1g3j). We removed all records lacking geographic coordinates, and inspected those remaining with respect to known ranges of species based on expert assessment by four ornithologists (authors Cooper, Hosner, and Peterson), removing records that reflected errors or outdated taxonomic arrangements. Species-specific hypotheses of areas accessible to the species (M) were developed based on the biotic attributes and biogeographic history of the clade (Barve et al. 2011; Elith et al. 2010). That is, the ornithologists inspected patterns of occurrences for each species and outlined accessible-area hypotheses based on known barriers to dispersal (i.e., oceans, high mountain ranges, the Amazon River, deserts). While this step remains subjective, it is crucial to a realistic representation of the environments that should be considered within the species’ potential distribution (Phillips et al. 2009; Barve et al. 2011).

We then used BR to score species’ niches, explicitly scoring the parts of these profiles that were not observable (i.e. at the periphery of M) as uncertain (see above). For mean annual temperature (Bio1 in WorldClim v.1.4; Hijmans et al. 2005), we used 32 equal-width, 1ºC bins (3-4º, 4-5º, … 34-35º) across the full range of temperature values represented in the union of all species’ M areas. For annual precipitation (Bio12 in WorldClim v1.4; Hijmans et al. 2005), we used 80, 10 mm-width bins to cover the range of precipitation values from 0 to 800 mm across all species’ M areas. For comparison to more traditional methods of coding species’ niches, we calculated median values for mean annual precipitation and temperature across species’ known occurrences. As with our simulated species, we characterized species’ niches using R; raw and annotated R code for analyses, as well as inputs and outputs, can be found in supplementary materials provided via Dryad (Code Supplement 3, Annotated Code Supplement 3, Data Package 2; DOI: https://doi.org/10.5061/dryad.c866t1g3j).

Finally, we inferred the evolutionary history of oriole temperature and precipitation niches using both BAR, as described above, and GLS reconstructions using the median temperature and precipitation values at species occurrences. For GLS reconstructions, we first examined the fits of Brownian motion, Ornstein-Uhlenbeck, early burst, and diffusion with linear trend models of evolution. We then performed ancestral state reconstructions using a continuous-value maximum likelihood algorithm (as implemented in reconstruct in ape) under the best-fit evolutionary model (Ornstein-Uhlenbeck) for both mean annual temperature and annual precipitation. Raw and annotated R code for analyses, as well as input and output can be found in supplementary materials provided via Dryad (Code Supplement 4, Annotated Code Supplement 4, Data Package 2; DOI: https://doi.org/10.5061/dryad.c866t1g3j).

Usage notes

Note the orioles phylogeny has not been included. It can be found via the following citation:

Powell, A. F., F. K. Barker, S. M. Lanyon, K. J. Burns, J. Klicka, and I. J. Lovette. 2014. A comprehensive species-level molecular phylogeny of the New World blackbirds (Icteridae). Molecular Phylogenetics and Evolution 71:94-112.