Skip to main content

Genetic data improves niche model discrimination and alters the direction and magnitude of climate change forecasts

Cite this dataset

Bothwell, Helen et al. (2020). Genetic data improves niche model discrimination and alters the direction and magnitude of climate change forecasts [Dataset]. Dryad.


Ecological niche models (ENMs) have classically operated under the simplifying assumptions that there are no barriers to gene flow, species are genetically homogeneous (i.e., no population-specific local adaptation), and all individuals share the same niche. Yet, these assumptions are violated for most broadly distributed species. Here we incorporate genetic data from the widespread riparian tree species narrowleaf cottonwood (Populus angustifolia) to examine whether including intraspecific genetic variation can alter model performance and predictions of climate change impacts. We found that (1) P. angustifolia is differentiated into six genetic groups across its range from México to Canada, and (2) different populations occupy distinct climate niches representing unique ecotypes. Comparing model discriminatory power, (3) all genetically-informed ecological niche models (gENMs) outperformed the standard species-level ENM (3-14% increase in AUC; 1-23% increase in pROC). Furthermore, (4) gENMs predicted large differences among ecotypes in both the direction and magnitude of responses to climate change, and (5) revealed evidence of niche divergence, particularly for the Eastern Rocky Mountain ecotype. (6) Models also predicted progressively increasing fragmentation and decreasing overlap between ecotypes. Contact zones are often hotspots of diversity that are critical for supporting species’ capacity to respond to present and future climate change, thus predicted reductions in connectivity among ecotypes is of conservation concern. We further examined the generality of our findings by comparing our model developed for a higher elevation Rocky Mountain species with a related desert riparian cottonwood, P. fremontii. Together our results suggest that incorporating intraspecific genetic information can improve model performance by addressing this important source of variance. gENMs bring an evolutionary perspective to niche modeling and provide a truly “adaptive management” approach to support conservation genetic management of species facing global change.

Usage notes

NL_MSAT_STRUCTURE_756.csv - Data for this study included 756 narrowleaf cottonwood samples collected from 36 different sampling locations across the species' range from Arizona to Alberta, Canada. Each tree was individually GPSed; latitude and longitude were recorded in the North American Datum 1983 coordinate system (X, Y). Three sampling locations with very small sample sizes, close proximity and high genetic identity (>0.9) with neighboring collection sites were combined, such that the Site column reflects 33 sites used to assess population genetic structure. Remaining columns indicate diploid microsatellite allele calls for 12 loci originally described by Tuskan et al. (2004, 2006). Missing data are indicated as -9. We note that site BWR was strongly differentiated and exhited unusually high heterozygosity. We did not include this site in subsequent ecological niche modeling, and suspect possible introgression from P. trichocarpa

Occurrence records (genetic groups + GBIF).zip - This folder contains .csv files, one for each of the population genetic groups, and an AllPops file that contains the complete list of occurrence records. Each ecotype is augmented with occurrence reccords from the Global Biodiversity Information Facility (GBIF) that exhibited >70% probability of belonging to a given genetic group, based on spatial population genetic maps developed in Geneland (Guillot et al. 2005). Data are in the WGS84 coordinate system. 

Pruned_gENM_rasters_(continuous).zip - This folder contains continuous ecological niche model output generated in MaxEnt (Phillips et al. 2004, 2006). Raster grids are included for each of the seven P. angustifolia ecotypes (pop1-7) and a species-level model (all). Naming conventions are as follows: curr refers to baseline climate (30-year averages for 1961-1990); a1b and a2 indicate niche models based on IPCC AR4 moderate and high emissions scenarios, respectively (IPCC 2007); 2050 and 2080 refer to 30-year average climate forecasts for the periods 2040-2069 and 2070-2099, respectively. Rasters represent ecological niche model averages based on 10-fold cross-validation (e.g., each model was run 10X, with a different random subset of 10% of occurrence records left out for testing each run). For future 2050 and 2080 predictions, model output represents a 5-GCM ensemble mean, also derived from 10-fold cross-validation of each of the five individual GCMs for each time period (individual model means for each of the five GCMs can be found below). MaxEnt model output was pruned using the over-prediction correction algorithm of Kremen et al. (2008); this served to restrict model output within a biologically meaningful region encompassing potential pollen dispersal and gene flow from genetic populations. We used the following pruning parameters: = 10th percentile training presence threshold for each model (Pearson et al., 2007), = 80 grid cell buffer, = 160 grid cell width of fading buffer. Rasters are in the WGS84 coordinate system and at ~1km resolution. - This folder contains binary ENM raster grids, with suitable habitat (1) based on the 10th percentile training presence threshold for each model (Pearson et al., 2007). Naming conventions, pruning methodology, and coordinate system (WGS84) are the same as described above.

Individual_GCM_mean_rasters_(continuous).zip - This folder contains raster .tif files of continuous ecological niche model output based on future climate as predicted by five different global climate models (GCMs). Whereas the files in Pruned_gENM_rasters_(continuous).zip above represent averages of the 5-GCM ensemble mean, the files included here are 10-model averages for each of the niche models generated from the five different individual GCMs. Future ENMs are based on 30-year average climate forecasts for 2070-2099. GCMs are abbreviated as: cnrm = CNRM-CM3, csiro = CSIRO-MK3.0, mpi = ECHAM5-MPI, ncar = NCAR-CCSM3, and ukmo = UKMO-HADGEM1. Other naming conventions, pruning methodology, and coordinate system (WGS84) are the same as described above. - This folder contains raster .tif files of binary ENM output, with suitable habitat (1) based on the 10th percentile training presence threshold for each model (Pearson et al., 2007). Naming conventions, pruning methodology, and coordinate system are the same as described above. Suitable habitat rasters ending with avg_10T represent the average of 10-fold cross-validation for each of the five GCMs and emissions scenario (A1B, A2). Files ending with summed illustrate where predictions of the five GCMs are in agreement or diverge. For example, a value of 1 indicates that only a single GCM predicts suitable habitat in a given location by 2080, whereas a value of 5 indicates that all five GCMs predict suitable habitat for a given location. Regions of high agreement are indicative of low future risk, and thus suggest regions where conservation and restoration efforts are likely to have high long-term success. -  This folder contains raster grids of temporal corridors (i.e., climate refugia) where habitat is predicted to remain continuously suitable from the present through 2099. Naming conventions and pruning methodology are as described above with WGS84 coordinate system. These regions are important for conservation as they represent relatively low stress areas where trees are not expected to have to adapt or migrate away from, but rather may remain in situ in the face of climate change.