Nonbreeding distributions of four declining Nearctic-Neotropical migrants are predicted to contract under future climate and socioeconomic scenarios

Brodie, Ryan E.1 ; Bayly, Nicholas J.2 ; González, Ana M.3 ; Hightower, Jessica4 ; Larkin, Jeffery L.5 ; Stewart, Rebecca L. M.3 ; Wilson, Scott 6 ; Roth, Amber M.1

Research facility: University of Maine

Published Jul 31, 2024; Updated Nov 08, 2024 on Dryad. https://doi.org/10.5061/dryad.g1jwstr0h

Data files

Jul 31, 2024 version files 42.81 GB

covariates.zip

42.80 GB
presence_record_CSVs.zip

23.66 KB
ProcessestoIdentifyNear_termConservationPriorityAreas_Honduras_FocalAreas.zip

204.08 KB
README.md

33.64 KB
spatial_extents.zip

1.27 MB

Abstract

Climate and land use/land cover change are expected to influence the stationary nonbreeding distributions of 4 Nearctic–Neotropical migrant bird species experiencing population declines: Cardellina canadensis (Canada Warbler), Setophaga cerulea (Cerulean Warbler), Vermivora chrysoptera (Golden-winged Warbler), and Hylocichla mustelina (Wood Thrush). Understanding how and where these species’ distributions shift in response to environmental drivers is critical to inform conservation planning in the Neotropics. For each species, we quantified current (2012 to 2021) and projected future (2050) suitable climatic and land use/land cover conditions as components of stationary nonbreeding distributions. Multi-source occurrence data were used in an ensemble modeling approach with covariates from 3 global coupled climate models (CMCC-ESM2, FIO-ESM-2-0, MIROC-ES2L) and 2 shared socioeconomic pathways (SSP2-RCP4.5, SSP5-RCP8.5) to predict distributions in response to varying climatic and land use/land cover conditions. Our findings suggest that distribution contraction, upslope elevational shifts in suitable conditions, and limited shifts in latitude and longitude will occur in 3 of 4 species. Cardellina canadensis and S. cerulea are expected to experience a moderate distribution contraction (7% to 29% and 19% to 43%, respectively), primarily in response to expected temperature changes. The V. chrysoptera distribution was modeled by sex, and females and males were projected to experience a major distribution contraction (56% to 79% loss in suitable conditions for females, 46% to 65% for males), accompanied by shifts in peak densities to higher elevations with minimal changes in the upper elevation limit. Expected changes in precipitation had the greatest effect on V. chrysoptera. Hylocichla mustelina experienced the smallest distribution change, consistent with the species’ flexibility in habitat selection and broader elevational range. We recommend defining priority areas for conservation as those where suitable conditions are expected to remain or arise in the next 25 years. For V. chrysoptera in particular, it is urgent to ensure that mid-elevation forests in Costa Rica and Honduras are adequately managed and protected.

https://doi.org/10.5061/dryad.g1jwstr0h

Description of the data and file structure

The following data are available for use: covariates, spatial extents, presence record CSVs, and Honduras focal areas in the 'Processes to Identify Near-term Conservation Priority Areas' methods section. Covariates are available for the current (2012-2021) and future (2050) time periods. The current timeframe for Brodie et al. 2024 was designated as 2012-2021 to align with prior research and capitalize on increased user engagement with eBird. The 2050 data is split into SSP2-RCP4.5 (best-case) and SSP5-RCP8.5 (worst-case) climate and socioeconomic scenarios. The best-case represents a future where climate-smart practices increase and non-renewable resource use declines. In contrast, the worst-case represents a future where technological advances and increased fossil fuel extraction lead to maximum global emissions. Further, the best- and worst-case data are split into three global coupled climate models (CMCC-ESM2, FIO-ESM-2-0, MIROC-ES2L).

These are the 2012-2021 and 2050 datasets available: monthly climatic (max temperature, min temperature, precipitation; Fick and Hijmans 2017), bioclimatic (CMIP6; Fick and Hijmans 2017), topographic (Fick and Hijmans 2017, Esri Inc. 2022), and plant functional types (Chen et al. 2022). The plant functional types dataset consists of global LULC projections based on simulations of 16 plant functional types (i.e., forest, grassland, cropland) and urban expansion. Topographic datasets (i.e., elevation and slope) are held constant throughout the ensemble modeling process and do not change. Refer to Table 1 in the paper for more information on what to include in ensemble modeling for an individual species. Below are detailed descriptions of all folders and files in the data repository:

'code' folder - see** Code/Software** section below for more details on files and R libraries used
'covariates' folder -
- 'current_2012_2021' folder - bioclimatic data and monthly climatic data (i.e., maximum temperature, minimum temperature, precipitation) are WorldClim version 2.1 climate data for 1970-2000 (https://www.worldclim.org/data/worldclim21.html)
  - 'bioclimatic' folder -
    - bio_1 = annual mean temperature (in degrees Celsius)
    - bio_2 = mean diurnal range [mean of monthly (max temp - min temp)] (in degrees Celsius)
    - bio_3 = isothermality ((bio_2/bio_7) *100) (dimensionless)
    - bio_4 = temperature seasonality (standard deviation * 100) (in degrees Celsius)
    - bio_5 = max temperature of warmest month (in degrees Celsius)
    - bio_6 = min temperature of coldest month (in degrees Celsius)
    - bio_7 = temperature annual range (bio_5 - bio_6) (in degrees Celsius)
    - bio_8 = mean temperature of wettest quarter (in degrees Celsius)
    - bio_9 = mean temperature of driest quarter (in degrees Celsius)
    - bio_10 = mean temperature of warmest quarter (in degrees Celsius)
    - bio_11 = mean temperature of coldest quarter (in degrees Celsius)
    - bio_12 = annual precipitation (in millimeters)
    - bio_13 = precipitation of wettest month (in millimeters)
    - bio_14 = precipitation of driest month (in millimeters)
    - bio_15 = precipitation seasonality (coefficient of variation) (fraction)
    - bio_16 = precipitation of wettest quarter (in millimeters)
    - bio_17 = precipitation of driest quarter (in millimeters)
    - bio_19 = precipitation of coldest quarter (in millimeters)
  - 'max_temperature' folder -
    - tmax_03 = average max temperature of March (in degrees Celsius)
    - tmax_10 = average max temperature of October (in degrees Celsius)
    - tmax_11 = average max temperature of November (in degrees Celsius)
  - 'min_temperature' folder -
    - tmin_03 = average min temperature of March (in degrees Celsius)
    - tmin_10 = average min temperature of October (in degrees Celsius)
    - tmin_11 = average min temperature of November (in degrees Celsius)
  - 'precipitation' folder -
    - precip_01 = total precipitation of January (in millimeters)
    - precip_02 = total precipitation of February (in millimeters)
    - precip_03 = total precipitation of March (in millimeters)
    - precip_10 = total precipitation of October (in millimeters)
    - precip_11 = total precipitation of November (in millimeters)
    - precip_12 = total precipitation of December (in millimeters)
  - plant_functional_types = from Chen et al. 2022, the 'global_LULC_2015.tif' product from https://doi.org/10.5281/zenodo.4584775
    - categorical LULC, each plant functional type corresponds to an individual pixel value (1-20, see figure 2 in Chen et al. 2022)
- 'ssp2_rcp4_5_2050' folder - contains three subfolders pertaining to the three global coupled climate models (CMCC-ESM2, FIO-ESM-2-0, MIROC-ES2L) and the 'global_SSP2_RCP45_2050.tif' plant functional types product from Chen et al. 2022 (https://doi.org/10.5281/zenodo.4584775). The plant functional types dataset should be applied to each global coupled climate model scenario separately when working with future projections.
  - 'cmcc_esm2' folder - bioclimatic data and monthly climatic data (i.e., maximum temperature, minimum temperature, precipitation) are CMIP6 (https://www.carbonbrief.org/cmip6-the-next-generation-of-climate-models-explained/) future climate data from the CMCC-ESM2 global coupled climate model (Cherchi et al. 2019) for 2041-2060. Data source: https://www.worldclim.org/data/cmip6/cmip6_clim30s.html.
    - 'bioclimatic' folder -
      - bio_1 = annual mean temperature (in degrees Celsius)
      - bio_2 = mean diurnal range [mean of monthly (max temp - min temp)] (in degrees Celsius)
      - bio_3 = isothermality ((bio_2/bio_7) *100) (dimensionless)
      - bio_4 = temperature seasonality (standard deviation * 100) (in degrees Celsius)
      - bio_5 = max temperature of warmest month (in degrees Celsius)
      - bio_6 = min temperature of coldest month (in degrees Celsius)
      - bio_7 = temperature annual range (bio_5 - bio_6) (in degrees Celsius)
      - bio_8 = mean temperature of wettest quarter (in degrees Celsius)
      - bio_9 = mean temperature of driest quarter (in degrees Celsius)
      - bio_10 = mean temperature of warmest quarter (in degrees Celsius)
      - bio_11 = mean temperature of coldest quarter (in degrees Celsius)
      - bio_12 = annual precipitation (in millimeters)
      - bio_13 = precipitation of wettest month (in millimeters)
      - bio_14 = precipitation of driest month (in millimeters)
      - bio_15 = precipitation seasonality (coefficient of variation) (fraction)
      - bio_16 = precipitation of wettest quarter (in millimeters)
      - bio_17 = precipitation of driest quarter (in millimeters)
      - bio_19 = precipitation of coldest quarter (in millimeters)
    - 'max_temperature' folder -
      - tmax_03 = average max temperature of March (in degrees Celsius)
      - tmax_10 = average max temperature of October (in degrees Celsius)
      - tmax_11 = average max temperature of November (in degrees Celsius)
    - 'min_temperature' folder -
      - tmin_03 = average min temperature of March (in degrees Celsius)
      - tmin_10 = average min temperature of October (in degrees Celsius)
      - tmin_11 = average min temperature of November (in degrees Celsius)
    - 'precipitation' folder -
      - precip_01 = total precipitation of January (in millimeters)
      - precip_02 = total precipitation of February (in millimeters)
      - precip_03 = total precipitation of March (in millimeters)
      - precip_10 = total precipitation of October (in millimeters)
      - precip_11 = total precipitation of November (in millimeters)
      - precip_12 = total precipitation of December (in millimeters)
  - 'fio_esm_2_0' folder - bioclimatic data and monthly climatic data (i.e., maximum temperature, minimum temperature, precipitation) are CMIP6 (https://www.carbonbrief.org/cmip6-the-next-generation-of-climate-models-explained/) future climate data from the FIO-ESM-2-0 global coupled climate model (Bao et al. 2020) for 2041-2060. Data source: https://www.worldclim.org/data/cmip6/cmip6_clim30s.html.
    - 'bioclimatic' folder -
      - bio_1 = annual mean temperature (in degrees Celsius)
      - bio_2 = mean diurnal range [mean of monthly (max temp - min temp)] (in degrees Celsius)
      - bio_3 = isothermality ((bio_2/bio_7) *100) (dimensionless)
      - bio_4 = temperature seasonality (standard deviation * 100) (in degrees Celsius)
      - bio_5 = max temperature of warmest month (in degrees Celsius)
      - bio_6 = min temperature of coldest month (in degrees Celsius)
      - bio_7 = temperature annual range (bio_5 - bio_6) (in degrees Celsius)
      - bio_8 = mean temperature of wettest quarter (in degrees Celsius)
      - bio_9 = mean temperature of driest quarter (in degrees Celsius)
      - bio_10 = mean temperature of warmest quarter (in degrees Celsius)
      - bio_11 = mean temperature of coldest quarter (in degrees Celsius)
      - bio_12 = annual precipitation (in millimeters)
      - bio_13 = precipitation of wettest month (in millimeters)
      - bio_14 = precipitation of driest month (in millimeters)
      - bio_15 = precipitation seasonality (coefficient of variation) (fraction)
      - bio_16 = precipitation of wettest quarter (in millimeters)
      - bio_17 = precipitation of driest quarter (in millimeters)
      - bio_19 = precipitation of coldest quarter (in millimeters)
    - 'max_temperature' folder -
      - tmax_03 = average max temperature of March (in degrees Celsius)
      - tmax_10 = average max temperature of October (in degrees Celsius)
      - tmax_11 = average max temperature of November (in degrees Celsius)
    - 'min_temperature' folder -
      - tmin_03 = average min temperature of March (in degrees Celsius)
      - tmin_10 = average min temperature of October (in degrees Celsius)
      - tmin_11 = average min temperature of November (in degrees Celsius)
    - 'precipitation' folder -
      - precip_01 = total precipitation of January (in millimeters)
      - precip_02 = total precipitation of February (in millimeters)
      - precip_03 = total precipitation of March (in millimeters)
      - precip_10 = total precipitation of October (in millimeters)
      - precip_11 = total precipitation of November (in millimeters)
      - precip_12 = total precipitation of December (in millimeters)
  - 'miroc_es2l' folder - bioclimatic data and monthly climatic data (i.e., maximum temperature, minimum temperature, precipitation) are CMIP6 (https://www.carbonbrief.org/cmip6-the-next-generation-of-climate-models-explained/) future climate data from the MIROC-ES2L global coupled climate model (Hajima et al. 2020) for 2041-2060. Data source: https://www.worldclim.org/data/cmip6/cmip6_clim30s.html.
    - 'bioclimatic' folder -
      - bio_1 = annual mean temperature (in degrees Celsius)
      - bio_2 = mean diurnal range [mean of monthly (max temp - min temp)] (in degrees Celsius)
      - bio_3 = isothermality ((bio_2/bio_7) *100) (dimensionless)
      - bio_4 = temperature seasonality (standard deviation * 100) (in degrees Celsius)
      - bio_5 = max temperature of warmest month (in degrees Celsius)
      - bio_6 = min temperature of coldest month (in degrees Celsius)
      - bio_7 = temperature annual range (bio_5 - bio_6) (in degrees Celsius)
      - bio_8 = mean temperature of wettest quarter (in degrees Celsius)
      - bio_9 = mean temperature of driest quarter (in degrees Celsius)
      - bio_10 = mean temperature of warmest quarter (in degrees Celsius)
      - bio_11 = mean temperature of coldest quarter (in degrees Celsius)
      - bio_12 = annual precipitation (in millimeters)
      - bio_13 = precipitation of wettest month (in millimeters)
      - bio_14 = precipitation of driest month (in millimeters)
      - bio_15 = precipitation seasonality (coefficient of variation) (fraction)
      - bio_16 = precipitation of wettest quarter (in millimeters)
      - bio_17 = precipitation of driest quarter (in millimeters)
      - bio_19 = precipitation of coldest quarter (in millimeters)
    - 'max_temperature' folder -
      - tmax_03 = average max temperature of March (in degrees Celsius)
      - tmax_10 = average max temperature of October (in degrees Celsius)
      - tmax_11 = average max temperature of November (in degrees Celsius)
    - 'min_temperature' folder -
      - tmin_03 = average min temperature of March (in degrees Celsius)
      - tmin_10 = average min temperature of October (in degrees Celsius)
      - tmin_11 = average min temperature of November (in degrees Celsius)
    - 'precipitation' folder -
      - precip_01 = total precipitation of January (in millimeters)
      - precip_02 = total precipitation of February (in millimeters)
      - precip_03 = total precipitation of March (in millimeters)
      - precip_10 = total precipitation of October (in millimeters)
      - precip_11 = total precipitation of November (in millimeters)
      - precip_12 = total precipitation of December (in millimeters)
- 'ssp5_rcp8_5_2050' folder - contains three subfolders pertaining to the three global coupled climate models (CMCC-ESM2, FIO-ESM-2-0, MIROC-ES2L) and the 'global_SSP5_RCP85_2050.tif' plant functional types product from Chen et al. 2022 (https://doi.org/10.5281/zenodo.4584775). The plant functional types dataset should be applied to each global coupled climate model scenario separately when working with future projections.
  - 'cmcc_esm2' folder - bioclimatic data and monthly climatic data (i.e., maximum temperature, minimum temperature, precipitation) are CMIP6 (https://www.carbonbrief.org/cmip6-the-next-generation-of-climate-models-explained/) future climate data from the CMCC-ESM2 global coupled climate model (Cherchi et al. 2019) for 2041-2060. Data source: https://www.worldclim.org/data/cmip6/cmip6_clim30s.html.
    - 'bioclimatic' folder -
      - bio_1 = annual mean temperature (in degrees Celsius)
      - bio_2 = mean diurnal range [mean of monthly (max temp - min temp)] (in degrees Celsius)
      - bio_3 = isothermality ((bio_2/bio_7) *100) (dimensionless)
      - bio_4 = temperature seasonality (standard deviation * 100) (in degrees Celsius)
      - bio_5 = max temperature of warmest month (in degrees Celsius)
      - bio_6 = min temperature of coldest month (in degrees Celsius)
      - bio_7 = temperature annual range (bio_5 - bio_6) (in degrees Celsius)
      - bio_8 = mean temperature of wettest quarter (in degrees Celsius)
      - bio_9 = mean temperature of driest quarter (in degrees Celsius)
      - bio_10 = mean temperature of warmest quarter (in degrees Celsius)
      - bio_11 = mean temperature of coldest quarter (in degrees Celsius)
      - bio_12 = annual precipitation (in millimeters)
      - bio_13 = precipitation of wettest month (in millimeters)
      - bio_14 = precipitation of driest month (in millimeters)
      - bio_15 = precipitation seasonality (coefficient of variation) (fraction)
      - bio_16 = precipitation of wettest quarter (in millimeters)
      - bio_17 = precipitation of driest quarter (in millimeters)
      - bio_19 = precipitation of coldest quarter (in millimeters)
    - 'max_temperature' folder -
      - tmax_03 = average max temperature of March (in degrees Celsius)
      - tmax_10 = average max temperature of October (in degrees Celsius)
      - tmax_11 = average max temperature of November (in degrees Celsius)
    - 'min_temperature' folder -
      - tmin_03 = average min temperature of March (in degrees Celsius)
      - tmin_10 = average min temperature of October (in degrees Celsius)
      - tmin_11 = average min temperature of November (in degrees Celsius)
    - 'precipitation' folder -
      - precip_01 = total precipitation of January (in millimeters)
      - precip_02 = total precipitation of February (in millimeters)
      - precip_03 = total precipitation of March (in millimeters)
      - precip_10 = total precipitation of October (in millimeters)
      - precip_11 = total precipitation of November (in millimeters)
      - precip_12 = total precipitation of December (in millimeters)
  - 'fio_esm_2_0' folder - bioclimatic data and monthly climatic data (i.e., maximum temperature, minimum temperature, precipitation) are CMIP6 (https://www.carbonbrief.org/cmip6-the-next-generation-of-climate-models-explained/) future climate data from the FIO-ESM-2-0 global coupled climate model (Bao et al. 2020) for 2041-2060. Data source: https://www.worldclim.org/data/cmip6/cmip6_clim30s.html.
    - 'bioclimatic' folder -
      - bio_1 = annual mean temperature (in degrees Celsius)
      - bio_2 = mean diurnal range [mean of monthly (max temp - min temp)] (in degrees Celsius)
      - bio_3 = isothermality ((bio_2/bio_7) *100) (dimensionless)
      - bio_4 = temperature seasonality (standard deviation * 100) (in degrees Celsius)
      - bio_5 = max temperature of warmest month (in degrees Celsius)
      - bio_6 = min temperature of coldest month (in degrees Celsius)
      - bio_7 = temperature annual range (bio_5 - bio_6) (in degrees Celsius)
      - bio_8 = mean temperature of wettest quarter (in degrees Celsius)
      - bio_9 = mean temperature of driest quarter (in degrees Celsius)
      - bio_10 = mean temperature of warmest quarter (in degrees Celsius)
      - bio_11 = mean temperature of coldest quarter (in degrees Celsius)
      - bio_12 = annual precipitation (in millimeters)
      - bio_13 = precipitation of wettest month (in millimeters)
      - bio_14 = precipitation of driest month (in millimeters)
      - bio_15 = precipitation seasonality (coefficient of variation) (fraction)
      - bio_16 = precipitation of wettest quarter (in millimeters)
      - bio_17 = precipitation of driest quarter (in millimeters)
      - bio_19 = precipitation of coldest quarter (in millimeters)
    - 'max_temperature' folder -
      - tmax_03 = average max temperature of March (in degrees Celsius)
      - tmax_10 = average max temperature of October (in degrees Celsius)
      - tmax_11 = average max temperature of November (in degrees Celsius)
    - 'min_temperature' folder -
      - tmin_03 = average min temperature of March (in degrees Celsius)
      - tmin_10 = average min temperature of October (in degrees Celsius)
      - tmin_11 = average min temperature of November (in degrees Celsius)
    - 'precipitation' folder -
      - precip_01 = total precipitation of January (in millimeters)
      - precip_02 = total precipitation of February (in millimeters)
      - precip_03 = total precipitation of March (in millimeters)
      - precip_10 = total precipitation of October (in millimeters)
      - precip_11 = total precipitation of November (in millimeters)
      - precip_12 = total precipitation of December (in millimeters)
  - 'miroc_es2l' folder - bioclimatic data and monthly climatic data (i.e., maximum temperature, minimum temperature, precipitation) are CMIP6 (https://www.carbonbrief.org/cmip6-the-next-generation-of-climate-models-explained/) future climate data from the MIROC-ES2L global coupled climate model (Hajima et al. 2020) for 2041-2060. Data source: https://www.worldclim.org/data/cmip6/cmip6_clim30s.html.
    - 'bioclimatic' folder -
      - bio_1 = annual mean temperature (in degrees Celsius)
      - bio_2 = mean diurnal range [mean of monthly (max temp - min temp)] (in degrees Celsius)
      - bio_3 = isothermality ((bio_2/bio_7) *100) (dimensionless)
      - bio_4 = temperature seasonality (standard deviation * 100) (in degrees Celsius)
      - bio_5 = max temperature of warmest month (in degrees Celsius)
      - bio_6 = min temperature of coldest month (in degrees Celsius)
      - bio_7 = temperature annual range (bio_5 - bio_6) (in degrees Celsius)
      - bio_8 = mean temperature of wettest quarter (in degrees Celsius)
      - bio_9 = mean temperature of driest quarter (in degrees Celsius)
      - bio_10 = mean temperature of warmest quarter (in degrees Celsius)
      - bio_11 = mean temperature of coldest quarter (in degrees Celsius)
      - bio_12 = annual precipitation (in millimeters)
      - bio_13 = precipitation of wettest month (in millimeters)
      - bio_14 = precipitation of driest month (in millimeters)
      - bio_15 = precipitation seasonality (coefficient of variation) (fraction)
      - bio_16 = precipitation of wettest quarter (in millimeters)
      - bio_17 = precipitation of driest quarter (in millimeters)
      - bio_19 = precipitation of coldest quarter (in millimeters)
    - 'max_temperature' folder -
      - tmax_03 = average max temperature of March (in degrees Celsius)
      - tmax_10 = average max temperature of October (in degrees Celsius)
      - tmax_11 = average max temperature of November (in degrees Celsius)
    - 'min_temperature' folder -
      - tmin_03 = average min temperature of March (in degrees Celsius)
      - tmin_10 = average min temperature of October (in degrees Celsius)
      - tmin_11 = average min temperature of November (in degrees Celsius)
    - 'precipitation' folder -
      - precip_01 = total precipitation of January (in millimeters)
      - precip_02 = total precipitation of February (in millimeters)
      - precip_03 = total precipitation of March (in millimeters)
      - precip_10 = total precipitation of October (in millimeters)
      - precip_11 = total precipitation of November (in millimeters)
      - precip_12 = total precipitation of December (in millimeters)
- elevation = derived from a ~1-square kilometer digital elevation model in ArcGIS Pro 3.0.0 (in meters)
- slope1 = derived from a ~1-square kilometer digital elevation model in ArcGIS Pro 3.0.0 (in degrees, rate of change of elevation)
'presence_record_CSVs' folder - presence record CSVs were developed in the 'Bird Occurrence Data' methods section and are used in the principle component analysis, ensemble modeling, and spatial autocorrelation steps. For each species, they are a combination of eBird records and species-specific georeferenced occurrence datasets (see Supplementary Material Table S1 for more information).
- Canada_Warbler_2012_2021.csv = three column CSV dataset with 'Longitude', 'Latitude', and a four-letter alpha code (https://www.birdpop.org/docs/misc/Alpha_codes_eng.pdf) column with value = 1 for each presence record. Contains 1,586 unique Canada Warbler presence records that have been resampled to a ~1-square kilometer resolution to remove spatial bias in ensemble modeling.
- Cerulean_Warbler_2012_2021.csv = three column CSV dataset with 'Longitude', 'Latitude', and a four-letter alpha code (https://www.birdpop.org/docs/misc/Alpha_codes_eng.pdf) column with value = 1 for each presence record. Contains 546 unique Cerulean Warbler presence records that have been resampled to a ~1-square kilometer resolution to remove spatial bias in ensemble modeling.
- Female*_*Golden_winged_Warbler_2012_2021.csv = three column CSV dataset with 'Longitude', 'Latitude', and a four-letter alpha code (https://www.birdpop.org/docs/misc/Alpha_codes_eng.pdf) column with value = 1 for each presence record. Contains 192 unique female Golden-winged Warbler presence records that have been resampled to a ~1-square kilometer resolution to remove spatial bias in ensemble modeling.
- Male*_*Golden_winged_Warbler_2012_2021.csv = three column CSV dataset with 'Longitude', 'Latitude', and a four-letter alpha code (https://www.birdpop.org/docs/misc/Alpha_codes_eng.pdf) column with value = 1 for each presence record. Contains 283 unique male Golden-winged Warbler presence records that have been resampled to a ~1-square kilometer resolution to remove spatial bias in ensemble modeling.
- Wood_Thrush_2012_2021.csv = three column CSV dataset with 'Longitude', 'Latitude', and a four-letter alpha code (https://www.birdpop.org/docs/misc/Alpha_codes_eng.pdf) column with value = 1 for each presence record. Contains 3158 unique Wood Thrush presence records that have been resampled to a ~1-square kilometer resolution to remove spatial bias in ensemble modeling.
'ProcessestoIdentifyNear_termConservationPriorityAreas_Honduras_FocalAreas' folder - Focal areas in Honduras were previously delineated for the Golden-winged Warbler nonbreeding season conservation plan (Bennett et al. 2016) and were used in Brodie et al. 2024 for the 'Processes to Identify Near-term Conservation Priority Areas' methods section. Codes like this (HO06) correspond to the focal area name (i.e., La Muralla). Non .shp files below are accessory files for the .shp file and do not require an explanation for this research.
- Honduras_FocalAreas.cpg
- Honduras_FocalAreas.dbf
- Honduras_FocalAreas.prj
- Honduras_FocalAreas.sbn
- Honduras_FocalAreas.sbx
- Honduras_FocalAreas.shp = shapefile used for the 'Processes to Identify Near-term Conservation Priority Areas' methods section. The shapefile includes many attributes in the attribute table, but the following are the most important: FA_Name_Ne (name of the focal area), and FA_ID_new (Focal Area code).
- Honduras_FocalAreas.shp.xml
- Honduras_FocalAreas.shx
'spatial_extents' folder - Spatial extents are used throughout the process to crop/mask covariates and presence records and act as the environmental space for ensemble modeling. Once ensemble modeling and covariate evaluations are complete, spatial extents are no longer necessary and modeling products can be used for remaining analyses. Non .shp files below are accessory files for the .shp file and do not require an explanation for this research.
- Canada*_*Warbler_spatialExtent.cpg
- Canada*_*Warbler_spatialExtent.dbf
- Canada*_*Warbler_spatialExtent.prj
- Canada*_*Warbler_spatialExtent.sbn
- Canada*_*Warbler_spatialExtent.sbx
- Canada*_*Warbler_spatialExtent.shp = shapefile used to crop/mask covariates and presence records and act as the environmental space for Canada Warbler ensemble modeling.
- Canada*_*Warbler_spatialExtent.shp.xml
- Canada*_*Warbler_spatialExtent.shx
- Cerulean*_*Warbler_spatialExtent.cpg
- Cerulean*_*Warbler_spatialExtent.dbf
- Cerulean*_*Warbler_spatialExtent.prj
- Cerulean*_*Warbler_spatialExtent.sbn
- Cerulean*_*Warbler_spatialExtent.sbx
- Cerulean*_*Warbler_spatialExtent.shp = shapefile used to crop/mask covariates and presence records and act as the environmental space for Cerulean Warbler ensemble modeling.
- Cerulean*_*Warbler_spatialExtent.shp.xml
- Cerulean*_*Warbler_spatialExtent.shx
- Female_Golden_winged*_*Warbler_spatialExtent.cpg
- Female_Golden_winged*_*Warbler_spatialExtent.dbf
- Female_Golden_winged*Warbler*Warbler_spatialExtent.prj
- Female_Golden_winged*Warbler*Warbler_spatialExtent.sbn
- Female_Golden_winged*Warbler*Warbler_spatialExtent.sbx
- Female_Golden_winged*WarblerWarbler_spatialExtent.shp = shapefile used to crop/mask covariates and presence records and act as the environmental space for Female Golden-winged *Warbler ensemble modeling.
- Female_Golden_winged*_*Warbler_spatialExtent.shp.xml
- Female_Golden_winged*_*Warbler_spatialExtent.shx
- Male_Golden_winged*_*Warbler_spatialExtent.cpg
- Male_Golden_winged*_*Warbler_spatialExtent.dbf
- Male_Golden_winged*Warbler*Warbler_spatialExtent.prj
- Male_Golden_winged*Warbler*Warbler_spatialExtent.sbn
- Male_Golden_winged*Warbler*Warbler_spatialExtent.sbx
- Male_Golden_winged*WarblerWarbler_spatialExtent.shp = shapefile used to crop/mask covariates and presence records and act as the environmental space for Male Golden-winged *Warbler ensemble modeling.
- Male_Golden_winged*_*Warbler_spatialExtent.shp.xml
- Male_Golden_winged*_*Warbler_spatialExtent.shx
- Wood_Thrush_spatialExtent.cpg
- Wood_Thrush_spatialExtent.dbf
- Wood_Thrush_spatialExtent.prj
- Wood_Thrush_spatialExtent.sbn
- Wood_Thrush_spatialExtent.sbx
- Wood_Thrush_spatialExtent.shp = shapefile used to crop/mask covariates and presence records and act as the environmental space for Wood Thrush ensemble modeling.
- Wood_Thrush_spatialExtent.shp.xml
- Wood_Thrush_spatialExtent.shx

*Additional notes: 1) In bioclimatic data, bio_18 (precipitation of warmest quarter) is not included as a covariate. This is the result of the principle component analysis of each individual species not keeping bio_18 for ensemble modeling. 2) All data are in geographic coordinate system WGS 84 (EPSG:4326; https://epsg.io/4326), but products from these data will be in projected coordinate system 'Pseudo-Mercator' (EPSG:3857; https://epsg.io/3857) or any projected coordinate system that is chosen. Be cognizant of this change during additional analyses in ArcGIS Pro.

Sharing/Access information

Data was derived from the following sources:

monthly climatic data - Fick and Hijmans 2017
bioclimatic data - Fick and Hijmans 2017
topographic data - Fick and Hijmans 2017, Esri Inc. 2022
plant functional types data - Chen et al. 2022
spatial extents - known current stationary nonbreeding locations (Fink et al. 2022), 200-km buffer distance around presence records
presence record CSVs - eBird (accessed in January 2023), species-specific georeferenced occurrence datasets (see Supplementary Material Table S1 for more information)
Honduras focal areas - Golden-winged Warbler nonbreeding season conservation plan (Bennett et al. 2016)

Code/Software

R code are prefixed by a number (e.g., 1_) which corresponds to the step in the methods (e.g., BirdOccurrenceData). If you are using the presence record CSVs and spatial extents for any of the following - Canada Warbler, Cerulean Warbler, Golden-winged Warbler or Wood Thrush: you may proceed to step 3 (i.e., 3_EnsembleModelingandProjectedDistributions) and also complete steps 4-6 if interested. If you are using these methods with a different species: proceed through steps 1-3 and also complete steps 4-6 if interested.

IMPORTANT: there are notes for additional steps to follow outside R in some of the code files. Keep any eye out for these steps.

IMPORTANT: the rest of the analyses (e.g., developing 3 spatial layers for comparison of predicted occupied current ranges to predicted current suitable conditions) can be completed in ArcGIS Pro given the method descriptions provided in the paper.

R code come complete with necessary libraries, but they are mentioned below for convenience. Brodie et al. 2024 used R 2022 v4.2.1 for analyses.

1_BirdOccurrenceData: auk, lubridate, sf, gridExtra, tidyverse, dplyr
2_ClimateandLandUseLandCoverData_PCA: biomod2, ggplot2, gridExtra, raster, sf, rgdal, ade4, factoextra, magrittr
3_EnsembleModelingandProjectedDistributions: biomod2, ggplot2, gridExtra, raster, sf, rgdal, rasterVis, dplyr, readr, tidyterra
4a_SpatialAutocorrelation_Step1: biomod2, dplyr
4b_SpatialAutocorrelation_Step2: biomod2, ape, raster, terra, rgdal, ncf, sf, dplyr
5a_DistributionShiftsDuetoChangesinSuitableConditions_LatLong: raster, sp, rgdal, tidyverse
5b_DistributionShiftsDuetoChangesinSuitableConditions_Histograms: ggplot2, cowplot, tidyverse, scales, reshape2
6_ComparisonofPredictedOccupiedCurrentRangestoPredictedCurrentSuitableConditions_OccupiedCurrentRaneDev: ebirdst, raster, dplyr, tidyverse, sf, sp, rgdal

Bird Occurrence Data

We obtained current (2012 to 2021) bird occurrence data containing only Neotropical presence records from eBird (accessed in January 2023; Sullivan et al. 2009) and supplemented with species-specific georeferenced occurrence datasets to bolster presence record sample sizes and the spatial representation of records. 2012 to 2021 was identified as the “current” timeframe to capitalize on increased user engagement with eBird and align with prior research (Hightower et al. 2023). Date ranges for the stationary nonbreeding period were defined using expert input (N. Bayly, E. Cohen, I. Davidson, A. González, J. Hightower, J. L. Larkin, E. Montenegro, D. Raybuck, A. Roth, C. Rushing, C. Stanley, R. L. M. Stewart, and S. Wilson personal communication) to assess frequency distributions of daily presence records in the current timeframe. Experts emphasized date selection 2 weeks before or after most birds initiated or completed migration through the Neotropical flyway to minimize the signal from areas used during migration (C. canadensis: November 16 to March 17, S. cerulea: October 25 to March 10, V. chrysoptera: October 28 to March 31, H. mustelina: November 5 to March 28).

eBird occurrence data were filtered in R with the auk package (Strimas-Mackey et al. 2018, R Core Team 2022) to select presence records collected using “traveling,” “stationary,” and “incidental” protocols with observer effort distances ≤2 km (Medina et al. 2023). Duplicate records, as well as outlier records from areas outside of known stationary nonbreeding locations (Fink et al. 2022), were removed. We added the species-specific datasets to filtered eBird datasets and resampled all presence records to a 1-km² resolution (Fick and Hijmans 2017). The final dataset included 5,765 unique presence records for the current timeframe (C. canadensis: n = 1,586, S. cerulea: n = 546, V. chrysoptera ♀: n = 192, V. chrysoptera ♂: n = 283, H. mustelina: n = 3,158). We partitioned V. chrysoptera records by sex as it is a sexually dimorphic species allowing for possible identification by plumage. The sexes are known to segregate by habitat and elevation resulting in conservation planning bias in favor of higher elevations for males (Bennett et al. 2019). Thus, we removed records that did not specify sex (n = 1,860).

Climate and Land Use/Land Cover Data

We downloaded historical (1970 to 2000 averages) monthly climatic and bioclimatic raster datasets at a 30-arc second (~1-km²) spatial resolution from the WorldClim data repository (Fick and Hijmans 2017). Historical climate data aided predictions of current climatic and LULC conditions with documented ecological patterns (Acevedo et al. 2012). Bioclimatic covariates were selected based on literature review, expert input, principal component analysis (PCA) correlation circles, and predictor contribution percentages. PCA correlation circles and predictor contribution percentages were used to identify multicollinearities among bioclimatic covariates (Fick and Hijmans 2017, Guisan et al. 2017). We selected bioclimatic covariates for ensemble modeling (Thuiller et al. 2009, Guisan et al. 2017) that were above the expected average contribution percentage, a product of the covariate eigenvalues (Dray et al. 2023). Further, elevation and slope were derived from a ~1-km² digital elevation model (Fick and Hijmans 2017) in ArcGIS Pro 3.0.0 (Esri Inc. 2022). Global LULC projections based on simulations of 16 plant functional types (i.e., forest, grassland, and cropland) and urban expansion (see figure 2 in Chen et al. 2022) were included to simulate effects of LULC change. We used the resulting covariates in ensemble modeling to capture species responses from the current timeframe based on historical climate (C. canadensis: n = 23 covariates, S. cerulea: n = 24, ♀ V. chrysoptera: n = 23, ♂ V. chrysoptera: n = 23, H. mustelina: n = 26).

For future (2050) climatic and LULC conditions, we obtained climatic datasets (2041 to 2060 averages) identical to the historical dataset from WorldClim for 3 individual GCCMs: the CMCC-ESM2 (Cherchi et al. 2019), the FIO-ESM-2-0 (Bao et al. 2020), and the MIROC-ES2L (Hajima et al. 2020). For each GCCM, we used 2 2041 to 2060 SSP-RCP scenarios which represent independent climatic futures: SSP2-RCP4.5 and SSP5-RCP8.5 (Fick and Hijmans 2017). SSP2-RCP4.5 (hereinafter, best-case) represents a future where climate-smart practices increase and nonrenewable resource use declines (Van Vuuren et al. 2011, Riahi et al. 2017). In contrast, SSP5-RCP8.5 (hereinafter, worst-case) represents a future where technological advances and increased fossil fuel extraction lead to maximum global emissions (Van Vuuren et al. 2011, Riahi et al. 2017).

The spatial extent used to extract climate and LULC data was identical among scenarios to project current species responses onto future climates (Guisan et al. 2017, Hightower et al. 2023). To accommodate potential distribution shifts in latitude and longitude by 2050, we initially included areas in the periphery of current stationary nonbreeding locations (Fink et al. 2022) for the spatial extents of each focal species. Preliminary analyses resulted in extralimital projections of species occurrence when suitable climatic and LULC conditions occurred well outside the current distribution of each species. To limit these projections, we defined the northern and southern termini of each spatial extent with a combination of the unique presence records and known current stationary nonbreeding locations (Fink et al. 2022). We applied a spatial constraint that prevented extralimital projections of occurrence that exceeded 200 km from known occurrences, but we filled gaps in presence record coverage where species are known to occupy (Fink et al. 2022). The 200-km distance was selected to accommodate reasonable dispersal distances for each species in the current timeframe (Barbet-Massin et al. 2012, Freeman et al. 2018).

Ensemble Modeling and Projected Distributions

We used an ensemble modeling framework within the R package biomod2 (Thuiller et al. 2009, Guisan et al. 2017) to model current and future projections of suitable climatic and LULC conditions for the 4 focal bird species (V. chrysoptera ♀ and ♂ separately). To address multicollinearity and biases in ecological studies (Fotheringham and Oshan 2016), we incorporated 4 successful modeling algorithms (Qiao et al. 2015, Guisan et al. 2017): generalized linear model (GLM), generalized boosting model (GBM), generalized additive model (GAM), and random forest (RF). Default settings in biomod2 were kept for GBM and RF, while settings were modified for GLM and GAM: We set the relationship between presence records and covariates to a polynomial function for GLM (Hightower et al. 2023), while the GAM modeling function was set to GAM_mgcv (Wood 2017).

Predictive performance of individual models. For each modeling algorithm plus 1 full model (models that are calibrated and validated over an entire pseudo-absence dataset), we used 5K-fold cross-validations with 70% and 30% of the occurrence records allocated for training and validations, respectively (Guisan et al. 2017). We evaluated modeling algorithm performances using TSS and receiver operating characteristic (ROC) metrics, where TSS values > 0.6 are good and ROC values > 0.9 are excellent (Thuiller et al. 2009, Guisan et al. 2017).

We randomly generated pseudo-absence points in the modeling framework due to limited true-absence records in the Neotropics during the current timeframe. The number of pseudo-absences and presence records were roughly equal to aid in decision tree dynamics for GBM and RF (Barbet-Massin et al. 2012). Pseudo-absences were generated within a radius of 200-km from presence records, but no pseudo-absences were generated within the same 30 arc-second (~1 km²) pixel of a presence record (Hightower et al. 2023). The maximum radius of 200-km permitted the modeling algorithms to train in different climatic and LULC conditions within reasonable dispersal distances in the Neotropics (Barbet-Massin et al. 2012, Freeman et al. 2018).

Including our unique presence records, 3 pseudo-absence runs were completed for 4 modeling algorithms, 5 model runs (i.e., cross-validations), 7 distributions (i.e., 1 current + 3 GCCMs × 2 SSP-RCPs), and 4 species (with V. chrysoptera ♀ and ♂ modeled separately) for an analysis of 2,100 individual models. Individual models with ROC scores > 0.9 and TSS scores > 0.6 were included in calculations for the ensemble models to augment sensitivity (i.e., predicted presences) and specificity (i.e., predicted absences) scores (Araújo and New 2007, Thuiller et al. 2009). We did not consider erroneous models that reached iteration limits without full convergence. The removal of low-scoring (80) and erroneous (3) models ensured that ensemble models were calculated with the best projections (i.e., 2,017 individual models). Committee-averaged ensemble model outputs were selected for post-processing analyses which represented consensus and disagreement among individual models (Araújo and New 2007, Guisan et al. 2017).

Covariate Evaluation

We identified the top 4 performing covariates by calculating mean covariate importance values from the GLM, GBM, GAM, and RF modeling algorithms (Hightower et al. 2023). The analysis of 4 covariates captured variety for each species and each modeling algorithm in a succinct manner. Importance values were interpreted within each modeling algorithm only. However, for each focal bird species, recurring covariates among modeling algorithms were noted to accommodate for predictive variance and identify the most influential environmental factors for committee-averaged ensemble model outputs (Bucklin et al. 2015).

Distribution Shifts Due to Changes in Suitable Conditions

We completed post-processing analyses in ArcGIS Pro 3.0.0 to assess conditions of suitable climatic and LULC conditions in current and future scenarios for each focal species. Pixel values <0.5 in the committee-averaged ensemble model outputs were labeled as absences and were removed prior to analysis (Brown and Yoder 2015). We developed spatial products from the binary grids for suitable conditions lost, gained, and remaining between the current and future scenarios, where current suitable conditions were the reference for future suitable conditions (Hightower et al. 2023). Among the GCCMs used in this study, MIROC-ES2L (Hajima et al. 2020) was the most reported in the literature. Thus, the SSP2-RCP4.5 (best-case) MIROC-ES2L future scenario was used in the main text to highlight suitable conditions lost, remained, and gained for each focal species.

Histograms corresponding to elevation, latitude, and longitude were created to describe shifts in spatial patterns for the focal species (Da Silveira et al. 2021). We used the current and future binary grids as raster masks to determine elevation ranges in ArcGIS Pro 3.0.0, which were defined by the absolute minimum and maximum elevation values among the raster masks for each species. This standardization procedure allowed us to compare pixel counts at exact elevation values within the masked areas and among current and future scenarios, where histogram values were synonymous with available suitable conditions (Da Silveira et al. 2021). Latitudinal and longitudinal histograms were developed with similar methods as the elevation histograms, except that we extracted coordinates directly from the masked elevation rasters in R.

Comparison of Predicted Occupied Current Ranges to Predicted Current Suitable Conditions

We sought to compare predicted occupied current ranges to our predicted current suitable conditions (as components of stationary nonbreeding distributions) to facilitate the identification of high-priority conservation areas. To define the occupied current ranges for the focal species, we used high-resolution (~3-km²) weekly abundance raster datasets from the ebirdst R package (Strimas-Mackey et al. 2023). Weekly abundance rasters for the focal species were selected according to the date ranges described above. We combined and resampled rasters to match the projected coordinate system and resolution of the binary grids developed previously. To minimize the influence of outliers and create a standardized raster layer, data within the 99th quantile were converted to values between 0 and 1 and any value <0.01 was set to “NA” (Gómez and Flores 2023).

Having created a raster for the occupied current range, we compared it with the predicted current suitable conditions from the ensemble model in ArcGIS Pro 3.0.0 (Botero-Delgadillo et al. 2022). Abundance values <0.1 were removed prior to analysis to ensure comparisons between the predictive methods were not overly restrictive but made without consideration of low probability use areas (Wilson et al. 2022). We developed 3 spatial layers for each species: (1) predicted current suitable conditions but currently unoccupied, (2) predicted current suitable conditions that are currently occupied, and (3) predicted current unsuitable conditions but currently occupied. Corresponding area totals were calculated as follows: (1) current suitable conditions - intersect between predictive methods, (2) intersect between predictive methods, and (3) occupied current range - intersect between predictive methods. The inclusion of probable occupied current ranges improved our confidence in developing a conceptual process to identify near-term conservation priority areas (Botero-Delgadillo et al. 2022).

Processes to Identify Near-term Conservation Priority Areas

Honduran focal areas were previously delineated for the V. chrysoptera nonbreeding season conservation plan (Bennett et al. 2016). For V. chrysoptera ♀ and ♂, we first intersected the “predicted current suitable conditions that are currently occupied” products with individual Honduran focal areas. Then, we merged suitable conditions remained and gained among each best-case future projection to capture spatial overlap in the near-term and intersected those products with individual Honduran focal areas. Within individual focal areas, intersected area (km²) and percent decline (from future intersected area/current intersected area) were calculated to assess the current and future validity of each boundary. We extended the mapping portion beyond Honduran focal area boundaries to consider areas for conservation expansion, where locations that satisfied the current and/or future spatial intersection could infer opportunities for resource allocation. Thereupon, we developed a conceptual process to identify near-term conservation priority areas, dependent on the preference of a conservation practitioner to maintain suitable landscapes within currently delineated conservation focal areas or to shift conservation action to areas that are likely to be suitable in the future (Groves et al. 2012, Bax et al. 2021).

Nonbreeding distributions of four declining Nearctic-Neotropical migrants are predicted to contract under future climate and socioeconomic scenarios

Data files

Abstract

README: Nonbreeding distributions of four declining Nearctic-Neotropical migrants are predicted to contract under future climate and socioeconomic scenarios

Description of the data and file structure

Sharing/Access information

Code/Software

Methods

Works referencing this dataset