Joint evolution of the biogeography and phenology of seasonal migration
Data files
Aug 26, 2025 version files 1.63 MB
-
annual_cycle_durations_150sp_t30.csv
39.45 KB
-
annual_cycle_durations_150sp_t40.csv
39.52 KB
-
annual_cycle_durations_150sp_t60.csv
39.46 KB
-
calculate_migration_dates.R
12.55 KB
-
centroid_calculation.zip
6.79 KB
-
centroids.zip
1.42 MB
-
jetz_matched_names.tre
31.96 KB
-
README.md
8.36 KB
-
species_metadata.csv
21.42 KB
-
statistical_analyses.R
9.36 KB
Abstract
In migratory species, the temporal phases of the annual cycle are linked to seasonally shifting geographic ranges. Here, we investigate the spatiotemporal structure of the annual cycle in a phylogenetic comparative framework by developing a method to demarcate the pacing of annual cycle stages using eBird, a massive avian occurrence dataset, and applying it to migratory passerine birds breeding in North America. The analyses focus on 150 species of migratory passerine birds breeding in North America, filtered from a starting list of 206 species based on data quality. The analyses use eBird data averaged across the years 2002-2021 to calculate daily distributional centroids of the geographic range of each species. Next, the centroids are used to estimate demarcations of the annual cycle stages for downstream analyses. Our analyses reveal a striking negative correlation between the durations of the breeding versus nonbreeding stationary periods, indicating that a tradeoff between the lengths of the two stationary periods is the primary axis of variation in annual cycle pacing. Our results further show that the duration of annual occupancy in the breeding versus stationary nonbreeding ranges predicts the geographic separation of these seasonal ranges, demonstrating that the ratio of time spent on stationary breeding versus nonbreeding locations evolves in tandem with a species’ migration distance. By contrast, the amount of time during which species undergo seasonal migration—that is, the duration of the seasonal periods when species’ geographic ranges shift latitudinally—varies relatively little across species compared to the length of the stationary periods. Our study helps untangle the complexity of seasonal distributions and schedules to reveal integrated evolution of the biogeography of the migratory cycle, its pacing, and life history tradeoffs among species.
Dataset DOI: 10.5061/dryad.18931zd8h
Description of the data and file structure
Species distributional data were downloaded from www.ebird.org for the period 1 January 2002 to 25 September 2021. The data could be expanded to additional, more recent years by the user through download from www.ebird.org.
Files and variables
File: calculate_migration_dates.R
Description: This script reads in daily centroids of the geographic distribution for each species, from the directory ./centroids. The script uses GAM functions and analysis of the inflection points of the GAMs to estimate dates demarcating the annual cycle for each species. The script also filters out species that did not pass the quality checks described in the paper and in the script. The script outputs a tabular data file of annual cycle date demarcations for use in the script statistical_analyses.R. Three versions of the data file were produced for downstream analysis, corresponding for different time windows for estimating the migratory period, as described below and in the manuscript: annual_cycle_durations_150sp_t30.csv, annual_cycle_durations_150sp_t40.csv, annual_cycle_durations_150sp_t60.csv.
File: species_metadata.csv
Description: This file provides the initial species list of 206 species, along with initial metadata to be used in the script calculate_migration_dates.R
Variables
- Species: Initial species list of 206 species, after removal of 22 species of short-distance migrants, from T. M. Pegan, B. M. Winger, The influence of seasonal migration on range size in temperate North American passerines. Ecography 43, 1191–1202 (2020).
- Migration_distance_km: Estimates of migration distances (in km) from the geographic centroids of range maps, from T. M. Pegan, B. M. Winger, The influence of seasonal migration on range size in temperate North American passerines. Ecography 43, 1191–1202 (2020). Note these migration distances were used for initial filtering, but downstream analyses used migration distances calculated directly from the eBird-derived centroids from this study in calculate_migration_dates.R (which are very similar).
- Mass: Estimates of mass in grams from T. M. Pegan, B. M. Winger, The influence of seasonal migration on range size in temperate North American passerines. Ecography 43, 1191–1202 (2020).
- Breeding_latitude: Estimates of breeding latitude from T. M. Pegan, B. M. Winger, The influence of seasonal migration on range size in temperate North American passerines. Ecography 43, 1191–1202 (2020). Note these latitudes were used for initial data exploration, but downstream analyses used breeding latitudes calculated directly from the eBird-derived centroids in calculate_migration_dates.R
- Breeding_range_area_km2: Estimates of breeding latitude from T. M. Pegan, B. M. Winger, The influence of seasonal migration on range size in temperate North American passerines. Ecography 43, 1191–1202 (2020).
- Breeding.Biome: Breeding Biome categorization (see Methods).
- Winter.Biome: Winter Biome categorization (see Methods).
- Family: taxonomic family
- prebasic.moult_bmw: categorical location of prebasic molt, compiled from the literature
- molt.migration: binary variable, whether a species undergoes molt migration
File: statistical_analyses.R
Description: script for all statistical analyses described in the paper. Reads tabular data files that are provided in this repository or produced by calculate_migration_dates.R
File: jetz_matched_names.tre
Description: phylogenetic tree required for some analyses in statistical_analyses.R, described in the Methods.
File: annual_cycle_durations_150sp_t30.csv
Description: Output of calculate_migration_dates.R, used for alternate statistical analyses presented in the supplementary materials.
File: annual_cycle_durations_150sp_t40.csv
Description: Output of calculate_migration_dates.R, used for statistical analyses presented in main manuscript.
File: annual_cycle_durations_150sp_t60.csv
Description: Output of calculate_migration_dates.R, used for alternate statistical analyses presented in the supplementary materials.
Variables (for annual_cycle_durations_150sp_t30.csv, annual_cycle_durations_150sp_t40.csv, and annual_cycle_durations_150sp_t60.csv
- Species: The species
- s1: Ordinal date marking beginning of spring migration
- s2: Ordinal date marking peak spring migration
- s3: Ordinal date marking end of spring migration
- a1: Ordinal date marking beginning of autumn migration
- a2: Ordinal date marking peak autumn migration
- a3: Ordinal date marking end of autumn migration
- snr.90: .90 quantile of the signal to noise ratio of the gam fits
- edf.10: .10 quantile of the expected degrees of freedom of the gam fits
- breed.lat.ebd: breeding latitude, calculated from the species' centroid data
- breed.long.ebd: breeding longitude, calculated from the species' centroid data
- wint.lat.ebd: non-breeding latitude, calculated from the species' centroid data
- wint.long.ebd: non-breeding longitude, calculated from the species' centroid data
- migdist.ebd: migration distance (km), calculated from the species' centroid data
- lat.breadth: latitudinal breadth of the species' geographic range, calculated from the species' centroid data
- breed.dur: duration (days) of the breeding stationary period
- wint.dur: duration (days) of the non-breeding stationary period
- spr.dur: duration (days) of spring migration
- aut.dur: duration (days) of autumn migration
- migr.dur: duration (days) of spring+autumn migration
- Migration_distance_km: migration distance in km estimates from range maps, retained from species_metadata.csv and described with that file.
- Mass: Mass estimate in grams, retained from species_metadata.csv and described with that file.
- Breeding_latitude: Breeding latitude estimates from range maps, retained from species_metadata.csv and described with that file.
- Breeding_range_area_km2: Breeding area estimates from range maps, retained from species_metadata.csv and described with that file.
- Breeding.Biome: Breeding Biome, retained from species_metadata.csv and described with that file.
- Winter.Biome: Nonbreeding Biome, retained from species_metadata.csv and described with that file.
- Family: Taxonomic family
- prebasic.moult_bmw: Molt metadata, retained from species_metadata.csv and described with that file.
- molt.migration: Molt metadata, retained from species_metadata.csv and described with that file.
- filtered: Whether the species passed the filter checks
File: centroid_calculation.zip
Description: Scripts to calculate the daily geographic centroids of each species from eBird data (www.ebird.org). Script 1.ebd-hex-extract.R extracts hexagon cell IDs for ebird checklists by year and day. Script 2. species-hex-checklist-count. R counts the number of ebird checklists where a species was observed within each hexagon cell by year and day. Script 3. effort-hex-checklist-count.R counts the number of ebird checklists compiled within each hexagon cell by year and day. Script 4. species-centroids. R generates daily centroid estimates based on the eBird checklist data compiled within hexagon cells and saves an .RData file of daily centroids for each species (which are provided in centroids.zip)
File: centroids.zip
Description: The daily geographic centroids calculated from the scripts in centroid_calculation.zip and used in calculate_migration_dates.R. A separate .RData file of daily centroids is provided for each species in the study, listing latitude and longitude for each ordinal date of the year. The .RData objects can be accessed via the load() function in R, as shown in calculate_migration_dates.R.
Code/software
R (https://www.r-project.org/)
Access information
Other publicly accessible locations of the data:
Data was derived from the following sources:
