Data and code from: Spatial and seasonal variation in avian dietary strategies
Data files
May 27, 2026 version files 93.15 KB
-
avian_diet_seasonality.zip
82.92 KB
-
README.md
10.23 KB
Abstract
Aim: Diet is a fundamental aspect of vertebrate life history, shaping survival, recruitment, and fitness. While spatial variation in avian dietary characteristics has been studied, seasonal dynamics at species and assemblage levels remain largely unexplored, hindering our understanding of biodiversity patterns and processes. We present the first global-scale assessment of seasonal variation in avian dietary space and its environmental drivers.
Location: Global
Time period: Contemporary
Major taxa studied: Birds
Methods: We integrated seasonal species distributions for over 10,000 bird species with the SAviTraits 1.0 database, a compilation of intra-annual variation in species-specific dietary preferences. We summarized the avian dietary space using a Log Ratio Analysis and identified the dominant components of seasonal variation in avian dietary space using a Principal Component Analysis. To assess species contributions to the seasonality of assemblage-level dietary space, we quantified species-level annual variability in dietary characteristics. Finally, we examined correlations of assemblage-level dietary variability with temperature, precipitation, and GPP seasonality as well as with predictability of seasonal changes using logistic quantile regressions.
Results: Strong seasonal variation in dietary space exists at both assemblage and species levels, and is most pronounced in temperate and boreal regions of the Northern Hemisphere. This seasonality arises from two key processes: (1) the seasonal redistribution of migratory species, which occupy distinct regions of dietary space, alters assemblage composition and thus dietary space, and (2) within-species dietary shifts. We also find that temporal variation in avian diets is linked to different environmental drivers across latitudes, with temperature seasonality playing a dominant role in northern regions and precipitation seasonality being more influential in southern regions.
Main conclusions: Viewing species’ traits as dynamic systems provides a powerful framework to capture the temporal complexity of trait-environment associations, understand factors shaping community structure, and advance conservation efforts.
Dataset DOI: 10.5061/dryad.bvq83bkmd
Description of the data and file structure
Spatiotemporal avian assemblages
We used species’ range maps from BirdLife International for 10,349 species that intersected with the SAviTraits 1.0 database (see below). To minimize false absences57, these range maps were rasterized and stacked at a 100 × 100 km resolution in the World Geodetic System 84 coordinate reference system. This process resulted in 18,609 grid cells overlaying terrestrial regions worldwide (out of a total of 80,000 global grid cells). Regions without species range boundaries, such as the interior of Antarctica or Greenland, were excluded from the analysis. Standard data manipulations were performed with the ‘dplyr’ and ‘stringr’ R packages, while simple feature and gridded geospatial data manipulations were performed with the ‘sf’60 and ‘terra’ packages, respectively. All data manipulation and analysis were conducted in R (4.4.0) through RStudio (2023.06.1).
Avian breeding data
To define temporally-varying assemblages, we used breeding, non-breeding, and migration range maps (when available), ultimately defining four temporal assemblages for each 100 km × 100 km resolution grid cell. A total of 1,528 species (15% of all species) in our dataset exhibited temporally dynamic range maps (hereafter, migratory birds), i.e., a non-breeding, breeding, and/or passage range were found for these species. For each of these species we constructed a dataset, primarily using the Cornell Lab of Ornithology Birds of the World, assigning which ranges (i.e., breeding + resident, non-breeding + resident, and/or passage + resident) were used for each month. Additionally, for species whose ranges span the Northern and Southern Hemispheres, we assigned the range used independently for each Hemisphere based on available phenological information. The remaining 8,821 species (85%) were classified as year-round residents (hereafter, resident birds), and were considered present across all grid cells overlapping their resident range throughout the entire year.
Avian diet data.
To explore spatiotemporal variation of dietary strategies, we used SAviTraits 1.0 database. SAviTraits 1.0 provides information on temporal variation in dietary characteristics for >10,000 species of birds. SAviTraits 1.0 contains information on the proportional use of the following dietary categories: (1) invertebrates, (2) endotherms (e.g., Mammalia and Aves), (3) ectotherms (e.g., Reptilia and Amphibia), (4) fishes, (5) vertebrates unknown, (6) carrion, (7) fruit and flowers, (8) nectar and pollen, (9) seed, and (10) other plant matter. We consolidated endotherms, ectotherms, fishes, and vertebrates of unknown dietary axes into a single category termed "vertebrates," as the specifics of vertebrate diet were generally unknown for most species56, which resulted in a total of seven dietary categories. SAviTraits 1.0 captures dietary information at a monthly resolution, and for each species all diet categories sum to 100 for each month. We note, however, that the dietary classifications of SAviTraits 1.0 should be regarded as representing seasonal patterns because information on diet for most species was reported in a seasonal context (e.g., breeding season, winter, dry season, etc.). Standardizing SAviTraits 1.0 to a seasonal resolution, however, presents challenges due to inconsistent definitions of seasons across species, often because their breeding and non-breeding periods fall at different times of the year. For that reason, we retained the monthly resolution of SAviTraits 1.0 for our analysis, but generally speak of seasonal, rather than monthly, variation in diet throughout the paper. We further note that SAviTraits 1.0 database cannot easily differentiate between species that truly display no temporal variation in their diet and those that might be data deficient but provides users with an estimate of the certainty in each species' dietary designation, which we incorporate in our analysis to assess the sensitivity of our results to uncertain dietary categorization.
Files and variables
File: avian_diet_seasonality.zip
Description:
All necessary data we created and scripts are included in our zip file. For data created by others, please see the information access section. The scripts located in the "scripts" folder are designed to be run sequentially and are carefully commented to describe the steps taken and outputs produced. Note, users are required to create the directory architecture for the code to run. For this project we created a dataset of all the birds which have ranges other than 'Resident' (as classified by the BirdLife International Range Maps) - this dataset identifies the BirdLife international range (i.e., 'Passage', 'Non-Breeding', 'Breeding' and 'Resident') for these birds on a monthly basis. These are contained in the 'bird_season_by_month.csv'. The 'index' column tracks the row index of the species - this can be duplicated when birds have distinct populations in separate hemispheres. The 'species_corrected' latin binomial name shows the latest taxonomic identifier, while the 'species_rast' column has the synonymous name used in the BirdLife International maps. The 'subspecies' column indicates where a subspecies with a distinct range is represented - where the cell is blank, this indicates that this column was had 'no applicable' data. The 'hemisphere' column indicates which hemisphere the row corresponds to for the handful of birds that have populations in both hemispheres that are acting independently - similarly, this column was left blank where the species in question did not have multiple phenology entries on Birds of the World for populations in different hemispheres. 'n.present', 'e.present' and 's.present' are logical columns indicating is the species is ever found in the respective Northern (30 - 90), Equatorial (30 - -30) or Southern (-30 - -60) latitudinal bands. 'Jan' - 'Dec' show each month, and what the corresponding range the species is occupying for that period.
The './scripts/' directory contains all R scripts necessary to reproduce our analysis, and should be run in numbered order. Detailed comments in each script outline the purpose of the code, and decisions regarding our methods. 00_data_preparation.R processes the species ranges maps into individual raster files ('./output/season_rasters/' - note, these are not included due to file size limitations, but can be reproduced with our script) and produces a dataframe of the species that have temporally variable ranges for us to reference individual months for their seasonal ranges.
01_monthl_matrix.R takes the rasters produced in the 00_data_preparation.R script and splits them into monthly maps based on their breeding phenology.
02_diet_pca_scores.R fits a PCA to CLR transformed diet data for the unique diets which occur among Avian species throughout the year. It then joins these back onto the full species × month dataframe produced in 01_monthly_matrix.R, so that these values can subsequently be projected onto space and time.
03_diet_geography_creation.R produces individual rasters for each species based on their diet for each month of the year by replacing their presence values with their LC scores from the LRA. It requires the output of all preceeding scripts.
04_geographic_pca.R creates assemblage level summary rasters of for each month × LC across the globe by averaging over all species rasters. Additionally, it lays the ground work for the sensitivity analysis by following the same process but excluding species with based on the certainty in their dietary traits.
05_clustering.R uses k-means clustering to find clusters of congruent patterns in seasonality in dietary characteristics.
06_species_level_diet_variability.R quantifies species-level temporal variability in diet on the basis of migrant v.s. resident classes.
07_seasonality_regression.R creates rasters of environmental seasonality, extracts data from these and then fits logistic quantile regressions of assemblage diet variability by environmental covariates of seasonality and it's predictability.
08_sar_models.R fits analogous regressions to 07_seasonality_regression.R, but includes spatial autoregressive terms in the model to account for spatial autocorrelation. These were presented in the supplementary information.
09_certainty_analysis.R conducts an uncertainty analysis of our findings by removing species with data poor diets. It follows an identical workflow to the 02_diet_pca_scores script, but removes rows based on our certainty about a given species diet.
10_plots_dietspace_assemblage_level.R, 11_plots_clustering.R, 12_plots_spatiotemp_pca_maps.R and 13_plots_species_level_variability_and_dietspace create plots dietary space at the assemblage level, the k-means clustering, spatio-temporal biplots, and species level variability respectively.
Code/software
All data manipulation and analysis were conducted in R (4.4.0) through RStudio (2023.06.1)
Access information
Other publicly accessible locations of the data:
- Murphy, S. J., Bellvé, A. M., Miyajima, R. J., Sebunia, N. A., Lynch, M. M., Jetz, W., & Jarzyna, M. A. (2023). SAviTraits 1.0: Seasonally varying dietary attributes for birds. Global Ecology and Biogeography, 32, 1690–1698. https://doi.org/10.1111/geb.13738
- BirdLife International Range Maps. https://datazone.birdlife.org/
- Karger, D., Conrad, O., Böhner, J. et al. Climatologies at high resolution for the earth’s land surface areas. Sci Data 4, 170122 (2017). https://doi.org/10.1038/sdata.2017.122
- Running, S., & Zhao, M. (2021). MODIS/Terra Gross Primary Productivity Gap-Filled 8-Day L4 Global 500m SIN Grid V061. NASA Land Processes Distributed Active Archive Center. https://doi.org/10.5067/MODIS/MOD17A2HGF.061
Spatiotemporal avian assemblages.
We used BirdLife International species’ range maps for 10,349 species that intersected with the SAviTraits 1.0 database. We rasterized and stacked range maps at a 100 × 100 km resolution in the WGS84 CRS, which resulted in 18,609 grid cells overlaying terrestrial regions worldwide. We note that while a 100 × 100 km resolution may seem relatively coarse, it represents the finest possible scale for analyses based on range maps without introducing false presences (Hurlbert & Jetz, 2007). Moreover, this resolution is consistent with other studies of global avian diversity patterns (e.g., Belmaker et al., 2012), including those focusing on avian diets (Kissling et al., 2012; Barnagaud et al., 2019). We excluded regions without species range boundaries (e.g., the interior of Antarctica, Greenland). Standard data manipulations were performed with R packages ‘dplyr’ (Wickham et al., 2023) and ‘stringr’ (Wickham, 2023); simple feature and gridded geospatial data manipulations were performed with ‘sf’ (Pebesma, 2018) and ‘terra’ (Hijmans, 2025), respectively. All analyses were conducted in R (4.4.0) through RStudio (2023.06.1) (RStudio Team, 2020).
To define temporally-varying assemblages, we used resident, breeding, non-breeding, and passage range maps (when available), defining 12 (monthly) assemblages for each grid cell. 8,821 species (85%) had only one range map (resident) and were thus classified as resident birds and considered present year-round across all grid cells overlapping their resident range. The remaining 1,528 species (15% of all species) had more than one range maps available (i.e., any combination of resident, breeding, non-breeding, and passage ranges), indicating seasonal movements (hereafter, migratory birds). For these species, we assigned specific ranges (i.e., breeding + resident, non-breeding + resident, and/or passage + resident) to each month, based on phenological information from the Cornell Lab of Ornithology Birds of the World (Billerman et al., 2022). For species spanning both hemispheres, we assigned the range used separately for each Hemisphere. To the extent that partial migration is captured in BirdLife International range maps, including distinct breeding, non-breeding, or passage ranges, it is incorporated into our monthly assemblages.
Avian diet data.
To explore dietary seasonality, we used SAviTraits 1.0 database (Murphy et al., 2023), which provides monthly-resolution information on diet for 10,672 bird species. SAviTraits contains information on the proportional use of ten dietary categories: (1) invertebrates, (2) endotherms (Mammalia and Aves), (3) ectotherms (Reptilia and Amphibia), (4) fishes, (5) vertebrates unknown, (6) carrion, (7) fruit and flowers, (8) nectar and pollen, (9) seed, and (10) other plant matter. We consolidated endotherms, ectotherms, fishes, and vertebrates unknown into a single "vertebrates" category, bringing the number of dietary categories to seven. All diet categories sum to 100 for each month. We note that although SAviTraits provides diet values at a monthly resolution, these values originate from diet descriptions that were typically reported at a seasonal level in the primary literature (e.g., “feeds mainly on seeds during the breeding season,” “consumes fruit primarily in winter,” etc.). Because the timing of seasons (breeding, wintering, dry season, etc.) varies substantially among species, Murphy et al. (2023) standardized these seasonal descriptions to a monthly format to ensure comparability across species. We retained this monthly resolution in our analysis, but generally refer to seasonal variation in diet throughout the paper. We also note that of the 10,349 species considered here, only 996 species (10%) show any temporal variability in their dietary characteristics, which may reflect either genuine dietary stability for most species or data deficiencies. While SAviTraits cannot easily differentiate between these possibilities, it provides users with estimates of certainty in each species' dietary designation, which we use to assess the sensitivity of our results. Despite these data limitations and uneven taxonomic coverage, the dietary data used here represent the most comprehensive compilation currently available for capturing intra-annual variation in avian diets.
Avian dietary space.
We summarized the avian dietary space using a Log Ratio Analysis (LRA). Using SAviTraits (Murphy at al. 2023), we first created a species-month × diet matrix, where each row represented a species-month combination, and each column represented a dietary category. This matrix had dimensions of 128,064 × 7 (10,672 species × 12 months resulted in 128,064 rows), which we collapsed into unique diets (1,044 × 7). Given the compositional nature of dietary data (rows sum to 100) and thus strong correlation among the dietary categories, a log-ratio (LR) transformation was necessary to map these data into real vector space (Greenacre, 2021). While rarely utilized in ecology, LR transformations are common in geological and biological chemistry (Greenacre, 2021). Specifically, we used the centered log-ratio (CLR) transformation, which expresses the log-ratio of a part (i.e., a single diet category) relative to the geometric mean of all parts (Eq. 1) (Aitchison, 1982). A key advantage of the CLR transformation is that the matrix can be analyzed with a reduced-dimensional component analysis to represent the equivalent analysis of all log-ratios (Greenacre, 2021).
(Eq. 1)
Where CLR is the log-ratio between a part (j) and the geometric mean of all J parts in the composition of a given sample (x). We analyzed the CLR-transformed data using LRA, a log-ratio-based equivalent of principal component analysis (PCA) (Aitchison, 1990; Aitchison & Greenacre, 2002). LRA reduced the data to orthogonal (i.e., independent) log ratio components (LCs), with the first six LCs capturing over 99% of the total log ratio variance in dietary space while preserving all original diet categories (Table S1). To avoid discarding biologically meaningful variation, we retained all six LCs (hereafter, diet axes) for further analysis. Unlike in PCA, the positions of diet categories only have meaning in their pairwise comparisons (i.e., any pair of diets can be interpreted as a log ratio change in the direction of the connection between the pair) (Greenacre, 2021). We conducted LRA using the ‘clr’ and ‘prcomp’ functions from R packages ‘compositions’ (v2.0-8) (Van den Boogaart & Tolosana-Delgado, 2008) and ‘stats’, respectively.
For each species, we mapped, at a 100 × 100 km resolution, their scores on each diet axis for each month. These maps were then stacked to create mean assemblage-level scores for each grid cell and month, resulting in 72 mean assemblage-level score maps (12 months × 6 diet axes). We note that because the assemblage-level score for each dietary axis is an average across species, it provides largely an unbiased estimate of the community’s expected value; increasing species richness thus only increases the precision of this estimate without altering its mean. Since our seasonality analyses rely exclusively on these average assemblage-level scores, species richness should not influence the observed patterns of temporal dietary variability.
Spatiotemporal variation in avian dietary space.
To identify the dominant components of seasonal variation in avian dietary space, we applied a standard PCA to the 72 assemblage-level maps. Below we briefly describe the principles of PCA in the context of spatiotemporal data analysis, as such PCA application remains rare in our field (see Jarzyna & Stagge, (2023) for more details).
We first created a 2-dimensional matrix Y[t, ij], where each row t is a month, and each column holds the average assemblage LC score for each diet axis, j (here, j=6) for each grid cell, i. Matrix Y is then subject to PCA, transforming these data into new orthogonal axes—Principal Components (PCs). The first PC (PC1) captures the largest proportion of variance in the data, followed by PC2, which captures the second largest proportion of variance, orthogonally to PC1, and so forth. In our application, PCA decomposes the original matrix Y[t, ij] into two new matrices: PC loadings, U, and PC scores, V
(Eq. 2)
where U has dimensions ij × k, representing the number of loading values across all grid cells and diet axes by the number of Principal Components, k; V has dimensions t × k, where t is the number of months. All LC scores were first centered and normalized independently for each site and metric by calculating the long-term mean, µ, and standard deviation, σ, for each column of the original Y matrix.
PC scores describe the temporal expression of each PC—here, the dominant seasonal pattern of avian dietary space—centered around the long-term mean, µ. For example, a transition of PC scores from negative in January to positive in June reflects seasonal dietary shifts associated with phenological changes. PC loadings indicate the strength and direction of the temporal pattern described by PC scores in each grid cell. Strong positive loadings reflect alignment with the temporal pattern captured by PC scores, strong negative loadings indicate that the temporal pattern captured by PC scores is expressed in the opposite direction, and loadings near zero indicate a minimal deviation from the mean. Note that the true temporal pattern of each diet axis is always a combination of all PCs, but cells with stronger loadings for a particular PC are more influenced by that PC. Though scores and loading maps can be generated for all PCs, later PCs often capture increasingly random spatial variation. In our case, the first two PCs collectively explained 80.6% of spatiotemporal variation in diet axes and were retained for further analysis.
To evaluate whether seasonality in assemblage-level dietary characteristics is driven primarily by species redistribution due to migration or by within-species variability in dietary space, we repeated the PCA separately for resident species and for migratory species. To assess the impact of species with high uncertainty in dietary designation, we repeated the PCA for subsets of species within the 0.75 and 0.50 quantiles of confidence in dietary designation (Murphy et al., 2023). PCA was conducted using function ‘prcomp’ from R package ‘stats’.
Spatial congruence in seasonality of avian dietary space.
We evaluated the spatiotemporal congruence among the six diet axes using k-means clustering, which partitions observations into k clusters based on proximity in q-dimensional space. Here, q = 12 because clustering was based on loadings for six diet axes and two PCs. The optimal number of clusters was determined using silhouette width, based on the local maximum in silhouette score. Clustering was performed using function ‘kmeansruns’ from R package ‘fpc’ (v2.2-13) (Hennig & Imports, 2015).
Species- and group-level variability in dietary space.
To assess species contributions to the seasonality of assemblage-level dietary space, we quantified species-level annual variability in dietary characteristics (spVAR). For each species, we calculated the maximum difference in LC score values on each diet axis across all months, normalized these differences to a 0-1 scale, and weighted them by the respective proportion of variation each diet axis explained in the LRA. spVAR for each species was then computed as the sum of these weighted values on each diet axis across all months. We tested for the phylogenetic signal in spVAR using Blomberg’s K, as implemented in the ‘phylosig’ function of the R package ‘phytools’. Blomberg’s K quantifies the strength of phylogenetic signal relative to expectations under a Brownian motion model of trait evolution (Blomberg et al. 2003; Münkemüller et al. 2012). Because estimating K for the full megaphylogeny (10,349 species) was computationally prohibitive, we adopted a subsampling approach. Following Barnagaud et al. (2019), we randomly drew 1,000 species from the phylogeny, computed Blomberg’s K and its associated permutation-based P-value, and repeated this procedure 1000 times. We found no evidence for phylogenetic signal in spVAR (Blomberg’s K: mean = 0.12, SE = 0.0008; P-value: mean = 0.20, SE = 0.0074), indicating that interspecific variation in seasonal dietary variability is not structured by phylogenetic relatedness. We note, however, that this result may reflect the limited prevalence of seasonal dietary variability in our dataset, as only ~10% of species exhibit detectable seasonal changes. As additional data become available and seasonal dietary variation is documented for a larger fraction of species, re-evaluating phylogenetic signal in spVAR may be warranted. We used the Cornell Lab of Ornithology Open Tree of Life Avian Phylogeny, which we extracted using function ‘extractTree’ from R package ‘clootl’ (McTavish et al. 2025).
We also calculated each species’ maximum migration distance (MigDist). Resident birds were assigned MigDist = 0. For migratory birds, MigDist was emasured as the difference between the 0.025 and 0.975 quantiles of the most northerly and southerly extents of their combined range, respectively. Species were then categorized as residents (MigDist = 0; 8,821 species), short-distance migrants (0 < MigDist < 45° latitude; 1,048 species), and long-distance migrants (MigDist ³ 45° latitude; 480 species).
Environmental drivers of assemblage-level variability in dietary space.
To investigate environmental drivers, we first developed an assemblage-level measure of diet variability (aVAR). Following the species-level procedure, we quantified the greatest difference in the monthly assemblage-level LC scores for each diet axis, normalized these to 0-1 scale, and weighted them by the variance each diet axis explained in the LRA. aVAR was then calculated as the sum of these weighted values for each month.
Because our goal was to investigate the seasonality of dietary characteristics, we focused specifically on seasonal environmental drivers, while acknowledging that other factors influencing overall diet composition have been examined elsewhere (e.g., Barnagaud et al. 2019). We note that such an approach is consistent with the broader literature on seasonality in assemblage composition (Keyser et al. 2024) and previously highlighted as potential drivers of avian migration (Dingle 2008). To determine how aVAR correlates with environmental drivers, we collected three measures of environmental seasonality. We downloaded temperature seasonality and precipitation seasonality from WorldClim 2 (Fick & Hijmans, 2017) and aggregated them to the resolution and projection of aVAR layer. To obtain seasonality in gross primary production (GPP), we aggregated three years of eight-day GPP layers from MODIS (Running & Zhao, 2021) using the Google Earth Engine (Gorelick et al., 2017). We calculated the coefficient of variation for each year and averaged these values to create a single GPP seasonality raster.
The predictability of seasonal changes—how reliably events reoccur across years (Tonkin et al., 2017)—may influence seasonal avian diversity patterns (Keyser et al. 2024). To assess this, we extracted monthly temperature and precipitation data from 1980 to 2018 (Karger et al., 2017) and eight-day GPP data from 2021 to 2023 (Running & Zhao, 2021), aggregating GPP to a monthly resolution. We quantified predictability of the seasonality of each of these variables using wavelet analysis as the average proportion of significant wavelet power at a 12-month period over the entire time series for each grid cell (Cazelles et al., 2008).
To determine how aVAR correlates with these six drivers, we fitted individual logistic quantile regressions (LQR; Koenker & Hallock, 2001) against the 0.05, 0.50 and 0.95 quantiles of aVAR using the ‘lqr’ (v6.0.0) (Koenker et al., 2018) R package. Assumption checks for independence among the residuals showed weak positive spatial autocorrelation (Moran’s I coefficients < 0.17; Table S2). LQR predictions were compared to analogous spatial autoregressive models and showed consistent trends. As LQRs are more robust to outliers, heteroscedasticity and the bounded nature of the response, we chose to present these results in the main manuscript.
