Skip to main content

A trait-based framework for predicting foodborne pathogen risk from wild birds

Cite this dataset

Smith, Olivia et al. (2022). A trait-based framework for predicting foodborne pathogen risk from wild birds [Dataset]. Dryad.


Recent foodborne illness outbreaks have heightened pressures on growers to deter wildlife from farms, jeopardizing conservation efforts. However, it remains unclear which species, particularly birds, pose the greatest risk to food safety. Using >11,000 pathogen tests and 1,565 bird surveys covering 139 bird species from across the western U.S.A., we examined the importance of 11 traits in mediating wild bird risk to food safety. We tested whether traits associated with pathogen exposure (e.g., habitat associations, movement, and foraging strategy) and pace-of-life (clutch size and generation length) mediated foodborne pathogen prevalence and proclivities to enter farm fields and defecate on crops. Campylobacter spp. were the most prevalent enteric pathogen (8.0%), while Salmonella and Shiga-toxin producing E. coli (STEC) were rare (0.46% and 0.22% prevalence, respectively). We found that several traits related to pathogen exposure predicted pathogen prevalence. Specifically, Campylobacter and STEC-associated virulence genes were more often detected in species associated with cattle feedlots and bird feeders, respectively. Campylobacter was also more prevalent in species that consumed plants and had longer generation lengths. We found that species associated with feedlots were more likely to enter fields and defecate on crops. Our results indicated that canopy-foraging insectivores were less likely to deposit foodborne pathogens on crops, suggesting growers may be able to promote pest-eating birds and birds of conservation concern (e.g., via nest boxes) without necessarily compromising food safety. As such, promoting insectivorous birds may represent a win-win-win for bird conservation, crop production, and food safety. Collectively, our results suggest that separating crop production from livestock farming may be the best way to lower food safety risks from birds. More broadly, our trait-based framework suggests a path forward for co-managing wildlife conservation and food safety risks in farmland by providing a strategy for holistically evaluating the food safety risks of wild animals, including under-studied species.


Data acquisition

We compiled enteric pathogen prevalence data from studies that tested for Campylobacter spp., Salmonella spp., and/or STEC: 1) in at least 5 species of free-ranging birds (e.g., no single species studies and no captive birds), 2) using feces and/or cloacal swabs (e.g., no necropsy studies), and 3) from samples collected on farms that grow produce in the United States. We omitted trace-back studies investigating particular outbreaks because they would inflate apparent prevalence (e.g., Gardner et al. (2011), who investigated a Campylobacter outbreak in Alaska). We began by searching the reference list of the recent meta-analysis of foodborne pathogen prevalence in North American breeding birds by Smith et al. (2020c) for suitable studies, then expanded their list to include several as-of-then unpublished studies and grey literature, ultimately yielding pathogen data from 5 studies (Fig. 3A; see Appendix S1: Table S3 for included study meta-data). Three of the five studies are fully published (Rivadeneira et al. 2016, Navarro-Gonzalez et al. 2020, Smith et al. 2020a). One study was previously unpublished (Olimpi et al. sub-data (a) in this Dryad dataset). Finally, we used data from a Center for Produce Safety final grant report (Gordus et al. 2011) that is partially published in Cooley et al. (2007) and Gorski et al. (2011). Published studies focused on reporting pathogen prevalence and possible transmission across wildlife and in the environment (Cooley et al. 2007, Gorski et al. 2011), prevalence in a variety of bird species sampled across seasons (Navarro-Gonzalez et al. 2020), or identifying landscape/farm-level pathogen risk factors (Rivadeneira et al. 2016, Smith et al. 2020a).  

Studies that tested for STEC were diverse in methodology (Appendix S1: Table S3). Three studies performed the necessary steps for bacterial culture, isolation, and identification, then confirmed presumptive colonies by PCR. In contrast, two studies directly tested feces for STEC-associated virulence genes by extracting DNA and then using PCR but did not culture for bacteria first. Due to diverse methodologies, we considered samples to be positive for STEC if stx1 and/or stx2 (Shiga-toxin producing genes) were detected. Because these genes were rarely detected (0.22% of samples), we conducted additional analyses on data from the PCR-only studies, counting samples as positive if any proven or putative STEC-associated virulence gene(s) was/were detected (referred to as “STEC-associated virulence genes” for simplicity). The interest of this additional analysis resides in potential for horizontal transmission of virulence genes between E. coli strains (Bryan et al. 2015). Thus, from a broader public health perspective, it is important to know whether birds frequently carry E. coli possessing any virulence factor (sensu lato) typically found in STEC. We note that the bacteria referred to here as positive for STEC-associated virulence genes are not necessarily pathogenic to humans.

Second, we combined bird point-count data collected by the authors in three prior projects to estimate bird abundance in crop fields (“crop contact rates”; Fig. 3B; see Appendix S1: Table S4 for included study meta-data; Smith et al. (2020b, 2021)). Briefly, Smith et al. (2020b) conducted 100-m radius, 10 min point-count surveys during the breeding season twice per year over two years on 52 highly diversified farms across the U.S.A. states of Washington, Oregon, and California. The other two projects (Olimpi et al. sub-data (b) and Garcia et al. sub-data published in this Dryad dataset) each conducted 50-m radius, 10 min point-count surveys during the breeding season three times per year over two years across 20 organic farms in the Central Coast of California. Five of their farms overlapped and were only counted once in analyses. We only used data from survey points conducted in crop fields, and birds were only counted as “contacts” if they were in fields (we included aerial foraging as contacts but excluded flyovers). We excluded tree fruit contacts because of the structural similarity to non-crop trees and due to the focus of included studies on leafy greens, brassicas, and strawberries.

Finally, to identify which species were most likely to defecate in crop fields, we leveraged a dataset of 1215 fecal samples collected by Smith et al. (2020a) from brassica fields and food wash/packing areas across 37 farms in Washington, Oregon, and California, U.S.A. These samples were subsequently attributed to bird species through COI gene testing. Smith et al. (2020a) determined the bird species responsible for defecating 463 of the 1215 (38.1%) fecal samples, which were traced back to 35 species (Appendix S1: Fig. S1). Here, we examined traits that predicted the number of feces identified back to a bird species from the entire bird species pool (n = 106) observed while conducting point counts at the collection locations.

Although we sought studies on foodborne pathogen prevalence in birds from any farms that grow produce throughout the United States, we only found data collected from the West Coast that met our inclusion criteria (Fig. 3). Briefly, studies in our analysis surveyed farms that spanned a gradient of types and diversity of crops grown, sizes, and landscape contexts. For example, Smith et al. (2020a,b) surveyed highly diversified farms ranging from 0.38 to 272.2 ha that spanned a wide range of landscape contexts from 0–100% seminatural. Similarly, one of the previously unpublished studies (Olimpi et al. sub-data (a) in this Dryad dataset) surveyed farms ranging from 1.3 to 100.3 ha, spanning a wide range of landscape contexts from 0–85.4% seminatural. Farms ranged from highly diversified with many crop types to strawberry monocultures. The farm and regional contexts for included studies are described in full detail in Appendix S1: Tables S3-S4.

Species traits

We took a barriers-to-spillover approach meaning that we considered how species traits influence alignment of a series of hierarchical barriers that must align to enable transmission of foodborne pathogens from birds to crops (Fig. 1; Plowright et al. 2017). Thus, we first generated a priori hypotheses about traits that might affect foodborne pathogen prevalence in birds (Appendix S1: Table S1). We then examined how the same traits we hypothesized would impact foodborne pathogen prevalence impacted the “downstream layers” of contact with crops and fecal deposition in fields (Fig. 1). Our hypotheses broadly covered aspects of exposure/habitat preferences (diet guild, foraging strata, migratory strategy, sociality, high use of bird feeders, feedlot association, synanthropy, large daily movements [proxied by hand-wing index due to limited data available quantifying home range size or actual daily movement (Sheard et al. 2020)]), pace of life (clutch size and generation length), and nonnative status (i.e., nonnative birds possess several traits that may increase reservoir competence) (Waldenstrom et al. 2002, Altizer et al. 2011, Ostfeld et al. 2014, Daversa et al. 2017, Smith et al. 2020c, 2020a).

We first sought existing databases with relevant traits and were successful for diet guild, foraging strata, hand-wing index, and generation length (De Graaf et al. 1985, Wilman et al. 2014, Barnagaud et al. 2017, Bird et al. 2020, Sheard et al. 2020, Smith et al. 2020b). We did not find robust estimates for several traits of interest (migratory strategy in the study region, sociality/gregariousness, clutch size, bird feeder use, feedlot association, and synanthropy). For these traits, we generated novel databases from secondary and primary sources. We used Birds of the World Online (Billerman et al. 2020) to classify mean clutch size and migratory strategy for populations in our study region and sociality during the breeding season (i.e., when most produce is grown). We classified birds as highly associated with bird feeders if they were on one or more Project FeederWatch top 25 lists for Washington, Oregon, and California for 2016–2017 (; last accessed May 2020).

To classify feedlot association (highly associated, somewhat associated, not associated), we conducted a literature review, searched Birds of the World Online(Billerman et al. 2020), and consulted eBird checklists ( Because of discrepancies in data availability, methods, and reporting, we used the review to guide expert elicitation because we were unable to use specific numerical guidelines to classify species (see Appendix S1: Section S1 and Data S1). To quantify synanthropy, we extracted citizen-science data from eBird and filtered checklists according to “best practices” (Strimas-Mackey et al. 2020). We used the 500-m resolution MODIS MCD12Q1 v006 land cover product to calculate the proportion of anthropogenic land-cover within a 700-m radius of each checklist location (Friedl and Sulla-Menashe 2015). We then used generalized additive models to quantify species’ responses to anthropogenic land cover while accounting for nuisance variables and spatial autocorrelation (Wood 2006). We first modelled the effect of each type of natural and anthropogenic habitat on occupancy and then used our model to identify the most preferred natural habitat type. The synanthropy index was calculated as the relative log-fold increase (or decrease) of occurrence probability in anthropogenic habitat versus the natural habitat where the species was most abundant. Specifically, this quantity was calculated as the slope of a species’ response to anthropogenic land covers minus the slope of their response to their most preferred “natural” land cover (see Appendix S1: Section S1 and Data S2 for full details).

Our efforts resulted in 18 traits that we then narrowed down to 11 to represent key hypotheses. We selected traits that were least correlated with metrics representing other hypotheses by examining pairwise correlations. For example, mass, wing chord, and hand-wing index may all represent dispersal ability (Sutherland et al. 2000, Sheard et al. 2020), but mass is also highly correlated with generation length (Bird et al. 2020). Therefore, we only used hand-wing index and generation length since they were the least correlated with each other. Nevertheless, some traits remained correlated after our selection (the highest Pearson’s correlation for traits used in models was 0.48 between synanthropy and bird feeder association; Appendix S1: Fig. S2).

Statistical analysis

We modeled pathogen prevalence (Campylobacter spp., Salmonella spp., and STEC) and carriage of STEC-associated virulence genes as a function of bird traits using generalized linear mixed-effects models (GLMMs) with a binomial error distribution and logit link function (glmmTMB package in R) (Brooks et al. 2017). Additionally, we used GLMMs to examine the impact of species traits on 1) the total number of individuals detected in crops within each point using a negative binomial distribution to account for overdispersion and 2) the total number of environmental fecal samples attributed to each species via COI gene testing, again using a negative binomial distribution to account for overdispersion. We modeled (1) above using both the relative abundance (i.e., number of individuals counted in crop fields) of species across all sites within each study as well as their relative abundances per survey point (see Appendix S1: Section S1).

We first determined the optimal random effects structure for each of the 7 response variables described above (Appendix S1: Table S5). To account for variation in methods between studies, we included “study” as a random effect in models using data from 3 or more studies and as a fixed effect in models using data from 2 studies. To account for multiple bird surveys on the same farm and multiple visits to each point-count location, we included point nested within farm in analyses of crop contact per survey point. Finally, to account for non-independence in phylogenetic relationships, we also included order, family, genus, and species as random effects. Including order, family, and genus, rather than a continuous measure of phylogenetic relatedness, allowed us to identify the taxonomic level of non-independence. Specifically, we constructed models with all combinations of order, family, genus, and species, and then used AIC to determine which taxonomic levels to include. Our final candidate model sets included a random effect of family for all response variables except Campylobacter spp. (genus only) and crop contact (both family and species).

After determining the optimal random effects structures, we constructed 44 candidate models that tested the relative importance of each of the 11 traits for the 7 response variables (Appendix S1: Table S5). The 44 models included the null model, 11 models that tested a single trait, 16 additive models that tested one exposure and one pace of life trait, and 16 models that tested for an interaction between one exposure and one pace of life trait. All models that included hand-wing index also included an intercept for aerial foraging to account for the higher hand-wing index observed in species specialized in foraging in flight (Sheard et al. 2020). Continuous physiological traits were log transformed to reduce leverage of high values.

We then ranked models based on AICc and identified those that were most supported (∆AICc < 2.0) (Burnham and Anderson 2002). We assessed if variables improved model fit using likelihood ratio tests for all models with weights > 0.05. We assessed multicollinearity for candidate models using the performance package in R (Ludecke et al. 2020), and found it not to be an issue in our models (VIF < 5). We used generalized Tukey HSD tests in the multcomp package in R (Hothorn et al. 2008) to examine differences in categorical predictor variables that had high support (were included in models with < 2 ∆AICc) and improved model fit (likelihood ratio tests). We predicted pathogen prevalence and crop contact rates per species from the best-supported models using the predict() function in R, then model-averaged predictions.

Usage notes

Data S1: Feedlot trait assignment – this file contains information on how we generated the feedlot association trait for bird species in our study. The file contains tabs pertaining to (1) source meta-data (i.e., studies we gathered feedlot association data from), (2) counts (i.e., number of individuals per species observed in feedlots during standardized counts (e.g., point count surveys) per study), (3) captures (i.e., number of individuals per species captured at feedlots by study), (4) marginally suitable literature (i.e., studies conducted in feedlots that didn't provide robust data), (5) Birds of the World (formerly Birds of North America) Online mentions of feedlot association by species, and (6) final expert elicitation (i.e., how each of three ornithologists classified species and their final classifications). Tab 7 contains full definitions of what each column represents. 

Data S2: Synanthropy trait – this file contains statistics regarding the synanthropy trait calculations. Synanthropy indices are an index of relative preference for anthropogenic habitat types (agriculture and urban) compared to natural habitat types with greatest use. "Synanthropy index" is the trait we used in our final analyses. Tab 2 contains full definitions of what each column represents. 

Data S3: Predictions and traits – this file contains pathogen prevalence and crop contact estimates for each of the 139 species in our study. It also contains their trait and taxonomic data used in analyses. Tab 2 contains full definitions of what each column represents. 

Data S4: Meta-data – this file again contains the pathogen prevalence and crop contact estimates for each of the 139 species in our study in addition to the raw pathogen prevalence and crop contact data used in our analyses from original studies. Tab 2 contains full definitions of what each column represents. 

Data S5: Data used in analyses – this file contains the data analyzed in "A trait-based framework for predicting foodborne pathogen risk from wild birds." Tabs 1-4 contain the pathogen prevalence and species traits data for each bird species for each study included in the meta-analysis. Tab 5 contains crop contact data and species traits data for each bird species for each study included in the meta-analysis. Tab 6 includes the fecal deposition and traits data for each species included in the meta-analysis. Tab 7 contains full defintions of what each column represents in each tab. 


U.S. Department of Agriculture, Award: 2015-51300-24155

U.S. Department of Agriculture, Award: 2017-67019-26293

National Science Foundation, Award: CNH-1824871