Ungrazed seminatural habitats around farms benefit bird conservation without enhancing infectious disease risks
Data files
Jul 23, 2024 version files 38.09 MB
-
diversity_Nmix.RDS
24.03 MB
-
diversity_occ.RDS
12.64 MB
-
diversity_processed.csv
215.89 KB
-
pathogen_bysample.csv
532.76 KB
-
pathogen_bytransect.csv
9.57 KB
-
pointcount_covars.csv
26.08 KB
-
pointcount.csv
612.05 KB
-
README_diversity_Nmix.csv
398 B
-
README_diversity_occ.csv
415 B
-
README_diversityNmixprocessed.csv
2.35 KB
-
README_diversityprocessed.csv
2.74 KB
-
README_pathogenbysample.csv
3.38 KB
-
README_pathogenbytransect.csv
1.66 KB
-
README_pointcount.csv
1.61 KB
-
README_pointcountcovars.csv
858 B
-
README_speciestraits.csv
797 B
-
README.md
1.27 KB
-
species_traits.csv
6.77 KB
Abstract
All data for associated publication (title above) are included. Specific data include: (1) raw point count data, including birds observed and associated information; (2) point count covariates, including site descriptors; (3) species traits information for every detected species; (4) estimated occupancy data that resulted from applying occupancy models to point count data; (5) estimated abundance data that resulted from applying N mixture models to point count data; (6) derived bird diversity data, obtained after analyzing the abundance results from N mixture models; (7) raw pathogen occurence and fecal density data from transects; (8) aggregated pathogen prevalence data.
Descriptions of the procedures involved are included in the methods description and in the resulting manuscript.
Description of the Data and file structure
All data are stored in CVS or RDS files. Each file has an associated README csv, that provides a verbal description of each column. RDS files are used for 3 dimensional arrays, with cells detailing bird occurance or abundance. Dimensions include: species, site, and posterior from the abundance or N mixture model.
Study region
Our work focused on three counties in the California Central Coast (i.e., Santa Cruz, San Benito, and Monterey Counties), one of the most productive and economically-important agricultural regions in the United States, especially for fresh produce [1]. Across this region, we selected 30 organic farms as study sites, with farms defined as contiguous lands managed by a single grower or operation. Though farmers often grew many crops (see Table S1 for farm summary statistics), all study sites included lettuce. Lettuce was chosen as our focal crop because prominent foodborne disease outbreaks have been linked to leafy greens, making them a focus of food-safety regulations [2]. In addition, lettuce ranked as the most important agricultural commodity in Monterey and San Benito Counties [3,4] and the seventh most important agricultural commodity in California in 2020 (total value, production, and acreage: ~US$2.3 billion, ~3.3 million tons, and ~200,000 acres, respectively), with California leading the nation in its production (75.8% of U.S. receipts, [1].
The Central Coast region experiences a temperate Mediterranean climate and exists as a landscape mosaic of large monoculture farms, small diversified farms, grazing lands, and other seminatural habitats (e.g., grasslands, shrublands, forest, riparian habitat, and wetlands). To study the effects of on-farm management practices and landscape context, we selected farms that independently varied in local diversification, the proportion of surrounding grazed land, and the proportion of surrounding ungrazed semi-natural habitats, leveraging aerial imagery from the National Agricultural Imagery Project (NAIP, 30 m resolution). We limited our study to organic farms because organic farmers (1) are constrained in which agrochemicals can be applied and thus often rely on diversification practices such as crop rotations and preserving non-crop vegetation to maintain soil fertility and control pests, (2) are subject to intense scrutiny regarding food-safety requirements, and (3) represent a growing share of the lettuce market, with ~22% of California lettuce acreage currently in organic production, approximately half of which occurs in the Central Coast [1,5].
Bird point count surveys
We surveyed birds on each farm using 10 minute, 50 m fixed-radius point count surveys, repeated three times over consecutive days from May-July in 2019 and 2020. Each year we surveyed 20 farms. Some farms were surveyed both years (N= 10) and others (N= 20) in only one year due to lettuce crop rotations. Point count locations were separated by at least 100m (range: 100 m-1514 m, mean=459 m; [6]). Thus, the number of point counts per farm varied by farm size (point counts: range: 3-6, mean=5.7; point counts per 10 hectares: range: 0.1-11.4, mean=3.1). At least half of the count locations on each farm were centered in lettuce; the other half were located in other dominant crops (e.g., strawberry, squash, broccoli). All surveys were conducted by the same skilled observer (T. Glaser), primarily between sunrise and 10:30 am and always in the absence of rain or heavy fog. All individuals seen or heard within the survey radius were identified to species and recorded, alongside key covariates that may influence bird detectability (e.g., time of day, day of year, wind speed, temperature, presence of loud noises, etc.). We also noted the substrate (e.g., crop field, tree, fence, etc.) associated with each bird observation.
Flocking birds and species-level traits
Flocking birds could increase food-safety risks by leaving concentrated deposits of fecal contamination on farms. We thus created a binary response variable to indicate whether flocks were observed during each survey. To reflect food-safety risks, we excluded observations of birds in trees (which were less likely to interact with the crops) and auditory detections when an individual’s exact location was unknown (e.g., crop field vs. tree). We also excluded swallows because they are usually observed flying above crop fields but seldom contact crops. Then, we defined flocks as a group of 7 or more individuals of the same species observed during a survey.
We also collected several species-level traits that represent the relative food-safety risk or conservation value of each species. First, we noted whether we observed flocking behavior in each species (using the criteria listed above). Second, we assigned pathogen prevalence traits to each species using data from [7]. These traits reflect the likelihood that each species on farms in the western U.S. would test positive for Campylobacter spp., E.coli virulence genes, and Salmonella spp. Finally, we collected conservation scores for each species from the 2016 State of North America’s Birds report [8], which incorporates information on population size, distribution, and other components of vulnerability. Because this report focused on native species only, we assigned the lowest conservation score possible to non-native species.
Local farm management practices and landscape context
We quantified the level of local (on-farm) diversification associated with each 50 m radius point-count location by building a composite index from measurements of crop diversity, non-crop vegetation cover, and vegetation complexity (Supplemental methods). We also documented the total length of fencing, where birds often perch, in each point count radius. Next, we manually digitized seminatural habitat (forest, shrubland, grassland, pasture, and wetlands) from NAIP 2016 imagery within a 1km radius of each sampling location using ArcMap 10.3.1 (ESRI, Redlands, CA, USA). To assess the effects of different types of seminatural habitat, we overlaid spatial grazeable land data from the Farmland Mapping and Monitoring Program [9] on top of our land-cover map. Grazeable land, or land where vegetation is suitable for grazing livestock in California, was dominated by grasslands and pastures. We thus further subdivided our maps into grazed seminatural habitat (areas of overlap between our seminatural habitat map and grazeable lands) versus ungrazed seminatural habitat.
Bird fecal transects and pathogen testing
We surveyed bird fecal contamination along three parallel, 20 m transects in lettuce crops on 20 farms per year from May-July of 2019 and 2020 (10 farms sampled in 2019 or 2020 only, 10 farms sampled in both years). Transects were located at the farm edge with the most seminatural habitat, as far from a farm edge as possible (up to 500 m from the edge), and halfway in between. We recorded the number of bird feces within 20, 1m2, adjacent quadrats centered along each transect. In 2019 only, we also collected 10 fecal samples from each transect, or extended sample collection beyond the transect as needed to obtain 10 samples. We placed samples in sterile cryotubes filled with 100% ethanol, immediately froze them in a liquid nitrogen dewar, and kept samples frozen until DNA extraction. We screened bird fecal samples for E.coli virulence genes, Campylobacter spp., and Salmonella spp. using multiplex polymerase chain reactions. Although Shiga-toxin producing E. coli that carries the stx1 and/or stx2 genes is responsible for causing disease in humans, other ‘virulence genes’ can contribute to pathogenesis. E.coli virulence genes carried by birds can be transferred between bacterial strains, and when combined with Shiga-toxins, can result in pathogenic E.coli strains that cause severe disease in humans ([10,11]; see Supplemental methods).
Statistical analyses
We used occupancy and N-mixture models that account for variation in detection probability to estimate species presence/absence and abundance, respectively, and to quantify changes in bird communities among sites [12–14]. Specifically, we created three types of N-mixture and occupancy models to (1) estimate the abundance/occupancy of each species at each site, (2) understand how local and landscape diversification affects species- and community-level abundance/occupancy, and (3) measure how species traits interact with diversification variables to affect abundance/occupancy (Supplemental methods). We considered community-level parameters to be statistically significant when their 95% Bayesian credible interval did not overlap zero (BCI; the range between the 2.5 and 97.5th percentiles of the posterior distribution). In contrast, we considered species-level parameters to be statistically significant when their 90% BCI did not overlap zero, as species-level effects are estimated with lower sample sizes and thus less power [15]. We also determined whether species varied in their responses to local diversification, landscape context, and/or their interactions by examining the variation (s parameter) associated with each slope term (Supplemental methods). Responses were considered to vary significantly among species when the 90% highest posterior density interval of s did not overlap 0 (as in [15]).
To quantify bird conservation metrics, we extracted the number of individuals (and occupancy state) for each species at each site across 3000 posterior iterations of the N-mixture and occupancy models. We then calculated the species richness, Shannon diversity, and total bird abundance for each point-count location and each posterior iteration. To quantify the “conservation value” of each community, we extracted posteriors from the occupancy model and then calculated the average conservation score across all species estimated to occur at each site. Finally, we calculated the median and inverse interquartile range of each metric across all posteriors (see Supplemental methods for more information).
We measured pathogen risk in several ways. First, we quantified the number of feces detected within each 20 m transect (i.e., fecal density). Next, we created binary responses to indicate whether each of the assayed fecal samples tested positive for any pathogen. Finally, we quantified pathogen risk as the product of the total number of feces per 20 m transect and the fraction of feces testing positive for Campylobacter spp., Salmonella spp., or any E.coli virulence gene. We divided this number by 20 to ultimately arrive at an estimate of ‘potentially pathogenic fecal density’ or the number of potentially pathogenic feces per m2.
We used generalized linear mixed models (GLMM) to test the effects of local diversification and surrounding seminatural habitat on bird conservation and pathogen risk metrics. All models included fixed effects of local diversification, grazed and ungrazed seminatural habitat within 1 km, and two interactions between local diversification and grazed and ungrazed seminatural habitat. Fecal density and pathogen risk models also included distance from the fecal transect to the nearest non-crop edge as a fixed effect to account for spatial variation in bird activity. Pathogen prevalence, fecal density, and pathogen risk models included ‘day of year’ to account for seasonal effects that may impact pathogen exposure. Bird conservation models included the inverse of the interquartile range of richness, abundance, diversity, or conservation score across posteriors as model ‘weights.’ All models included a random intercept of farm to account for spatial dependence of individuals captured on the same farm.
We used linear mixed models for diversity, species richness, and abundance estimated from N-mixture models; conservation score estimated from occupancy models; fecal density; and pathogen risk. We used binomial GLMMs with a log link function for the probability of flocks occurring and pathogen prevalence. We transformed some variables (fourth-root: richness, abundance; log: fecal density +1, pathogen risk + 0.1) to meet model assumptions, scaled covariates by subtracting by the mean and dividing by the standard deviation, and verified that models did not display multicollinearity (Pearson correlation coefficient <0.6). We ran models with the glmmTMB package [16] and performed model selection with the MuMIn package [17] in R. To do so, we first identified the best-supported models within 2 AIC of the top model and then used a model averaging approach to assess variable significance within these top models [18].
Finally, we visualized and analyzed community turnover between sites by first extracting the median abundance of each species at each site across all 3000 posteriors from N-mixture models and then calculating the community dissimilarity between each pair of sites (Bray-Curtis dissimilarity). We visualized differences in community composition between sites via Non-Metric Multidimensional Scaling and then used Permutational Multiple Analysis of Variance (PERMANOVA) with the ‘adonis’ function in the ‘vegan’ library [19], with farm as a blocking factor, to assess the influence of diversification on species turnover.
References
1. CDFA. 2020 California agricultural statistics report: 2019-2020.
2. California Leafy Greens Marketing Association. 2020 Commodity specific food safety guidelines for the production and harvest of lettuce and leafy greens.
3. County of Monterey Agricultural Commissioner. 2021 Monterey County crop & livestock report: Salad bowl of the world. (doi:10.1145/312379.313069)
4. San Benito County Agricultural Commissonier. 2020 San Benito County crop & livestock report.
5. Carlisle L et al. 2022 Organic farmers face persistent barriers to adopting diversification practices in California’s Central Coast. Agroecol. Sustain. Food Syst. 00, 1–28. (doi:10.1080/21683565.2022.2104420)
6. Ralph CJ, Martin TE, Geupel GR, Desante DF, Pyle P. 1993 Handbook of field methods for monitoring landbirds. Gen. Tech. Rep. PSW-GTR-144-www.
7. Smith OM et al. 2022 A trait‐based framework for predicting foodborne pathogen risk from wild birds. Ecol. Appl. 32, e2523. (doi:10.1002/eap.2523)
8. North American Bird Conservation Initiative. 2016 The State of North America’s Birds 2016.
9. California Department of Conservation. 2016 Farmland Mapping & Monitoring Program GIS Shapefiles.
10. Paton AW, Paton JC. 2002 Direct detection and characterization of shiga toxigenic Escherichia coli by multiplex PCR for stx1, stx2, eae, ehxA, and saa. J. Clin. Microbiol. 40, 271–274. (doi:10.1128/JCM.40.1.271)
11. Bryan A, Youngster I, McAdam AJ. 2015 Shiga toxin producing Escherichia coli. Clin. Lab. Med. 35, 247–272. (doi:10.1016/j.cll.2015.02.004)
12. Royle JA. 2004 N-mixture models for estimating population size from spatially replicated counts. Biometrics 60, 108–115. (doi:10.1111/j.0006-341X.2004.00142.x)
13. Ficetola GF et al. 2018 N-mixture models reliably estimate the abundance of small vertebrates. Sci. Rep. 8, 10357. (doi:10.1038/s41598-018-28432-8)
14. Kéry M. 2018 Identifiability in N-mixture models: a large-scale screening test with bird data. Ecology 99, 281–288. (doi:10.1002/ecy.2093)
15. Frishkoff LO, Karp DS. 2019 Species‐specific responses to habitat conversion across scales synergistically restructure Neotropical bird communities. Ecol. Appl. 29, e01910. (doi:10.1002/eap.1910)
16. Magnusson A, Skaug H, Nielsen A, Berg C, Kristensen K, Maechler M, Brooks M. 2016 glmmTMB: Generalized linear mixed models using template model builder.
17. Bartoń K. 2020 MuMIn: Multi-Model Inference. R package version 1.43.17.
18. Burnham KP, Anderson DR. 2002 Model selection and multimodel inference: A practical information-theoretic approach, 2nd Ed. Spring-Verlag.
19. Oksanen J et al. 2022 vegan: Community Ecology Package, R package version 2.6-2.