Patterns in bird and pollinator occupancy and richness in a mosaic of urban office parks across scales and seasons
Data files
Dec 27, 2023 version files 74.56 KB
-
data.zip
-
metadata.txt
-
README.md
Abstract
Urbanization is a leading cause of global biodiversity loss, yet cities can provide resources required by many species throughout the year. In recognition of this, cities around the world are adopting strategies to increase biodiversity. These efforts would benefit from a robust understanding of how natural and enhanced features in urbanized areas influence various taxa. We explored seasonal and spatial patterns in occupancy and taxonomic richness of birds and pollinators among office parks in Santa Clara County, California, USA, where natural features and commercial landscaping have generated variation in conditions across scales. We surveyed birds and insect pollinators, estimated multi-species occupancy and species richness, and found that spatial scale, season, and urban sensitivity were all important for understanding how communities occupied sites. Features at the landscape- and local-scale (i.e., distance to streams or baylands and tree canopy, shrub, or impervious cover, respectively) were the strongest predictors of avian occupancy in all seasons. The pollinator richness index was influenced by local tree canopy and impervious cover in spring, and distance to baylands in early and late summer. We predicted relative contributions of different spatial scales to annual bird species richness by assigning values to simulated sites representing “good” and “poor” quality, based on influential covariates returned by models. Shifting from poor to good quality conditions locally increased annual avian richness by up to 6.8 species with no predicted effect of the quality of the neighborhood. Conversely, sites of poor local- and neighborhood-scale quality in good quality landscapes were predicted to harbor 11.5 more species than sites of good local- and neighborhood-scale quality in poor quality landscapes. Finally, more urban sensitive bird species were gained at good quality sites relative to urban tolerant species, suggesting that urban natural features at the local- and landscape-scales disproportionately benefited them.
README: Patterns in bird and pollinator occupancy and richness in a mosaic of urban office parks across scales and seasons
https://doi.org/10.5061/dryad.0k6djhb68
Description of the data and file structure
Scripts repository:
- 00_shared_functions: code sourced in higher-level scripts
- 01_bird_spoccupancy: code to run avian dynamic multispecies models and model predictions using R package spOccupancy.
- 02_pollinators: code to run Poisson regressions for pollinator data.
- 01_manuscript_values.R: code that extracts general values for the manuscript, including tables and plots
- 02_spatial_autocorrelation.R: spatial autocorrelation analysis for both birds and pollinators
- 03_species_probability_bar_chart_occupancy.R: code for producing appendix figure
- 04_species_probability_bar_chart_detection: code for producing appendix figure
- 05_occupancy_summary_panel.R: code for producing main text Figure 2
- 06_best_worst_site_values.R: code for producing main text Figure 3
- 07_richness_change.R: code for summarizing predicted richness change for changes in environmental variables.
Data repository:
- model_data.RDS: bird data structured for input to the spOccupancy R package.
- pollinator_survey_data.RDS: pollinator survey data
Code/Software
Scripts were run using R and packages available through the CRAN R package repository. Scripts are numbered in order of intended use.
Methods
Study area, sites, and species
Surveys of avian and pollinator communities were conducted at 45 1-hectare sites located in 3 cities—Palo Alto, The City of Mountain View, and Sunnyvale—in the Santa Clara Valley of the south San Francisco Bay Area, California (Figure 1). This area is known colloquially as ‘Silicon Valley,’ famous for high-tech companies and start-ups. Santa Clara Valley experienced rapid urbanization in the late 20th century, with commercial and office park developments concentrated near San Francisco Bay in areas developed later than adjacent uplands due to higher groundwater levels (Grossinger et al. 2007, Beller et al. 2020). The resulting study area is a gradient mosaic of land use bounded in the north by tidal marsh mudflats and salt ponds of the Bay; moving southward through a band of semi-natural vegetation of varying quality and comprising private and protected open space, a golf course, airfield, and decommissioned landfills; and ending in the south among office parks and medium- to high-density human residential and small business areas with varying degrees of vegetative cover (Figure 1). Four primary creeks (San Francisquito, Adobe, Permanente, and Stevens creeks) and a few canals and sloughs run through the study area, all of which drain to the bay. Varying amounts of natural vegetation occur in and adjacent to the channels or on top of the bordering levees. The study area is characterized by a Mediterranean climate, with the region receiving 250 to 500 mm of precipitation annually, with warm, dry summers and cooler, wet winters during which the majority of rainfall occurs (McKee et al. 2003).
Most (82%) of the 1-ha study sites were characterized entirely by proportions of the four land cover types (Open Space Developed and/or Low, Medium, or High-Intensity Developed) developed by Homer et al. (2020). The remaining eight sites additionally included proportions of other land cover types (Cultivated Crops, Grassland/Herbaceous, Emergent Herbaceous Wetland, and/or Open Water). Sites varied by the proportions of impervious surface cover (e.g., parking lots, portions of corporate campuses) and different types of vegetated landscaping. These urban sites were embedded in neighborhoods that ranged from corporate campuses to more natural areas (e.g., city parks, urban nature preserves, golf courses, and riparian areas), and varied in their distances to the San Francisco Bay and urban streams in the wider landscape. Sites were on average 3.48 km apart (standard deviation ±2.48 km, range 0.14 to 11.82 km), and were irregularly shaped because they were defined by parcel boundaries and excluded building footprints. The 1-ha sites were selected for sampling because of the anticipated addition of native plantings in future years; thus sites during the years of this study (2015-2017) also represented the baseline conditions for a long-term monitoring study. Birds and pollinators were selected for study because native landscaping was and will be targeted with the requirements of birds and pollinators in mind; the expectation is that surveys of these taxa will continue to monitor bird and pollinator responses to the native plant enhancements over time.
Bird surveys
Avian field surveys were performed from winter 2015 through fall 2016 during four seasons relevant to the occupancy of breeding and non-breeding birds in the study area: overwinter (November 2015 to January 2016), spring migration (March to May 2016), summer breeding (May to July 2016), and fall migration (August to October 2016). Sites were surveyed 8 to 9 times per season, at approximately 1-week intervals, using 20-minute passive area searches during which all birds within the survey site were recorded (Loyn 1986, Slater 1994). This time- and area-constrained method is recommended for habitat patches of small size that will not allow for the placement of multiple independent point counts. Area searches allow for movement around a site and were originally devised for winter surveys when birds are mostly detected visually (versus during the breeding season when birds are mostly detected aurally; Twedt et al. 2008). Estimates derived from this method are comparable and sometimes higher than line transects or strip transects (Watson 2004, Roberts and Schnell 2006). Surveys were completed within 4 hours of sunrise. Due to construction activities, not all 45 sites were surveyed each season, resulting in 35 to 44 sites surveyed per season.
Pollinator surveys
Visual surveys of insect pollinators were conducted from spring to late summer of 2017. Seasons were selected to capture the seasonal variation in floral resources and pollinator activity: spring (April 27 to May 2), early summer (May 31 to June 2), and late summer (August 31). We placed one pollinator survey plot of 100 square meters within each of the larger 1-hectare sites and sampled the same plot during each visit. We selected the plot location in vegetated areas with floral resources that we assumed would be most attractive to pollinators. Thus the plots were not representative of the 1-ha plots which were often characterized by largely unvegetated areas such as parking lots and paved meeting areas, and instead were representative of vegetation patches with the highest floral resources within the 1-ha sites. Due to construction activities, not all sites were accessible during every season, and surveys were conducted 1 to 3 times per site (mean: 2.57 visits). Fifteen-minute visual surveys occurred between 10:00 and 15:00. A practiced entomologist—with experience in visually identifying most regional floral insect visitors by sight—conducted all pollinator surveys. The surveyor followed a meandering transect through the plot and recorded all encountered invertebrates located on the reproductive parts of flowers. In the interest of not depleting native pollinators, the surveyor visually identified invertebrates to the lowest feasible taxonomic level. Dipterans were identified to family, Hymenopterans were identified to genera, and Lepidopterans to species except for Hesperia sp., Hylephila sp., and Phyciodes sp.. To confirm visual identification of the most commonly encountered cryptic species, a sample individual was hand-captured and identified in the lab using a dissecting scope, an independently confirmed reference collection, and a definitive reference book by Michener (2000). For the complete list of taxa (n = 42), see Appendix 1: Table 1.
Environmental variables
Environmental variables were calculated at three spatial scales: local, neighborhood, and landscape (Appendix 1: Table 2). Local-scale variables were derived from data collected at 1-ha sites during the summer of 2016 to characterize vegetation composition, land cover, and vertical vegetation complexity. These data represent direct measures of vegetation cover and complexity in the summer. Because we did not collect vegetation data in other seasons, however, we made the assumption that differences between sites during fall, winter, and spring are represented by differences between sites during the summer. In 1-ha plots, we counted and identified (to at least genus level) all native and non-native trees (where trees are defined as woody plants with trunks ≥10 cm diameter at breast height). From these counts, we calculated the ratio of native to non-native trees. Within each 1-ha site, we used GIS to randomly select six 10 m x 5 m subplots. Within each subplot we visually estimated the percentage of ground cover for impervious surface and each native and non-native herbaceous, tree, and shrub species independently in 1% increments and 0.1% placeholders for trace quantities of a given cover type.. Vertical vegetation complexity was assessed using a 1 x 6-foot cover board, divided into six one-foot vertical bands, following Herrick et al. (2009 pp. 61–62). Measurements were taken 4.5 meters away from the observer in each cardinal direction. Bands were considered obscured if >25% was covered in vegetation. The number of obscured bands was divided by the total number of bands and averaged across all measures taken at the site. Values from the six subplots were averaged and considered representative of the 1-ha site as a whole.
We quantified neighborhood-scale variables (tree canopy and impervious cover) within 500m buffers around sites using ArcGIS (ESRI 2020). We chose this buffer distance because it is the minimum value of the range of most commonly used distance buffers (500 m to 2500 m) from which variables significantly associated with birds have been derived (Litteral and Shochat 2017). It is also a distance buffer shown to be relevant to urban insects (Turrini & Knop 2015). Canopy cover was derived from Earth Define (2013) 60-cm-resolution Tree Canopy data set, and impervious cover was derived from the 30-m-resolution 2016 National Land Cover Database (Homer et al. 2020).
We calculated landscape-scale variables to characterize the site’s position relative to water features. Distance to the San Francisco Baylands (i.e., the closest tidal bay flats or open bay from the site edge) was measured using the California Aquatic Resource Inventory (SFEI 2017a). Distance to the nearest stream or channel was calculated using the Bay Area Aquatic Resources Inventory (SFEI 2017b).
Occupancy models
We used robust-design, dynamic, multi-species occupancy models to evaluate species-level occupancy and community composition of birds across a range of urban conditions while correcting for imperfect detection (Iknayan et al. 2014). We temporally partitioned the sampling occasions in each season into three primary sampling periods each comprising triplet adjacent samples. The model assumes closure within the secondary sampling periods, but permits changes in occupancy between primary periods (Altwegg and Nichols 2019, Pollock 1982, MacKenzie et al. 2003). We modeled each season independently with single-season occupancy models rather than linked multi-season models (Dorazio et al. 2010). Since many species enter or leave the study area between seasons, a multiseason model would capture migratory behaviors for many species rather than extinction and colonization dynamics driven by site suitability. Further, the multiseason approach requires the same species list be used across all seasons, including those not available for sampling, which could introduce additional uncertainty in parameter estimation. Only species that were present in at least one site during that season were included in that season’s model.
In our study, we define the term "occupancy" as the use of a site at any point during the sampling season. A core assumption of occupancy models is that sites experience no species immigration or emigration during a season (i.e., the “closure assumption”). Violating the closure assumption can overestimate occupancy when detection probabilities are low, potentially leading to inflated species richness estimates (MacKenzie et al. 2018, 146). Given the dynamic migratory behavior of birds in spring and fall, specific subsets of species, particularly those with transient behavior, could wield relatively greater influence on covariate estimations through inflation of their estimated occupancy. To further accommodate migration of species during the course of a season beyond the use of a dynamic model with secondary sampling periods, we allowed variation of detection probabilities over time among survey periods. This approach serves to mitigate the impact of migration on our analysis (MacKenzie et al. 2018). To assess the adequacy of our models and the impacts of violating underlying assumptions, we conducted a posterior predictive check (PPC) as a goodness of fit test for all models using a ?2 fit statistic, calculated with the spOccupancy R package (Doser et al. 2022). Our analysis revealed no indication of a lack of fit in any of the seasonal models (PPC: winter = 0.39, spring = 0.41, summer = 0.42, fall = 0.40 ). If potential biases persist, we believe that these should not significantly distort comparisons between different sites or compromise our evaluation of how site characteristics interact with species behaviors, given the close proximity of our study sites, we reasonably anticipate that migratory birds' departure and arrival times will exhibit similar trends across the various locations.
Multispecies occupancy models assume that the set of species modeled is composed of species that will respond similarly, but not identically, to environmental variables (Dorazio et al. 2010, Kéry and Royle 2016, 667). The models allow for heterogeneity in species responses to environmental conditions for both detectability and occupancy, including the direction of response. We address the assumption of ecological similarity by reducing the set of species modeled to represent those with similar resource use, dependence on nearby water bodies, and similar home ranges and body size. Based on these criteria, we included passerines (Passeriformes), hummingbirds (Apodiformes), doves (Columbiformes), and woodpeckers (Piciformes) (n = 80 species; see Appendix 1: Table 3 for species list, scientific names, and the seasons in which species were detected). Nocturnal species, shorebirds, waterfowl, waterbirds, game birds, and raptors were excluded from the analysis. We assume that given the heterogeneity allowed by the model, species with different migratory patterns and species with varying urban tolerances can be modeled jointly. For the set of modeled species, we define the “community” as a set of species occurring in the same place at the same time (Ricklefs 2008).
Model predictors of avian occupancy were fit to evaluate processes at the local scale (canopy cover, proportion native trees, impervious cover, total shrub cover, native shrub cover, and vertical complexity), neighborhood (matrix canopy cover and matrix impervious cover), and landscape scale (distance to baylands, distance to stream; Appendix 1: Table 2). The pairwise Pearson correlation coefficient for all included environmental variables was <|0.6|. Detection was modeled as a function of Julian day and its quadratic, time after sunrise (in minutes), and the ambient noise at the site during the survey, measured by a phone app sound meter and external microphone (in Leq).
Field encounters of species i at site j during primary sampling period k and secondary period t, denoted as y_ijkt (y=1 if the species was encountered and y=0 otherwise), were assumed to be the result of the imperfect detection of a species’ true incidence at a site: z_ijk (z=1 if the species occurred at the site, and z=0 otherwise). The incidence of a species was modeled as conditional on the probability that a species occupies that site (ψ_ijk):z_ijk |ψ_ijk ~ Bernoulli (ψ_ijk)
Whereas, the field encounters were the probability of detecting that species during a survey, p_ijkt, given its incidence at a site:
y_ijkt |z_ijk,p_ijkt ~ Bernoulli (z_ijk p_ijkt)
Occupancy and detection probabilities were modeled as linear combinations of covariates along with an intercept using a logit-link transformation:
logit(ψ_ijk )= αΟ_i+ α_i Χ_j
logit(p_ijkt )= βΟ_i+ β_i Χ_jkt
Species-specific intercept and covariate values were modeled as normally distributed governed by community-level hyperparameters, for example:
αΟ_i= Normal(µ_(αΟ,community ),τ_(αΟ,community))
All intercepts and covariates were drawn from separate distributions and followed this same form. The community-level means and precisions were assigned the following priors:
µ_(αΟ,community )~ Normal(0,σ^2= 2.72)
τ_(αΟ,community) ~ Inverse Gamma(0.1,0.1)
which translates to a relatively uniform prior when transformed on the probability scale (0, 1). Research on prior selection has demonstrated this an appropriate specification for relatively non-informative priors for occupancy models (Broms et al. 2016, Northrup and Gerber 2018). Models were fit using Markov chain Monte Carlo methods as implemented by the spOccupancy package in R which employs Pólya-Gamma data augmentation for efficient sampling (Doser et al. 2022). 13,200 samples of the posterior were generated for each model (3 chains, 54,000 iterations, 10,000 burn-in, and a thinning of 10). Pre- and post-processing of data was done in R (R Core Team 2020). Chains were considered converged if the Gelman-Rubin statistic was <1.1 for all stochastic nodes (Gelman et al. 2014). Significance throughout the analyses is defined as values having non-overlapping 95% highest-density interval (HDI) when comparing numbers, or a 95% HDI that is non-overlapping with zero when evaluating the significance of covariate values. Species-specific responses to environmental variables were considered strong when their 75% HDIs did not overlap with zero. Code and spatially anonymized data in Iknayan et al. (published on acceptance). The community-level distributions for covariates provide inferences of the drivers of overall community response. Whereas, species-specific values can help investigate the responses of individual species that underlie the community response.
Total richness and richness of a subset of the community are presented throughout as the number of species potentially gained or lost by management actions, which likely provides a more intuitive summary of outcomes for landholders than occupancy probability values. One of the notable advantages of the Bayesian multispecies occupancy model is that it returns the estimation of every species’ presence or absence at each survey site, represented as a binary incidence matrix (z_ijk), as a fundamental output. In this matrix, zeros signify estimated absence at a site, and ones indicate estimated presence. This estimation is returned for every sample of the posterior distribution, making it straightforward to link error to the estimated presence or absence of each species. This feature enables the easy estimation of species richness within any chosen subset of the community by simply summing the incidence values from the resulting posterior distribution. Furthermore, the Bayesian framework extends this flexibility to making predictions. At any given value for a predicted variable, the model returns estimated incidence values for each species across the entire posterior distribution. As a result, errors in estimation can be directly associated with these predicted incidence values and for any incidence-derived community-level index such as richness. Throughout predictions were generated using the R function spOccupancy::predict.tMsPGOcc.
Predicting the relative contribution of spatial scales to avian richness
We made predictions about the relative contribution of local-, neighborhood-, and landscape-scales to differences in annual bird species richness by simulating data for hypothetical sites under assumed “good” and “poor” quality conditions at each scale, based on the significant covariates returned by the community occupancy models in any season. At each spatial scale, we set the significant variables having positive or negative associations with community occupancy to their maximal or minimal values, respectively, and predicted annual richness at these hypothetical sites with different combinations of “good” or “poor” conditions at each scale. Our hypothetically “good” conditions at the local-, neighborhood-, and landscape-scale were set to have the maximum canopy cover and minimum impervious cover at the local- and neighborhood-scale and the minimum distance from the Bay and streams source at the landscape scale (i.e., “good” conditions based on significant model coefficients associated with higher richness). All other environmental variables were held to their average values. Hypothetically “poor” sites were set to opposite conditions at those scales. We evaluated combinations of “good” and “poor” conditions at different scales to evaluate the impacts of different actions across scales.
Bird urban-tolerance score
To evaluate whether site characteristics benefited more urban-tolerant or more urban-sensitive avian species, we used species-level urban-tolerance scores provided in (Callaghan et al. 2021). Callaghan et al.’s analysis quantifies species-specific responses to urbanization in the Western Hemisphere by integrating citizen science observations from eBird with urbanization intensity as measured by night-time light intensity (Elvidge et al. 2017) . We averaged Callaghan et al.’s monthly, species-specific urban tolerance scores for the seasons in which a species was detected in the study area. Species scores were then classified into “urban sensitive” (0.33 quantile of the community-level distribution of scores), “urban neutral” (0.33 to 0.66 interquartile range), and “urban tolerant” (0.66 to 1.00 interquartile range). The dataset covers all but three species in our analysis: golden-crowned sparrow, cliff swallow, and willow flycatcher. Given the lack of data on these species, they were excluded from our assessment of urban tolerance. Previous work has found that continental measures of urban-tolerance align with local measures of urban tolerance (Callaghan et al. 2020).
Pollinator taxonomic richness
Pollinator surveys did not include within-season resurveys, preventing the use of occupancy models and the correction of potential variation in pollinator detectability. In light of these limitations, we assumed pooled raw pollinator taxonomic richness values from the surveys could provide an index to evaluate site characteristics’ effects on pollinator diversity. Only native or potentially native taxa were included in the analysis and nonnative pollinators were excluded from analysis: Apis sp. and Pieris rapae (Appendix 1: Table 1). Taxa with potentially invasive species were evaluated based on known ranges of invasive species (Droege 2021): Hylaeus sp., Megachile sp., Polistes sp., Vespula sp. These taxa were included in the final analysis because they occurred in low numbers and models including and excluding these taxa returned the same significant covariates and similar covariate values. The same environmental variables used in the avian multi-species occupancy model were included as predictors of pollinator richness using a Poisson linear model (GLM) with a variance of ϕ × μ, where μis the mean and ϕis the dispersion. Models were fit using the glm() function in R. A standard Poisson model was appropriate given that no overdispersion was detected in the standard errors.
Assessment of Spatial Autocorrelation
As site selection was restricted to participating landowners, we assessed potential spatial autocorrelation that could have resulted from the non-random placement of sampling locations. For birds, we analyzed residual spatial autocorrelation using methods explicitly designed for occupancy models presented by Wright et al (2019). Latent occupancy residuals for each species and site were obtained by subtracting the probability of occupancy (ψ_ijk) from the incidence (z_ijk) across the entire posterior distribution. These residuals may more appropriately termed “discrepancies” as incidence is partially latent. We assessed spatial autocorrelation for every species in each model using Moran’s I at 500 m distance bands from 0 to 5 km. Band distance was chosen following guidelines by Fletcher and Fortin (2018) to allow for a sufficient number of pairs for comparison in each distance band (>20 pairs) and capped at 5 km to avoid boundary effects. We found limited evidence of spatial autocorrelation in our avian models; only 0.34% (n = 7) of our species-season pairs in any distance class in any season had 95% HDIs non-overlapping zero (Appendix 1 Figure 4). This overlap did not show a pattern with distance or seasonality, and there was no spatial autocorrelation evident for sites separated by <1 km. For pollinators, we calculated Moran’s I using the R package pgirmess::correlog, using the Sturges method to compute the optimal number of distance classes (Giraudoux 2023). There was no significant spatial autocorrelation evident for sites <5km apart.