Skip to main content

Wildfire smoke affects detection of birds in Washington state

Cite this dataset

Sanderfoot, Olivia (2022). Wildfire smoke affects detection of birds in Washington state [Dataset]. Dryad.


Wildfire smoke is likely to have direct health effects on birds, as well as influence movement, vocalization, and other avian behaviors. These behavioral changes may affect if and how birds are observed in the wild, although research on the effects of wildfire smoke on bird behavior is limited. To evaluate how wildfire smoke affects detection of birds, we combined data from eBird, an online community science program, with data from an extensive network of air quality monitors in the state of Washington over a 4-year period. We assessed how PM2.5, a marker of smoke pollution, affected the probability of observing 71 bird species during the wildfire seasons of 2015–2018 using bird observations from 62,908 eBird checklists. After accounting for habitat, weather conditions, seasonality, and survey effort, we found that PM2.5 affected the probability of observing 37% of study species. The ambient concentration of PM2.5 was negatively associated with the probability of observing 16 species and positively associated with the probability of observing 10 species, indicating that birds exhibit species-specific behavioral changes during wildfire smoke events that influence how they are observed. Our results suggest that wildfire smoke impacts the presence, availability, and/or perceptibility of birds. Impacts of smoke pollution on human observers, such as impaired visibility, may also influence detection of birds. These results provide a foundation for developing mechanistic hypotheses to explain how birds, and our studies of them, are impacted by wildfire smoke. Given the projected increase in large-scale wildfire smoke events under future climate change scenarios, understanding how birds are affected by wildfire smoke –– and how air pollution may influence our ability to detect them –– are important next steps to inform wildlife research and avian conservation.


This dataset was used by Sanderfoot and Gardner (2021) to investigate how smoke pollution influenced the probability of detecting birds in Washington state during the wildfire seasons of 2015 – 2018.  This README file provides a brief overview of the dataset and a description of the variables included in our analysis. A complete description of the methods used to prepare the dataset is provided in Sanderfoot and Gardner 2021.

Bird Observations: This dataset includes data on observations of 71 species collected by the public and submitted to eBird (eBird Basic Dataset 2021). eBird supports opportunistic data collection by volunteers who submit online checklists of species they observe and other related information (e.g., date, time, duration and type of survey, distance traveled, number of observers, etc.) (Sullivan et al. 2009). We included bird observations from complete checklists documented in Washington state from July 1 – September 30, 2015 – 2018 from the eBird Basic Dataset (eBird Basic Dataset 2021). The observations in this dataset were filtered using the auk package (Strimas-Mackey et al. 2018) in R (R Core Team 2020) to include only those from stationary, traveling, and area counts. We also restricted the dataset to only include checklists within 32 kilometers of an active fine particulate matter (PM2.5) monitor in Washington state. For additional details on bird observations included in this dataset, please see the methods section of Sanderfoot and Gardner 2021.

Environmental Data:

  • "particulates": daily mean concentration of PM2.5 in μg/m3

We used data from the Environmental Protection Agency (EPA) Air Quality System (AQS) (US Environmental Protection Agency 2019). We downloaded daily PM2.5 concentrations (24-hour averages) available from all ground-based PM2.5 monitors in the state of Washington for the years 2015 through 2018. We then linked each of the checklists to the daily concentration of PM2.5 at the monitoring station closest to the checklist location.

  • "air": daily mean air temperature in Kelvin
  • "apcp": daily accumulated precipitation in mm

We used data from the North American Regional Reanalysis (NARR) to determine daily mean air temperature and daily accumulated precipitation for each eBird checklist. NARR data were provided by the National Oceanic and Atmospheric Administration Physical Sciences Laboratory in Boulder, Colorado, USA from their website at The spatial resolution of NARR data is approximately 32 kilometers. We used the ncdf4 package (Pierce 2019) in R (R Core Team 2020) to extract the weather data.

  • "land_cover": land cover class (categorical)

We assigned a land cover type to each checklist using data from the 2016 National Land Cover Database (NLCD), developed by the Multi-Resolution Land Characteristics (MRLC) Consortium and made available as a layer in ArcMap by the Environmental Systems Research Institute (Esri) (Esri 2019, 2020). Land cover classifications were grouped into nine categories: open water, perennial ice/snow, developed, barren, forest, shrubland, herbaceous, planted/cultivated, and wetlands.

  • "year": year of observation date (categorical)

Works Cited:

eBird Basic Dataset. Version: EBD_relJan-2021. Cornell Lab of Ornithology, Ithaca, New York. January 2021.

Esri Inc. (2020). ArcMap (Version 10.8.1). Esri Inc. Redlands, California, USA.

Pierce, D. (2019). ncdf4: Interface to Unidata netCDF (Version 4 or Earlier) Format Data Files. R package version 1.17.

R Core Team. 2020. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL

Strimas-Mackey, M., E. Miller, and W. Hochachka. 2018. auk: eBird Data Extraction and Processing with AWK. R package version 0.3.0.

Sullivan, B. L., C. L. Wood, M. J. Iliff, R. E. Bonney, D. Fink, and S. Kelling. 2009. eBird: a citizen-based bird observation network in the biological sciences. Biological Conservation 142:2282-2292.

US Environmental Protection Agency. Air Quality System Data Mart [internet database] available via Accessed June 27, 2019.

Usage notes

The final dataset included detection/non-detection data for 71 species from 62,908 eBird checklists, each of which represented a single survey. These checklists were submitted by 4,865 unique eBird observers and linked to air quality data from a total of 71 PM2.5 sensors.

This dataset contains 4,466,610 rows and 40 columns. Each row includes detection/non-detection data for one of 71 study species and a complete set of covariates for a single eBird checklist (i.e., survey). For example, row 43 contains detection/non-detection data for American Robins (Turdus migratorius) in eBird checklist S24832105. There are 62,910 unique eBird checklists in the data set. Note that 62,910 checklists x 71 study species = 4,466,610 rows.

Most columns refer to information provided by eBird when downloading the Basic Dataset, including "checklist_id", "last_edited_date", "country", "country_code", "state", "state_code", "county", "county_code", "iba_code",  "bcr_code", "usfws_code", "atlas_block", "locality", "locality_id", "locality_type", "latitude", "longitude", "observation_date", "time_observations_started", "observer_id", "sampling_event_identifier", "protocol_type", "protocol_code", "project_code", "duration_minutes", "effort_distance_km", "effort_area_ha", "number_observers", "all_species_reported", "group_identifier", "trip_comments", "scientific_name", "observation_count" and "species_observed." Please see metadata provided with eBird data downloads for details. Download data at:

We added the following covariates (see above for details on each):

  •  "particulates": daily mean concentration of PM2.5 (μg/m3)
  •  "air": daily mean air temperature (K)
  •  "apcp": daily accumulated precipitation (mm)
  •  "land_cover": land cover class (categorical)
  •  "year": year of observation date (categorical)

Note that prior to our analysis, we converted the time observations started to the number of seconds past midnight, estimated the distance traveled for area checklists (see the methods section of Sanderfoot and Gardner 2021 for details), and scaled all numeric covariates. We also removed data from the only eBird checklist in this dataset with a recorded concentration of PM2.5 above 300 and the only eBird checklist in this dataset for which no information on the duration of the survey was provided.


National Science Foundation, Award: DGE-1762114

McIntire-Stennis Cooperative Forestry Research Program, Award: Grant No. NI19MSCFRXXXG035/Project Accession No. 1020586

McIntire-Stennis Cooperative Forestry Research Program, Award: Grant No. NI19MSCFRXXXG035/Project Accession No. 1020586