Skip to main content

Data from: Performance of unmarked abundance models with data from machine-learning classification of passive acoustic recordings

Cite this dataset

Fiss, Cameron et al. (2024). Data from: Performance of unmarked abundance models with data from machine-learning classification of passive acoustic recordings [Dataset]. Dryad.


The ability to conduct cost-effective wildlife monitoring at scale is rapidly increasing due to availability of inexpensive autonomous recording units (ARUs) and automated species recognition, presenting a variety of advantages over human-based surveys. However, estimating abundance with such data collection techniques remains challenging because most abundance models require data that are difficult for low-cost monoaural ARUs to gather (e.g., counts of individuals, distance to individuals), especially when using the output of automated species recognition. Statistical models that do not require counting or measuring distances to target individuals in combination with low-cost ARUs provide a promising way of obtaining abundance estimates for large-scale wildlife monitoring projects but remain untested. We present a case study using avian field data collected in forests of Pennsylvania during the Spring of 2020 and 2021 using both traditional point counts and passive acoustic monitoring at the same locations. We tested the ability of the Royle-Nichols and time-to-detection models to estimate abundance of two species from detection histories generated by applying a machine-learning classifier to ARU-gathered data. We compared abundance estimates from these models to estimates from the same models fit using point-count data and to two additional models appropriate for point counts, the N-mixture model and distance models. We found that the Royle-Nichols and time-to-detection models can be used with ARU data to produce abundance estimates similar to those generated by a point-count based study but with greater precision. ARU-based models produced confidence or credible intervals that were on average 31.9% ( 11.9 SE) smaller than their point-count counterpart. Our findings were consistent across two species with differing relative abundance and habitat use patterns. The higher precision of models fit using ARU data is likely due to higher cumulative detection probability, which itself may be the result of greater survey effort using ARUs and machine-learning classifiers to sample significantly more time for focal species at any given point. Our results provide preliminary support the use of ARUs in abundance-based study applications, and thus may afford researchers a better understanding of habitat quality and population trends, while allowing them to make more informed conservation actions and recommendations.

README: Data and code for ARU and point-count abundance estimates

 Description of the data and file structure

point_count_data.csv file contains Wood Thrush (WOTH) and Cerulean Warbler (CERW) point-count data. Columns are formatted as such: "Species Code_ Data type Visit number. For example, WOTH_count1 contains the number of Wood Thrush counted during visit 1. Rows=sites.

CERW_det_hist_2020-2021.csv and WOTH_det_hist_2020-2021.csv contain ARU-derived detection histories for Cerulean Warbler (CERW) and Wood Thrush (WOTH). The "det1", "det2", "det3", etc, columns represent whether the species occurred on day 1, 2, or 3, etc. of the survey day. The "ttd" column indicates which day the species was first detected over the 10-day window. 


.R file contains code for four different models (time-to-detection, N-mixture, Distance, Royle-Nichols) each of which occurs in four different chunks of code (each chunk pertaining to a different species and data type; 1: Wood Thrush point counts, 2: Cerulean Warbler point counts, 3: Wood Thrush ARUs, 4: Cerulean Warbler ARUs). The time-to-detection categorical abundance model written in JAGS language (JAGS software must be installed) is defined at the top of the code and called within each chunk. Within each chunk, abundance estimates are derived from the "best" (lowest AIC) model from a comparison of models with different detection covariates. Abundance estimates are the result of back-transformed beta-coefficients. C-hat (lack of fit ratio) was estimated in each chunk and used to inflate confidence/credible intervals.


National Fish and Wildlife Foundation

Gordon and Betty Moore Foundation

National Science Foundation