Skip to main content

Data from: N-mixture models estimate abundance reliably: a field test on Marsh Tit using time-for-space substitution

Cite this dataset

Neubauer, Grzegorz; Wolska, Alicja; Rowiński, Patryk; Wesołowski, Tomasz (2021). Data from: N-mixture models estimate abundance reliably: a field test on Marsh Tit using time-for-space substitution [Dataset]. Dryad.


Imperfect detection in field studies on animal abundance, including birds, is common and can be corrected for in various ways. The binomial N-mixture (hereafter binmix) model developed for this task is widely used in ecological studies owing to its simplicity: it requires replicated count results as the input. However, it may overestimate abundance and be sensitive to even small violations of its assumptions. We used a 33-year dataset on the Marsh Tit, Poecile palustris, a sedentary forest passerine, from Białowieża Forest, Poland to validate inference from binmix models by comparing model-estimated abundances to the true number of breeding pairs within the plots, determined by exhaustive population study. The abundance estimates, derived from six springtime (April-May) counts of males on each plot in each year, were highly reliable: 116 out of 132 year-plot estimates (88%) included the true number of pairs within the 95% confidence intervals. Over- and underestimations were thus rare and similarly frequent (9 and 12 cases, respectively), with a tendency to overestimate at low densities and underestimate at high densities. Marsh Tits sing rarely but the frequency of countersinging increases with abundance, leading to non-independence in detections. When accounted for in a submodel for detection, the per-survey number of countersinging events positively affected detection probability but only weakly affected abundance estimates. Simulations further demonstrate that this property, overestimation at low densities and underestimation at high densities, may be a systematic bias of binmix model even if density-dependent detection is absent. While the behaviour of binmix models in specific situations requires more study, we conclude that these models are a valid tool to estimate abundance reliably when intensive population monitoring is not feasible.


The readme file contains an explanation of each of the variables in the datasets. Detailed information on field methods used to collect these data can be found in the associated manuscript referenced above.

Usage notes

The dataset contains three data files which are read when executing the R codes and three code files.

The data files include:

1. results of counts of singing Marsh Tit Poecile palustris males in Białowieża National Park, Poland, 1987-2019 (file 'pl_nc_v5_1987-2019.txt');

2. the file with the number of countersinging males and cases recorded during surveys (file 'countersinging.txt');

3. the file with the true number of breeding pairs on the plots where counts have been performed (file 'True_state.txt').

R code files needed to repeat the analyses presented in the paper include:

1. 'Rcode_model_selection_P_NB_ZIP.R' to perform model selection (separated from the main code in pt 2, due to long computing time)

2. 'Rcode_binmix.R' containing code to read relevant data and perform N-mixture models and generalized linear mixed models fitting, along with drawing the main figures in the paper.

3. 'Rcode_Simulations.r' to perform simulations reported in the paper and draw figures.

All the codes are richly commented and have a simple structure to ease understanding. Note that executing the entire code 2 (w/o bootstrapped GOF test) takes c 20 minutes on a 2.6 GHz i5 processor and executing model selection code could even be longer.