Skip to main content

Multi-species occupancy models as robust estimators of community richness

Cite this dataset

Tingley, Morgan W.; Nadeau, Christopher; Sandor, Manette (2020). Multi-species occupancy models as robust estimators of community richness [Dataset]. Dryad.


1. Understanding patterns of diversity is central to ecology and conservation, yet estimates of diversity are often biased by imperfect detection. In recent years, multi-species occupancy models (MSOM) have been developed as a statistical tool to account for species-specific heterogeneity in detection while estimating true measures of diversity. Although the power of these models has been tested in various ways, their ability to estimate gamma diversity – or true community size, N – is a largely unrecognized feature that needs rigorous evaluation.

2. We use both simulations and an empirical dataset to evaluate the bias, precision, accuracy, and coverage of estimates of N from MSOM compared to the widely applied iChao2 non-parametric estimator. We simulated 5,600 datasets across 7 scenarios of varying average occupancy and detectability covariates, as well as varying numbers of sites, replicates, and true community size. Additionally, we use a real dataset of surveys over 9 years (where species accumulation has asymptoted, indicating true N), to estimate N from each annual survey.

3. Simulations showed that both MSOM and iChao2 estimators are generally accurate (i.e., unbiased and precise) except under unideal scenarios where mean species occupancy is low. In such scenarios, MSOM frequently overestimated N. Across all scenarios, MSOM estimates were less certain than iChao2, but this led to over-confident iChao2 estimates that showed poor coverage. Results from the real dataset largely confirmed the simulation findings, with MSOM estimates showing greater accuracy and coverage than iChao2.

4. Community ecologists have a wide choice of analytical methods, and both iChao2 and MSOM estimates of N are substantially preferable to raw species counts. The simplicity of non-parametric estimators has obvious advantages, but our results show that in many cases, MSOM may provide superior estimates that also account more accurately for uncertainty. Both methods can show strong bias when average occupancy is very low, and practitioners should show caution when using estimates derived from either method under such conditions.


Dataset is a mix of empirical data on bird communities in burned forest in California, as well as simulated data to estimate community size. 

Usage notes

Simulations include one R code file ("Code_Simulation.R") and one JAGS model code file ("Simulation_JAGS_model.txt"). R code re-produces all results and figures as presented in the manuscript.

Empirical analysis includes two R code files, 1 raw data file ("Data_EmpiricalRaw.Rdata"), and nine model-result files (e.g., "CommunityResults_2010.Rdata"), one for each year of data 2010–2018. The first R code file ("Code_EmpiricalFit.R") takes 1 year of data and fits a multi-species occupancy model with data augmentation, extracting information on the true community size, creating a model-result file for each year (as provided in this archive). The second R code file ("Code_EmpiricalPlot.R") takes all 9 years of model results and re-creates the figures and tables as presented in the manuscript.