Estimating the number of species in a community is important for assessments of biodiversity. Previous species richness estimators are mainly based on non-parametric approaches. Although parametric asymptotic models have been applied, they received limited attention due to specific limitations. Here, we introduce parametric models fitting the probability-based rarefied species richness curve that allow us to estimate the ‘Total Expected Species’ (TES) in a community based on species’ abundance data. We develop two approaches to calculate TES (termed ‘TESa’ and ‘TESb’), based on two slightly different mathematical assumptions regarding Expected Species (ES) models. We provide R functions to calculate both these estimation approaches and their standard deviation. The function also enables users to visualize the estimation. We test the performance of TESa, TESb and their average (TESab) across simulated and empirical data, and compare their bias, precision and accuracy with other, commonly used, non-parametric species richness estimators; the bias-corrected (bc-)Chao1 and the Abundance-based Coverage Estimator (ACE). Simulation reveals that in small samples, TESa shows a tendency to over-estimate and TESb to under-estimate overall species richness. TESab performs well in bias, precision and accuracy when compared to (bc-)Chao1 and ACE estimators. Results from empirical data shows that the variance generated from TES estimates is comparable to that for (bc-)Chao1 and ACE. Our study demonstrates that rarefaction theory in combination with parametric approximation models provides a valuable new approach to estimate the species richness of incompletely sampled communities. Robust estimates are likely to be obtained where the observed number of species is greater than half of the TES estimation. When the ratio of TESa to the observed richness is >> 2, we suggest the use of TESb or TESab. Although more comprehensive comparisons with other estimators are suggested, we encourage researchers to consider the TES approach in their biodiversity studies as a complement to current existing estimators.

There is no empirical data in the submission. The file only contains R scripts for functions of ES(), TES() and plot.TES().

# ES() calculate the Expected Species'

# TES() calculate Total Expected Species base on ESa, ESb and their average value

# plot.TES() provides the fitted curve for TES

# The argument x for ES() and TES() is a data vector representing number of individuals for each species

# The argument TES_output for plot.TES() is the output from TES()

# The argument m for ES() is the sample size parameter that represents the number of individuals randomly drawn from the sample, which by default is set to m=1, but can be changed according to the users' requirements. For ESa, m can not be larger than the sample size

# The argument method is the calculation approach of Expected Species used, with two options available as "a" and "b" to calculate ESa and ESb, with the default set as "a"

# The argument knots specifies the number of separate sample sizes of increasing value used for the calculation of ES between 1 and the endpoint, which by default is set to knots=40

# ES() returns a value of Expected Species

# TES() returns a list, which contains a table of the summary of the estimated values and their standard deviations based on TESa, TESb, and TESab, and the model used in the estimation of TES, either 'logistic' or 'Weibull'

Estimating total species richness: fitting rarefaction by asymptotic approximation

Data files

Abstract

Estimating total species richness: fitting rarefaction by asymptotic approximation

Data files

Abstract

Methods

Usage notes

Works referencing this dataset