Skip to main content
Dryad

Data from: Extending null scenarios with Faddy distributions in a probabilistic randomization protocol for presence-absence data

Data files

May 13, 2021 version files 324 B

Abstract

1. The analysis of species occurrences at discrete locations makes use of statistical methods intended to elucidate whether a random process can explain a particular observed pattern of presences-absences (1-0). Various statistical methods have contributed to the development of null model analysis of (1-0) data in community ecology using randomization tests. Frequentist techniques assuming probability distributions under the null scenarios have been proposed, as in the work by Navarro and Manly (2009) (NM), a protocol that has been applied in the analysis of plant and microbial communities, and chemical hazards.

2. The NM method assumes that presences-absences are governed by independent Bernoulli random variables, and that a non-observable non-negative random variable (“quasi-abundance”) is associated to each species on each location. The quasi-abundance is presumed to follow any of three possible distributions (Poisson, Binomial and Negative Binomial) and to be log-linearly related to the qualitative effects of species and location. By connecting the probability of occurrence of each species on each location, and the "best" quasi-abundance distribution (chosen by profile deviance), it is possible to estimate that probability by generalized linear modelling, which is used, in turn, to generate random matrices via parametric bootstrap. The question now is whether just three distributions are enough to support an “optimal” null model.

3. We provide the theoretical formulation of the original NM protocol for null model analysis, and then expand the quasi-abundance distributions, based on extended Poisson processes (Faddy 1997), to allow general distributions of over-dispersed and under-dispersed discrete random variables. The method is illustrated using presence-absence data of island lizard communities.

4. For the binomial case and Faddy distributions, nonlinear constrained optimization algorithms are needed in order to get maximum likelihood estimates thus, the null-model selection process faces challenging numerical problems (non-convergence to the global optimum). In addition, the process may end up suggesting that the best fitted probabilities for the generation of null matrices are those obtained from links different to the canonical logistic link. This property of the NM protocol should not be ignored, as an improper choice of the null matrix universe may impact the outcome of randomization tests.