Seeing shapes in clouds: the fallacy of deriving ecological hypotheses from statistical distributions
Cite this dataset
Warren, Robert; Bradford, Mark; Costa, James (2022). Seeing shapes in clouds: the fallacy of deriving ecological hypotheses from statistical distributions [Dataset]. Dryad. https://doi.org/10.5061/dryad.ngf1vhhx7
The explanations behind observations of global patterning in species diversity pre-date the field of ecology itself. The generation of new species-area theories, in particular, far outpaces their falsification, resulting in a centuries-old accumulation in species diversity theories. We use historical assessment and new data analysis to argue that one of the earliest recognized and most consistent patterns in species diversity is not strictly an ecological phenomenon and, when ecological mechanism is invoked, the range of potential mechanisms is too numerous for tractable hypothesis falsification. We provide a historical parallel in that the normal distribution once was treated as a pattern assuming a biological mechanism rather than a statistical distribution that can be generated by biological and non-biological forces. Similarly, power law distributions are ubiquitous in aggregated data, such as the species-area relationship. That nearly identical broad-scale aggregation patterns are observed for both ecological and non-ecological data as a function of area suggest that these broad-scale patterns reflect a statistical distribution that, in itself, cannot be used to discern between or among ecological and non-ecological mechanisms. We argue that by seeking processes in such a ubiquitous pattern, ecologists may read ecological mechanism into statistical patterns, and we suggest that falsifying broad-scale diversity distribution hypotheses should be a greater priority than generating or parameterizing new ones.
We used worldwide nationality boundaries to aggregate data by standard boundaries that are comparable across disparate data sets (and because publicly available species and non-species data commonly are aggregated by national boundaries). We chose species data sets where abundances were available by country (amphibians, ants, birds, freshwater fish, human diseases, Lepidopteran species, mammals, mosquitoes, non-natives, rare, reptiles, trees, crops). We chose non-species data sets that parallel species (where individuals are aggregated by their group membership) [airlines, cheeses, cities, colleges, hospitals, imports, industries, journals, livestock, newspaper companies, political parties]. For example, the number of newspapers printed in a country would not be parallel, but the number of newspaper companies printing newspapers is parallel to the number of species containing individual organisms. We used several data sources where ecological and non-ecological data were aggregated by the same country boundaries:
National Science Foundation