Skip to main content
Dryad

Data from: Across space and time: a review of sampling and analytical biases in fossil data across macroecological scales

Data files

Abstract

Quantitative studies of fossil data have proven critical to a number of major macroevolutionary and macroecological discoveries, such as the ‘Big 5’ mass extinctions of the Phanerozoic. The development and easy accessibility of major meta-data sources such as the Paleobiology Database and Geobiodiversity Database have also spurred the widespread application of these data to testing ecological hypotheses at finer spatiotemporal and phylogenetic scales. However, issues of preservational/taphonomic biases, sampling/collecting biases, taxonomic issues, and analytical choice can impact the degree of interpretative resolution possible, and even obscure biological ‘signal’ from error/bias-introduced ‘noise’. The degree to which these factors can impact analytical interpretations is not well-documented in comparison to the scale of use of these data sources. Here, we review the many forms of systematic error that can creep into a paleoecological study, from the stage of data collection to the interpretation of analytical results, and provide two case studies based upon re-analysis of previously-published datasets to illustrate the varying impacts of such biases. The first case study focuses on the Cambrian Burgess Shale, and the second on the Belly River Group, with both representing highly-sampled, taphonomically characterized, and spatiotemporally-constrained datasets developed through multiple years of sustained field collecting. In the former, we illustrate the impacts of collecting bias through quantitative comparisons of collected vs. discarded specimens over multiple field seasons, illustrating the impact of this data loss on ecological reconstructions and analysis. In the latter case study, we review the impact of preservational biases, the approaches to their quantification and mitigation, where these approaches have led to misinterpretations in the past, and the differences in ecological resolution that result from occurrence vs abundance approaches in macroecological analysis. Lastly, we synthesize these case studies with our review of past approaches to propose a series of recommendations for future paleoecological and macroecological studies, emphasizing the continued importance of high-quality primary data and ongoing need for a first-principles approach to address existing issues of missing data.