Data from: The increasing disconnection of primary biodiversity data from specimens: how does it happen and how to handle it?
Troudet, Julien, Sorbonne University
Vignes-Lebbe, Régine, Sorbonne University
Grandcolas, Philippe, Sorbonne University
Legendre, Frédéric, Sorbonne University
Published Jun 05, 2018 on Dryad.
Cite this dataset
Troudet, Julien; Vignes-Lebbe, Régine; Grandcolas, Philippe; Legendre, Frédéric (2018). Data from: The increasing disconnection of primary biodiversity data from specimens: how does it happen and how to handle it? [Dataset]. Dryad. https://doi.org/10.5061/dryad.gr883k8
Abstract.—Primary biodiversity data represent the fundamental elements of any study in systematics and evolution. They are, however, no longer gathered as they used to be and the mass-production of observation-based occurrences is overthrowing the collection of specimen-based occurrences. Although this change in practice is a major upheaval with significant consequences in the study of biodiversity, it remains understudied and has not attracted yet the attention it deserves. Analyzing 536 million occurrences from the Global Biodiversity Information Facility (GBIF) mediated data, we show that this spectacular change affects the 24 eukaryote taxonomic classes we targeted: from 1970 to 2016 the proportion of occurrences marked as traceable to tangible material (i.e. specimen-based occurrences) fell from 68 to 18 %; moreover, most of those specimen based-occurrences cannot be readily traced back to a specimen because the necessary information is missing. Ethical, practical or legal reasons responsible for this shift are known, and this situation appears unlikely to be reversed. Still, we urge scholars to acknowledge this dramatic change, embrace it and actively deal with it. Specifically, we emphasize why specimen-based occurrences must be gathered, as a warrant to allow both repeating evolutionary studies and conducting rich and diverse investigations. When impossible to secure, voucher specimens must be replaced with observation-based occurrences combined with ancillary data (e.g. pictures, recordings, samples, DNA sequences). Ancillary data are instrumental for the usefulness of biodiversity occurrences and we show that, despite improving technologies to collate them, they remain rarely shared. The consequences of such a change are not yet clear but we advocate collecting material evidence or ancillary data to ensure that primary biodiversity data collected lately do not partly become obsolete when doubtful.
Figure S1: Accumulation curve of collected species occurrences from 1900 to today. The plot shows the cumulated number of occurrences available between 1900 and 2016.
Figure S2: Average completeness of the GBIF mediated data per year does not evolve along time. The blue line represents the average proportion of columns filled in the DarwinCore format. The blue area represents the standard deviation of this value. The average completeness of the data does not change much over the years and is never above 25 %.