Accounting for imperfect detection in data from museums and herbaria when modeling species distributions: Combining and contrasting data-level versus model-level bias correction
Data files
Jun 22, 2021 version files 15.79 MB
-
01_Download_Data.R
2.83 KB
-
02_Process_Data.Rmd
39.91 KB
-
03_Assemble_Simple_Model_Folds.R
3.27 KB
-
04A_Crossvalidation_FullModel.R
25.85 KB
-
04B_Crossvalidation_SimpleModel.R
14.63 KB
-
04B_Crossvalidation_Summary_AllSpecies.R
7.94 KB
-
05_Figures.R
61.19 KB
-
correctedDates2.csv
6.85 KB
-
flagged_coordinate_data.csv
1.27 MB
-
incorrectYearsCollectors.csv
8.27 KB
-
listCollectors_refined.csv
12.96 MB
-
MANIND_collectors_of_species.RData
14.82 KB
-
MANIND_surv_cov.RData
80.48 KB
-
MANIND_surv_cov3.RData
98.65 KB
-
METTOX_collectors_of_species.RData
14.89 KB
-
METTOX_surv_cov.RData
80.78 KB
-
METTOX_surv_cov3.RData
98.94 KB
-
README.md
5.28 KB
-
RHUCOP_collectors_of_species.RData
14.98 KB
-
RHUCOP_surv_cov.RData
81.71 KB
-
RHUCOP_surv_cov3.RData
99.88 KB
-
SCHTER_collectors_of_species.RData
15.03 KB
-
SCHTER_surv_cov.RData
82.08 KB
-
SCHTER_surv_cov3.RData
100.26 KB
-
TOXPUB_collectors_of_species.RData
14.81 KB
-
TOXPUB_surv_cov.RData
80.51 KB
-
TOXPUB_surv_cov3.RData
98.68 KB
-
TOXRAD_collectors_of_species.RData
15 KB
-
TOXRAD_surv_cov.RData
81.79 KB
-
TOXRAD_surv_cov3.RData
99.96 KB
-
TOXVER_collectors_of_species.RData
14.81 KB
-
TOXVER_surv_cov.RData
80.52 KB
-
TOXVER_surv_cov3.RData
98.69 KB
-
unit_cov2.RData
11.96 KB
Abstract
The digitization of museum collections as well as an explosion in citizen science initiatives has resulted in a wealth of data that can be useful for understanding the global distribution of biodiversity, provided that the well-documented biases inherent in unstructured opportunistic data are accounted for. While traditionally used to model imperfect detection using structured data from systematic surveys of wildlife, occupancy models provide a framework for modelling the imperfect collection process that results in digital specimen data. In this study, we explore methods for adapting occupancy models for use with biased opportunistic occurrence data from museum specimens and citizen science platforms using 7 species of Anacardiaceae in Florida as a case study. We explored two methods of incorporating information about collection effort to inform our uncertainty around species presence: (1) filtering the data to exclude collectors unlikely to collect the focal species and (2) incorporating collection covariates (collection type, time of collection, and history of previous detections) into a model of collection probability. We found that the best models incorporated both the background data filtration step as well as collector covariates. Month, method of collection and whether a collector had previously collected the focal species were important predictors of collection probability. Efforts to standardize meta-data associated with data collection will improve efforts for modeling the spatial distribution of a variety of species.
R code for downloading data, cleaning data, and running occupancy models.
README.MD contains an overview of the R scripts.