Integrating data from different taxonomic resolutions to better estimate community alpha diversity
Cite this dataset
Adjei, Kwaku et al. (2024). Integrating data from different taxonomic resolutions to better estimate community alpha diversity [Dataset]. Dryad. https://doi.org/10.5061/dryad.34tmpg4s0
Abstract
Integrated distribution models (IDMs), in which datasets with different properties are analysed together, are becoming widely used to model species distributions and abundance in space and time. To date, the IDM literature has focused on technical and statistical issues, such as the precision of parameter estimates and mitigation of biases arising from unstructured data sources. However, IDMs have an unrealised potential to estimate ecological properties that could not be properly derived from the source datasets if analysed separately. We present a model that estimates community alpha diversity metrics by integrating one species-level dataset of presence-absence records with a co-located dataset of group-level counts (i.e. lacking information about species identity). We illustrate the ability of community IDMs to capture the true alpha diversity through simulation studies and apply the model to data from the UK Pollinator Monitoring Scheme, to describe spatial variation in the diversity of solitary bees, bumblebees and hoverflies. The simulation and case studies showed that the proposed IDM produced more precise estimates of the community diversity than the single models, and the analysis of the real dataset further showed that the alpha diversity estimates from the IDM were averages of the single models. Our findings also revealed that IDMs had a higher prediction accuracy for all the insect groups in most cases, with this performance linked to the information provided by a data source into the IDM.
README: Integrating data from different taxonomic resolutions to better estimate community alpha diversity
This repository hosts the dataset used for the paper "Integrating data from different taxonomic resolutions to better estimate community alpha diversity". The scripts for the analysis can be accessed with
(https://zenodo.org/badge/424611928.svg)](https://zenodo.org/badge/latestdoi/424611928). This data is a subset of the UK Pollinator Monitoring Scheme (PoMS). Check section 2.1 of the main paper for details of the PoMS survey.
Dataset
FIT_counts_with_effort.csv
This file contains data on the Flower-Insect Transects (FIT) survey. The data contains information on the site identification (site ID), observation date (date), counts for each insect group (counts) and the number of surveys which is a measure of the effort (n_surveys).
pantraps.csv
This file contains data on the pantrap survey. The data is collected from five pan traps. The file contains information on the site identification (site ID), observation date (date), the taxonomic group of the species identification (taxon_group), species identification (taxon_aggregated), number of traps with the species present in them (n_traps_present) and the number of traps used in the survey at the particular site (n_traps)
gr_ref.csv
This file contains the coordinate reference for the site identification (Site ID) in the FIT_counts_with_effort.csv and pantraps.csv. It contains information on the coordinates (EASTING and NORTHING) and region of the sites (region).
all_surveys.csv
By the design of the POMS survey, both pantrap and 10-minute FIT count survey should be conducted by the same observer on the same sampling visit. However, due to some challenges during the sampling visits, some visits did not have both surveys. This file contains information of each site (with identification Site ID), sampling date (date) and an indication of whether pan trap and FIT count surveys (with NA indicating situations where no survey was conducted).
species_lookup.csv
The file contains the species list for each insect group.
Funding
Natural Environment Research Council, Award: NE/R016429/1