Community science enhances modelled bee distributions in a tropical Asian city
Data files
Jan 28, 2024 version files 1.25 MB
Abstract
Bees and the ecosystem services they provide are vital to urban ecosystems, but little is understood about their distributions, particularly in the Asian tropics. This is largely due to taxonomic impediments and limited inventorying, monitoring, and digitization of occurrence records. While expert collections (EC) are demonstrably insufficient by themselves as a data source to model and understand bee distributions, the boom of community science (CS) in urban areas provides an untapped opportunity to learn about bee distributions within our cities. We used CS observations in combination with EC observations to model the distribution of bees in Singapore, a small tropical city-state in Southeast Asia. To address the restricted spatial context, we performed multiple bias corrections and show that species distribution models performed well when estimating the distribution of habitat specialists with distinct range limits detectable within Singapore. We successfully modelled 37 bee species, where model statistics improved for 23 species upon the incorporation of CS observations. Nine species had insufficient EC observations to obtain acceptable models, but could be modelled with the incorporation of CS observations. This is the first study to combine both EC and CS observations to map and model the occurrences of tropical Asian bee species for a highly urbanised region at such fine resolution. Our results suggest that urban landscapes with impervious surfaces and higher temperatures are less suitable for bee species, and such findings can be used to advise the management of urban landscapes to optimise the diversity of bee pollinators and other organisms.
README: Community science enhances modelled bee distributions in a tropical Asian city
Description of the data and file structure
Observations were obtained from both community science(CS) and expert-collected(EC) databases (Table 1). EC observations were defined as those collected by individuals possessing specialised knowledge and experience in the study of bees, including formally-trained students in entomology and researchers associated with the National University of Singapore Insect Diversity Laboratory (PI Ascher), using targeted and standardised sampling methods (Ascher et al., 2019) supplemented by opportunistic sampling. This dataset was the primary basis for recent conservation assessments (Ascher et al., 2022). CS sources were defined as observations collected in large part by non-experts or non-formally trained individuals through open access repositories, such as iNaturalist (Robinson et al., 2020) and social media - specifically from the Facebook Group “The Bees and Wasps of Singapore” (https://www.facebook.com/groups/1450495321695805). Observations can be submitted by anyone to iNaturalist, which are then publicly available for other users to identify. By crowdsourcing identifications, with validation by two identifiers required to achieve Research Grade status, iNaturalist promotes knowledge-building and information sharing to the benefit of various stakeholders – public, scientists and resource managers. Although there may be misidentifications by non-experts, validation by one or more expert users making comprehensive checks across taxa and areas, including of already Research Grade observations, can help to mitigate this. To assemble our dataset, CS images from Singapore submitted by more than 500 observers were validated by multiple experts on Asian bees including the same experts (especially JSA and ZWWS with help from colleagues; see above) who also collected many of and identified nearly all of the recent EC specimens. The latter were consulted directly to confirm the most difficult identifications made from images. Observations were curated to include only data from 2007 until 2022 to match the land use data from Gaw et al. (2019), and observations were inspected to remove data points with clearly misrepresented localities. 'NA's in cells represent the lack of data at the time of collection. So as not to adversely impact conservation efforts or introduce unintended risk to the threatened/vulnerable species associated with our dataset, the precision of geographic coordinates are generalized by reducing the number of decimal places included in the coordinates to 0.1 decimal degrees.
Methods
Observations were obtained from both CS and EC databases (Table 1). EC observations were defined as those collected by individuals possessing specialised knowledge and experience in the study of bees, including formall-trained students in entomology and researchers associated with the National University of Singapore Insect Diversity Laboratory (PI Ascher), using targeted and standardised sampling methods (Ascher et al., 2019) supplemented by opportunistic sampling. This dataset was the primary basis for recent conservation assessments (Ascher et al., 2022). CS sources were defined as observations collected in large part by non-experts or non-formally trained individuals through open access repositories, such as iNaturalist (Robinson et al., 2020) and social media - specifically from the Facebook Group “The Bees and Wasps of Singapore” (https://www.facebook.com/groups/1450495321695805). Observations can be submitted by anyone to iNaturalist, which are then publicly available for other users to identify. By crowdsourcing identifications, with validation by two identifiers required to achieve Research Grade status, iNaturalist promotes knowledge-building and information sharing to the benefit of various stakeholders – public, scientists and resource managers. Although there may be misidentifications by non-experts, validation by one or more expert users making comprehensive checks across taxa and areas, including of already Research Grade observations, can help to mitigate this. To assemble our dataset, CS images from Singapore submitted by more than 500 observers were validated by multiple experts on Asian bees including the same experts (especially JSA and ZWWS with help from colleagues; see above) who also collected many of and identified nearly all of the recent EC specimens. The latter were consulted directly to confirm the most difficult identifications made from images. Observations were curated to include only data from 2007 until 2022 to match the land use data from Gaw et al. (2019), and observations were inspected to remove data points with clearly misrepresented localities. With the CS and EC datasets, we combined them to create a third dataset (hereafter referred to as ‘combined’) to examine the feasibility of combining the CS and EC datasets for greater observation coverage and sampling effort. Blanks in cells represent the lack of data at the time of collection.