Wildwatch Kenya expert verified data

Egna, Nicole 1 ; O'Connor, David1; Stacy-Dawes, Jenna1 ; Tobler, Mathias1 ; Pilfold, Nicholas1 ; Neilsen, Kristin1 ; Simmons, Brooke2; Davis, Elizabeth1; Bowler, Mark3; Fennessy, Julian4; Glikman, Jenny Anne1; Larpei, Lexson5; Lekalgitele, Jesus6; Lekupanai, Ruth6; Lekushan, Johnson6; Lemingani, Lekuran6; Lemirgishan, Joseph6; Lenaipa, Daniel6; Lenyakopiro, Jonathan6; Lenyakopiro, Jonathan6; Lesipiti, Ranis Lenalakiti6; Lororua, Masenge6; Muneza, Arthur4; Rabhayo, Sebastian6; Masiaine, Symon1; Ruppert, Kirstie1; Owen, Megan1

Published Aug 25, 2021 on Dryad. https://doi.org/10.5061/dryad.n02v6wwv9

Data files

Aug 25, 2021 version files 29.22 MB

ACT_Expert_Verified_Dataset.csv

407.52 KB
WWK_Expert_Verified_Dataset.csv

26.21 MB
WWK_Extended_Classification_Set.csv

2.61 MB

Abstract

Scientists are increasingly using volunteer efforts of citizen scientists to classify images captured by motion-activated trail-cameras. The rising popularity of citizen science reflects its potential to engage the public in conservation science and accelerate processing of the large volume of images generated by trail-cameras. While image classification accuracy by citizen scientists can vary across species, the influence of other factors on accuracy are poorly understood. Inaccuracy diminishes the value of citizen science derived data and prompts the need for specific best practice protocols to decrease error. We compare the accuracy between three programs that use crowdsourced citizen scientists to process images online: Snapshot Serengeti, Wildwatch Kenya, and AmazonCam Tambopata. We hypothesized that habitat type and camera settings would influence accuracy. To evaluate these factors, each photo was circulated to multiple volunteers.

All volunteer classifications were aggregated to a single best answer for each photo using a plurality algorithm. Subsequently, a subset of these images underwent expert review and were compared to the citizen scientist results. Classification errors were categorized by the nature of the error (e.g. false species or false empty), and reason for the false classification (e.g. misidentification). Our results show that Snapshot Serengeti had the highest accuracy (97.9%), followed by AmazonCam Tambopata (93.5%), then Wildwatch Kenya (83.4%). Error type was influenced by habitat, with false empty images more prevalent in open-grassy habitat (27%) compared to woodlands (10%). For medium to large animal surveys across all habitat types, our results suggest that to significantly improve accuracy in crowdsourced projects, researchers should use a trail-camera set up protocol with a burst of three consecutive photos, a short field of view, and determine camera sensitivity settings based on in situ testing. Accuracy level comparisons such as this study can improve reliability of future citizen science projects, and subsequently encourage the increased use of such data.

WWK Expert Verified Dataset and ACT Expert Verified Dataset:

Photos from Wildwatch Kenya and AmazonCam Tambopata citizen science platforms were classified by experts into expert-verified datasets, “Expert Answers” (EA). For each photo reviewed, the NEA ("Non-expert Answer") was compared to EA. Here the NEA represents the aggregated answer for all the citizen science classifications on that one photo, reporting the species that had a majority of the votes. The column "perc_spec" represents that agreement of the citizen scientists on the content of the photo. When NEA and the EA disagreed, the photo was labeled as ‘false species’ if the NEA falsely identified the species present, or ‘false empty’ if the NEA falsely reported that there was no species in the image. The date and time, the location, the site, the expert who reviewed the photo, and the URL hosting the image are also listed.

WWK Extended Classification Set:

In order to look further into WWK’s lower rate of overall accuracy as compared to SS and ACT, and abundance of false empties compared to ACT, a separate analysis with a subset of 21,530 WWK images was conducted. This subset represented the images that had at least one citizen scientist classification of either a reticulated giraffe, a zebra (Equus quagga or E. grevvi), an elephant (Loxodonta africana), a gazelle, an impala, or a dik dik, and also had only one type of species present. An expert reviewed the images from the Extended Classification Set and determined the images that actually contained either a giraffe, a zebra, an elephant, a gazelle, an impala, or a dik dik. The aggregated NEA of those images were then compared to the EA to determine if the NEA agreed or disagreed with the EA.

For images where the NEA and EA disagreed within the ACT Expert Verified Dataset and the WWK Extended Classification Set, an expert conducted an additional review to determine the most likely reason for disagreement: distance (species was far in the background), night time (image was too dark to determine species), partial view (only a portion of the species was captured in the frame), close up (species was too close to the camera), hidden (vegetation or other obstacle impeding view of the species), or misidentification (species was confused with another species).

Wildwatch Kenya expert verified data

Data files

Abstract

Methods