Analysis, camera, and image files for: ecoEye, embedded vision camera for biodiversity monitoring
Data files
Sep 26, 2024 version files 54.67 MB
-
A_-_Durian_bats.zip
5.03 MB
-
B_-_bats_and_insects.zip
1.55 MB
-
C_-_board_insects.zip
4.03 MB
-
D_-_rapeseed_bees.zip
10.08 MB
-
E_-_waterfowl.zip
3.71 MB
-
F_-_dianthus.zip
30.24 MB
-
README.md
22.96 KB
Abstract
We need comprehensive information to manage and protect biodiversity in the face of global environmental challenges, and artificial intelligence is needed to generate that information from vast amounts of biodiversity data. Currently, vision-based monitoring methods are heterogenous; they poorly cover spatial and temporal dimensions, overly depend on humans, and are not reactive enough for adaptive management.
To mitigate these issues, we present a portable, modular, affordable, and low-power device with embedded vision for biodiversity monitoring of a wide range of terrestrial taxa. Our camera uses interchangeable lenses to resolve barely visible and remote targets, as well as customisable algorithms for blob detection, region-of-interest classification, and object detection to identify them. We showcase our system in six use cases from the ethology, landscape ecology, agronomy, pollination ecology, conservation biology, and phenology disciplines.
Using the same devices with different automated setups, we discovered bats feeding on Durian tree flowers, monitored flying bats and their insect prey, identified nocturnal insect pests in paddy fields, detected bees visiting rapeseed crop flowers, triggered real-time alerts for waterfowl, and tracked flower phenology over months. We measured classification accuracies (i.e., F1-scores) between 55% and 95% in our field surveys and used them to standardise observations over highly-resolved time scales.
Our cameras are amenable to situations where automated vision-based monitoring is required off the grid, in natural and agricultural ecosystems, and in particular for quantifying species interactions. Embedded vision devices such as this will help addressing global biodiversity challenges and facilitate a technology-aided agricultural systems transformation.
https://doi.org/10.5061/dryad.1ns1rn90j
Description of the data and file structure
Each ZIP file corresponds to and is named after one of the use cases described in the manuscript (A to F). It contains:
- “analysis” subfolder: the CSV files and R scripts ("A-F - analysis[...].R" and "A-F - precision and recall[...].R") needed to reproduce the results and figures in R. The scripts only require to manually input the path to the folder where the ZIP was extracted. CSV files are described below.
- “camera” subfolder: micropython scripts ("main.py", "ecofunctions.py", etc.) and optionally TensorFlow model ("model.tflite") and labels file ("labels.txt") to replicate the camera operation. These files should be placed at the root of the SD card of the ecoEye camera.
- “samples” subfolder: 10 representative images obtained during the field deployments
Files and variables
File: README.md
Description: Description of all the data CSV files contained in the ZIP files.
File: A_-_ALL_detections.csv
Description: Comma-separated file containing detections logged by the camera over all deployment dates. The date column was added manually when merging the data from the different dates. For visually inspected images, new rows were inserted manually to represent false negative detections ; their detection ID has a suffixed lowercase letter.
Variables
- date : Y/M/D format
- detection_id : incremental integer for the detections, restarts on each date. Discontinuous due to removed CNN object detections.
- clear_bat: whether a bat is clearly visible (1) or not (blank)
- flower_id: the ID of the flower on which the bat is feeding. NA when bats are flying
- picture_id: incremental integer for the pictures, restarts on each date
- blob_pixels: number of contiguous pixels that surpassed the detection threshold
- blob_elongation: value between 0 and 1 representing how long (not round) the blob is
- blob_corner1_x: x coordinate of the top left blob corner
- blob_corner1_y: y coordinate of the top left blob corner
- blob_corner2_x: x coordinate of the top right blob corner
- blob_corner2_y: y coordinate of the top right blob corner
- blob_corner3_x: x coordinate of the bottom right blob corner
- blob_corner3_y: y coordinate of the bottom right blob corner
- blob_corner4_x: x coordinate of the bottom left blob corner
- blob_corner4_y: y coordinate of the bottom left blob corner
- blob_l_mode: mode of the luminance channel in the Lab color space blob histogram
- blob_l_min: minimum of the luminance channel in the Lab color space blob histogram
- blob_l_max: maximum of the luminance channel in the Lab color space blob histogram
- blob_a_mode: mode of the a channel in the Lab color space blob histogram
- blob_a_min: minimum of the a channel in the Lab color space blob histogram
- blob_a_max: maximum of the a channel in the Lab color space blob histogram
- blob_b_mode: mode of the b channel in the Lab color space blob histogram
- blob_b_min: minimum of the b channel in the Lab color space blob histogram
- blob_b_max: maximum of the b channel in the Lab color space blob histogram
- comments: manually written comments upon visual inspection
File: A_-_ALL_images.csv
Description: Comma-separated file containing pictures information logged by the camera over all deployment dates.
Variables
- date : Y/M/D format
- picture_id: incremental integer for the pictures, restarts on each date
- date_time: date and time in YYYY-M-D-H-M-S format
- exposure_us: exposure time in microseconds
- gain_dB: digital gain applied by the image sensor to brighten the image (in decibels)
- frames_per_second: picture-taking rate of the camera
- image_type: whether the saved image was used as a a reference to subtract the current image from
File: B_-_ALL_detections.csv
Description: Comma-separated file containing detections logged by the camera over all deployment dates. The date column was added manually when merging the data from the different dates. For visually inspected images, new rows were inserted manually to represent false negative detections ; their detection ID has a suffixed lowercase letter.
Variables
- location: position of the camera
- date : M/D/Y format
- detection_id : incremental integer for the detections, restarts on each date.
- picture_id: incremental integer for the pictures, restarts on each date
- blob_pixels: number of contiguous pixels that surpassed the detection threshold
- blob_elongation: value between 0 and 1 representing how long (not round) the blob is
- blob_corner1_x: x coordinate of the top left blob corner
- blob_corner1_y: y coordinate of the top left blob corner
- blob_corner2_x: x coordinate of the top right blob corner
- blob_corner2_y: y coordinate of the top right blob corner
- blob_corner3_x: x coordinate of the bottom right blob corner
- blob_corner3_y: y coordinate of the bottom right blob corner
- blob_corner4_x: x coordinate of the bottom left blob corner
- blob_corner4_y: y coordinate of the bottom left blob corner
- blob_l_mode: mode of the luminance channel in the Lab color space blob histogram
- blob_l_min: minimum of the luminance channel in the Lab color space blob histogram
- blob_l_max: maximum of the luminance channel in the Lab color space blob histogram
- blob_a_mode: mode of the a channel in the Lab color space blob histogram
- blob_a_min: minimum of the a channel in the Lab color space blob histogram
- blob_a_max: maximum of the a channel in the Lab color space blob histogram
- blob_b_mode: mode of the b channel in the Lab color space blob histogram
- blob_b_min: minimum of the b channel in the Lab color space blob histogram
- blob_b_max: maximum of the b channel in the Lab color space blob histogram
- image_labels: semicolon-separated classification labels
- image_confidences: semicolon-separated classification confidence scores
- image_x: X coordinate of the top left corner of the blob bounding rectangle (ROI) that was classified
- image_y: Y coordinate of the top left corner of the blob bounding rectangle (ROI) that was classified
- image_width: width in pixels of the blob bounding rectangle (ROI) that was classified
- image_height: height in pixels of the blob bounding rectangle (ROI) that was classified
- target: actually detected target, as determined by visual inspection
- unclear: whether the target type is unclear (1) or clear (blank)
- comments: manually written comments upon visual inspection
File: B_-_ALL_images.csv
Description: Comma-separated file containing pictures information logged by the camera over all deployment dates.
Variables
- location: position of the camera
- picture_id: incremental integer for the pictures, restarts on each date
- date_time: date and time in YYYY-M-D-H-M-S format.
- exposure_us: exposure time in microseconds
- gain_dB: digital gain applied by the image sensor to brighten the image (in decibels)
- frames_per_second: picture-taking rate of the camera
- image_type: whether the saved image was used as a a reference to subtract the current image from
File: B_-_detection_statistics_mean.csv
Description: Comma-separated file generated by “B - precision-recall bats and insects.R” script containing performance information for the ROI classification.
Variables
- image_label: classification label
- Best confidence threshold: threshold at which the accuracy (i.e., F1-score) is maximal
- Detections: how many detections are generated at this threshold
- False positives: number of false positive detections
- False negatives: number of false negative detections
- True positives: number of true positive detections
- Recall: proportion of the real events (as determined by visual inspection) that are true positives
- Precision: proportion of the detections that are true positives
- max F1 score: maximum accuracy for the given confidence threshold
- Real events: number of times the target with the given label has been detected by visual inspection
- Correction factor: factor to apply to the number of detections to find the real events
File: C_-_ALL_detections.csv
Description: Comma-separated file containing detections logged by the camera over all deployment dates. The date column was added manually when merging the data from the different dates. For visually inspected images, new rows were inserted manually to represent false negative detections ; their detection ID has a suffixed lowercase letter.
Variables
- date : D/M/Y format
- camera: ID of the device
- detection_id : incremental integer for the detections, restarts on each date.
- picture_id: incremental integer for the pictures, restarts on each date
- blob_pixels: number of contiguous pixels that surpassed the detection threshold
- blob_elongation: value between 0 and 1 representing how long (not round) the blob is
- blob_corner1_x: x coordinate of the top left blob corner
- blob_corner1_y: y coordinate of the top left blob corner
- blob_corner2_x: x coordinate of the top right blob corner
- blob_corner2_y: y coordinate of the top right blob corner
- blob_corner3_x: x coordinate of the bottom right blob corner
- blob_corner3_y: y coordinate of the bottom right blob corner
- blob_corner4_x: x coordinate of the bottom left blob corner
- blob_corner4_y: y coordinate of the bottom left blob corner
- blob_l_mode: mode of the luminance channel in the Lab color space blob histogram
- blob_l_min: minimum of the luminance channel in the Lab color space blob histogram
- blob_l_max: maximum of the luminance channel in the Lab color space blob histogram
- blob_a_mode: mode of the a channel in the Lab color space blob histogram
- blob_a_min: minimum of the a channel in the Lab color space blob histogram
- blob_a_max: maximum of the a channel in the Lab color space blob histogram
- blob_b_mode: mode of the b channel in the Lab color space blob histogram
- blob_b_min: minimum of the b channel in the Lab color space blob histogram
- blob_b_max: maximum of the b channel in the Lab color space blob histogram
- columns “board” until “hymenoptera2”: classification confidence scores of the corresponding classes/labels
- image_x: X coordinate of the top left corner of the blob bounding rectangle (ROI) that was classified
- image_y: Y coordinate of the top left corner of the blob bounding rectangle (ROI) that was classified
- image_width: width in pixels of the blob bounding rectangle (ROI) that was classified
- image_height: height in pixels of the blob bounding rectangle (ROI) that was classified
- target: actually detected target, as determined by visual inspection
- unclear: whether the target type is unclear (1) or clear (blank)
- comments: manually written comments upon visual inspection
File: C_-_ALL_images.csv
Description: Comma-separated file containing pictures information logged by the camera over all deployment dates.
Variables
- camera: ID of the device
- picture_id: incremental integer for the pictures, restarts on each date
- date: date in D/M/Y format
- time: time in H:M:S format
- exposure_us: exposure time in microseconds
- gain_dB: digital gain applied by the image sensor to brighten the image (in decibels)
- frames_per_second: picture-taking rate of the camera
- image_type: whether the saved image was used as a a reference to subtract the current image from
- confidence: maximum confidence score of any class detected in the image, if above wireless transfer score threshold
- wifi_connected: whether the WLAN module has successfully connected to the 4G modem. Blank if not attempted.
- data_transfer: whether the confidence score and label were successfully transferred. Blank if not attempted.
- file_transfer: whether the image file was successfully transferred. Blank if not attempted.
File: C_-_detection_statistics mean.csv
Description: Comma-separated file generated by “C - precision-recall rice boards.R” script containing performance information for the ROI classification.
Variables
- image_label: classification label
- Best confidence threshold: threshold at which the accuracy (i.e., F1-score) is maximal
- Detections: how many detections are generated at this threshold
- False positives: number of false positive detections
- False negatives: number of false negative detections
- True positives: number of true positive detections
- Recall: proportion of the real events (as determined by visual inspection) that are true positives
- Precision: proportion of the detections that are true positives
- max F1 score: maximum accuracy for the given confidence threshold
- Real events: number of times the target with the given label has been detected by visual inspection
- Correction factor: factor to apply to the number of detections to find the real events
File: D_-_ALL_detections.csv
Description: Comma-separated file containing detections logged by the camera over all deployment dates. The date column was added manually when merging the data from the different dates. For visually inspected images, new rows were inserted manually to represent false negative detections ; their detection ID has a suffixed lowercase letter.
Variables
- date_time: date and time in YYYY-M-D-H-M-S format
- cage_camera: ID of the device and cage where it was set up
- cage: ID of the cage where the device was set up
- camera: ID of the device
- detection_id : incremental integer for the detections, restarts on each date.
- picture_id: incremental integer for the pictures, restarts on each date
- image_labels: classification label
- image_confidences: classification confidence score
- image_x: X coordinate of the top left corner of the blob bounding rectangle (ROI) that was classified
- image_y: Y coordinate of the top left corner of the blob bounding rectangle (ROI) that was classified
- image_width: width in pixels of the blob bounding rectangle (ROI) that was classified
- image_height: height in pixels of the blob bounding rectangle (ROI) that was classified
- target: whether the label was detected (1) or not (0), as determined by visual inspection
- unclear: whether the target type is unclear (1) or clear (blank)
- comments: manually written comments upon visual inspection
File: D_-_ALL_images.csv
Description: Comma-separated file containing pictures information logged by the camera over all deployment dates.
Variables
- cage: ID of the cage where the device was set up
- camera: ID of the device
- picture_id: incremental integer for the pictures, restarts on each date
- date_time: date and time in YYYY-M-D-H-M-S format
- exposure_us: exposure time in microseconds
- gain_dB: digital gain applied by the image sensor to brighten the image (in decibels)
- frames_per_second: picture-taking rate of the camera
- comments: manual comments
- day_monitoring: manual comments on whether the correspondign day was completely sampled by the camera
File: D_-_detection_statistics group.csv
Description: Comma-separated file generated by “D - precision-recall rapeseed bees.R” script containing performance information for the ROI classification.
Variables
- image_label: classification label
- Best confidence threshold: threshold at which the accuracy (i.e., F1-score) is maximal
- Detections: how many detections are generated at this threshold
- False positives: number of false positive detections
- False negatives: number of false negative detections
- True positives: number of true positive detections
- Recall: proportion of the real events (as determined by visual inspection) that are true positives
- Precision: proportion of the detections that are true positives
- max F1 score: maximum accuracy for the given confidence threshold
- Real events: number of times the target with the given label has been detected by visual inspection
- Correction factor: factor to apply to the number of detections to find the real events
File: D_-_weather_data_timeanddate_customdata.csv
Description: Weather data from local Chinese meteorological service station (from timeanddate.com)
Variables
- Date: date in Y-M-D format
- hour: time in H:M:S format
- air_temperature_C: temperature of the air in degrees Celsius
- weather: sky conditions
- wind_km_h: sind speed in kilometers per hour
- humidity_percent: percent humidity
- barometer: pressure in Pascals
File: E_-_ALL_detections.csv
Description: Comma-separated file containing detections logged by the camera over all deployment dates. The date column was added manually when merging the data from the different dates. For visually inspected images, new rows were inserted manually to represent false negative detections ; their detection ID has a suffixed lowercase letter.
Variables
- camera: ID of the device
- deployment_date: date of deployment in Y-M-D format
- detection_id : incremental integer for the detections, restarts on each date.
- picture_id: incremental integer for the pictures, restarts on each date
- image_labels: classification label
- image_confidences: classification confidence score
- image_x: X coordinate of the top left corner of the blob bounding rectangle (ROI) that was classified
- image_y: Y coordinate of the top left corner of the blob bounding rectangle (ROI) that was classified
- image_width: width in pixels of the blob bounding rectangle (ROI) that was classified
- image_height: height in pixels of the blob bounding rectangle (ROI) that was classified
- target: actually detected target, as determined by visual inspection
- unclear: whether the target type is unclear (1) or clear (blank)
- comments: manually written comments upon visual inspection
File: E_-_ALL_images.csv
Description: Comma-separated file containing pictures information logged by the camera over all deployment dates.
Variables
- camera: ID of the device
- picture_id: incremental integer for the pictures, restarts on each date
- date_time: date and time in YYYY-M-D-H-M-S format
- exposure_us: exposure time in microseconds
- gain_dB: digital gain applied by the image sensor to brighten the image (in decibels)
- frames_per_second: picture-taking rate of the camera
- image_confidence: maximum confidence score of any class detected in the image, if above wireless transfer score threshold
- wifi_connected: whether the WLAN module has successfully connected to the 4G modem. Blank if not attempted.
- data_transfer: whether the confidence score and label were successfully transferred. Blank if not attempted.
- file_transfer: whether the image file was successfully transferred. Blank if not attempted.
File: E_-_detection_statistics_mean.csv
Description: Comma-separated file generated by “E - precision-recall waterbirds.R” script containing performance information for the ROI classification.
Variables
- image_label: classification label
- Best confidence threshold: threshold at which the accuracy (i.e., F1-score) is maximal
- Detections: how many detections are generated at this threshold
- False positives: number of false positive detections
- False negatives: number of false negative detections
- True positives: number of true positive detections
- Recall: proportion of the real events (as determined by visual inspection) that are true positives
- Precision: proportion of the detections that are true positives
- max F1 score: maximum accuracy for the given confidence threshold
- Real events: number of times the target with the given label has been detected by visual inspection
- Correction factor: factor to apply to the number of detections to find the real events
File: F_-_ALL_detections.csv
Description: Comma-separated file containing detections logged by the camera over all deployment dates. The date column was added manually when merging the data from the different dates. For visually inspected images, new rows were inserted manually to represent false negative detections ; their detection ID has a suffixed lowercase letter.
Variables
- detection_id : incremental integer for the detections, restarts on each date.
- camera: ID of the device
- deployment: date of deployment in Y-M-D format
- picture_id: incremental integer for the pictures, restarts on each date
- image_labels: classification label
- image_confidences: classification confidence score
- image_x: X coordinate of the top left corner of the blob bounding rectangle (ROI) that was classified
- image_y: Y coordinate of the top left corner of the blob bounding rectangle (ROI) that was classified
- image_width: width in pixels of the blob bounding rectangle (ROI) that was classified
- image_height: height in pixels of the blob bounding rectangle (ROI) that was classified
- target: whether the label was detected (1) or not (0), as determined by visual inspection
- unclear: whether the target type is unclear (1) or clear (blank)
- comments: manually written comments upon visual inspection
File: F_-_ALL_images.csv
Description: Comma-separated file containing pictures information logged by the camera over all deployment dates.
Variables
- camera: ID of the device
- deployment: incremental integer marking deployments of the same fixed camera, separated by data retrieval
- picture_id: incremental integer for the pictures, restarts on each date
- date_time: date and time in YYYY-M-D-H-M-S format
- exposure_us: exposure time in microseconds
- gain_dB: digital gain applied by the image sensor to brighten the image (in decibels)
- frames_per_second: picture-taking rate of the camera
File: F_-_detection_statistics group.csv
Description:
Variables
- image_label: classification label
- Best confidence threshold: threshold at which the accuracy (i.e., F1-score) is maximal
- Detections: how many detections are generated at this threshold
- False positives: number of false positive detections
- False negatives: number of false negative detections
- True positives: number of true positive detections
- Recall: proportion of the real events (as determined by visual inspection) that are true positives
- Precision: proportion of the detections that are true positives
- max F1 score: maximum accuracy for the given confidence threshold
- Real events: number of times the target with the given label has been detected by visual inspection
- Correction factor: factor to apply to the number of detections to find the real events
Code/software
R version 4.4.1 (2024-06-14) -- "Race for Your Life"
packages:
- cowplot 1.1.3
- data.table 1.16
- DHARMa 0.4.6
- ggeffects 1.7.1
- ggplot2 3.5.1
- ggrepel 0.9.6
- lme4 1.1-35.3
- scales 1.3.0
- stringr 1.5.1
Please refer to the paper's Materials & Methods section and its Supplementary Information.
