Data from: Automating field based floral surveys with machine learning
Data files
Sep 19, 2024 version files 15.48 GB
-
external_val.7z
2.61 KB
-
image.7z
15.23 GB
-
model.7z
208.62 MB
-
predict.7z
45.75 MB
-
README.md
5.98 KB
Abstract
The abundance and diversity of flowering plant species are important indicators of pollinator habitat quality, but traditional field-based surveying techniques are time-intensive. Therefore, they are often biased due to under-sampling and are difficult to scale. Aerial photography was collected across ten sites located in and around Rouge National Urban Park, Toronto, Canada using a consumer-grade drone. A convolutional neural network (CNN) was trained to semantically segment, or identify and categorize, pixel clusters which represent flowers in the collected aerial imagery. Specifically, flowers of the dominant taxa found in the depauperate fall flowering plant community were surveyed. This included yellow flowering Solidago spp., white Symphyotrichum ericoides/lanceolatum and purple Symphyotrichum novae-angliae. The CNN was trained using 930 m2 of manually annotated data, approximately 1% of the mapped landscape. The trained CNN was tested on 20% of the manually annotated data concealed during training. In addition, it was externally validated by comparing the predicted drone-derived floral abundance metrics (i.e., floral area (m2) and the number of floral segments) to the field-based count of floral units estimated for thirty-four 4 m2 plots. The CNN returned accurate multi-classification when evaluated against the testing data. It obtained a precision score of 0.769, a recall of 0.849 and an F1 score of 0.807. The automated floral abundance counting yielded estimates that were strongly correlated with field-based manual counting. In addition, flower segmentation using the trained CNN was time efficient. On average, it took roughly the same amount of time to segment the flowers occurring in an entire drone scene as it took to complete the abundance count of a single quadrat. However, the training process, particularly manual data annotation, was the most time-consuming component of the study. Overall, the analysis provided valuable insights into automated flower classification and abundance estimation using drone imagery and machine learning. The results demonstrate that these tools can be used to provide accurate and scalable estimates of pollinator habitat quality. Further research should consider diverse wildflower systems to develop the generalizability of the methods.
https://doi.org/10.5061/dryad.nvx0k6f1t
This repo contains (1) orthorectified drone images of the ten study sites located in Rouge National Urban Park, Ontario, Canada. The imagery was collected at low altitude (7m, 15m, or 30m) in September 2024 with the DJI Phantom 4 Pro V2. (2) Floral classification maps predicted by the trained convolutional neural network (CNN) that is described in the paper. The CNN was trained to perform multi-classification of the three flower taxa that dominate the Fall flowering landscape in the region. (3) The trained CNN model that performs the multi-classification on input drone imagery. (4) Tabulation of plot-level floral surveys that were used to ground the truth of the CNN model.
The code is provided as supporting data for
Sookhan N, Sookhan S, Grewal D, MacIvor JS. 2024. Automating field-based floral surveys with machine learning. Ecological Solutions and Evidence.
Description of the data and file structure
Files and variables
File: image.7z
Description: Drone orthomosaic rasters of each study site in GeoTIFF format. Compressed folder contains 13 files.
- 20210907_siteCF_h15m.7z: Compressed drone orthomosaic of site CF collected at 15 m altitude above ground level.
-
20210907_siteCF_h30m.7z: Compressed drone orthomosaic of site CF collected at 30 m altitude above ground level.
- 20210913_siteA_h7m.7z: Compressed drone orthomosaic of site A collected at 7 m altitude above ground level.
- 20210913_siteA_h15m.7z: Compressed drone orthomosaic of site A collected at 15 m altitude above ground level.
- 20210913_siteA_h30m.7z: Compressed drone orthomosaic of site A, 1 and 2 collected at 30 m altitude above ground level.
- 20210913_site2_h15m.7z: Compressed drone orthomosaic of site 2 collected at 15 m altitude above ground level.
- 20210906_site5_h7m.7z: Compressed drone orthomosaic of site 5 collected at 7 m altitude above ground level.
- 20210907_site6_h15m.7z: Compressed drone orthomosaic of site 6 collected at 15 m altitude above ground level.
- 20210907_site6_h30m.7z: Compressed drone orthomosaic of site 6 collected at 30 m altitude above ground level.
- 20210905_site8_h7m.7z: Compressed drone orthomosaic of site 8 collected at 7 m altitude above ground level.
- 20210905_site9_h7m.7z: Compressed drone orthomosaic of site 9 collected at 7 m altitude above ground level.
- 20210905_site10_h7m.7z: Compressed drone orthomosaic of site 10 collected at 7 m altitude above ground level.
- 20210907_site15_h15m.7z: Compressed drone orthomosaic of site 15 collected at 15 m altitude above ground level.
File: predict.7z
Description: Floral classification maps predicted by the trained convolutional neural network for each study site in 8-bit GeoTIFF format. Compressed folder contains 13 files.
- 20210907_siteCF_h15m.7z: Compressed floral classification map of site CF collected at 15 m altitude above ground level.
- 20210907_siteCF_h30m.7z: Compressed floral classification map of site CF collected at 30 m altitude above ground level.
- 20210913_siteA_h7m.7z: Compressed floral classification map of site A collected at 7 m altitude above ground level.
- 20210913_siteA_h15m.7z: Compressed floral classification map of site A collected at 15 m altitude above ground level.
- 20210913_siteA_h30m.7z: Compressed floral classification map of site A, 1 and 2 collected at 30 m altitude above ground level.
- 20210913_site2_h15m.7z: Compressed floral classification map of site 2 collected at 15 m altitude above ground level.
- 20210906_site5_h7m.7z: Compressed floral classification map of site 5 collected at 7 m altitude above ground level.
- 20210907_site6_h15m.7z: Compressed floral classification map of site 6 collected at 15 m altitude above ground level.
- 20210907_site6_h30m.7z: Compressed floral classification map of site 6 collected at 30 m altitude above ground level.
- 20210905_site8_h7m.7z: Compressed floral classification map of site 8 collected at 7 m altitude above ground level.
- 20210905_site9_h7m.7z: Compressed floral classification map of site 9 collected at 7 m altitude above ground level.
- 20210905_site10_h7m.7z: Compressed floral classification map of site 10 collected at 7 m altitude above ground level.
- 20210907_site15_h15m.7z: Compressed floral classification map of site 15 collected at 15 m altitude above ground level.
File: model.7z
Description: The trained TensorFlow model in HDF5 format.
File: external_val.7z
Description: Floral counts completed at the plot level that were used to externally validate the automated drone-derived floral abundance predicted by the trained convolutional model. In CSV format.
- date: Date (YYYY-MM-DD) that plot was surveyed.
- time_start: Time (HH:MM) survey commenced.
- time_end: Time (HH:MM) survey was completed.
- site_id: Site ID of plot.
- plot_id: Plot ID of plot.
- total.area: Drone-derived floral abundance measure. CNN predicted floral cover.
- prop.area: Drone-derived floral abundance measure. CNN predicted relative floral cover (relative to plot area)
- fu: Field-derived floral abundance measure. Field count of floral units.
-
shoot: Field-derived floral abundance measure. Field count of number of flowering shoots.
For each of the drone and field-derived floral abundance measures, a separate column is provided for each of the species detected in the study. The species are Solidago spp (SOSP), Achillea millefolium (ACME), Symphyotrichum ericoides (SYER), Symphyotrichum novae-angliae (SYNO), Symphyotrichum lanceolatum (SYLA), Lotus corniculatus (LOCO), Trifolium repens (TRRE), Medicago lupulina (MELU).
Orthorectified imagery of study sites were constructed using data from a drone image acquisition program completed in the Rouge National Urban Park, Ontario, Canada during the late summer of 2021. These data represent typical late-season flowering landscapes of remnant habitat patches found in Southern Ontario, Canada.
The major flowering plant groups (i.e., Solidago spp and Symphyotrichum spp) were automatically mapped using the convolutional neural netowork model trained in this study.