UAV cotton flower counting dataset
Data files
Feb 05, 2025 version files 2.65 GB
-
aerial_dataset.zip
2.57 GB
-
aerial_flower_counting-feature-plot-count-analysis.zip
26.68 MB
-
Auto-Count_Results.xlsx
188.93 KB
-
IHF_NILs_Genotype_Information.xlsx
55.12 KB
-
README.md
3.13 KB
-
yolov8m_flower_active_9.pt
52.02 MB
Abstract
Many perennial plants make important contributions to agroeconomies and agroecosystems, but have complex architecture and/or long flowering duration that hinders measurement and selection. Iteratively tracking productivity over a long flowering/fruiting season may permit the identification of genetic factors conferring different reproductive strategies that might be successful in different environments, ranging from rapid early maturation that avoids stresses, to late maturation that utilizes the full seasonal duration to maximize productivity. In cotton, a perennial plant that is generally cultivated as an annual crop, we apply aerial imagery and deep learning methods to novel and stable genetic stocks, identifying genetic factors influencing the duration and rate of fruiting. While these factors may have different relationships with crop productivity and quality in different environments, their determination adds potentially important information to breeding decisions. With transfer learning of the deep learning models, this approach could be applied widely, potentially improving gains from selection in diverse perennial shrubs and trees essential to sustainable agricultural intensification.
README: UAV Cotton Flower Counting Dataset
https://doi.org/10.5061/dryad.5qfttdzhb
Description of the data and file structure
These data were collected with a UAV at a cotton breeding field in Watkinsville, Georgia in 2021. The field was scanned twice weekly, and the data was analyzed using an automated pipeline:
- The raw images were stitched together into an orthophoto
- Individual plot-level crops were extracted
- An object detector was used to detect flowers in the plot images
- The flower counts were analyzed in order to produce various phenotyping metrics
This dataset contains the raw, annotated images used to train the object detector. In addition to the data we collected in 2021, this dataset also includes some additional data, including UAV images from previous years (2016, 2018) as well as some plot-level images collected from a tractor. All images contain bounding box annotations for each flower.
We have also included a model trained on this dataset, and the raw flower counts extracted by this model on our complete 2021 dataset.
Files and variables
File: aerial_dataset.zip
Description: The annotated flower detection dataset. It is in YOLO format. The various sub-folders in the dataset contain data from different sessions.
- flower0x: These sessions contain ground data collected using a tractor-mounted platform. They were used to bootstrap the active learning process.
- active_x: These contain the data added in each subsequent round of active learning. They are drawn from the entire set of data collected in 2021.
- YYYY-MM-DD: These contain data collected during previous seasons. They are used for validation.
File: IHF_NILs_Genotype_Information.xlsx
Description: List of which plot numbers in the field were planted with which genotype.
Variables
- SN: Index
- Genotype: Unique identifier for the genotype. Green highlights denote check genotypes, which are not NILs but instead standard genotypes that we planted for comparison.
- 2021 Identifier #: The plot number in the field where this genotype was planted* **Population: **The population that this genotype belongs to
File: Auto-Count_Results.xlsx
Description: Per-plot flower counts extracted by our trained model on the data we collected in 2021.
Variables
- Plot: The plot number in the field that the counts are for
- 2021-XX-XX: The automatic flower count for this plot on this day.
File: yolov8m_flower_active_9.pt
Description: YOLOv8 model weights trained on the provided flower detection dataset.
File: aerial_flower_counting-feature-plot-count-analysis.zip
Description: Archive of our Git repository, containing the analysis software that we used.
Code/software
The code we used for data analysis is available on Github.
Access information
Other publicly accessible locations of the data:
- None
Data was derived from the following sources:
- None
Methods
Data were collected twice a week from 2021-08-09 through 2021-11-05. Some sessions were skipped due to inclement weather conditions, resulting in a total of 23 sessions. Data collection was halted after the first overnight freeze, after which most of the plants showed a significant drop in the number of flowers produced. Images of the field were collected using a Matrice 100 drone (DJI, Shenzhen, China) fitted with a custom mount and equipped with a Lumix G7 camera (Panasonic Corporation of North America, Newark, N.J., USA) and a 17 mm lens. The drone was flown at a height of 15 meters, resulting in a GSD of 0.23 cm/px. In a few cases, technical issues with the Matrice 100 data required the substitution of equivalent data from a DJI Phantom 4 Pro v2 drone. The images were geo-referenced using a total of six ground control points distributed throughout the field, with their exact positions measured using a Realtime-Kinematic (RTK) GPS.