Data from: Deployment and analysis of instance segmentation algorithm for in-field yield estimation of sweet potatoes
Data files
Jan 16, 2026 version files 8.75 GB
-
Dryad_SweetAPPS_Data.zip
1.49 GB
-
Image_Datasets.zip
6.94 GB
-
model_final.zip
327.70 MB
-
README.md
9.11 KB
Abstract
Shape estimation of sweetpotato (SP) storage roots is inherently challenging due to their varied size and shape characteristics. Even measuring “simple” metrics, such as length and diameter, requires significant time investments either directly in-field or afterward using automated graders. We present the results of a model that can perform grading and provide yield estimates directly in the field faster than manual measurements. Detectron2, a library consisting of deep-learning object detection algorithms, was used to implement Mask R-CNN, an instance segmentation model. This model was deployed for in-field grade estimation of SP roots and evaluated against an optical sorter. Roots from various clones imaged with a cellphone during trials between 2019 and 2020, were used in the model’s training and validation to fine-tune a model to detect SP roots. Our results showed that the model (Average Precision = 74.1) could distinguish individual roots in environmental conditions, including variations in lighting and soil characteristics. Root mean square error (RMSE) for length, diameter, and weight, from the model compared to a commercial optical sorter, were 0.66 cm, 1.22 cm, and 74.73 g, respectively, while the RMSE of root counts per plot was 5.27 roots, with R^2 = 0.8. This phenotyping strategy has the potential to enable rapid yield estimates in the field without the need for sophisticated and costly sorters and may be more readily deployed in environments with limited access to these resources or facilities.
Dataset DOI: 10.5061/dryad.wh70rxx0z
Description of the data and file structure
This archive contains data presented in our paper, including the masks and imagery used to train our Mask R-CNN sweetpotato instance segmentation model, and data used to validate it (plot-level dataset, one-to-one dataset, and the commercial optical sorter baseline dataset).
Files and variables
File: Dryad_SweetAPPS_Data.zip
Description: Contains the contents of the one-to-one dataset, plot-level dataset, and the sorter baseline data. Decompressing this file will create the following directories:
- 3D_set: Monte-Carlo simulation data. Contains a Google Colab file, “process_data.ipynb”, that was used to generate the figures and analyses used for the free-space, plane, and roller data. Monte-Carlo datasets that were used are contained in the folders “free_space”, “plane”, and “roller” and are .txt files that are formatted as comma-separated value (CSV) data.
- Cellphone_Images: One-to-one dataset. Contains folders, numbered 1-30, that contains the images that were used for the one-to-one dataset. Each folder contains two images: one of the roots with the labels face-down, and another with the labels face-up. This is so that each sweetpotato can be associated with its corresponding measurement on the Exeter (commercial) sorter. Additionally, each folder contains a file "pair_dict.xlsx", which contains columns "region", defining the MaskRCNN-based region index for each sweetpotato in the two folders' images (e.g., the file contains columns "region", "IMG_7172", and "IMG_7175" for the folder "10"). Each region has an index that refers back to the Exeter one-to-one dataset entry (see #3 below), and allows each region in each image to be matched (paired) to the Exeter's dataset on a instance-by-instance basis.
- Exeter: One-to-one dataset. Contains a CSV and excel (xlsx) spreadsheet that has the individual sorter’s measurements for length, width, and weight. Note that length and width are measured in 1/100ths of an inch (e.g., 200 would be 2 inches). A folder “Images” contains the two-view images from the Exeter sorter. The file manual.xlsx contains:
- plotname: ID number associated with each individual instance. Note: manual.xlsx contains this number and is used by the script, while manual.csv contains the original image names for traceability back to the original image files.
- estimateddiameter: Diameter in 1/100ths of an inch.
- Estimatedweight: Weight in 10ths of ounces (e.g., a number 358 would be 35.8 oz);
- estimatedlength: Length in 1/100ths of an inch.
- Mask_RCNN_Calibrated: One-to-one dataset. Contains the Mask R-CNN processed data for the images contained in the folder “Cellphone_Images,” which was used to establish the length, width, and weight estimates for the one-to-one dataset. This folder contains many sub-folders that are image names from the “Cellphone_Images” folders (e.g., IMG_7150). Within this folder are a qualitative image from Mask R-CNN showing the outlines of each detected mask, a zip file containing all the individual binary masks (numbered by region), and a csv file for each mask (e.g., csv_IMG_7150) with columns:
- region. A integer denoting which region (mask ID number) that each row of metrics is associated with. This number matches the number in the “masks_IMG_XXXX.zip” file and the numbers in the qualitative mask detection image (e.g., “masked_IMG_XXXX.jpg”).
- area (px^2). Area in square pixels.
- width (px). Width in pixels.
- length (px). Length in pixels.
- volume (px^3). Volume in pixels from ellipsoid model.
- area (in^2). Area in square inches.
- width (in). Width in inches.
- length (in). Length in inches.
- volume (in^3). Cubic inches estimated from ellipsoid model.
- solidity strict_solidity. A computer vision metric.
- Enrique_weight (g). This is an estimate provided for the emperical model presented in a different manuscript (10.1016/j.atech.2024.100469)
- processed_data: Plot level dataset. This folder contains a series of folders containing image names. The image names and data are formatted exactly as (4) above and represent Mask R-CNN outputs for each image taken in for the plot-level data assessment.
- test: Plot level dataset. This folder contains the full resolution images for the plot level dataset in (5) above.
- Kylie_Complete.xlsx: This file contains the observed Exeter length, width, and weights as compared to manually measured length, width, and weight for the one-to-one Exeter dataset.
- 2020Sweetapps roots.csv: Contains plot-level Exeter data containing columns as:
- plotname: A unique ID for the plot in the field.
- estimateddiameter, estimatedweight, and estimatedlength are defined the same as #3 above.
- exeter_stats.py: Code to parse the Exeter data's file names.
- SweetAPPS_MaskRCNN_DataValidation.ipynb and SweetAPPS_MaskRCNN_DataValidation.py: See below, runs the comparison and plotting / analysis code.
- run_del_datedfolder.py: See below, runs inference.
- Readme.docx: This information.
File: Image_Datasets.zip
Description: Contains labels and cellphone imagery that was used to train the Mask R-CNN model.
- SP_dataset: contains two folders:
- annotations: labels for individual sweetpotato regions in json format, as produced by the VGG Image Annotator (VIA) tool. Documentation on this format is provided at their github: https://github.com/ox-vgg/via
- Images: contains four folders:
- all: All data in one place. These are cellphone images that were used to train the Mask R-CNN model and can be associated with the annotations in the annotations folder.
- test: 20% testing data that was used for model developing and assessments.
- train: 60% training data that was used for model developing and assessments.
- validate: 20% validation data that was used for model developing and assessments.
File: model_final.zip
Description: Contains the model coefficients of the trained Mask R-CNN data, for use with detectron2. This is a single file with the format “.pth”. Once detectron2 package is installed, this file is pointed to instead of the default. This would be initialized as follows:
\#Initialize configuration for Mask-RCNN
cfg = get_cfg()
cfg.merge_from_file("/home/user/Downloads/detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.DEVICE = "cpu" #change to “cuda” if nvidia gpu is present.
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1
cfg.MODEL.WEIGHTS = r'/home/user/Downloads/detectron2/model_final.pth'
Code/software
Code is present in "SweetAPPS_MaskRCNN_DataValidation". This file was tested with Google Colab using the “SweetAPPS_MaskRCNN_DataValidation.ipynb” file. As an alternative, users can setup a Python environment and run “SweetAPPS_MaskRCNN_DataValidation.py” with the following Anaconda installation command:
conda create -n sweetapps_analysis_env python=3.10 pandas matplotlib numpy scipy scikit-learn opencv -c conda-forge
conda activate sweetapps_analysis_env
However, the following commands assume the use of Google Colab:
- Import data into a Google Drive folder. For this example, we assume creation of a folder “SweetAPPS_Colab” under your “My Drive”. Note that folder locations will have to be adjusted in the Colab code if it is different.
- Go to “Add Workspace Apps” and install “Colaboratory”. Once complete, select “Make Google Colaboratory the default app for files it can open.”
- Right click “Hoang_code_rewrite.ipynb” from Google Drive, select “Open With” and choose “Google Colaboratory”. This will launch a new tab with the notebook opened.
- Once running, execute the cells in order to generate the figures or access/export the paper’s data. Upon executing the first cell, you do need to grant Google Colab permissions to your Google Drive by selecting “Connect to Google Drive” when prompted to “Permit this notebook to access your Google Drive files?”
Note: an example “run_del_datedfolder.py” is also included. This example will load the model coefficients for inference. Users are directed to detectron2’s github (https://github.com/facebookresearch/detectron2) to configure and run this code. In this example, images that appears in the folder specified in “input_path” (in the example, “/home/user/Downloads/detectron2/MaskRCNN/Input”) will be processed by Mask R-CNN and placed into a new folder in “/home/user/Downloads/detectron2/MaskRCNN/Output”. The folder name will match the file name that was inserted in the “Input”. This setup can enable a real-time flow of images to be processed by the model.
