Differences in apple fruit shape are independent of fruit size
Data files
Jul 09, 2025 version files 605.15 MB
-
all_binaries.zip
111.98 MB
-
apple_metadata.csv
434.21 KB
-
colour_corrected.zip
492.73 MB
-
README.md
2.42 KB
Abstract
This dataset comprises over 12,000 high-resolution images of apples collected from Canada’s Apple Biodiversity Collection, housed at Agriculture and Agri-Food Canada’s Kentville Research and Development Centre. Each apple was photographed from both the top and side views, capturing more than 500 genetically unique accessions. For each image, three versions are included: the original, a colour-corrected version, and a binary (segmented) version to support various analytical applications.
Using a pseudo-landmarking approach, we determined that the primary source of morphological variation among the apples is related to shape rather than size. To support reproducibility and further analysis, the dataset also includes an initial metadata file that serves as a foundation for conducting morphological assessments.
Dataset DOI: 10.5061/dryad.xgxd254tg
Description of the data and file structure
This dataset contains over 12,000 images of apples collected from Canada's Apple Biodiversity Collection located at Agriculture and Agri-Food Canada's Kentville Research and Development Centre. Images of both the top and side of the fruit were taken from over 500 genetically unique accessions and includes the original, colour corrected, and binary versions of each image.
The fruit images uploaded were taken from Canada’s Apple Biodiversity Collection located at Agriculture and Agri-Food Canada’s (AAFC) Kentville Research and Development Centre (KRDC) in Nova Scotia, Canada in 2016. A comprehensive description of the design, maintenance, and harvest of the Canada’s Apple Biodiversity Collection can be found in Watts et al. (2021) Plants People Planet. Images were colour corrected using the included colour card and then converted into binary scans using MATLAB® (The MathWorks Inc., 2018) following the methods described by Li et al. (2022) Methods Mol Biol. A complete description of the data collection and image processing can be found in DeViller et al. (2025) The Plant Phenome Journal.
Files and variables
File: all_binaries.zip
Description: contains subfolder of all binary images
- binary_side_domestica: contains side images of accessions that are domestic (used in DeViller et al., 2025)
- binary_side_not_domestica: contains side images of accessions that are not domestic
- binary_top: contains top images of all accessions
File: apple_metadata.csv
Description: original metadata file needed to reproduce morphological analysis with domestic side images.
Variables
- file: file name of image (e.g., BW = binary, 1010 = tree (nursery_id), 3 = individual)
- dataset: from the apple collection
- px.cm: pixel per cm value calculated from the average of 10 images
- nursery_id: tree identifier
- apple_id: accession identifier
- node: NA
- top_x: landmark coordinate for top point (x value)
- top_y: landmark coordinate for top point (y value)
- bottom_x: landmark coordinate for bottom point (x value)
- bottom_y: landmark coordinate for bottom point (y value)
File: colour_corrected.zip
Description: all images after being colour corrected for colour analysis
The fruit images uploaded were taken from Canada’s Apple Biodiversity Collection located at Agriculture and Agri-Food Canada’s (AAFC) Kentville Research and Development Centre (KRDC) in Nova Scotia, Canada in 2016. A comprehensive description of the design, maintenance, and harvest of the Canada’s Apple Biodiversity Collection can be found in Watts et al. (2021) Plants People Planet. Images were colour corrected using the included colour card and then converted into binary scans using MATLAB® (The MathWorks Inc., 2018) following the methods described by Li et al. (2022) Methods Mol Biol. A complete description of the data collection and image processing can be found in DeViller et al. (2025) The Plant Phenome Journal.