Labeled RGB and depth images for cattle body condition score prediction
Data files
Jul 03, 2023 version files 34.52 GB
-
README.md
3.92 KB
-
Total_sorted_DGE_images.zip
178.62 MB
-
Unfiltered_bag_files.zip
34.34 GB
Abstract
This dataset contains a collection of preclassified Criollo cow RGB+depth videos, as well as processed depth, grayscale, and edge images. The dataset was previously used to train convolutional neural networks and vision transformers to estimate body condition scores of cattle, and will be useful to other researchers in need a high quaility visual dataset that incorporates depth for the purposes of three dimensional representations of cattle. The availability of compatible software and image processing packages makes this dataset very robust and applicable to other areas of both agricultural and machine learning research. This dataset has several features that make it a good test of machine learning algorithms such as low sample uniqueness and a slightly subjective metric.
This dataset is composed of videos saved as Rosbag files recorded via an Intel Realsense D435i RGB+depth camera. Each video is unfiltered and unprocessed, and contains RGB and depth channels. These videos are easily processed using the pyrealsense2
library, created by Intel. Filter parameter effects can be observed in realtime via the Intel Realsense Viewer software (hyperlink this). Most videos are recorded at a resolution of 640x480, with several having 848x480. This mistake in resolution is easily recified in postprocessing. During recording, roughly a third the cows were allowed to stand in a lane, and the rest were allowed to walk through the lane unimpeded. Cows that stood in the lane are denoted as "Cow_##" whereas cattle that walked are "Cow_##_####".
The easiest way to manipulate each .bag
file is to use the pyrealsense
library, as intel seems to have used their own conventions. Each depth image generated by the video is 16 bits deep, and special care should be taken when assigning data types so as to not truncate each value. Many image formats are incapable of 16-bit images, and some machine learning librarys will reject an image that is forced to have 16 bits of depth. Therefore, we encourage users to normalize each depth image before converting to an 8-bit data type so as to minimize data corruption.
In the article, each image was assumed to be unique, and this created some unwanted effects. If the application that this dataset will be used for requires unique images, use as few as possible from each cow. Images generated from videos should be temporally distant from each other so as to maximize uniqueness.