Data from: Automatic segmentation of early Triassic vertebrate fossil CT scans: Reducing human annotation time through deep learning
Data files
Sep 10, 2024 version files 7.85 GB
-
QMF60282_Data.zip
7.85 GB
-
README.md
2.05 KB
Abstract
This is an image dataset used in deep learning studies for the automated segmentation of fossil x-ray CT images (see related publication). The dataset consists of CT slices and training slices for Queensland Museum Specimen QMF60282 discovered in Early Triassic rocks of Queensland, Australia.
Please note that the datasets included herein are sufficient be used for taxonomical, morphological and taphonomical studies, and is part of ongoing active research. We therefore request that you please ask for consent from either the correspondence author Espen M. Knutsen (espen.knutsen@qm.qld.gov.au) or the Queensland Museum geoscience collection staff prior to using this data for such work.
README: Data from: Automatic segmentation of early Triassic vertebrate fossil CT scans: Reducing human annotation time through deep learning
https://doi.org/10.5061/dryad.mw6m9064n
This is an image dataset used in Deep Learning studies (see related publication) consisting of CT slices and training slices of a Triassic fossil (QMF60282) from Queensland, Australia, held at the Queensland Museum. The specimen was CT scanned at the Imaging and Medical Beam Line (IMBL) at the Australian Synchrotron in 2020, producing a stack of 2159 image slices measuring 2560x2560 pixels, and a voxel size of 10μm. Across the visible extent of the specimen within the CT image stack, every 200th slice was manually segmented for Regions of Interest (ROIs). The presence of air and rock matrix was indicated by black colour, while fossil material was indicated by white colour.
Description of the data and file structure
The image datasets are uploaded as a single ZIP-file within which the folder containing the CT x-ray image slices is named "QMF60282_CT_Slices" and the folder containing the input training slices is named "QMF60282_Input_Slices".
The CT image slices consist of 2159 files in TIFF format, which can be opened individually or as an image stack in ImageJ or other image/CT viewing software.
The manually segmented training data consists of 18 files in TIFF format, which can be opened individually in ImageJ or other image software.
Usage notes
For usage in Deep Learning, please refer to the published article associated with this supplementary material.
Please note that the datasets included herein are sufficient be used for taxonomical, morphological and taphonomical studies, and is part of ongoing active research. We therefore request that you please ask for consent from either the correspondence author Espen M. Knutsen (espen.knutsen@qm.qld.gov.au) or the Queensland Museum geoscience collection staff prior to using this data for such work.
Methods
The specimen (QMF60828) was CT scanned at the Imaging and Medical Beam Line (IMBL) at the Australian Synchrotron in 2020, producing a stack of 2159 x-ray image slices measuring 2560x2560 pixels, and a voxel size of 10μm. Across the visible extent of the specimen within the CT image stack, every 200th slice was manually segmented for Regions of Interest (ROIs), resultting in 9 training slices. The presence of air and rock matrix was coloured black, while fossil material was coloured white. These were used to train our Deep Learning Model, which was then applied to the entire CT image stack of 2159 x-ray slices, producing a model-predicted ROI-segmented image stack. These were used as a template to produce a further 9 training slices, resulting in a final training dataset consists of 18 slices, or every 100th slide across the entire CT stack.