3D micro-CT image of cichlid fish samples for genetic analysis
Data files
Aug 31, 2023 version files 16.75 GB
-
D25-E10_180613-processed-AP.zip
731.81 MB
-
D25-E10_180613-processed-DV.zip
710.71 MB
-
D25-E10_180613-processed-LR.zip
716.96 MB
-
D25-E10_raw_AP.zip
14.59 GB
-
README.md
2.75 KB
Abstract
The number of Genome-Wide Association Studies (GWAS) has been growing rapidly in recent years due to developments in genotyping and sequencing platforms. When applied to quantitative traits, these and other statistical genetics approaches require large amounts of consistently and accurately measured phenotypes. Using the data shared here, we introduce a computational toolbox based on deep convolutional neural networks that we have developed to phenotype quantitative traits describing morphology from micro-CT-scan image datasets. We illustrate the use of this Deep Learning Phenotyper (DLP) on a sample set of craniofacial CT scans of 118 samples from two very closely related species of Lake Malawi cichlid fish, Maylandia zebra and Cynotilapia zebroides. We show that the pipeline constructed and implemented here is capable of measuring morphological skeletal phenotypes with high accuracy. We also demonstrate how this pipeline can be integrated with existing GWAS frameworks to identify candidate association loci. We believe the methods we present here will be valuable for groups studying quantitative morphological traits not only in fishes, but in other vertebrates using CT scan datasets. The data shared here is an example dataset containing both the primary CT data for a sample (as an Anterior-Posterior image stack), and the denoised and compressed image stacks in all three orientations (AP, Left-Right, and Dorsal-Ventral) which are generated by the initial preprocessing steps described in the paper.
README: ZebraAfra
The data provided here is an example image stack for a Maylandia zebra fish sample. The image stack for both the raw CT scan projections and the computed tomography images are provided in the Anterior-Posterior projection. These raw images were pre-processed using the pipeline described in our paper and three sets of image stacks were generated for each standard projection: Anterior-Posterior (AP), Left-Right (LR), and Dorsal-Ventral (DV).
Description of the Data and file structure
The main directory contains the following directories, within which the processed and raw data files are stored:
- Raw images
- CentreSlice
- Sinogram - Contains the Sinogram images used for backpropagation
- D25-E10 Skull.ct2dprofile.xml - Contains the specifications of the CT scanner such as the voxel sizes, offset of scanner and X-ray source, etc
- D25-E10 Skull.xtek2dct - Contains the specifications of the CT scanner such as the voxel sizes, offset of scanner and X-ray source, etc
- D25-E10 Skull_01
- [vg-project] D25-E10 Skull - Contains the VGP files that specify the structure of the data when images are opened with different softwares
- D25-E10 Skull_0000.tif - D25-E10 Skull_1888.tif - All the computed tomography images generated through back-calculation of X-ray absorption.
- D25-E10 Skull.vgl - The VGL file that specify the structure of the data when images are opened with different softwares
- D25-E10 Skull_0001.tif - D25-E10 Skull_1081.tif - All the raw projections that were collected by the CT scanner through rotations around the fish.
- D25-E10 Skull.xtekct - Contains the specifications of the CT scanner such as the voxel sizes, offset of scanner and X-ray source, etc
- _ctdata.txt - This file specifies the number of projections and the angle between them that the scanner has used to generate raw projections.
- D25-E10 Skull.ctprofile.xml - Contains the specifications of the CT scanner such as the voxel sizes, offset of scanner and X-ray source, etc
- CentreSlice
- D25-E10_180613-processed-AP
- D25-E10 Skull_0000.jpg - D25-E10 Skull_1888.jpg - All the compressed, de-noised, and processed images for the same fish along the Anterior-Posterior axis.
- D25-E10_180613-processed-LR
- 0001.jpg - 1239.jpg - All the compressed, de-noised, and processed images for the same fish along the Left-Right axis.
- D25-E10_180613-processed-DV
- 0001.jpg - 2000.jpg - All the compressed, de-noised, and processed images for the same fish along the Dorsal-Ventral axis.
Sharing/Access Information
This example data set is publicly available for anyone to download, view, and use to run the DLP pipeline we developed.
Methods
The data was collected through three field trips to Lake Malawi, where fish samples were collected from four different locations: 1)Thumbi West, 2) Nkhata Bay, 3)Otter Point, and 4) Chiofu. The fish collected were then brought to the University of Cambridge, where they were fixed in formaldehyde and ethanol, and scanned at the Cambridge Biotomography Centre using a Nikon XT 225 ST micro-CT scanner.