Telecentric wide-field reflected light microscopic dataset
Data files
Feb 22, 2024 version files 2.29 GB
-
Dataset.zip
-
README.md
Abstract
Multi-class segmentation of unlabelled living cells in time-lapse light microscopy images is challenging due to the temporal behaviour and changes in cell life cycles and the complexity of images of this kind. The deep-learning-based methods achieved promising outcomes and remarkable success in single- and multi-class medical and microscopy image segmentation. The main objective of this study is to develop a hybrid deep-learning-based categorical segmentation and classification method for living HeLa cells in reflected light microscopy images.
A symmetric simple U-Net and three asymmetric hybrid convolution neural networks---VGG19-U-Net, Inception-U-Net, and ResNet34-U-Net were proposed and mutually compared to find the most suitable architecture for multi-class segmentation of our datasets. The inception module in the Inception-U-Net contained kernels with different sizes within the same layer to extract all feature descriptors. The series of residual blocks with the skip connections in each ResNet34-U-Net's level alleviated the gradient vanishing problem and improved the generalisation ability.
The m-IoU scores of multi-class segmentation for our datasets reached 0.7062, 0.7178, 0.7907, and 0.8067 for the simple U-Net, VGG19-U-Net, Inception-U-Net, and ResNet34-U-Net, respectively. For each class and the mean value across all classes, the most accurate multi-class semantic segmentation was achieved using the ResNet34-U-Net architecture (evaluated as the m-IoU and Dice metrics).
README: Telecentric wide-field reflected light microscopic dataset
This Telecentric bright-field_README.txt file was generated on 2023-04-20 by Ali Ghaznavi
GENERAL INFORMATION
Symmetry Breaking in the U-Net: Hybrid Deep-Learning Multi-Class Segmentation of HeLa Cells in Reflected Light Microscopy Images
Author Information
First-author, Corresponding author
Name: MSc. Ali Ghaznavi
Institution: Institute of Complex Systems, University of South Bohemia in České Budějovice, Zámek 136, 373 33, Nové Hrady, Czech Republic
Email: ghaznavi@frov.jcu.czCo-author
Name: Dr. Renata Rychtáriková
Institution: Institute of Complex Systems, University of South Bohemia in České Budějovice, Zámek 136, 373 33, Nové Hrady, Czech Republic
Email: rrychtarikova@frov.jcu.cz
Co-author
Name: Dr. Petr Císař
Institution: Institute of signal and image processing, University of South Bohemia in České Budějovice, Zámek 136, 373 33, Nové Hrady, Czech Republic
Email: cisar@frov.jcu.czCo-author
Name: MSc. Mohammadmehdi Ziaei
Institution: Institute of signal and image processing, University of South Bohemia in České Budějovice, Zámek 136, 373 33, Nové Hrady, Czech Republic
Email: ziaei@frov.jcu.czCo-author
Name: Prof. Dalibor Štys
Institution: Institute of Complex Systems, University of South Bohemia in České Budějovice, Zámek 136, 373 33, Nové Hrady, Czech Republic
Email: stys@frov.jcu.czDate of data collection: 2022
Geographic location of data collection: Nové Hrady, Czech Republic
Funding sources that supported the collection of the data: Ministry of Education, Youth and Sports of the Czech Republic – project CENAKVA (LM2018099), European Regional Development Fund in the frame of the project ImageHeadstart (ATCZ215) in the Interreg V-A Austria–Czech Republic programme, and GAJU 114/2022/Z.
Recommended citation for this dataset: Ghaznavi A., Rychtáriková R., Císař P., Ziaei M., Štys D. Telecentric wide-field reflected light microscopic dataset (2022), Dryad, Dataset.
DATA & FILE OVERVIEW
1- Description of HeLa dataset
Human Negroid cervical epithelioid carcinoma line HeLa was chosen as a testing cell line for microscopy image segmentation task. The reason for choosing is that HeLa is the oldest, immortal, and most used model cell line ever. HeLa is cultivated in almost all tissue and cell laboratories worldwide and utilized in many fields of medical research, such as research on carcinoma or testing the material biocompatibility.
Human HeLa cell line (European Collection of Cell Cultures, Cat. No. 93021013) was cultivated to low optical density overnight at 37°C, 5% CO2, and 90% relative humidity. The nutrient solution consisted of Dulbecco’s modified Eagle medium (87.7%) with high glucose (1 g L−1), fetal bovine serum (10%), antibiotics and antimycotics (1%), L-glutamine (1%), and gentamicin (0.3%; all purchased from Biowest, Nuaille, France). The HeLa cells were maintained in a Petri dish with a cover glass bottom and lid at temperature of 37°C.
2- Description of wide-field microscope
Time-lapse image series of living human HeLa cells on the glass Petri dish were captured using a high-resolved wide-field light microscope (designed by the Institute of Complex System Nové Hrady, CZ; built by Optax, Prague, CZ and ImageCode, Brloh, CZ in 2021).
The microscope has a simple construction of the optical path. The sample is illuminated by a Schott VisiLED S80-25 LED Brightfield Ringlight. The light reflected from a sample goes through a telecentric measurement objective TO4.5/43.4-48-F-WN (Vision & Control GmbH, Shul, Germany) to an Arducam AR1820HS 1/2.3-inch 10-bit RGB camera with a chip of 4912 × 3684 pixel resolution
All these experiments were performed in time-lapse to observe cells’ behaviour over time.
3- Data preparation steps
3-1 Calibration of all time-lapse experiments images to avoid image background inhomogeneities and noise (more information in section 2-2 of DOI:https://doi.org/10.3390/sym16020227).
3-2 Conversion of the raw image representations to 8-bit colour (rgb) images of resolution (number of pixels) quarter of the original raw images by applying quadruplets of Bayer mask pixels.
3-3 Applying means denoising method to minimize the background noise in the constructed RGB images at preserving the texture details.
3-4 Cropping the image series to the 1024 × 1024 pixel size.
The steps described 3-1 -- 3-4 gave us 650 images from different time-lapse experiments on HeLa cells.
3-5 Multi-class manual labelling the cells in the images in apper platform as Ground-Truth (GT) multi class masks with the dimension of 1024 × 1024.
3-6 Usage of 650 labelled images as training (80%), testing (20%), and evaluation (20% of the training set) sets for the different U-Net base networks.
4- METHODOLOGICAL INFORMATION
The U-Net is a semantic segmentation method proposed on the FCN architecture. The FCN consists of a typical encoder–decoder convolutional network. This architecture includes several feature channels to combine shallow and deep features.
The first architecture we applied to our dataset was a simple U-Net. We reached multi-class segmentation Mean-IoU score of 0.7062 for our dataset.
In the second step, we applied VGG19-U-Net architecture to improve multi class segmentation result and achieved Mean-IoU score for our datasets reached to 0.7178.
In the third step, we applied Inception-U-Net architecture to to classify the cells more accurately into right classes and achieved Mean-IoU score for our datasets reached to 0.7907.
In the final step, we applied a ResNet34-U-Net architecture to overcome the gradient vanishing problem and achieve the most accurate categorical segmentation result. We improved the segmentation result slightly to Mean-IoU score of 0.8067.
(More information in section 2-3 of DOI: https://doi.org/10.3390/sym16020227.)
5- Training Models
5-1 The computation was implemented in Python 3.9. The framework for deep learning was Keras, and the backend was Tensorflow.
5-2 The whole method, including the Deep Learning framework, was transferred and executed on the Google Colab Pro + account with P100 and T4 GPU, 24 Gb of RAM, and 2 vCPU.
5-3 The primary dataset (of 650 images)
5-4 Since the network architectures work with a specific size of the input images, all datasets were resized to the 512 × 512 pixel size.
DATA-SPECIFIC INFORMATION FOR: Folder x_train
The train folder contains images selected randomly among all datasets to train the models based on our Train set.
DATA-SPECIFIC INFORMATION FOR: Folder y_train
The train folder contains masks created manually as the Ground-Truth set (described in section 3-5) to train the models based on our Train set.
DATA-SPECIFIC INFORMATION FOR: Folder x_validate
The val folder contains images selected randomly among all datasets to train and validate the models using our Train set and Validation set.
DATA-SPECIFIC INFORMATION FOR: Folder y_validate
The val folder contains masks created manually as the Ground-Truth set (described in section 3-5) to train and validate the models using our Train set and Validation Set.
DATA-SPECIFIC INFORMATION FOR: Folder x_test
The test folder contains images selected randomly among all datasets to test the models' performance after the training process and evaluate the models' performance using appropriate metrics.
DATA-SPECIFIC INFORMATION FOR: Folder y_test
The test folder contains masks created manually as the Ground-Truth set (described in section 3-5) to test the models' performance after the training process and evaluate the models' performance with the Test set using appropriate metrics.
6- Evaluation metrics (More information section 2-5 DOI:https://doi.org/10.3390/sym16020227)
Overall pixel accuracy (Acc): represents a per cent of image pixels belonging to the correctly segmented cells
Precision (Pre): is a proportion of the cell pixels in the segmentation results that match the GT
The Recall (Recl): represents the proportion of cell pixels in the GT correctly identified through the segmentation process
F1-score or Dice similarity coefficient (m-Dice): states how the predicted segmented region matches the GT in location and level of details and considers each class’s false alarm and missed value
Jaccard similarity index or Intersection over Union (m-IoU): is a correlation among the prediction and GT, and represents the overlap and union area ratio for the predicted and GT segmentation.
7- Result (More information section 3 DOI:https://doi.org/10.3390/sym16020227)
Simple U-Net: Acc = 0.9869 -- m-IoU = 0.7062 -- m-Dice = 0.8104
Vgg19-U-Net: Acc = 0.9865 -- m-IoU = 0.7178 -- m-Dice = 0.8218
Inception-U-Net: Acc = 0.9904 -- m-IoU = 0.7907 -- m-Dice = 0.8762
ResNet34-U-Net: Acc = 0.9909 -- m-IoU = 0.8067 -- m-Dice = 0.8873