Data from: A reusable pipeline for large-scale fiber segmentation on unidirectional fiber beds using fully convolutional neural networks
Fioravante de Siqueira, Alexandre; van der Walt, Stéfan; Mayumi Ushizima, Daniela (2021), Data from: A reusable pipeline for large-scale fiber segmentation on unidirectional fiber beds using fully convolutional neural networks, Dryad, Dataset, https://doi.org/10.6078/D1069R
Fiber-reinforced ceramic-matrix composites are advanced materials resistant to high temperatures, with application to aerospace engineering. Their analysis depends on the detection of embedded fibers, with semi-supervised techniques usually employed to separate fibers within the fiber beds. Here we present an open computational pipeline to detect fibers in ex-situ X-ray computed tomography fiber beds. To separate the fibers in these samples, we tested four different architectures of fully convolutional neural networks. When comparing our neural network approach to a semi-supervised one, we obtained Dice and Matthews coefficients greater than 92.28 ± 9.65%, reaching up to 98.42 ± 0.03%, showing that the network results are close to the human-supervised ones in these fiber beds, in some cases separating fibers that human-curated algorithms could not find. The software we generated in this project is open source, released under a permissible license, and can be adapted and re-used in other domains. Here you find the data resulting from this study.
We implemented four architectures — two dimensional U-net and Tiramisu, and their three-dimensional versions — to process images of fiber beds provided by Larson et al. We used supervised algorithms: they rely on labeled data to learn what are the regions of interest — in our case, fibers within microtomographies of fiber beds. All CNN algorithms were implemented using TensorFlow and Keras on a computer with two Intel Xeon Gold processors 6134 and two Nvidia GeForce RTX 2080 graphical processing units. Each GPU has 10 GB of RAM. To train the neural networks on how to recognize the fibers, we used slices from two different samples: “232p3 wet” and “232p3 cured”, registered according to the wet sample. Larson et al. provided the fiber segmentation for these samples, which we used as labels in the training. The training and validation procedures processed 350 and 149 images from each sample, respectively; a total of 998 images. Each image from the original samples have width and height size of 2560 × 2560 pixels. To feed the two-dimensional networks, we padded the images with 16 pixels, of value zero, in each dimension. Then, each image was cut into tiles of size 288 × 288, each 256 pixels, creating an overlap of 32 pixels. These overlapping regions, which are again removed after processing, avoid artifacts on the borders of processed tiles. Therefore, each input slice generated 100 images with 288 × 288 pixels, in a total of 50,000 images for the training set, and 10,000 for the validation set.
We used twelve different datasets from Larson et al (2019) in our study. We kept the same folder identifiers used in their original data, for fast cross-reference. The filenames for each processed sample follow the structure `<NETWORK>-<Larson's sample folder>.zip`, where `<NETWORK>` can be `tiramisu`, `tiramisu_3d`, `unet`, `unet_3d`. For example, results for the sample 232p3, wet, obtained with the 2D U-net network are given in the file `unet-rec20160318_191511_232p3_2cm_cont__4097im_1500ms_ML17keV_6.h5.zip`.
The file `coefficients.zip` contains: 1. the training coefficients for each network, where filenames follow the structure `larson_<NETWORK>.hdf5`; 2. accuracy, loss, validation accuracy and validation loss we obtained during our training process; 3. filenames follow the structure `larson_<NETWORK>.hdf5-<MEASURE>.csv`, where `<MEASURE>` can be `accuracy`, `loss`, `val_accuracy`, `val_loss`, for accuracy, loss, validation accuracy and validation loss, respectively; 4. output of the training and prediction steps in our study, where filenames follow the structure `output.train_<NETWORK>.txt` and `output.predict_<NETWORK>.txt` for the training and prediction processes, respectively.
Gordon and Betty Moore Foundation, Award: GBMF3834
Alfred P. Sloan Foundation, Award: 2013-10-27