Segmented high-resolution transmission electron microscopy images of nanoparticles
Data files
Jul 31, 2023 version files 33.69 GB
-
2021_08_26_5nm_Au_nanoparticles_on_C.zip
-
2021_09_02_10nm_Au_nanoparticles_on_UTC.zip
-
2021_10_06_20nm_Au_nanoparticles_on_UTC.zip
-
2021_10_06_2p2nm_Au_nanoparticles_on_UTC.zip
-
2021_10_06_5nm_Au_nanoparticles_on_UTC.zip
-
2021_12_15_5nm_Au_nanoparticles_on_5nm_SiN.zip
-
2022_02_02_5nm_Ag_nanoparticles_on_UTC.zip
-
2022_06_15_5nm_Au_nanoparticle_on_UTC_noise_series.zip
-
2022_10_25_5nm_Au_nanoparticles_on_UTC.zip
-
2022_11_09_5nm_Au_nanoparticles_on_UTC.zip
-
2023_01_23_5nm_CdSe_nanoparticles_on_UTC.zip
-
Ag_5nm_330kx_423e_Std_UTC_FFCorr_Team05_Images.h5
-
Ag_5nm_330kx_423e_Std_UTC_FFCorr_Team05_Labels.h5
-
Au_10nm_330kx_425e_Std_UTC_FFCorr_Team05_Images.h5
-
Au_10nm_330kx_425e_Std_UTC_FFCorr_Team05_Labels.h5
-
Au_2p2nm_330kx_423e_Std_UTC_FFCorr_Team05_Images.h5
-
Au_2p2nm_330kx_423e_Std_UTC_FFCorr_Team05_Labels.h5
-
Au_5nm_160kx_425e_Std_UTC_FFCorr_Team05_Images.h5
-
Au_5nm_160kx_425e_Std_UTC_FFCorr_Team05_Labels.h5
-
Au_5nm_205kx_420e_Std_UTC_FFCorr_Team05_Images.h5
-
Au_5nm_205kx_420e_Std_UTC_FFCorr_Team05_Labels.h5
-
Au_5nm_260kx_450e_Std_UTC_FFCorr_Team05_Images.h5
-
Au_5nm_260kx_450e_Std_UTC_FFCorr_Team05_Labels.h5
-
Au_5nm_330kx_421e_UTC_Team05_Std_Sess1109FFCorr_Images.h5
-
Au_5nm_330kx_421e_UTC_Team05_Std_Sess1109FFCorr_Labels.h5
-
Au_5nm_330kx_423e_UTC_Team05_Std_Sess0821FFCorr_Images.h5
-
Au_5nm_330kx_423e_UTC_Team05_Std_Sess0821FFCorr_Labels.h5
-
Au_5nm_330kx_423e_UTC_Team05_Std_Sess1006FFCorr_Images.h5
-
Au_5nm_330kx_423e_UTC_Team05_Std_Sess1006FFCorr_Labels.h5
-
Au_5nm_330kx_425e_UTC_Team05_Std_Sess1025FFCorr_Images.h5
-
Au_5nm_330kx_425e_UTC_Team05_Std_Sess1025FFCorr_Labels.h5
-
Au_5nm_330kx_80e_Std_UTC_FFCorr_Team05_Images.h5
-
Au_5nm_330kx_80e_Std_UTC_FFCorr_Team05_Labels.h5
-
Au_5nm_330kx_884e_Std_UTC_FFCorr_Team05_Images.h5
-
Au_5nm_330kx_884e_Std_UTC_FFCorr_Team05_Labels.h5
-
CdSe_5nm_330kx_421e_Std_UTC_FFCorr_Team05_Images.h5
-
CdSe_5nm_330kx_421e_Std_UTC_FFCorr_Team05_Labels.h5
-
Processed_datasets_metadata.csv
-
raw_data_metadata.csv
-
README.md
Abstract
A collection of high-resolution transmission electron microscopy (HRTEM) images of crystalline nanoparticles on amorphous substrates and their corresponding segmentation maps, including 407 raw camera images and segmentation maps; as well as 13 curated datasets created from this larger repository.
Images cover a variety of microscope magnifications (0.02-0.042 nm/pixel), electron dosages (80-884 e/Å2), nanoparticle diameters (2.2-20 nm), nanoparticle material (Au, Ag, CdSe), and substrate background (ultrathin C, SiN), and have been acquired over multiple microscope sessions. Segmentation maps were created by a single human labeler. Digital images are provided to aid in viewing data.
Datasets have each been curated by metadata characteristics, including acquisition date, microscope magnification, electron dosage, nanoparticle material, and nanoparticle diameter. The datasets have been processed from raw image data and converted into a format conducive for machine learning training, including flat field correction, pixel value standardization, and image patching.
Methods
TEM samples were created by dropcasting commercially purchased nanoparticles on either an ultrathin C or SiN grid. HRTEM images of nanoparticles were collected on a TEAM 0.5 aberration-corrected transmission electron microscope with a OneView (Gatan) camera at full resolution (4096x4096 pixels). Segmentation labels were created by a single human labeler using LabelBox. Digital images of the data (under a variety of color mapping protocols) were created and uploaded to LabelBox, where they were then labeled.
To create datasets, images were first selected using metadata criteria (i.e. microscope parameters, nanoparticle characteristics), and then processed into a dataset. Processing included (in order): 1) Removal of x-ray artifacts. 2) Flat-field correction. 3) Image standardization. 4) Dividing into smaller patches. 5) Removal of majority-background patches. X-ray artifacts were removed by averaging the surrounding pixels of outlier points above a certain threshold (1500 counts) above the image mean. For flat field correction, the uneven illumination was estimated using iterative weighted linear regression to a 2D Bezier basis (n=2, m=2) and divided out. Images were then individually standardized (mean=0, std=1). The original 4096x4096 pixel images and corresponding labels were then divided into 64 512x512 pixel patches, and patches that consisted of mostly substrate were removed. For more details, see our code here.
Usage notes
Raw image data are saved under the dm3 format, which can be opened by Digital Micrograph (also known as Gatan Microscopy Suite), ImageJ/Fiji, or using the ncempy Python package. The raw data are stored in folders, where each folder generally corresponds to a single microscope session. raw_data_metadata.csv provides the metadata and locations for all of the image files.
Datasets and their corresponding labels are stored as h5 files, which can be opened using the h5py Python package. Image files are stored under the key 'images' and segmentation labels are stored under the key 'labels'. Processed_datasets_metadata.csv provides the metadata for the processed datasets and their associated segmentation label files. For an example of opening and using these datasets, see our code here.