Data from: phaser: A unified and extensible framework for fast electron ptychography
Data files
Jan 21, 2026 version files 2.96 GB
-
phaser_data.zip
2.96 GB
-
README.md
7.13 KB
Abstract
Electron ptychography is a groundbreaking technique for the advanced characterization of materials. Here, we present phaser, an open source Python package which provides a unified interface to both conventional and gradient descent based ptychographhc algorithms. This record provides data required to reproduce the benchmarks and results in our paper. It contains raw experimental and simulated 4D-STEM datasets, as well as reconstruction plan files and reconstruction outputs.
Data from: phaser: A unified and extensible framework for fast electron ptychography
This repository contains the data required to reproduce the results in the manuscript.
To run the reconstructions and view the reconstructed data, download and install phaser (https://github.com/hexane360/phaser)
Detailed contents by subfolder
fig4_benchmark
Contains scripts and raw data for performance benchmarking. Benchmark results are recorded as newline-delimited JavaScript object notation (NDJSON) files, with the following columns:
engine: Which engine was used ('lsqml','grad','epie')backend: Which computational backend was used ('jax','cupy','torch'(PtyRAD), or'matlab'(fold_slice))sim_size: Size of diffraction patterns in reciprocal spacen_positions: # of scan positions in datasetn_slices: # of object slicesgrouping: Reconstruction grouping/batch sizedevice: GPU device reconstruction was performed oncode: Version of code used, e.g. 'v4' or 'fold_slice'. For PtyRAD, "compile" indicates the torch.compile feature was used.iter_times: List of iteration times in seconds
Scripts for the fold_slice benchmarks are in the fold_slice directory.
The JSON files in the fold_slice directory contain reconstruction parameters, the run_foldslice.sh file runs benchmarks, and the *.log files contain reconstruction output logs (including iteration times).
These files are processed by process_logs_mos2.py and process_logs_si.py to collect the benchmark outputs.
Scripts for the PtyRAD benchmarks are in the ptyrad directory.
The YAML files contain reconstruction parameters, the run_benchmarks.py script runs benchmarks,
and the *_log.txt files contain reconstruction output logs (including iteration times). The process_logs.py script to collect the benchmark outputs. Additionally, the system_info.txt file outputs system information as recorded by PtyRAD.
fig5_comparison
Contains scripts and reconstructed data for the engine comparison in fig. 5.
Contains the following files and folders:
*_study.py: Hyperoptimization scripts to find optimal reconstruction parameters for each engine.
Data is automatically compared toground_truth_PrScO3_300kV.tif(epie|lsqml|grad)_trial####.json: Reconstruction plan file with optimized reconstruction parameters(epie|lsqml|grad)_trial###folder: Reconstruction outputs from optimized reconstructioniter1000.h5: Final reconstruction stateiter1000_error.png: Comparison of ground truth to reconstructed data using optimal reconstruction parameters.object_phase_sum_iter1000.tiff: Reconstructed object phase sumprobe_iter1000.tiff: Reconstructed real-space probe (separated into modes)probe_recip_iter1000.tiff: Reconstructed reciprocal-space probe (separated into modes)scan_iter1000.svg: Reconstructed scan positions
PSO_foldslice.json: Reconstruction parameters forfold_slicefoldslicefolder: Reconstruction outputs fromfold_slice:Niter1000.mat: Finalfold_sliceoutputiter1000.h5: Final output converted tophaserformatiter1000_error.png: Comparison of ground truth to reconstructed datafoldslice.log: Log file fromfold_slice
ground_truth_PrScO3_300kV.tif is the mean, thermally averaged phase for PrScO3 from simulation, using Kirkland parameterization and isotropic Debye-Waller factors. The image is floating point, in units of rad/Å. The pixel sampling in angstroms is recorded in the TIFF metadata.
fig6_exp
Contains reconstruction plan files and outputs for selected experimental datasets. Additionally contains ground_truth_BTO_300kV.tif and ground_truth_Si_300kV.tif, which are simulated ground truths used as a real-space error metric for the Si and BTO datasets.
Contains the following files and folders:
prsco3.yaml,bto.yaml, andsi.yaml: Reconstruction plans for each datasetprsco3,bto, andsifolders: Reconstruction outputs for each dataset, with the following files:iter500.h5: Final reconstruction stateiter500_error.png: Comparison of ground truth to reconstructed dataprobe_iter500.tiff: Reconstructed probe modesprobe_recip_iter500.tiff: Reconstructed probe modes (in reciprocal space)scan_iter500.svg: Reconstructed scan positions
ground_truth_(Si|BTO)_300kV.tif: Simulated ground truths for Si and BTO datasets
fig7_regularization
Regularization results are stored in results.ndjson with the following columns:
name: Name of data pointpath: Original path to reconstructed datastudy: Study # (1-5)study_i: # of data point in studyiters: List of iterations, real-space errors were measured aterrors: Real-space RMS errors recorded at those iterationsfourier_iters: List of iterations loss was measured atfourier_errors: Losses measured at those iterationseps,obj_l1,obj_l2,obj_tikh,layers_tikh: Values of regularization parametersplan: String containing reconstruction plan file used at that datapoint
Additionally, si_center.yaml contains the center point of the search, i.e., the reconstruction with default values for each regularization parameter.
fig8_depth
Contains reconstruction plan file (si_depth_20_layers_1.00e+01.yaml) and output (iter305.h5) for the simulated Si dataset with Sn interstitials.
raw_data
This contains two raw datasets. The first is an experimental BaTiO3 dataset, courtesy of Dr. Jingrui Wei.
The second is a simulated Si dataset containing Sn interstitials at specified depths.
This dataset was simulated with pyMultislicer (https://github.com/LeBeauGroup/pyMultislicer).
Details of the simulation parameters are found in the methods section of the main paper.
Contains the following files and folders:
- BTO
scan_x90_y90.raw: Raw data in EMPAD file format (packed float32 values)acq10_of15nm.xml: Raw EMPAD metadata (XML)acq10_of15nm_calib.json: Processed metadata file describing dataset parameters. Metadata schema is located in the phaser source code underphaser/io/empad.py
- Si_110_Sn_300kV_conv25_defocus10_tds
Si_110_Sn_300kV_conv25_defocus10_tds_280.34_dstep<dstep>.json: Processed metadata file for detector diffraction pixel size<dstep>(in mrad/px).Si_110_Sn_300kV_conv25_defocus10_tds_280.34_dstep<dstep>_x80_y80_4DSTEM.raw: Raw simulated data in EMPAD file format for detector diffraction pixel size<dstep>(in mrad/px).Si_110_Sn_300kV_conv25_defocus10_tds_280.34_bf_sum.tif: Simulated bright-field STEM image (5 mrad outer collection angle).Si_110_Sn_300kV_conv25_defocus10_tds_280.34_pacbed.tif: Simulated position-averaged convergent beam electron diffraction (PACBED) patternSi_110_Sn_300kV_conv25_defocus10_tds_scan.npy: Simulated scan positions, numpy.npyfile formatSi_110_Sn_300kV_conv25_defocus10_tds_scan.mat: Simulated scan positions, MATLAB.matfile format
