Data from: Label-free timing analysis of SiPM-based modularized detectors with physics-constrained deep learning

Ai, Pengcheng 1 ; Xiao, Le1; Deng, Zhi2; Wang, Yi2; Sun, Xiangming1; Huang, Guangming1; Wang, Dong1; Li, Yulei2; Ran, Xinchi2

Published Oct 25, 2023 on Dryad. https://doi.org/10.5061/dryad.qv9s4mwkj

Data files

Oct 25, 2023 version files 1 GB

README.md

8.88 KB
temp.zip

1 GB

Oct 25, 2023 version files 1 GB

README.md

8.94 KB
temp.zip

1 GB

Abstract

Pulse timing is an important topic in nuclear instrumentation, with far-reaching applications from high energy physics to radiation imaging. While high-speed analog-to-digital converters become more and more developed and accessible, their potential uses and merits in nuclear detector signal processing are still uncertain, partially due to associated timing algorithms which are not fully understood and utilized.

In the paper "Label-free timing analysis of SiPM-based modularized detectors with physics-constrained deep learning", we propose a novel method based on deep learning for timing analysis of modularized detectors without explicit needs of labelling event data. By taking advantage of the intrinsic time correlations, a label-free loss function with a specially designed regularizer is formed to supervise the training of neural networks towards a meaningful and accurate mapping function. We mathematically demonstrate the existence of the optimal function desired by the method, and give a systematic algorithm for training and calibration of the model. The proposed method is validated on two experimental datasets based on silicon photomultipliers (SiPM) as main transducers:

In the toy experiment, we collect data from a pair of SiPM sensors from a common laser source. The neural network model achieves the single-channel time resolution of 8.8 ps and exhibits robustness against concept drift in the dataset.
In the electromagnetic calorimeter experiment, we collect data from an eight-channel calorimeter module. Several neural network models (Fully-Connected, Convolutional Neural Network and Long Short Term Memory) are tested to show their conformance to the underlying physical constraint and to judge their performance against traditional methods.

In total, the proposed method works well in either ideal or noisy experimental condition and recovers the time information from waveform samples successfully and precisely. The dataset in this repository serves as a basis for similar researches on timing performance of SiPM-based nuclear detectors, and on application of neural networks to typical signals of nuclear radiation detectors.

Introduction

This repository holds the computer code and raw data to reproduce the results in the paper: Label-free timing analysis of SiPM-based modularized detectors with physics-constrained deep learning

In the paper "Label-free timing analysis of SiPM-based modularized detectors with physics-constrained deep learning", we propose a novel method based on deep learning for timing analysis of modularized detectors without explicit needs of labelling event data. By taking advantage of the intrinsic time correlations, a label-free loss function with a specially designed regularizer is formed to supervise the training of neural networks towards a meaningful and accurate mapping function. We mathematically demonstrate the existence of the optimal function desired by the method, and give a systematic algorithm for training and calibration of the model. The proposed method is validated on two experimental datasets based on silicon photomultipliers (SiPM) as main transducers:

In the toy experiment, we collect data from a pair of SiPM sensors from a common laser source. The neural network model achieves the single-channel time resolution of 8.8 ps and exhibits robustness against concept drift in the dataset.
In the electromagnetic calorimeter (ECAL) experiment, we collect data from an eight-channel calorimeter module. Several neural network models (Fully-Connected (FC), Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM)) are tested to show their conformance to the underlying physical constraint and to judge their performance against traditional methods.

Description of the data and file structure

When data and software files are downloaded, please unzip temp.zip and PhyECAL.zip, and put the folder temp under the root folder of PhyECAL, so that the computer code will work properly.

root directory: contain main routine scripts to train neural networks (NNs), and README.md (this file).

./s_toy_routine.py: Python script to train NNs on the toy experiment.
./s_basic_routine.py: Python script to train NNs on the ECAL experiment.
./README.md: This file.

./conf/ directory: configuration files for main routine scripts.

laser_in2048_[cluster]_[frequency]_2ch_internal.yaml: Configuration files for the toy experiment. Use optional [cluster] to select data, use opitional low-pass filter with [frequency] to preprocess data.
ecal_[network]_in800_8ch_internal.yaml: Configuration files for the ECAL experiment. Use [network] (null to represent CNN) for training on the dataset.

./src/ directory: Python source codes invoked by the main routine scripts.

./src/algorithm/slewing_correction.py: The slewing correction algorithm for timing.
./src/impl/[name]_bind_model.py: Implementation of neural network models.
./src/build_model.py: Basic model building routines (only CNN supported).
./src/build_model_v2.py: Advanced model building routines (CNN, LSTM and FC supported).
./src/data_provider.py: Wrapper to provide data objects.
./src/quan_aux.py: Quantization auxiliary classes.
./src/model_generate.py: Some model routines (save model, export model and save evaluation results).
./src/util.py: Utility functions.

./temp/ directory (data file): raw data (in .npz format), NN model weights and run results.

./temp/laser_in2048_2ch/export_[cluster]/: Model weights (with or without clustering) exportation for the toy experiment.
./temp/laser_in2048_2ch/model_[cluster]/: Saved models (with or without clustering) in TensorFlow-compatible format for the toy experiment.
./temp/laser_in2048_2ch/result_[cluster]/: Saved evaluation results (with or without clustering) for the toy experiment.
./temp/laser_in2048_2ch/laser_in2048_2ch-s_toy_data.npz: Raw waveform data file for the toy experiment.
./temp/laser_in2048_2ch/cluster_index.npz: Clustering indexes for the toy experiment.
./temp/laser_in2048_2ch/nn_data_var_chk.npz: Data checkpoint to keep indexes of training set, validation set and test set for the toy experiment.
./temp/ecal_in800_8ch/export/: Model weights exportation for the ECAL experiment.
./temp/ecal_in800_8ch/model/: Saved models in TensorFlow-compatible format for the ECAL experiment.
./temp/ecal_in800_8ch/result/: Saved evaluation results for the ECAL experiment.
./temp/ecal_in800_8ch/wave_raw.npz: Raw waveform data file for the ECAL experiment.
./temp/ecal_in800_8ch/nn_data_var_chk.npz: Data checkpoint to keep indexes of training set, validation set and test set for the ECAL experiment.
./temp/plot_compare_res_cache.npz: Cache file to save source data of figures.
./temp/test_on_linear_cut_selected_index.npz: Selected indexes for linear cut of figures.

./test/ directory: Python scripts to draw figures in the manuscript.

./test/plot_analyze_fit.py: Analysis of the timing results for the toy experiment.
./test/plot_compare_res.py: Compare NN and the traditional method with different low-pass filters for the toy experiment.
./test/plot_s_toy_detail.py: Detailed description of the NN algorithm for the toy experiment.
./test/plot_s_basic_calibre.py: Calibration plots for the ECAL experiment.
./test/plot_s_basic_on_thresh.py: Performance of NNs and traditional methods on different Pearson thresholds for the ECAL experiment.

How to acquire raw data

Make sure you have Python and Numpy installed on your computer.
In the root directory (mentioned above), use python or python3 command to enter Python command line.
The following commands will enable you to access raw data of the toy experiment:
- import numpy as np
- content = np.load("./temp/laser_in2048_2ch/laser_in2048_2ch-s_toy_data.npz")
- content["inputs"] # data used to train neural networks, with shape (10024, 2, 4000) representing (#examples, #channels, #sampling points)
- content["targets"] # original data with baseline cancellation, but without re-sampling
- content["labels"] # null data without any use
The following commands will enable you to access raw data of the ECAL experiment:
- import numpy as np
- content = np.load("./temp/ecal_in800_8ch/wave_raw.npz")
- content["data"] # raw data collected by the ECAL detector, with shape (16178, 8, 1000) representing (#examples, #channels, #sampling points)

Sharing/Access information

None.

Code/Software

Pre-requisite

The program is tested with the following setting:

python==3.9.5
tf-nightly-gpu==2.7.0.dev20210730
keras-nightly==2.7.0.dev2021073000
tensorflow-model-optimization==0.6.0
numpy==1.19.5
scipy==1.6.2
matplotlib==3.3.4
pandas==1.3.0
pyyaml==5.4.1

Newer versions may also work.

Basic usage

Generate figures in the paper

All figures of test results can be generated by Python scripts in the ./test/ directory. For example:

python test/plot_s_basic_calibre.py

will generate calibration plots for NN models in the ECAL experiment.

Re-train models (Overwrite)

To re-train models in the toy experiment:

python s_toy_routine.py --config_file [configuration file]

The [configuration file]s are located in the conf directory (starting with laser_in2048)

To re-train models in the ECAL experiment:

python s_basic_routine.py --config_file [configuration file]

The [configuration file]s are located in the conf directory (starting with ecal)

Create your own configuration

To create a new configuration file, follow the steps below:

Copy an existing configuration file (for example: conf/laser_in2048_cluster_2p_2ch_internal.yaml)
Change save_prefix (under supp) in the new configuration file to a new name (required if you do not want to overwrite)
Change other values in the new configuration file as you want
Run the new configuration file with the main routine in the project root (for example: python s_toy_routine.py --config_file [your new configuration file])

The program is tested with the following setting:

python==3.9.5
tf-nightly-gpu==2.7.0.dev20210730
keras-nightly==2.7.0.dev2021073000
tensorflow-model-optimization==0.6.0
numpy==1.19.5
scipy==1.6.2
matplotlib==3.3.4
pandas==1.3.0
pyyaml==5.4.1

Newer versions may also work.

When data and software files are downloaded, please unzip temp.zip and PhyECAL.zip, and put the folder temp under the root folder of PhyECAL, so that the computer code will work properly.

root directory: contain main routine scripts to train neural networks (NNs), and README.md (this file).

./s_toy_routine.py: Python script to train NNs on the toy experiment.
./s_basic_routine.py: Python script to train NNs on the ECAL experiment.
./README.md: This file.

`./conf/` directory: configuration files for main routine scripts.

laser_in2048_[cluster]_[frequency]_2ch_internal.yaml: Configuration files for the toy experiment. Use optional [cluster] to select data, use opitional low-pass filter with [frequency] to preprocess data.
ecal_[network]_in800_8ch_internal.yaml: Configuration files for the ECAL experiment. Use [network] (null to represent CNN) for training on the dataset.

`./src/` directory: Python source codes invoked by the main routine scripts.

./src/algorithm/slewing_correction.py: The slewing correction algorithm for timing.
./src/impl/[name]_bind_model.py: Implementation of neural network models.
./src/build_model.py: Basic model building routines (only CNN supported).
./src/build_model_v2.py: Advanced model building routines (CNN, LSTM and FC supported).
./src/data_provider.py: Wrapper to provide data objects.
./src/quan_aux.py: Quantization auxiliary classes.
./src/model_generate.py: Some model routines (save model, export model and save evaluation results).
./src/util.py: Utility functions.

`./temp/` directory (data file): raw data (in .npz format), NN model weights and run results.

./temp/laser_in2048_2ch/export_[cluster]/: Model weights (with or without clustering) exportation for the toy experiment.
./temp/laser_in2048_2ch/model_[cluster]/: Saved models (with or without clustering) in TensorFlow-compatible format for the toy experiment.
./temp/laser_in2048_2ch/result_[cluster]/: Saved evaluation results (with or without clustering) for the toy experiment.
./temp/laser_in2048_2ch/laser_in2048_2ch-s_toy_data.npz: Raw waveform data file for the toy experiment.
./temp/laser_in2048_2ch/cluster_index.npz: Clustering indexes for the toy experiment.
./temp/laser_in2048_2ch/nn_data_var_chk.npz: Data checkpoint to keep indexes of training set, validation set and test set for the toy experiment.
./temp/ecal_in800_8ch/export/: Model weights exportation for the ECAL experiment.
./temp/ecal_in800_8ch/model/: Saved models in TensorFlow-compatible format for the ECAL experiment.
./temp/ecal_in800_8ch/result/: Saved evaluation results for the ECAL experiment.
./temp/ecal_in800_8ch/wave_raw.npz: Raw waveform data file for the ECAL experiment.
./temp/ecal_in800_8ch/nn_data_var_chk.npz: Data checkpoint to keep indexes of training set, validation set and test set for the ECAL experiment.
./temp/plot_compare_res_cache.npz: Cache file to save source data of figures.
./temp/test_on_linear_cut_selected_index.npz: Selected indexes for linear cut of figures.

`./test/` directory: Python scripts to draw figures in the manuscript.

./test/plot_analyze_fit.py: Analysis of the timing results for the toy experiment.
./test/plot_compare_res.py: Compare NN and the traditional method with different low-pass filters for the toy experiment.
./test/plot_s_toy_detail.py: Detailed description of the NN algorithm for the toy experiment.
./test/plot_s_basic_calibre.py: Calibration plots for the ECAL experiment.
./test/plot_s_basic_on_thresh.py: Performance of NNs and traditional methods on different Pearson thresholds for the ECAL experiment.

How to acquire raw data:

1. Make sure you have Python and Numpy installed on your computer.

2. In the root directory (mentioned above), use python or python3 command to enter Python command line.

3. The following commands will enable you to access raw data of the toy experiment:

* import numpy as np

* content = np.load("./temp/laser_in2048_2ch/laser_in2048_2ch-s_toy_data.npz")

* content["inputs"] # data used to train neural networks, with shape (10024, 2, 4000) representing (#examples, #channels, #sampling points)

* content["targets"] # original data with baseline cancellation, but without re-sampling

* content["labels"] # null data without any use

4. The following commands will enable you to access raw data of the ECAL experiment: