Demo dataset for: SPACEc, a streamlined, interactive Python workflow for multiplexed image processing and analysis

Tan, Yuqi 1 ; Kempchen, Tim1

Published Jul 08, 2024 on Dryad. https://doi.org/10.5061/dryad.brv15dvj1

Data files

Jul 08, 2024 version files 414.46 MB

example_data.zip
414.46 MB
README.md
2.28 KB

Abstract

Multiplexed imaging technologies provide insights into complex tissue architectures. However, challenges arise due to software fragmentation with cumbersome data handoffs, inefficiencies in processing large images (8 to 40 gigabytes per image), and limited spatial analysis capabilities. To efficiently analyze multiplexed imaging data, we developed SPACEc, a scalable end-to-end Python solution, that handles image extraction, cell segmentation, and data preprocessing and incorporates machine-learning-enabled, multi-scaled, spatial analysis, operated through a user-friendly and interactive interface.

The demonstration dataset was derived from a previous analysis and contains TMA cores from a human tonsil and tonsillitis sample that were acquired with the Akoya PhenocyclerFusion platform. The dataset can be used to test the workflow and establish it on a user’s system or to familiarize oneself with the pipeline.

Descriptions

tonsil_TMA.tif

Example tif file as input for the tissue extraction and segmentation. The image shows two TMA cores from a human tonsil (right) and tonsillitis sample (left). The image was acquired on the Akoya PhenoCycler Fusion platform. The tissue was stained with a 58 oligo-tagged antibody panel and DAPI.

channelnames.txt

Text file that contains the channel names for the tif file. This file is used as input for image segmentation along with the image.

adata_nn_demo_tonsil.h5ad

An Anndata object is saved as h5ad file. The data were generated with SPACEc. Therefore the example image was segmented using mesmer and intensities normalized using Z-score normalization. Data were filtered as outlined in our example workflow (see GitHub). The object holds the normalized intensity data for each cell in the X slot and all metadata including the cell centroid coordinates in the adata.obs slot. The Anndata object contains previously annotated data as a reference for cell-type annotation using SVM-based annotation.

Description of the data and file structure

The folder structure is shown below

❖    Data
➢   raw
   ■      tonsil_tma.tiff
   ■      channelNames.txt
➢   processed
   ■      adata_nn_demo_tonsil.h5ad

The most common approach to visualize TIFF is through the use of FIJI (ImageJ). However, QuPath and Python also offer functionality for opening large, stacked TIFF files. The h5ad file format can be opened in Python using the Scanpy package, which provides user-friendly functions to interact with single-cell data in the Anndata format. Details can be found at https://github.com/yuqiyuqitan/SPACEc.

For opening .h5ad file in python

# open a terminal
pip install scanpy
python
# within the python
import scanpy as sc
adata = sc.read(adata_nn_demo_tonsil.h5ad)

Code/Software

The Anndata object was created using SPACEc 0.0.8. Further information on how to use the example dataset and demonstration notebooks can be found here: https://spacec.readthedocs.io/en/stable/readme.html

Tissue samples:

Tonsil cores were extracted from a larger multi-tumor tissue microarray (TMA), which included a total of 66 unique tissues (51 malignant and semi-malignant tissues, as well as 15 non-malignant tissues). Representative tissue regions were annotated on corresponding hematoxylin and eosin (H&E)-stained sections by a board-certified surgical pathologist (S.Z.). Annotations were used to generate the 66 cores each with cores of 1mm diameter. FFPE tissue blocks were retrieved from the tissue archives of the Institute of Pathology, University Medical Center Mainz, Germany, and the Department of Dermatology, University Medical Center Mainz, Germany. The multi-tumor-TMA block was sectioned at 3µm thickness onto SuperFrost Plus microscopy slides before being processed for CODEX multiplex imaging as previously described.

CODEX multiplexed imaging and processing

To run the CODEX machine, the slide was taken from the storage buffer and placed in PBS for 10 minutes to equilibrate. After drying the PBS with a tissue, a flow cell was sealed onto the tissue slide. The assembled slide and flow cell were then placed in a PhenoCycler Buffer made from 10X PhenoCycler Buffer \& Additive for at least 10 minutes before starting the experiment. A 96-well reporter plate was prepared with each reporter corresponding to the correct barcoded antibody for each cycle, with up to 3 reporters per cycle per well. The fluorescence reporters were mixed with 1X PhenoCycler Buffer, Additive, nuclear-staining reagent, and assay reagent according to the manufacturer's instructions. With the reporter plate and assembled slide and flow cell placed into the CODEX machine, the automated multiplexed imaging experiment was initiated. Each imaging cycle included steps for reporter binding, imaging of three fluorescent channels, and reporter stripping to prepare for the next cycle and set of markers. This was repeated until all markers were imaged. After the experiment, a .qptiff image file containing individual antibody channels and the DAPI channel was obtained. Image stitching, drift compensation, deconvolution, and cycle concatenation are performed within the Akoya PhenoCycler software. The raw imaging data output (tiff, 377.442nm per pixel for 20x CODEX) is first examined with QuPath software (https://qupath.github.io/) for inspection of staining quality. Any markers that produce unexpected patterns or low signal-to-noise ratios should be excluded from the ensuing analysis. The qptiff files must be converted into tiff files for input into SPACEc. Data preprocessing includes image stitching, drift compensation, deconvolution, and cycle concatenation performed using the Akoya Phenocycler software. The raw imaging data (qptiff, 377.442 nm/pixel for 20x CODEX) files from the Akoya PhenoCycler technology were first examined with QuPath software (https://qupath.github.io/) to inspect staining qualities. Markers with untenable patterns or low signal-to-noise ratios were excluded from further analysis. A custom CODEX analysis pipeline was used to process all acquired CODEX data (scripts available upon request). The qptiff files were converted into tiff files for tissue detection (watershed algorithm) and cell segmentation.