Spatial profiling of benign and malignant melanocytic tumors via RNA-SMI (CosMx)
Data files
May 03, 2024 version files 7.80 GB
-
FOVs_for_Slide_4.csv
-
FOVs_for_Slides_1-3.csv
-
InSituType_package_materials.zip
-
processed_metadata_Slide_4.csv
-
processed_metadata_Slides_1-3.csv
-
README.md
-
Slide_1.zip
-
Slide_2.zip
-
Slide_3.zip
-
Slide_4.zip
Abstract
Melanoma clinical outcomes emerge from incompletely understood genetic mechanisms operating within the tumor and its microenvironment. Here, we utilized single-cell RNA-based spatial molecular imaging (RNA-SMI) in patient-derived archival tumors to reveal clinically relevant markers of malignancy progression and prognosis. We examined spatial gene expression of 203,472 cells inside benign and malignant melanocytic neoplasms, including melanocytic nevi, primary invasive and metastatic melanomas. Algorithmic cell clustering paired with intratumoral comparative 2D-analyses visualized synergistic, spatial gene signatures linking cellular proliferation, metabolism, and malignancy, validated by protein expression. Metastatic niches included upregulation of CDK2 and FABP5, which independently predicted poor clinical outcome in 473 melanoma patients via Cox regression analysis. More generally, our work demonstrates a framework for applying single-cell RNA-SMI technology toward identifying gene regulatory landscapes pertinent to cancer progression and patient survival.
README: Spatial Profiling of Benign and Malignant Melanocytic Tumors via RNA-SMI (CosMx)
https://doi.org/10.5061/dryad.ksn02v7b1
Description of the data and file structure
"Slide_1.zip", "Slide_2.zip", "Slide_3.zip", "Slide_4.zip":
Each .zip folder contains the RNA-SMI data pertaining to Slides 1-3, which examined 203,472 cells amongst ten melanocytic tumors (exact tumor identity per slide is outlined in fig. S1, with further slide information regarding microscopic field of view distribution outlined in fig. S2). Slide 4 contains the RNA-SMI data examining 84,312 cells including the nevus-melanoma mixed tumor featured in fig. 4 as well as four other melanocytic tumors (one melanoma, one cutaneous metastasis, and two nevi).
Specifically, each .zip folder contains the following files and folders:
Files:
Transcript file (tx_file.csv), which contains columns for:
- —fov (Field Of View (FOV) where transcript is located)
- —cell_ID (unique identifier of cell with a FOV; together, the “fov” and “cell_ID” columns are able to define a unique identifier for each cell in the entire sample. Note, transcripts without an assigned cell have a value of 0)
- —x_global_px (see “x_local_px” description below; global position entails the relative transcript position within the large sample reference frame)
- —y_global_px (as “x_global_px” but for y dimension)
- —x_local_px (x position of transcript within FOV, measured in pixels. Note, to convert to microns, multiply pixel value by 0.18 um per pixel)
- —y_local_px (same as “x_local_px” but for y dimension)
- —target (HUGO gene symbol of target)
- —CellComp (nuclear, membrane, or cytoplasmic of subcellular compartment where transcript was detected via cell segmentation algorithm; note, “0” denotes extracellular as mentioned above)
Cell polygons file (polygons.csv), which contains simple polygon descriptions of cell boundaries in columns:
- —fov (Field Of View [FOV] where transcript is located)
- —cell_ID (unique identifier of cell with a FOV; together, the “fov” and “cell_ID” columns are able to define a unique identifier for each cell in the entire sample)
- —x_local_px (x position within FOV, measured in pixels; note, to convert to microns, multiply pixel value by 0.18 um per pixel)
- —y_local_px (same as “x_local_px” but for y dimension)
- —x_global_px (relative x position of the ROV, measured in pixels. Note, to convert to microns, multiply pixel value by 0.18 um per pixel)
- —y_global_px (as “x_global_px” but for y dimension)
Cell expression file (exprMat_file.csv), which contains gene expression counts per cell per gene in columns:
- —fov (Field Of View [FOV] where transcript is located)
- —cell_ID (unique identifier of cell with a FOV; together, the “fov” and “cell_ID” columns are able to define a unique identifier for a cell in the entire sample; note, transcripts without an assigned cell have a value of 0)
- —Alphabetical columns of genes targets (number of transcripts detected per gene target per cell)
- —Columns of negative probes (probes that do not match any sequence within the transcriptome, which can can be used to assess background levels)
Cell metadata file (metadata_file.csv), which contains metadata per cell in columns:
- —fov (Field Of View [FOV] where transcript is located)
- —cell_ID (unique identifier of cell with a FOV; together, the “fov” and “cell_ID” columns are able to define a unique identifier for each cell in the entire sample; note, transcripts without an assigned cell have a value of 0)
- —Area (total pixels per given cell)
- —Aspect ratio (width divided by height)
- —CenterX_global_px (see “CenterX_local_px” description below; global positions describe the relative transcript position within the large sample reference frame)
- —CenterY_global_px (as “CenterX_global_px” but for y dimension)
- —CenterX_local_px (x position of transcript within the FOV in pixels; the pixel edge length is 180nm. Thus, to convert to microns multiply the pixel value by 0.18 um per pixel)
- —CenterY_local_px (as “CenterX_local_px” but for y dimension)
- —Width (maximum cell length in x dimension in pixels)
- —Height (maximum cell length in y dimension in pixels)
- —Mean.MembraneStain (mean fluorescence intensity of a given cell’s membrane stain i.e., CD298)
- —Max.MembraneStain (max fluorescence intensity of a given cell’s membrane stain i.e., CD298)
- —Mean.S100b.PMEL17 (mean fluorescence intensity of a given cell’s S100B/PMEL17 stain)
- —Max.S100b.PMEL17 (max fluorescence intensity of a given cell’s S100B/PMEL17 stain)
- —Mean.CD45 (mean fluorescence intensity of a given cell’s CD45 stain)
- —Max.CD45 (max fluorescence intensity of a given cell’s CD45 stain)
- —Mean.CD3 (mean fluorescence intensity of a given cell’s CD3 stain)
- —Max.CD3 (max fluorescence intensity of a given cell’s CD3 stain)
- —Mean.DAPI (mean fluorescence intensity of a given cell’s DAPI stain)
- —Max.DAPI (max fluorescence intensity of a given cell’s DAPI stain)
FOV Positions File (fov_positions_file.csv), which provides each FOV location within the total structure of the sample in columns:
- —fov (Field Of View [FOV])
- —x_global_px (relative x position of FOV in pixels; yo convert to microns, multiply the pixel value by 0.18 um per pixel; NB, all FOVs are 5472 x 3648 pixels)
- —y_global_px (as “x_global_px” but for y dimension)
Folders:
“CellOverlay”: Contains JPG images of each FOV showing cell boundaries from cell segmentation (shown as cyan lines) as well as DAPI stain in white.
“CellComposite”: Contains JPG images of each FOV showing the immunofluorescence from the IHC markers and DAPI used in the SMI experiment
"CompartmentLabels”: Contains TIF images that display the subcellular compartment definitions for each FOV determined during cell segmentation. Each compartment type is given a unique number and all pixels determined to be within that compartment type have an intensity value matching that number.
“RawMorphologyImages”: Raw TIF image files for each FOV used for cell segmentation and to produce the cellComposite jPG images.
“CellLabels”: TIF files for each FOV that displays the cell definitions determined during cell segmentation. Each cell identified is given a unique number (cell_ID) and all pixels determined to be within that cell have an intensity value matching that number. These cell_ID values are shared in the transcript, Cell Expression, and Cell Metadata files. A pixel not assigned to a cell has a “0” value.
Other files contained in this upload include:
- "processed_metadata_Slides_1-3.csv" and "processed_metadata_Slide_4.csv": These files contains processed data and cell-level analyses of Slides 1-3 and Slide 4, using the columns headers described in the “tx_file.csv”, “polygons.csv”, “exprMat_file.csv”, “metadata_file.csv”, “fov_positions_file.csv” files above.
- FOVs for Slides 1-3.csv: Contains lesion descriptions of each FOV in Slides 1-3.
- FOVs for Slide 4.csv: Contains lesion descriptions of each FOV in Slide 4.
- InSituType package materials.zip: This self-contained .zip folder contains that InSituType package that utilized during the "Semi-supervised cell type annotation" data analysis; this package was available on GitHub @ https://github.com/Nanostring-Biostats/InSituType/tree/main as of April 10th 2024.
Code/Software
Files are compatible with open-source packages such as Seurat in R or Squidpy in Python. More information about how to employ these open-source packages can be found at:
Seurat: https://satijalab.org/seurat/
Squidpy: https://squidpy.readthedocs.io/en/stable/notebooks/tutorials/tutorial\_nanostring.html
Methods
NanoString® CosMx™ RNA Spatial Molecular Imaging (SMI)
The protocol used for NanoString® CosMx™ RNA Spatial Molecular Imaging (SMI) was based on the method previously described by He et al. (High-plex imaging of RNA and proteins at subcellular resolution in fixed tissue by spatial molecular imaging. Nat Biotechnol 40, 1794-1806. 2022). 5-μm formalin-fixed, paraffin-embedded (FFPE) tissue sections were mounted on VWR Superfrost Plus Micro slides (cat# 48311-703) and baked at 60°C overnight to improve tissue-slide adherence. The slides were prepared for in-situ hybridization (ISH) by heat-induced epitope retrieval (HIER) at 100°C for 15 min using ER1 epitope retrieval buffer (Leica Biosystems product, citrate-based, pH 6.0). Following HIER, the tissues were digested with 3 µg/ml Proteinase K diluted in ACD Protease Plus (Advanced Cell Diagnostics, Inc.) at 40°C for 30 minutes. Slides were washed twice with diethyl pyrocarbonate (DEPC)-treated water (DEPC H2O) and incubated in 0.0005% diluted fiducials (Bangs Laboratory, Inc.) in 2X SSCT (2X saline sodium citrate, 0.001% Tween-20) solution for 5 min at room temperature in the dark. Excess fiducials were rinsed from the slides with 1X phosphate buffered saline (PBS) and tissue sections were fixed with 10% neutral buffered formalin (NBF) for 5 min at room temperature. Fixed samples were rinsed twice with Tris-glycine buffer (0.1M glycine, 0.1M Tris-base in DEPC H2O) and once with 1X PBS for 5 min each before blocking with 100 mM N-succinimidyl (acetylthio) acetate (NHS-acetate, ThermoFisher) in NHS-acetate buffer (0.1M NaP, 0.1% Tween PH 8 in DEPC H2O) for 15 min at room temperature. The sections were then rinsed with 2X saline sodium citrate (SSC) for 5 min and an Adhesive SecureSeal Hybridization Chamber (Grace Bio-Labs) was placed over the tissue.
NanoString® ISH probes were prepared by incubation at 95°C for 2 min and placed on ice, and the ISH probe mix (1 nM 980 plex ISH probes, 1 nM custom probes, 10nM Attenuation probes, 1X Buffer R, 0.1 U/μL SUPERase•In™ [Thermofisher] in DEPC H2O) was pipetted into the hybridization chamber. The genes targeted for each analysis are listed in Data S3 and Data S7. The hybridization chamber was sealed to prevent evaporation, and hybridization was performed at 37°C overnight. Tissue sections were rinsed of excess probes in 2X SSCT for 1 min and washed twice in 50% formamide (VWR) in 2X SSC at 37°C for 25 min, then twice with 2X SSC for 2 min at room temperature and blocked with 100 mM NHS-acetate in the dark for 15 min. A custom-made flow cell was affixed to the slide in preparation for loading onto the CosMx SMI instrument.
RNA target readout was performed as described in He et al. (High-plex imaging of RNA and proteins at subcellular resolution in fixed tissue by spatial molecular imaging. Nat Biotechnol 40, 1794-1806. 2022). Briefly, the assembled flow cell was loaded onto the CosMx SMI instrument and Reporter Wash Buffer was flowed to remove air bubbles. A preview scan of the entire flow cell was taken, and diagnostic areas of each tumor were targeted using 0.7 x 0.9 mm fields of view (FOVs, 21 to 25 per slide as listed in Fig. S2, Data S4) to match regions of interest identified by H&E staining of an adjacent serial section. RNA readout began by flowing 100 μl of Reporter Pool 1 into the flow cell and incubation for 15 min. Reporter Wash Buffer (1 mL) was flowed to wash unbound reporter probes, and Imaging Buffer was added to the flow cell for imaging. Nine Z-stack images (0.8 μm step size) for each FOV were acquired. Photocleavable linkers on the fluorophores of the reporter probes were released by UV illumination and washed with Strip Wash buffer. The fluidic and imaging procedure was repeated for the 16 reporter pools, and the 16 rounds of reporter hybridization-imaging were repeated multiple times to increase RNA detection sensitivity.
After RNA readout, tissue samples were incubated with a 4-fluorophore-conjugated antibody cocktail against CD298/B2M (488 nm), S100b/PMEL17 (532 nm), CD45 (594 nm), and CD3 (647 nm) proteins and DAPI stain in the CosMx SMI instrument for 2 h. After unbound antibodies and DAPI stain were washed with Reporter Wash Buffer, Imaging Buffer was added to the flow cell and nine Z-stack images for the 5 channels (4 antibodies and DAPI) were captured.
Cell segmentation of CosMx SMI data
CosMx data underwent cell segmentation according to methods described previously (He et al., High-plex imaging of RNA and proteins at subcellular resolution in fixed tissue by spatial molecular imaging. Nat Biotechnol 40, 1794-1806. 2022). This segmentation process employed a machine learning algorithm (references; M. Pachitariu, C. Stringer, Cellpose 2.0: how to train your own model. Nat Methods 19, 1634-1641. 2022., and C. Stringer, T. Wang, M. Michaelos, M. Pachitariu, Cellpose: a generalist algorithm for cellular segmentation. Nat Methods 18, 100-106. 2021.) that utilizes z-stack images of immunostaining and DAPI to delineate cell boundaries and subsequently assign transcripts to specific cell locations and subcellular compartments. The resulting transcript profiles of individual cells were created by integrating the target transcript's location with the boundaries established during cell segmentation.