Data from: Enhanced combinatorial analysis of tumor cell-ECM interactions using design-of-experiment optimized microarrays
Data files
May 14, 2026 version files 172.52 GB
-
Anova_Comparisons.Rmd
9.93 KB
-
Attachment.zip
31.45 GB
-
EdU_intx_plots.Rmd
28.26 KB
-
Finding_Rep_Images.Rmd
9.14 KB
-
Fluorescent_Fibronectin.zip
18.28 GB
-
Models_to_Run_for_Figures_2_to_4.Rmd
5.87 KB
-
Models_to_Run.Rmd
38.46 KB
-
Phenotypic_Responses.zip
98.59 GB
-
Plotting_the_Data.Rmd
66.02 KB
-
Proliferation_and_Survival.zip
24.20 GB
-
README.md
8.51 KB
-
Report_2025-05-12_HCPC.Rmd
2.51 KB
-
Report_2025-05-12_PCA.Rmd
9.81 KB
-
Setting_up_the_Data.Rmd
27.03 KB
-
Survival_Plots.Rmd
29.95 KB
Abstract
The dysregulated and fibrotic tumor microenvironment of hepatocellular carcinoma (HCC) delays diagnosis and presents many complex signals that drive disease progression. To better recapitulate this microenvironment, we have enhanced our established protein microarray platform by integrating the design of experiments (DoE) methodology with high-throughput cell microarray screening. This innovative approach systematically interrogates the intricate roles of matrix stiffness (spanning healthy and fibrotic conditions), extracellular matrix (ECM) composition, and protein concentration, while simultaneously examining their interdependent interactions. By leveraging DoE principles, we were able to explore 117 unique microenvironments on a single microscope slide, ultimately generating a comprehensive dataset of 234 different microenvironments without compromising statistical rigor. Our enhanced screening system enabled the identification of unique microenvironmental interactions critically significant in dictating cellular responses, including adhesion, survival, proliferation, epithelial-to-mesenchymal transition, and drug resistance markers. Utilizing advanced statistical techniques such as linear models and principal component analysis, we characterized phenotypic clusters defined by precise microenvironmental cues. This work presents a robust, high-throughput microarray screening system that comprehensively explores the contributions of 9 physiologically relevant extracellular matrix proteins and matrix stiffness in modulating cellular behavior and disease progression through a methodologically sophisticated and statistically sound approach.
Dataset DOI: 10.5061/dryad.x3ffbg7xc
Description of the data and file structure
The data was collected through experiments on protein microarrays. Cells were seeded on these microarrays, imaged on an Axioscan slide scanner, processed and analyzed with Cell Profiler. All downstream analysis was performed in R.
Files and variables
Variables: For Cell Profiler Outputs, see Cell Profiler Manuals for interpretation of many of the columns as they are automatic outputs.
For "Metadata_" columns:
- Column - column of island in grid
- Experiment - biological replicate
- Part - technical replicate on slide
- Row - row of island in grid
- Slide - technical replicates in biological replicate
- Stiffness - stiffness of polyacrylamide gel on slide
- Treatment - Sorafenib Treatment
File: Attachment.zip
Description: Contains the CZI images and raw Cell Profiler output for the attachment (cell number) experiments.
Subfolders:
BAA e2 CZIs.zip - A zip file containing all CZI images from the first experimental replicate for attachment (cell number) assessment as well as the raw Cell Profiler output.
BAA e3 CZIs.zip - A zip file containing all CZI images from the second experimental replicate for attachment (cell number) assessment as well as the raw Cell Profiler output.
BAA e4 CZIs.zip - A zip file containing all CZI images from the third experimental replicate for attachment (cell number) assessment as well as the raw Cell Profiler output.
File: Fluorescent_Fibronectin.zip
Description: Contains the exported images (TIFFs) and analysis of fluorescent fibronectin intensity quantification for Fig S1.
Subfolders:
FFN e2 Images.zip - TIFF images of the second experiment of fluorescent fibronectin experiments.
FFN e4 Images.zip - TIFF images of the third experiment of fluorescent fibronectin experiments.
FFN e1 Images.zip - TIFF images of the first experiment of fluorescent fibronectin experiments.
FFN_Combined_Analysis.zip - Contains csv files and R files for analysis of fluorescent fibronectin intensity.
CSV Files: Data outputs from ImageJ analysis of fluorescent quantification:
"All_Data_e1.csv", "All_Data_e2.csv", "AllDatae2.csv", "AllDatae3.csv" - Files from ImageJ quantification
"All_Fridge_w_FOC.csv" - Combined data from all experiments for analysis
R Files: "FFN_Combined.Rproj", "FFN_Compiled Math.nb.html", "FFN_Combined Math.Rmd"
Variables:
- Color: Channel measured (red or green)
- Stiffness: stiffness of polyacrylamide slide
- Treatment: Percentage (ratio) of fluorescent red protein used
- Slide: Technical replicate
- Column: island location in grid (column is technical replicates of ratio)
- Mean Intensity: AIU of fluorescent intensity for channel measured
File: Proliferation_and_Survival.zip
Description: Contains the CZI images and raw Cell Profiler Output for the proliferation (EdU) and drug survival (Sorafenib) experiments.
Subfolders:
Cell Profiler Output - Excel Files from the output of Cell Profiler titled "EdU_Nuclei.csv" (Green Channel measurement for EdU), "EdUPositiveNuclei.csv" (Program identified EdU positive cells on each island), and "blue_objects_take2.csv" (DAPI channel data for cells on each island).
3DP CZIs.zip - Zip file of all CZIs of all islands in these experiments identified by experiment, slide, scene, and island location.
File: Phenotypic_Responses.zip
Description: Contains the CZI images and raw Cell Profiler Output for the phenotypic responses (E-cadherin, Vimentin, and PXR) experiments
Subfolders:
Cell Profiler Output - contains csv files for each fluorescent channel's Cell Profiler Output. ("blue_objects" DAPI, "green_objects" Vimentin, "orange_objects" Ecadherin, "red_objects" PXR)
DREMT CZIs-selected.zip - Zip file of all CZIs of all islands for phenotypic response experiments.
File: Report_2025-05-12_HCPC.Rmd
Description: It performs PCA-based hierarchical clustering (HCPC) to group individuals into 4 clusters and describe each cluster’s biological characteristics using variable contributions and visualizations.
File: Report_2025-05-12_PCA.Rmd
Description: The hierarchical clustering output from Facto_Shiny for the PCA performed. It uses PCA to reduce dimensionality, interpret key variability (63.83%), and identify patterns and clusters among biological variables and samples using HCPC
File: Models_to_Run.Rmd
Description: The linear model fittings for the different responses in this paper. This script builds mixed-effects models on low-dispersion DMSO data to identify significant biomolecular factors and interactions influencing cell markers, validating that the models are stable enough for reliable prediction.
File: Plotting_the_Data.Rmd
Description: Plotting of the phenotypic response data. This script visualizes complex relationships in the dataset (ECM composition, stiffness, ratios, and treatments) using boxplots, interaction plots, heatmaps, and correlation analyses. It helps interpret how different microenvironment factors influence cell phenotypes (PXR, Ecad, Vimentin, survival) and validates statistical findings through graphical exploration.
File: Setting_up_the_Data.Rmd
Description: Pre-processing of the data prior to plotting. This script processes raw multi-channel cell imaging data by cleaning, merging, normalizing (via quantile normalization), and assigning experimental conditions like ECM composition, stiffness, and treatment. It then aggregates the data at multiple levels (image, slide, experiment) and encodes precise ECM mixture ratios and replicates to create a structured dataset ready for downstream statistical analysis and modeling.
File: Anova_Comparisons.Rmd
Description: Creating a visual representation of the ANOVA outputs for each linear model. This R Markdown script compares ANOVA significance results across conditions (e.g., Day 0 vs Day 2, survival, proliferation) by merging datasets, converting significance levels into numeric values, and reshaping data for analysis.
File: EdU_intx_plots.Rmd
Description: Plots exploring the interactions between EdU conditions. This script generates detailed visualizations (interaction plots, boxplots, heatmaps, and scatter plots) to explore how ECM composition, stiffness, ratios, and drug treatments affect cell proliferation (EdU positivity). It highlights significant main effects and interactions (e.g., Stiffness×Ratio, Treatment×ECM) and supports statistical findings by visually identifying trends and biologically meaningful patterns
File: Finding_Rep_Images.Rmd
Description: Statistical tests and plots to identify candidate conditions for representative images. This script identifies statistically significant ECM, stiffness, and treatment conditions and generates targeted plots to select representative experimental images that best illustrate key biological differences.
File: Models_to_Run_for_Figures_2_to_4.Rmd
Description: Linear models for attachment, survival, and proliferation data. This script builds statistical mixed-effects models (linear and generalized) to quantify how ECM components, stiffness, treatment, and their interactions influence cell proliferation (EdU) and attachment/survival. It evaluates significance using ANOVA, calculates model performance metrics, and accounts for experimental variability using nested random effects (experiment, block, replicate).
File: Survival_Plots.Rmd
Description: Plots of the survival data. This script visualizes cell survival (FOC) across ECM compositions, stiffness, and drug treatments using boxplots, scatter plots, heatmaps, and interaction plots. It highlights how microenvironment factors influence survival outcomes and identifies conditions that decrease, maintain, or enhance cell viability relative to control.
Code/software
All code is based in R (Rmd), there is a part of each code attached that loads the packages necessary for performing the code.
Access information
Other publicly accessible locations of the data:
