Code for: Multi-modal screening for synergistic neuroprotection of mild extremely preterm brain injury: Cell counting code repository

Magoon, Matthew J.1 ; Jin, Zheyu Ruby 1 ; Corry, Kylie A.1; Brandon, Olivia C.1; Nance, Elizabeth 1 ; Wood, Thomas R.1 ; Boyle, Patrick M.1

Published Aug 14, 2025 on Dryad. https://doi.org/10.5061/dryad.r4xgxd2qv

Data files

Aug 14, 2025 version files 107.37 KB

cell_counter_repo.zip

100.21 KB
README.md

7.16 KB

Abstract

Preterm brain injury affects both white and grey matter, including altered cortical development and gyrification, with associated neurodevelopmental sequelae such as cerebral palsy and learning deficits. The preterm brain also displays regionally heterogeneous responses to both injury and treatment, supporting the need for drug combinations to provide global neuroprotection. We developed an extremely preterm-equivalent organotypic whole hemisphere (OWH) slice culture injury model using the gyrencephalic ferret brain to probe treatment mechanisms of promising therapeutic agents and their combination. Regional and global responses to injury and treatment were assessed by cell death quantification, machine learning-augmented morphological microglia assessments, and digital transcriptomics. Using two promising therapeutic agents, azithromycin (Az) and erythropoietin (Epo), we show minimal neuroprotection by either therapy alone, but evidence of synergistic neuroprotection by Az*Epo both globally and regionally. This effect of Az*Epo involved emergent augmentation of transcriptomic responses to injury related to neurogenesis and neuroplasticity and downregulation of transcripts involved in cytokine production, inflammation, and cell death. This study supports the use of the ferret OWH slice culture model to provide a powerful high-throughput platform to examine combinations of therapeutics for extremely preterm brain injury.

This repository supports our associated publication by Jin et al. in Bioengineering & Translational Medicine by supplying code used to process histology images. This codebase serves to identify nuclei in a DAPI-stained histology slide, measure geometry and intensity-based properties, and then develop and apply a random forest model to classify nuceli as pyknotic or non-pyknotic based on their measured properties.

Processes microscopy images to count cells and identify which are pyknotic. This is internal/development-grade source code, provided as-is primarily for reference purposes. However, we hope this program will help other labs establish an automated method of analyzing histology images.

Please cite our work if you use this code. DOI: 10.5061/dryad.r4xgxd2qv

Installation Instructions

This program is written entirely in the Python programming language, version 3.10.11. Please ensure Python version 3.10 is downloaded. This program was developed in MacOS (Sequoia 15.3.2) and has also been tested on the Ubuntu Linux distribution, version 22.04. Theoretically, it should not have trouble running on Windows OS with appropriate configuration.

These directions are intended to help novice programmers get started. First, decompress cell_counter_repo.zip, which will create a directory called cell_counter_repo. In a terminal, navigate to this directory, likely by invoking cd cell_counter_repo. Execute these commands in a terminal window, written for Linux/MacOS. This program was developed and tested with MacOS Adjustments may be needed for Windows and other operating systems.

Create a virtual environment: python3.10 -m venv .venv
Activate virtual environment: source .venv/bin/activate (Windows PowerShell: source .\.venv\Scripts\Activate.psl)
Upgrade pip and install dependencies: pip install --upgrade pip && pip install -r requirements.txt.

Congratulations, you have installed cell_counter on your computer!

Using `cell_counter`

Microscopy images may be viewed by running the command python image_viewer.py [FILE PATHS]

Cell counts can be performed with the command python cell_counter.py [-OPTIONS] [SOURCE] [OUTPUT DIRECTORY]

Usage instructions and helpful information

Pre-processing steps are used to refine each image. These steps first reduce image artifact and then create a mask of the image, indicating which regions may contain cells. We recommend using "image_viewer.py" to ensure the current pre-processing are sufficient for your application. Pre-processing steps may be modified within the code.

By default, cell candidates are detected in the pre-processed and masked image using the Laplacian of the Gaussian method. This searches the provided image for roughly circular "blobs" and labels each as a cell candidate. The center location and the "sigma" value for the Gaussian that best fit each blob (cell/nuclei) is recorded. This method can tolerate overlapping cells, although the extent of overlap permitted may be tuned for your specific application. Subsequent filtering steps remove false positives.

By default, a predefined cluster analysis is available to classify DAPI-stained cell nuclei as pyknotic or non-pyknotic. This will likely need to be tuned or changed depending on your specific application. Cells may be classified by explicitly setting thresholds or by providing your own data to develop a new cluster-based model.

Even if nuclei are not accurately labelled as pyknotic/non-pyknotic (if this labeling matters for your application), properties for all detected cells can be saved and accessed in a spreadsheet after analyzing a batch of images. These properties are intended to aid in developing your own classification scheme.

Developing a new classifier

Users are encouraged to use a pretrained classifier if possible and appropriate for their situation.

If a new classifier must be trained, a representative sample of images (recommend at least 50-100) should be manually annotated as described below.

Steps for developing a new classifier:

Set paths and run python cell_counter.py without a random forest classifier to count cells and save their properties. Classifications will be very inaccurate, but the classifications do not impact cell property calculations. The output file properties.csv will be important for the next step.
Set paths/constants and run python train_rdf_model.py, providing paths to folders containing identically named .nd2 and .png raw microscopy and annotated image files, respectively. This will create a new properties_updated.csv file where the "known_pyknotic" column has a value of 0/1 for cells in images that were annotated. A random forest classifier will be trained. Stop here if the random forest classifier already performs sufficiently well - set the principal component thresholds in train_rdf_model.py to None to only use the random forest classifier.
To potentially improve the classifier's performance, set paths and run python fit_pca.py to perform principal component analysis on the properties_updated.csv file. This will create a graph of the first 2 principal components and save a file called pca_transformation.xlsx with data to reproduce the same transformation with other sets of images.
Use the data in pca_transformation.xlsx to identify an acceptable domain in the PCA-derived latent space for all cells. Currently, the code is setup to only consider the first two principal components. A subdomain should then be defined where cells are likely to be pyknotic. In the cell_counter.py file, define the domain of "pyknotic candidates" in this latent space.
Rerun python cell_counter.py. The cells classified as pyknotic, especially in properties.csv, reflect the previously defined "pyknotic candidates" that will undergo further consideration by the random forest classifer.
Make a copy of properties.csv called training_properties.csv. Without changing the order of the training_properties.csv file, remove all rows where "classified_pyknotic" (last column) is "FALSE".
1. I recommend creating a column of row indices and sorting by the "classified_pyknotic" column and then by original row index to delete all non-candidates in one large block. Then delete the column of row indices you created.
Now, run python train_rdf_model.py to train a new random forest model that specifically classifies the previously defined pyknotic candidates as pyknotic or non-pyknotic. All other cells are known to be non-pyknotic, helping to improve the model's performance.
For future classifications, set cell_counter.py to reference the same pca_transformation.xlsx file you previously made, use the same principal component thresholds as before, and now reference the new random forest classifier .pkl file developed in the previous step. You now have a new classification scheme.

Manually annotating pyknotic/target nuclei

Target nuclei may be annotated in the free software ImageJ using the following protocol.

Use multi-point tool to identify pyknotic cells. Click as close to the center of the cell as possible.
Double click on the multi-point tool icon to open its configurations.
1. Type: dot
2. Color: yellow
3. Size: small
4. Label points: off
Image > overlay > Add selection
Image > overlay > Flatten
File > Save as > PNG

Save the .png files in a different folder. Aside from the extension, ensure the corresponding .png and .nd2 files have the same name.