Data and code from: Microfluidic nanomagnetically isolated neuron- and astrocyte-derived extracellular vesicles to differentiate Lewy body and Alzheimer’s disease

Yang, Stephanie 1 ; Lin, Andrew1 ; Shen, Hanfei1; Pappalardo, Laura2; Spychalski, Griffin1; Rosario, Jean1; Forsberg, Leah3; Grant, Kiera3; Buser, Joshua4; Savica, Rodolfo3; O'Bryant, Sid5; Jones, David3; Dickson, Dennis3; Reichard, R. Ross3; Nguyen, Aivi3; Meaney, David1; Boeve, Bradley3; Ross, Owen3; McLean, Pamela3; Issadore, David1

Published Mar 23, 2026 on Dryad. https://doi.org/10.5061/dryad.612jm64j4

Data files

Mar 23, 2026 version files 4.37 MB

Abstract

Identifying plasma-based biomarkers that can accurately differentiate Lewy body disease (LBD) from Alzheimer’s disease (AD) remains a major challenge. Extracellular vesicles (EVs), which carry molecular cargo from their parent cells and can cross the blood-brain barrier, offer a new path forward. We developed the multiplexed Track-Etch magnetic NanoPOre (mTENPO) platform, a highly parallelized microfluidic technology for cell-specific EV isolation, and demonstrated independent enrichment of GluR2+ (neuron-derived) and GLAST+ (astrocyte-derived) EVs from the antemortem plasma of 137 autopsy-confirmed LBD, AD, mixed pathology, and control subjects. By integrating miRNA sequencing of GluR2+ and GLAST+ EV cargo with plasma measurements of Aβ40, Aβ42, tau, p-Tau181, and p-Tau231, we identified a multimodal 15-feature panel that more comprehensively reflects brain pathology than conventional biomarkers. Using 10-fold cross-validation to mitigate overfitting, the panel achieved an accuracy of 0.95 and an area under the curve of 0.96 for distinguishing LBD versus AD.

Dataset DOI: 10.5061/dryad.612jm64j4

Description of the data and file structure

We analyzed plasma from N = 137 pathologically confirmed Lewy body disease (LBD), Alzheimer's disease (AD), combined AD/LBD, AD with amygdala Lewy bodies (AD/ALB), and control subjects. We performed neuron-derived (GluR2+) or astrocyte-derived (GLAST+) EV enrichment using our lab's multiplexed Track-Etch magnetic NanoPOre (mTENPO) sorting platform, followed by next-generation sequencing to analyze their miRNA cargo. We also performed gold-standard digital enzyme-linked immunosorbent assay (ELISA) to determine Aβ40, Aβ42, tau, p-Tau181, p-Tau231 expression. Using GluR2+ EV miRNA, GLAST+ EV miRNA, and plasma protein expression, we applied feature selection and 10-fold cross-validation to find biomarker panels that could differentially diagnose LBD and AD. The raw data and code used for feature selection are included in this submission.

Files and variables

File: Sequencing_SIMOA_Data.xlsx

Description: This file contains the GluR2+ EV and GLAST+ EV miRNA [in counts per million (CPM)-normalized reads], as well as the plasma protein measurements (in pg/mL) for all patients analyzed in this study. The plasma protein measurements are listed in the final 6 rows of the Excel sheet. Sample IDs start with the sequencing batch number before the dash. Columns correspond to either the GluR2 (ending in an odd ID) or GLAST (ending in an even ID) for each patient (e.g. 1-1 and 1-2 represent GluR2 and GLAST pulldowns for 2 aliquots of the same sample, 1-3 and 1-4 are from the same sample, and so on).

File: Isotype_qPCR_GluR2_Data.xlsx

Description: To confirm the specificity of the GluR2 and GLAST antibodies in mTENPO EV capture, quantitative polymerase chain reaction (qPCR) was performed on EV miRNA from mTENPO pulldowns using GluR2 or GLAST capture antibody or their appropriate isotype control antibodies. This file provides the quantification cycle (Cq) for a panel of central nervous system (CNS) EV-associated miRNAs for the GluR2 vs. isotype mTENPO pulldown comparison.

File: Isotype_qPCR_GLAST_Data.xlsx

Description: This file provides the Cq for a panel of CNS EV-associated miRNAs for the GLAST vs. isotype mTENPO pulldown comparison, using the same method as specified above for Isotype_qPCR_GluR2_Data.xlsx.

File: LASSO_Panels.ipynb

Description: This code is run in Python and will output a list of least absolute shrinkage and selection operator (LASSO)-selected markers for panel sizes up to 20 markers. Input data should be a .csv file with only the mTENPO-isolated EV miRNA and protein expression data for patients with the selected pathologies for your comparison where column 1 = patient IDs; column 2 = pathology; column 3:end = marker expression data. Copy the text output from the last block of code and put it in a .txt file with the naming convention “[group1]_[group2]_panels.txt” for the downstream MATLAB code needed to analyze the LASSO output.

File: LASSO_reformatPythonOutput.m

Description: This MATLAB script reformats the output from the LASSO Python code for ease of data handling in LASSO_ensembleLearning.m. You need to have saved the block of text from LASSO_Panels.ipynb as a text file before running this.

File: LASSO_ensembleLearning.m

Description: This MATLAB script uses disease cohort data to evaluate the performance of the LASSO selected panels and generate AUC and accuracy measurements for the combination of the markers within each panel. The ensemble model performance is evaluated using 10-fold cross-validation repeated 5x. You will need to have run LASSO_reformatPythonOutput.m before this code. You will also need to place this script in the same folder as the Sequencing_SIMOA_Data.xlsx file associated with this study.

Code/software

To open the data and code provided in this submission, you will need Microsoft Excel, Jupyter Notebook (v7), and MATLAB_R2024a. LASSO_Panels.ipynb uses the following packages: numpy, pandas, scikit-learn. LASSO_reformatPythonOutput.m and LASSO_ensembleLearning.m are run in MATLAB and require the Statistics and Machine Learning Toolbox.

Human subjects data

All participants or their proxies provided informed consent for this analysis. Individual identifiers and demographic information have been removed to protect patient privacy.