Proteomic profiling of serum extracellular vesicles from uranium-exposed miners

Oh, Mijung 1 ; Lim, Eunju1; Zychowski, Katherine 1 ; Luo, Li 1

Research facility: University of New Mexico

Published Oct 06, 2025 on Dryad. https://doi.org/10.5061/dryad.931zcrjxd

Data files

Oct 06, 2025 version files 399.62 KB

Abstract

This dataset contains quantitative proteomic data from serum-derived extracellular vesicles (EVs) isolated from 29 former uranium miners. Small and large EVs were separated by differential ultracentrifugation and analyzed using liquid chromatography-tandem mass spectrometry (LC-MS/MS). The dataset includes raw and normalized protein intensities for each EV subtype, as well as metadata on mining tenure and age. These data were used to examine associations between mining tenure and the molecular profiles of serum-derived EVs. EV proteomics may serve as a sensitive tool for assessing the long-term health effects of uranium exposure.

Dataset DOI: 10.5061/dryad.931zcrjxd

Description of the data and file structure

Serum-derived EVs Proteomics Analysis in Uranium Miners

This repository contains data and documentation for the proteomic analysis of serum-derived extracellular vesicles (EVs) from uranium miners, focusing on the relationship between mining tenure and EVs-derived protein expression profiles.

1. Project Overview

This study investigates how uranium mining exposure affects the protein content of serum-derived small and large EVs. Miners were classified into short tenure (ST) and long tenure (LT) groups based on their mining history, and proteomic differences were assessed accordingly.

2. Source Files

This dataset consists of 5 individual CSV files.

File 1: Miner_Info.csv (Participant metadata)
File 2: small_EVs_COMPLETE.csv (Raw protein intensities for small EVs)
File 3: small_EVs_normalized.csv (Normalized protein intensities for small EVs)
File 4: large_EVs_COMPLETE.csv (Raw protein intensities for large EVs)
File 5: large_EVs_normalized.csv (Normalized protein intensities for large EVs)

3. Sample Metadata (File: `Miner Info.csv`)

Contains miner-level metadata and phenotype classification used in downstream analysis.

Column Name	Description
Sample ID	Unique identifier for each participant (e.g., OLC_10002)
Mining Tenure	Total years of uranium mining experience
Tenure Classification	Group label based on tenure: ST (1–9 years), LT (10–40 years)
Age	Age in years at time of sample collection

4. Proteomics Data Files

Each file below contains quantified protein intensities per sample for either small or large EVs, in both raw and normalized formats.

4.1 File: `small EVs_COMPLETE.csv`

Type: Raw protein intensity
Rows: Each sample (by Sample ID)
Columns: Proteins (e.g., Alpha-2-macroglobulin [OS=Homo sapiens], etc.)

4.2 File: `small EVs_normalized.csv`

Type: Normalized intensities of small EV proteins
Normalization method: Total peptide amount

4.3 File: `large EVs_COMPLETE.csv`

Type: Raw protein intensity from large EVs

4.4 File: `large EVs_normalized.csv`

Type: Normalized large EV protein intensities

Note: Protein names are annotated in UniProt format (e.g., "[OS=Homo sapiens]").

5. Data Processing Workflow

Software and Database

Search Software: Thermo Proteome Discoverer v2.5
Database: UniProt Homo sapiens (reviewed; downloaded Oct 26, 2023)

Search Parameters

Enzyme: Trypsin (allowing 2 missed cleavages)
Static Modifications: Carbamidomethyl (C)
Variable Modifications: Oxidation (M), Acetyl (N-term)
Precursor Tolerance: 10 ppm
Fragment Tolerance: 0.02 Da
FDR Threshold: 1%

6. Quantification and Normalization

Quantification Method: Intensity-based precursor abundance
Normalization: Total peptide amount normalization for intra-sample correction

7. Output Summary

File Name	Description
`Miner Info.csv`	Sample ID and phenotype annotations (age, tenure)
`small EVs_COMPLETE.csv`	Raw intensities of small EV proteins
`small EVs_normalized.csv`	Normalized small EV protein data
`large EVs_COMPLETE.csv`	Raw intensities of large EV proteins
`large EVs_normalized.csv`	Normalized large EV protein data

8. Contact

For questions, please contact:

Name: Dr. Katherine Zychowski
Affiliation: University of New Mexico
Email: kzychowski@salud.unm.edu

9. Human Subjects Data

All human subjects' data included in this dataset have been de-identified in compliance with applicable legal and ethical standards. Explicit informed consent was obtained from all study participants for the public sharing of their de-identified data. Participant metadata includes only non-identifiable variables such as age (in years) and mining tenure (in years). The 'Race/Ethnicity' and 'Smoking History' columns were removed from the dataset to comply with Dryad's data anonymization policy.

10. Files and Variables

File 1: `Miner_Info.csv`

Description: Participant metadata including anonymized demographic information (age, mining tenure) and tenure classification.
Variables: Sample ID, Mining Tenure, Tenure Classification, Age.

File 2: `small_EVs_COMPLETE.csv`

Description: Raw, un-normalized protein intensities from small extracellular vesicles (S-EVs) for each participant.
Variables: Sample ID, Protein abundances by UniProt accession.

File 3: `small_EVs_normalized.csv`

Description: Normalized protein intensities from small extracellular vesicles (S-EVs) for each participant.
Variables: Sample ID, Normalized protein abundances by UniProt accession.

File 4: `large_EVs_COMPLETE.csv`

Description: Raw, un-normalized protein intensities from large extracellular vesicles (L-EVs) for each participant.
Variables: Sample ID, Protein abundances by UniProt accession.

File 5: `large_EVs_normalized.csv`

Description: Normalized protein intensities from large extracellular vesicles (L-EVs) for each participant.
Variables: Sample ID, Normalized protein abundances by UniProt accession.

Human subjects data

All human subjects data included in this dataset have been fully de-identified in compliance with applicable legal and ethical standards. Explicit informed consent was obtained from all study participants for the collection, analysis, and public sharing of their de-identified data. No personally identifiable information (PII), including names, contact information, or identifiable health data, is included in the dataset.

De-identification procedures involved removing all direct identifiers. To comply with Dryad's data anonymization policy and minimize the risk of re-identification, the 'Race/Ethnicity' and 'Smoking History' variables were removed from the public dataset. The remaining participant metadata includes only non-identifiable variables such as age (in years) and mining tenure (in years). All samples were assigned anonymized study ID codes with no link to personal identifiers.

The dataset is compliant with Dryad’s human subjects data policy and is appropriate for open-access publication under the CC0 license.