Identifying wastewater chemicals in coastal aerosols
Data files
May 06, 2025 version files 87.86 GB
-
Air_Mass_origin_assignments.csv
599 B
-
Calibration_Curves.csv
3.51 KB
-
Compound_Concentrations_and_Metadata.csv
123.50 KB
-
Compound_Intensities_and_Concentrations.csv
197.97 KB
-
raw.zip
87.86 GB
-
README.md
5.59 KB
Abstract
This dataset accompanies our Science Advances manuscript investigating the aerosolization of wastewater-derived contaminants from the Tijuana River along the U.S.–Mexico border. It includes raw and processed data used to assess the spatial distribution and atmospheric transport of contaminants from coastal waters into the air. Files include: (1) Data tables with air mass origin assignments and quantified contaminant concentrations; (2) raw mass spectrometry data files and (3) links to external repositories hosting pre-processed MS and MS/MS data, as well as Global Natural Products Social Molecular Networking (GNPS) visualizations. These data support the spatial and chemical analyses presented in the manuscript and are intended to facilitate transparency, reproducibility, and further research on coastal aerosol pollution.
Dataset DOI: 10.5061/dryad.ksn02v7gp
Description of the data and file structure
This dataset accompanies our Science Advances manuscript investigating the aerosolization of wastewater-derived contaminants from the Tijuana River along the U.S.–Mexico border. It includes raw and processed data used to assess the spatial distribution and atmospheric transport of contaminants from coastal waters into the air. Files include: Data tables in .csv format of (1) calibration curve data, (2) mass spectral intensities and concentrations of field sample data, (3) concentration metadata, (4) air mass origin assignments, and (5) raw mass spectrometry data files that may be viewed or processed in mzmine or similar software packages available externally. These data support the spatial and chemical analyses presented in the manuscript and are intended to facilitate transparency, reproducibility, and further research on coastal aerosol pollution.
Files and variables
File: Calibration Curves.csv
Description: Table of spike amount of external calibration curve with associated compound intensities.
Variables
- "Spike amount" refers to the amount in nanograms per sample (ng/sample) spiked onto the sample media used for calibration.
- Variables for 12 compounds are labelled by compound name. The data underneath each compound name are mass spectral intensity data in units of ions.
File: Compound Intensities and Concentrations.csv
Description: Table of compound intensities and concentrations for each data file.
Variables
- "File" shows the file name.
- Variables for 12 compounds are labelled by compound name and where listed, all compound intensities are provided in units of ions.
- Variables for 12 compounds are labelled by compound name and where listed, provided in units of "ppt" (parts per trillion) for water samples (i.e., "DOM" in file name) or "pg m^-3" (picogram per cubic meter) for aerosol samples (i.e., "SSA" in file name.)
File: Compound Concentrations and Metadata.csv
Description: Table of file metadata and compound concentrations for each data file.
Variables
- "File" shows the file name based on the attributes provided below.
- "ATTRIBUTE_Date" shows date in YYYYMMDD format
- "ATTRIBUTE_Sample_Type" shows sample type. "DOM" refers to water samples; "SSA" refers to aerosol samples
- "ATTRIBUTE_Location" shows sample location. "SIO" is Scripps Institution of Oceanography. "BF" is Borderfield State Park. "IB" is Imperial beach pier. "SS" is Silver Strand State Park. "Holl" is the Tijuana River at Hollister bridge, "Blank" is a laboratory blank.
- "ATTRIBUTE*_Sample” shows the sample type. *“Sample" refers to field samples. "Blank" refers to field sample aerosol filter blanks. "PPL_Blank" refers to water sample blanks.
- Variables for 12 compounds are labelled by compound name, and all data are provided in units of either parts per trillion "ppt" for water samples (i.e., "DOM" in file name) or picograms per cubic meter "pg m^-3" for aerosol samples (i.e., "SSA" in file name.)
File: Air Mass origin assignments.csv
Description: Table where air mass origins and precipitation labels were assigned based on wind direction, back trajectories and precipitation.
Variables
- "Date" shows date of sample collection in DD-Month
- "Air Mass origin - Southern sites" refers to the air mass origin for the southern sampling sites (Borderfield, Silver Strand, Imperial Beach). "Land" refers to air masses that originated primarily over the land. "Sea" refers to air masses that originated primarily over the sea. "Mixed" refers to air masses that originated from a mixture of both land and sea origin.
- "Air Mass origin - SIO" refers to the air mass origin for the SIO sampling site. "Land" refers to air masses that originated primarily over the land. "Sea" refers to air masses that originated primarily over the sea. "Mixed" refers to air masses that originated from a mixture of both land and sea origin.
- "Precip" lists precipitation occurring in the 24 hr collection window. "dry" means no precipitation occurred. "wet" means precipitation occurred.
File: raw.zip
Description: Condensed raw data files. Subfolders separate positive from negative data file from each other, as well as quality control (QC) and blank files (labelled as blank-QC-raw) from sample and standard files (labelled as raw.) File names are the same as the files listed in "Compound Intensities and Concentrations.csv" and "Compound Concentrations and Metadata.csv".
Code/software
mzmine or a similar software may be used to view ".raw" mass spectral files. The ion intensities provided in "Compound Intensities and Concentrations.csv" were converted to concentrations using Microsoft Excel from the calibration curve data provided in "Calibration Curves.csv". Microsoft Excel and other similar data analysis programs may be used to re-generate the calibration curves to calculate concentrations or to visualize other datasets in this repository that have been provided in ".csv" format. All figures accompanying the manuscript and supporting information files were visualized in Igor Pro from the provided data tables.
Access information
Other publicly accessible locations of the data:
Data was derived from the following sources:
- Ultra-high resolution hybrid linear ion trap-Orbitrap mass spectrometer (Obritrap Elite)
