Flow virometry for water-quality assessment: Protocol optimization for a model virus and automation of data analysis
Data files
Jan 04, 2023 version files 1.12 GB
Abstract
Flow virometry (FVM) can support advanced water treatment and reuse by delivering near real-time information about viral water quality. But maximizing the potential of FVM in water treatment and reuse applications requires protocols to facilitate data validation and interlaboratory comparison—as well as approaches to protocol design to extend the suite of viruses that FVM can feasibly and efficiently monitor. In the npj Clean Water article “Flow virometry for water-quality assessment: Protocol optimization for a model virus and automation of data analysis,” we address these needs by first optimizing a sample-preparation protocol for a model virus (T4 bacteriophage) using a fractional factorial experimental design. We then compare manual and algorithmic methods of analyzing complex FCM data collected by applying the optimized protocol to (i) a clean solution spiked with a variety of biological and non-biological viral surrogates [mixed-target experiment], and (ii) tertiary treated wastewater effluent spiked with T4 bacteriophage and two sizes of fluorescent polystyrene beads [environmental spike experiment]. This repository contains the FCM data used to develop the optimized protocol and to test the two analytical methods.
Methods
All data were collected by analyzing a 10-mL volume of the sample in question using the 488 nm (blue) solid-state laser, the lowest possible instrument flowrate (5 mL/min), and a FITC = 800 threshold on a NovoCyte 2070V Flow Cytometer coupled with a NovoSampler Pro autosampler (Agilent). Green fluorescence (FITC) intensity was collected at 530 ± 30 nm; forward and side scatter (FSC and SSC) intensities were collected as well. For the optimization experiments, 10 mL of an unstained control was run after each sample. The instrument was flushed in between each sample and control by running 150 mL of 1x NovoClean solution (Agilent) followed by 150 mL of MQ water through the SIP at the highest instrument flow rate (120 mL/min). Instrument performance was ensured by performing the instrument’s built-in quality control (QC) test at least monthly. The FCM data were exported directly to .fcs (the standard format for flow cytometry/virometry data) files. All of the raw .fcs files used for the optimization experiments, mixed-target experiments, and environmental spike experiments are provided in this repository. For the mixed-target and environmental spike experiments, these .fcs files were then manually gated and exported to .csv files for use in downstream, algorithmically assisted analysis. Each of these .csv files is provided in this repository as well.
Usage notes
For our analysis, we used the following tools: FlowJoTM 10 software (Becton Dixon & Company) to manually open and analyze the .fcs files, and to convert these files to .csv files where needed; Rstudio (version 2021.9.1.372) for downstream analysis of results and data obtained using FlowJo; MATLAB® software (version R2021a; MathWorks) to perform additional downstream analysis; and Excel (version 16.68; Microsoft) to manually inspect the .csv files. A list of free and open-source alternative programs that can be used to analyze the .fcs files contained in this repository can be found at https://floreada.io/flow-cytometry-software. A variety of free and open-source alternative programs (e.g., Google Sheets) exist to analyze the .csv files contained in this repository.
For the optimization experiments, subsequent direct analysis of these data (manual gating and calculation of the number, mean fluorescence intensity, and coefficient of variation of all gated particles) was performed using FlowJoTM 10 software (Becton Dixon & Company). The FrF2 package in Rstudio (version 2021.9.1.372) was then used to quantify the main and two-way interaction effects of each factor tested in the optimization. Documentation for this package is available at https://www.rdocumentation.org/packages/FrF2/versions/2.1/topics/FrF2-package.
For the mixed-target and environmental spike experiments, subsequent direct analysis of these data through manual gating was performed using the same FlowJo software. The FlowJo software was then used to export the gated data to .csv files, FlowJoTM 10 software. A log transformation was applied to these data, after which features were standardized by centering and rescaling to standard deviation 1. Rstudio (version 2021.9.1.372) was used to apply the OPTICS implementation available in the dbscan package (Hahsler et al. 2019). MATLAB® software (version R2021a; MathWorks) was then used to inspect reachability plots of the OPTICS-ordered data for manual extraction. Finally, we applied the opticskxi package available in R (Charlton 2019) for automated extraction, with parameters described in the companion article to this dataset.