Vibrio pectenicida strain FHCF-3 is a causative agent of sea star wasting disease
Data files
Dec 17, 2025 version files 1.19 MB
-
Fig2_ExtDFig1_2_datacode.zip
1.13 MB
-
Fig3_Data_Code.zip
18.17 KB
-
Fig4_Data_Code.zip
4.49 KB
-
README.md
17.86 KB
-
Script_16S_QIIME2_decontam.sh
9.92 KB
-
Script_Metatranscriptomics.sh
6.24 KB
Abstract
More than ten years following the onset of the sea star wasting disease (SSWD) epidemic, affecting over 20 asteroid species from Mexico to Alaska, the causative agent has been elusive. SSWD killed billions of the most susceptible species, sunflower sea stars (Pycnopodia helianthoides), initiating a trophic cascade involving unchecked urchin population growth and widespread loss of kelp forests. Identifying the causative agent underpins the development of recovery strategies. Here, we induced disease and subsequent mortality in exposure experiments using tissue extracts, coelomic fluid, and effluent water from wasting sunflower sea stars, with no mortality in controls. Deep sequencing of diseased sea star coelomic fluid samples from experiments and field outbreaks revealed a dominant proportion of reads assigned to the causative agent. Fulfilling Koch’s postulates, the pathogen, cultured from the coelomic fluid of a diseased sunflower sea star, caused disease and mortality in exposed sunflower sea stars, demonstrating that it is a causative agent of SSWD. This discovery will enable recovery efforts for sea stars and the ecosystems affected by their decline by facilitating culture-based experimental research and broad-scale screening for pathogen presence and abundance in the laboratory and field.
Dryad DOI: https://doi.org/10.5061/dryad.5mkkwh7g9
Title: Vibrio pectenicida strain FHCF-3 is a causative agent of sea star wasting disease
Authors: Melanie B. Prentice1,2*, Grace Crandall3, Amy M. Chan1, Katherine M. Davis1,2†, Paul K. Hershberger4, Jan F. Finke1,2, Jason Hodin5, Andrew McCracken6, Colleen T. E. Kellogg2, Rute B. G. Clemente-Carvalho2, Carolyn Prentice2, Kevin X. Zhong1, Drew Harvell5,7, Curtis A. Suttle1,8,9,10, Alyssa-Lois M. Gehman2, 10*
Affiliations:
1 Department of Earth, Ocean and Atmospheric Sciences, The University of British Columbia, Vancouver, BC, Canada.
2 Hakai Institute; Campbell River, BC, Canada.
3 School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA, USA.
4 U.S. Geological Survey, Western Fisheries Research Center, Marrowstone Marine Field Station; Nordland, WA, USA.
5 Friday Harbor Laboratories, University of Washington; Friday Harbor, WA, USA.
6 Department of Biology, University of Vermont; Burlington, VT, USA.
7 Department of Ecology and Evolutionary Biology, Cornell University; Ithaca, NY, USA.
8 Department of Microbiology & Immunology, The University of British Columbia; Vancouver, BC, Canada.
9 Department of Botany, The University of British Columbia, Vancouver, BC, Canada.
10 Institute for the Oceans and Fisheries, The University of British Columbia; Vancouver, BC, Canada.
†Current address: Washington Department of Fish and Wildlife, Port Townsend, USA.
*Corresponding authors. E-mail: melanie.prentice@ubc.ca, alyssamina@gmail.com
Name: Alyssa-Lois M. Gehman
Institution: The Hakai Institute & The University of British Columbia
Email: alyssamina@gmail.com, alyssa.gehman@hakai.org
Name: Melanie Prentice
Institution: The University of British Columbia & The Hakai Institute
Email: melanie.prentice@ubc.ca, melanie.prentice@hakai.org
Dataset Overview
This dataset contains the data and code required to replicate analyses for metatranscriptomic and 16S rRNA gene amplicon datasets.
We also provide R Code and associated data required to reproduce all figures associated with this manuscript.
Dates of Data Collection
A. Metatranscriptomic dataset: 2022
B. 16S rRNA gene amplicon datasets: 2022-2023
Data Spatial Scope
Genetic datasets were generated from samples collected from controlled exposure experiments on wild sunflower sea stars (Pycnopodia helianthoides) collected in Washington, USA.
Genetic data is also generated for samples collected from wild populations of sunflower sea stars in British Columbia, Canada.
Funding
The Nature Conservancy of California (DH, A-LMG)
The Tula Foundation (A-LMG, CAS)
Natural Sciences and Engineering Research Council of Canada Discovery Grant RPGIN-2020-06515 (CAS)
Canadian Foundation for Innovation and British Columbia Knowledge Development Fund Infrastructure award #25412 (CAS)
The University of British Columbia
The U.S. Geological Survey, Biological Threats Research Program, Ecosystems Mission Area (PKH)
Quantitative and Evolutionary STEM Traineeship NRT-1735316 (AM)
Ethics Approval
Wild P. helianthoides used in experiments were collected under scientific collection permits issued by the Washington Department of Fish and Wildlife (WDFW) (permit no. HARVELL 21-1172R, 22-175, 23-087, 24-053).
Captive-bred P. helianthoides were transported to the USGS Marrowstone Marine Field Station under WDFW shellfish transfer permits (permit no. 23-1249, 24-1249).
Samples from wild P. helianthoides collected in British Columbia, Canada, were collected under scientific collection permit XMCFR 18 2023 issued by Fisheries and Oceans Canada.
Although research with sea stars is not explicitly regulated within the United States, we followed the ethical principle of reduction in all experiments, using the minimum number of required individuals per experiment to retain statistical power.
Facilities housing the animals involved were accredited by the Association for Assessment and Accreditation of Laboratory Animal Care, International (AAA-LAC) and inspected regularly by the Institutional Animal Use and Care Committee (IACUC) at the U.S. Geological Survey, Western Fisheries Research Center.
Sharing/Access information
Sharing/Access
Metatranscriptomic and 16S rRNA gene sequence datasets are archived in the National Center for Biotechnology Information (NCBI) Short Read Archive (BioProject Number PRJNA1195080).
The whole genome of V. pectenicida strain FHCF-3 is available from the NCBI GenBank Repository (Accession Number JBLZMR000000000), with raw sequence reads archived in the NCBI Short Read Archive (BioProject Number PRJNA1232168).
The complete 16S rRNA gene sequences of V. pectenicida strain FHCF-3 is deposited in the NCBI GenBank Repository (Accessions PQ700178, PQ763222-PQ763229).
Fig2_ExtDFig1_2_Data&Code.zip
Content Description
Contains:
- Fig2_ExtDFig1_2_datacode.Rproj file for launching the project in R
- Fig2_ExtDFig1_2_data_share.Rmd file with script used to generate figures
- Fig2_data_21Mar25.csv file containing the data required to generate the Fig. 2 figure
Column B (experiment_number): name or number of the experiment
Column C (tmt): treatment type - type of exposure
Column D (years): year of experiment
Column E (response): categories of response being plotted, where "mean_start_twist_day" is the average number of days to stars starting to twist their arms, "mean_start_drop_day" is the average number of days before stars dropped an arm, and "mean_mort_days" is the average number of days to mortality
Column F (exposure_days): the average value for each response in days since exposure
Column G (response_type): labels for each response for the legend
Column H (sdresponse): standard deviation of the mean values
column I (sd): the value for plotting (sdresponse)
Column J (facet): a variable indicating which panel of the figure to plot the data in - Extended_Data_Fig1_data.csv file containing the data required to generate the Extended Data Fig. 1 figure
Column B (col.loc): treatment group names, where "lab" is lab-raised sea stars and "wild" is sea stars collected from the wild
Column C (count): sample size for each group
Column D (response): categories of response being plotted, where "mean_start_twist_day" is the average number of days to stars starting to twist their arms, "mean_start_drop_day" is the average number of days before stars dropped an arm, and "mean_mort_days" is the average number of days to mortality
Column E (exposure_days): the average value for each response in days since exposure
Column F (response_type): labels for each response for the legend
Column G (sdresponse): standard deviation of the mean values
Column H (sd): the value for plotting (sdresponse) - Extended_Data_Fig2_data.csv file containing the data required to generate the Extended Data Fig. 2 figure
Column B (star_ID): individual star ID
Column C (exp_number): experiment number
Column D (armtwist): number of arms twisting at time of observation
Column E (exp_date): number of days post-exposure (inoculation) that observation was made
Column F (control_cat): treatment group, where "exposed" are inoculated with live inocula, and "control" are inoculated with heat-killed inocula
Column G (inoc_type): type of inocula used for exposure, where "homogenate" is tissue homogenate, "CF" is coelomic fluid, "water" is water (which is not injection but immersion exposure), and "culture" is Vibrio pectenicida bacterial culture
Column H (star_arm_count): total number of arms each individual has
Column I (movement): whether or not movement was observed from the last observation, where "NA" is no data, "1" is movement, and "0" is no movement
Fig3_Data&Code.zip
Content Description
Contains:
- RScript_Fig3_Metatranscriptomics_PCoA.R R script for generating the figure
- Data_Normalized_Count_Matrix.csv data file containing the data required to generate the figure
Row headers (down column A): sample IDs
Column headers (along row 1): sequence annotations (i.e., family, genus or species IDs of microbial taxa identified in the metatranscriptomic sequencing dataset) - Vectors.csv data file generated during the analysis; this file was filtered (by r2 value) to create a new file which subsets all vectors to those sig. for plotting
Column A is species IDs
Column D is the r2 value used to sort and select species vectors to annotate on the plot - Vectors_Sig_ForPlotting.csv file generated during the analysis; this file contained sig. vectors for plotting
The subset of vectors from the Vectors.csv file to annotate on the plot (r2>0.9)
Fig4_Data&Code.zip
Content Description
Contains:
- Fig4_Data&Code.Rmd R script for generating the figure
- data_2022E2_CFSamples_Vpec_RelativeAbundance.csv data file containing the data required to generate the figure panel A
Column A (Sample ID): the sample ID
Column B (Time): the time point that the sample was collected during the experiment, where "pre-inoculation" samples were collected before sea stars were exposed to SSWD (i.e., before injection with inoculum) and "post-inoculation" samples were collected post-exposure (injection); "inoculum" is a sample of the inoculum used for the exposure
Column C (Experiment Treatment): experimental treatment group, where "exposed" individuals were injected with coelomic fluid collected from a wasting sunflower sea star, and "control" individuals were injected with heat-killed coelomic fluid
Column D (Prop.Vp.Reads): relative abundance of 16S rRNA amplicon sequencing reads annotated as Vibrio pectenicida in each individual's dataset - data_Library4Field_CFSamples_Vpec_RelativeAbundance.csv data file containing the data required to generate the figure panel B
Column A (Sample.ID): the sample ID
Column B (Site): the name of the site where sampling occurred
Column C (October.Disease.Signs): disease signs observed at the site in October, where "naive" is no disease observed at the site, and both "field exposed" and "wasting" are diseases observed at the site
Column D (Time): the month the sample was collected
Column E (Disease.Signs): disease signs observed at the time of sample collection, where "naive" is no disease signs observed in the individual being sampled or any other individuals at the site, "field exposed" is no disease signs observed in the individual being sampled but disease signs observed in other individuals at the site, and "wasting" is disease signs observed in the individual being sampled
Column F (Prop.Vp.Reads): relative abundance of 16S rRNA amplicon sequencing reads annotated as Vibrio pectenicida in each individual's dataset
Script_16S_QIIME2_decontam.sh
Content Description
This script contains an overview of the code used to run the analysis of 16S rRNA gene amplicon datasets.
This analysis was run using QIIME2, with an intermediate section requiring the use of R.
This script was run interactively (i.e., line by line) rather than submitted as a bash script, as there were multiple instances where intermediate files needed to be examined to inform parameter selection and filtering.
Within the script, there is a section of code that is run in R. This can either be done within the command shell or by downloading relevant files, running the analysis in R/R Studio, and re-importing the resulting files to continue the QIIME2 analysis.
Script_Metatranscriptomics.sh
Content Description
This script contains an overview of the code used to run the analysis of metatranscriptomic data.
This script provides a complete overview of the analytical pipeline used, but was bash executed in sections as independent scripts (corresponding to different stages of analysis).
Individual stages within the script require that variables (e.g., paths to directories/files) be set. An explanation of all variables is provided in the script so the user can modify as necessary.
References
B. Buchfink, K. Reuter, H-G. Drost, Sensitive protein alignments at tree-of-life scale using DIAMOND. Nature Methods 18, 366-368 (2021).
E. Bolyen, J. R. Rideout, M. R. Dillon, N. A. Bokulich, C. C. Abnet, G. A. Al-Ghalith, H. Alexander, E. J. Alm, M. Arumugam, F. Asnicar, Y. Bai, J. E. Bisanz, K. Bittinger, A. Brejnrod, C. J. Brislawn, C. T. Brown, B. J. Callahan, A. M. Caraballo-Rodríguez, J. Chase, E. K. Cope, R. Da Silva, C. Diener, P. C. Dorrestein, G. M. Douglas, D. M. Durall, C. Duvallet, C. F. Edwardson, M. Ernst, M. Estaki, J. Fouquier, J. M. Gauglitz, S. M. Gibbons, D. L. Gibson, A. Gonzalez, K. Gorlick, J. Guo, B. Hillmann, S. Holmes, H. Holste, C. Huttenhower, G. A. Huttley, S. Janssen, A. K. Jarmusch, L. Jiang, B. D. Kaehler, K. B. Kang, C. R. Keefe, P. Keim, S. T. Kelley, D. Knights, I. Koester, T. Kosciolek, J. Kreps, M. G. I. Langille, J. Lee, R. Ley, Y-X. Liu, E. Loftfield, C. Lozupone, M. Maher, C. Marotz, B. D. Martin, D. McDonald, L. J. McIver, A. V. Melnik, J. L. Metcalf, S. C. Morgan, J. T. Morton, A. T. Naimey, J. A. Navas-Molina, L. F. Nothias, S. B. Orchanian, T. Pearson, S. L. People, D. Petras, M. L. Preuss, E. Pruesse, L. B. Rasmussen, A. Rivers, M. S. Robeson II, P. Rosenthal, N. Segata, M. Schaffer, A. Shiffer, R. Sinha, S. J. Song, J. R. Spear, A. D. Swafford, L. R. Thompson, P. J. Torres, P. Trinh, A. Tripathi, P. J. Turnbaugh, S. Ul-Hasan, J. J. J. van der Hooft, F. Vargas, Y. Vázquez-Baeza, E. Vogtmann, M. von Hippel, W. Walters, Y. Wan, M. Wang, J. Warren, K. C. Weber, C. H. D. Williamson, A. D. Willis, Z. Z. Xu, J. R. Zaneveld, Y. Zhang, Q. Zhu, R. Knight, J. G. Caporaso, Reproducible, interactive, scalable, and extensible microbiome data science using QIIME 2. Nature Biotechnology 37, 852-857 (2019).
N. A. Bokulich, B. D. Kaehler, J. R. Rideout, M. Dillon, E. Bolyen, R. Knight, G. A. Huttley, J. G. Caporaso, Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin. Microbiome 6, 90 (2018).
D. McDonald, Y. Jiang, M. Balaban, K. Cantrell, Q. Zhu, A. Gonzalez, J. T. Morton, G. Nicolaou, D. H. Parks, S. M. Karst, M. Albertsen, P. Hugenholtz, T. DeSantis, S. J. Song, A. Bartko, A. S. Havulinna, P. Jousilahti, S. Cheng, M. Inouye, T. Niiranen, M. Jain, V. Salomaa, L. Lahti, S. Mirarab, R. Knight, Greengenes2 unifies microbial data in a single reference tree. Nature Biotechnology 42, 715-718 (2024).
H. Lin, S. Das Peddada, Analysis of compositions of microbiomes with bias correction. Nature Communications 11, 3514 (2020).
D. Lüdecke, sjPlot: Data Visualization for Statistics in Social Science, R package version 2.8.17 (2024); https://CRAN.R-project.org/package=sjPlot.
A. M. Bolger, M. Lohse, B. Usadel, Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114-2120.
S. Andrews, FastQC: A quality control tool for high throughput sequence data (2010). Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc.
P. Ewels, M. Magnusson, S. Lundin, M. Käller, MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32, 3047-3048 (2016).
H. Li, Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997v1 [q-bio.GN] (2013).
E. Bushmanova, D. Antipov, A. Lapidus, A. D. Prjibelski, rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data. Gigascience 8, giz100 (2019).
D. H. Huson, S. Beier, I. Flade, A. Górska, M. El-Hadidi, S. Mitra, H-J. Ruscheweyh, R. Tappu, MEGAN Community Edition: Interactive exploration and analysis of large-scale microbiome sequencing data. PLOS Computational Biology 12, e1004957 (2016).
R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing (2024); https://www.R-project.org/.
J. Oksanen, G. Simpson, F. Blanchet, R. Kindt, P. Legendre, P. Minchin, R. O'Hara, P. Solymos, M. Stevens, E. Szoecs, H. Wagner, M. Barbour, M. Bedward, B. Bolker, D. Borcard, G. Carvalho, M. Chirico, M. De Caceres, S. Durand, H. Evangelista, R. FitzJohn, M. Friendly, B. Furneaux, G. Hannigan, M. Hill, L. Lahti, D. McGlinn, M. Ouellette, E. Ribeiro Cunha, T. Smith, A. Stier, C. Ter Braak, J. Weedon, vegan: Community Ecology Package, R package version 2.7-0 (2024); https://github.com/vegandevs/vegan.
H. Wickham H, ggplot2: Elegant Graphics for Data Analysis, R package version 3.5.1 (2016); https://ggplot2.tidyverse.org.
M. Martin, Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17, 10 (2011).
B. J. Callahan, P. J. McMurdie, M. J. Rosen, A. W. Han, A. J. A. Johnson, S. P. Holmes, DADA2: high-resolution sample inference from Illumina amplicon data. Nature Methods 13, 581-583 (2016).
N. M. Davis, D. Proctor, S. P. Holmes, D. A. Relman, B. J. Callahan, Simple statistical identification and removal of contaminant sequences in marker-gene and metagenomics data. Microbiome 6, 226 (2018).
M. E. Brooks, K. Kristensen, K. J. van Benthem, A. Magnusson, C. W. Berg, A. Nielsen, H. J. Skaug, M. Mächler, B. M. Bolker, glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. R J. 9, 378-400 (2017).
