Skip to main content
Dryad

Effects of environmental translocation and host characteristics on skin microbiomes of sun-basking fish

Cite this dataset

Berggren, Hanna et al. (2023). Effects of environmental translocation and host characteristics on skin microbiomes of sun-basking fish [Dataset]. Dryad. https://doi.org/10.5061/dryad.w9ghx3fw7

Abstract

Variation in the composition of skin-associated microbiomes has been attributed to host species, geographic location, and habitat, but the role of intraspecific phenotypic variation among host individuals remains elusive. We explored if and how host environment and different phenotypic traits were associated with microbiome composition. We conducted repeated sampling of dorsal and ventral skin microbiomes of carp individuals (Cyprinus carpio) before and after translocation from laboratory conditions to a semi-natural environment. Both alpha and beta diversity of skin-associated microbiomes increased substantially within and among individuals following translocation, particularly on dorsal body sites. The variation in microbiome composition among hosts was significantly associated with body site, sun-basking, habitat switch, and growth, but not temperature gain while basking, sex, personality, or colour morph. We suggest that the overall increase in the alpha and beta diversity estimates among hosts were induced by individuals expressing greater variation in behaviours and thus exposure to potential colonizers in the pond environment compared to the laboratory. Our results exemplify how biological diversity at one level of organization (phenotypic variation among and within fish host individuals) together with the external environment impacts biological diversity at a higher hierarchical level of organisation (richness and composition of fish-associated microbial communities).

README

This README file was generated on 2023-11-13 by Hanna Berggren.

GENERAL INFORMATION

  1. Title of Dataset: "Effects of environmental translocation and host characteristics on skin microbiomes of sun-basking fish"

  2. Author Information
    A. Principal Investigator Contact Information
    Name: Anders Forsman
    Institution: Linnaeus University
    Address: Kalmar, Sweden
    Email: anders.forsman@lnu.se

    B. Associate or Co-investigator Contact Information
    Name: Hanna Berggren
    Institution: Linnaeus University
    Address: Kalmar, Sweden
    Email: hanna.e.berggren@gmail.com

We explored if and how host environment and different phenotypic traits were associated with microbiome composition. We conducted repeated sampling of dorsal and ventral skin microbiomes of carp individuals (Cyprinus carpio) before and after translocation from laboratory conditions to a semi-natural environment. We collected microbiome samples from dorsal and ventral body sites for each individual (n = 27) and environment (laboratory versus pond) representing 108 samples in total.
Sequencing libraries were prepared by using the primer pair 341F and 805R that target the V3-V4 hypervariable regions of the bacterial 16S gene complex and sequenced on the Illumina MiSeq platform (Illumina, USA) using MiSeg Reagent Kit v3 with 600 cycles (2×300 bp).

SHARING/ACCESS INFORMATION

  1. Licenses/restrictions placed on the data: CC0 1.0 Universal (CC0 1.0) Public Domain

  2. Links to publications that cite or use the data:

Berggren H, Nordahl O, Yildirim Y, Larsson P, Tibblin P, Forsman A. (2023). Effects of environmental translocation and host characteristics on skin microbiomes of sun-basking fish. Proceedings of the Royal Society B.

  1. Links to other publicly accessible locations of the data (raw sequence data):
     https://www.ncbi.nlm.nih.gov/bioproject/PRJNA714685
     https://www.ncbi.nlm.nih.gov/bioproject/PRJNA673155

  2. Links/relationships to ancillary data sets: None

  3. Was data derived from another source? No
    A. If yes, list source(s): NA

  4. Recommended citation for this dataset:

Berggren H, Nordahl O, Yildirim Y, Larsson P, Tibblin P, Forsman A. 2023 Data from: Effects of environmental translocation and host characteristics on skin microbiomes of sun-basking fish. Dryad Digital Repository (doi: https://doi.org/10.5061/dryad.w9ghx3fw7).

DATA & FILE OVERVIEW
The data is divided in two steps; Rawdata processing (raw data originated from qiime2 v2018.8 i.e., feature-tables, taxonomy files) for instance removing sequences present in negative controls and separating sequences from different libraries into correct files for further analysis. For this workflow, please see raw_data_processing.Rmd. The resulting files from this first part are provided on dryad and thus it is not a necessary step for going through the statistical analyses and data transformations described in and Sunbasking_carp_2022.Rmd.

For statistical analyses and data transformation, rawdata are provided that can be used for all the data handling described in Sunbasking_carp_2022.Rmd. However, some variables such as estimated richness, shannon and distance to centroid are included in the metadata file. Some of the transformed data i.e., centered-log-ratio are not provided due to very large file size. Please note that since these data transformations are based on probability estimates, the exact file cannot be generated, which might lead to slightly different test-statistics.

  1. File List:

Step 1:
A) raw_data_processing.Rmd
B) P11965-feature-table.tsv.zip
C) P11965_q2_taxonomy.zip
D) P12860-feature-table.tsv.zip
E) P12860_q2_taxonomy.zip
F) P13155-feature-table.tsv.zip
G) P13155_q2_taxonomy.zip
H) P13156-feature-table.tsv.zip
I) P13156_q2_taxonomy.zip
J) metadata-merged

Step 2:
K) Sunbasking_carp_2022.Rmd
L) metadata_carp_2021_03_14.tsv
M) carp_counts.tsv.zip
N) literature_search.txt
O) taxonomy_filtered.tsv.zip

LEfSe Results:
P) Supplementary Data File 1.txt
Q) Supplementary Datafile 2.txt

  1. Relationship between files, if important: None

  2. Additional related data collected that was not included in the current data package: None

  3. Are there multiple versions of the dataset? No
    A. If yes, name of file(s) that was updated: NA
    i. Why was the file updated? NA
    ii. When was the file updated? NA

#########################################################################

The data is divided in two steps;
DATA-SPECIFIC INFORMATION FOR STEP 1: A-J
Rawdata processing = preprocessing the raw data and annotated taxonomy originated from qiime2 v2018.8 (i.e., feature-tables) such as removing sequences present in negative controls,taxonomic filtering, and splitting samples into right files for further analysis. Please see raw_data_processing.Rmd. The resulting files from this part are provided and thus not a neccessary step for going through the statistical analysis in the file described in and Sunbasking_carp_2022.Rmd.

DATA-SPECIFIC INFORMATION FOR: raw_data_processing.Rmd

R.markdown containing code for removing sequences present in negative controls,taxonomic filtering, and splitting samples into right files for further analysis.

#########################################################################

DATA-SPECIFIC INFORMATION FOR: P11965-feature-table.tsv

  1. Number of variables: 3

  2. Number of cases/rows: 26988

  3. Variable List:

* seqid: unique ASV id originating from qiime2

* sample: sample id containing library information and sample number

* count: number of times the specific ASV occurred in the specific sample

  1. Missing data codes: None

  2. Specialized formats or other abbreviations used: None

#########################################################################

DATA-SPECIFIC INFORMATION FOR: P11965_q2_taxonomy.zip

  1. Number of variables: 9

  2. Number of cases/rows: 9634

  3. Variable List:

* asv: unique ASV id originating from qiime2

* domain: taxonomic rank

* phylum: taxonomic rank

* class: taxonomic rank

* order: taxonomic rank

* family: taxonomic rank

* genus: taxonomic rank

* species: taxonomic rank

* Confidence: confidence level for assigned genus

  1. Missing data codes: NA (data not available)

  2. Specialized formats or other abbreviations used: None

############################################################

DATA-SPECIFIC INFORMATION FOR: P12860-feature-table.tsv

  1. Number of variables: 3

  2. Number of cases/rows: 26076

  3. Variable List:

* seqid: unique ASV id originating from qiime2

* sample: sample id containing library information and sample number

* count: number of times the specific ASV occurred in the specific sample

  1. Missing data codes: None

  2. Specialized formats or other abbreviations used: None

#########################################################################

DATA-SPECIFIC INFORMATION FOR: P12860_q2_taxonomy.zip

  1. Number of variables: 9

  2. Number of cases/rows: 9675

  3. Variable List:

* asv: unique ASV id originating from qiime2

* domain: taxonomic rank

* phylum: taxonomic rank

* class: taxonomic rank

* order: taxonomic rank

* family: taxonomic rank

* genus: taxonomic rank

* species: taxonomic rank

* Confidence: confidence level for assigned genus

  1. Missing data codes: NA (data not available)

  2. Specialized formats or other abbreviations used: None

##########################################################

DATA-SPECIFIC INFORMATION FOR: P13155-feature-table.tsv

  1. Number of variables: 3

  2. Number of cases/rows: 52846

  3. Variable List:

* seqid: unique ASV id originating from qiime2

* sample: sample id containing library information and sample number

* count: number of times the specific ASV occurred in the specific sample

  1. Missing data codes: None

  2. Specialized formats or other abbreviations used: None

#########################################################################

DATA-SPECIFIC INFORMATION FOR: P13155_q2_taxonomy.zip

  1. Number of variables: 9

  2. Number of cases/rows: 13170

  3. Variable List:

* asv: unique ASV id originating from qiime2

* domain: taxonomic rank

* phylum: taxonomic rank

* class: taxonomic rank

* order: taxonomic rank

* family: taxonomic rank

* genus: taxonomic rank

* species: taxonomic rank

* Confidence: confidence level for assigned genus

  1. Missing data codes: NA (data not available)

  2. Specialized formats or other abbreviations used: None

#########################################################

DATA-SPECIFIC INFORMATION FOR: P13156-feature-table.tsv

  1. Number of variables: 3

  2. Number of cases/rows: 44418

  3. Variable List:

* seqid: unique ASV id originating from qiime2

* sample: sample id containing library information and sample number

* count: number of times the specific ASV occurred in the specific sample

  1. Missing data codes: None

  2. Specialized formats or other abbreviations used: None

#########################################################################

DATA-SPECIFIC INFORMATION FOR: P13156_q2_taxonomy.zip

  1. Number of variables: 9

  2. Number of cases/rows: 15486

  3. Variable List:

* asv: unique ASV id originating from qiime2

* domain: taxonomic rank

* phylum: taxonomic rank

* class: taxonomic rank

* order: taxonomic rank

* family: taxonomic rank

* genus: taxonomic rank

* species: taxonomic rank

* Confidence: confidence level for assigned genus

  1. Missing data codes: NA (data not available)

  2. Specialized formats or other abbreviations used: None

###########################################################

DATA-SPECIFIC INFORMATION FOR: metadata-merged.txt

  1. Number of variables: 6

  2. Number of cases/rows: 384

  3. Variable List:

* sample: sample id containing library information and sample number

* flowcell: library name

* sample_type: sample originated from water och fish

* study: indicates which study species that was used

* date_extraction: Date of DNA extraction

* kit_used: samples with same number were extracted using the same kit-package.

  1. Missing data codes: None

  2. Specialized formats or other abbreviations used: None

###########################################################

DATA-SPECIFIC INFORMATION FOR STEP 2: K-O

K) Sunbasking_carp_2022.Rmd
L) metadata_carp_2021_03_14.tsv
M) carp_counts.tsv.zip
N) literature_search.txt
O) taxonomy_filtered.tsv.zip

DATA-SPECIFIC INFORMATION FOR: Sunbasking_carp_2022.Rmd

R-markdown with the code for statistical analysis.

#########################################################################

DATA-SPECIFIC INFORMATION FOR: metadata_carp_2021_03_14.tsv

  1. Number of variables: 36

  2. Number of cases/rows: 116

  3. Variable List:

* sample: sample id containing library information and sample number

* est: estimated richness from breakaway package

* flowcell: sequence library name

* samplename: original sample name

* sample_type: sample originated from water or fish

* study: indicates which study species that was used

* sampling_location: whether sample was collected when fish were in laboratory or filed (pond)

* Individ: The individual ID number that was also used in Nordahl O, Tibblin P, Koch-Schmidt P, Berggren H, Larsson P, Forsman A. Sun-basking fish benefit from body temperatures that are higher than ambient water ambient water. Proceedings of the Royal Society B-Biological Sciences. 2018;285(1879).

* occasion: experimental period (pre or post translocation)

* body_cite.c: abbreviated exact body position: VL= ventral left, VR = Ventral right, DL= dorsal left, DR= dorsal right

* body_site: whether sample was taken on dorsal or ventral body sites

* tube_number: number of the tube the sample was stored in

* date_extraction: Date of DNA extraction

* kit_used: samples with same number were extracted using the same kit-package.

* cm_release: length of fish in cm at time of release into pond

* cm_recap: length of fish in cm at time of recapture from pond

* gr_recap: weight of fish in gram at time of recapture from pond

* gr_release: weight of fish in gram at time of release into pond

* Date of release into the pond: as described in variable name

* Date of recapture: as described in variable name

* PIT_tag_number: ID number on the individual fish PIT-tag

* DST_number: ID number on the individual fish DST-tag

* Colour: which colour morph the fish belong to

* Sex: which gender

* Weight: growth in gram during period in pond (i.e., weight gain calculated from gr_recap - gr_release)

* Bold_Shy: value of bold-shy continuum

* Thermal_switch: We extracted the maximum and minimum in body temperature of each individual during each hour of the last week in the pond and calculated a delta value (max-min) in °C that was summed up for the whole period.

* Basking: accumulated time of dormancy in the surface layer (measured in minutes) during periods of time categorized as basking

* Mean_basktemp: mean of the temperature excess acquired during periods of sun-basking for each individual compared to the water measured in °C

* Vertical_switch: We extracted the maximum and minimum in water depth of each individual during each hour of the last week in the pond and calculated a delta value (max-min) in cm that was summed up for the whole period.

* groups: grouping sample according to experimental period (occasion) and body site (dorsal/ventral)

* distance_to_centroid_occasion: from the Euclidean distance matrix, this the distance to centroid for each individual sample that belonged to either the groups of samples taken pre- or post-translocation, respectively.

* distance_to_centroid_occasion_abundant_ASVs: same as above but for abundant ASVs only

* shannon: shannon diversity index for each sample.

* distance_to_centroid_occasion_rare: same as above but for rare ASVs only

* color_morph: which colour morph the fish belong to

  1. Missing data codes: None

  2. Specialized formats or other abbreviations used:

F = female, M = male

#########################################################################

DATA-SPECIFIC INFORMATION FOR: carp_counts.tsv.zip

  1. Number of variables: 3

  2. Number of cases/rows: 42671

  3. Variable List:

* sample: sample id containing library information and sample number

* seqid: unique ASV id originating from qiime2

* count: number of times the specific ASV occurred in the specific sample

  1. Missing data codes: None

  2. Specialized formats or other abbreviations used: None

################################################################

DATA-SPECIFIC INFORMATION FOR: literature_search.txt

  1. Number of variables: 5

  2. Number of cases/rows: 287

  3. Variable List:

* Year: year of publication

* records: number of publications

* body_site: whether publication stydy skin or gut microbiome

* where: aquatic or terrestrial habitat

* groups: variables "body_site" and "where" combined

  1. Missing data codes: None

  2. Specialized formats or other abbreviations used: None

####################################################################

DATA-SPECIFIC INFORMATION FOR: taxonomy_filtered.tsv.zip

  1. Number of variables: 9

  2. Number of cases/rows: 40 175

  3. Variable List:

* seqid: unique ASV id originating from qiime2

* domain: taxonomic rank

* phylum: taxonomic rank

* class: taxonomic rank

* order: taxonomic rank

* family: taxonomic rank

* genus: taxonomic rank

* species: taxonomic rank

* Confidence: confidence level for assigned genus

  1. Missing data codes: None

  2. Specialized formats or other abbreviations used: None

#########################################################################

LEfSe Results:
DATA-SPECIFIC INFORMATION FOR: Supplementary Data File 1.txt

  1. Number of variables: 3

  2. Number of cases/rows: 58

  3. Variable List:

* taxonomy: taxonomic annotation

* total_score: total LDA score for the specific biomarker

* occasion: group of comparison (pre or post translocation)

* enriched_score: enriched LDA score

  1. Missing data codes: None

  2. Specialized formats or other abbreviations used: None

##########################################

DATA-SPECIFIC INFORMATION FOR: Supplementary Data File 2.txt

  1. Number of variables: 4

  2. Number of cases/rows: 58

  3. Variable List:

* taxonomy: taxonomic annotation

* LDA_score_total: total LDA score for the specific biomarker

* body_site: group of comparison (dorsal or ventral)

* score_body_site: enriched LDA score

  1. Missing data codes: None

  2. Specialized formats or other abbreviations used: None

Methods

Estimation of alpha and beta diversity

All statistical analyses were performed in Rstudio v1.3.1093 (53) with R v3.6.0. We included three alpha diversity estimates. Observed number of ASVs (100% identical ‘amplicon sequence variants’) was used to illustrate the partitioning of ASVs according to sample type and environment. Statistical analysis of richness was based on estimates generated with default settings in the breakaway function (package breakaway v4.6.11). To incorporate an alpha diversity measurement that takes abundance and evenness into account we used Shannon-Weaver diversity index estimated from data subsampled to the smallest sample size (3775 reads per sample) using the diversity function in the vegan package (v2.5-6).

Apart from the subsampling prior to the Shannon index, raw data was used throughout the analyses. However, to explore the contribution of rare ASVs, we conducted a filtering step for comparison with the results based on raw data. To this end, we followed the method described in Bokulich, Subramanian: ASVs with a total count <10 within each sample and total abundance <0.01% across all samples were considered “rare”. This step decreased the total number of ASVs found in fish skin microbiomes from 16,881 to 1883 representing 11% of the total number of ASVs. The results based on data subsets are reported in Table S2 and S3.

For beta diversity, the data was transformed by centred log ratio (clr) allowing it to be used as input for linear regressions. Distances to group centroid for samples from each of the two environments were estimated from Euclidean distance matrix on clr-values, using the function betadisper (type = centroid) in the vegan package.

Funding

Swedish Research Council for Environment Agricultural Sciences and Spatial Planning, Award: 2017-00346

Swedish Research Council for Environment Agricultural Sciences and Spatial Planning, Award: 2018-00605