Alternative proteins remodel the microbiome and support growth and development in zebrafish
Data files
Aug 01, 2025 version files 621.22 KB
-
Amino_Acid_Data.csv
5.14 KB
-
Biological_Replicate_Data.csv
8.30 KB
-
Microbiome_Data.zip
117.03 KB
-
Muscle_Cell_Area_Data.csv
279.29 KB
-
Nuclei_Data.csv
881 B
-
README.md
10.23 KB
-
Standard_Length_Data.csv
599 B
-
Survivorship_Data.csv
5.70 KB
-
Technical_Replicate_Data.csv
187.60 KB
-
Total_Length_Data.csv
6.44 KB
Abstract
Protein intake is indispensable to growing animals. As populations and resource demands expand, animal protein-based diets cannot be sustained. In order to validate a move completely away from these diets, we fed a model vertebrate alternative proteins during development. Diets were based on pea, milk and whey, as well as fishmeal (control). All diets supported growth to some degree, with the exception of those high in milk and whey proteins. The most promising diets were associated with the upregulation of genes associated with insulin sensitivity and fat storage. The microbiome reflected the dietary changes, with a shift occurring from Fusobacteriota to Proteobacteria. Correlations were also noted with health, as Cetobacterium positively affected development, while Aeromonas did the opposite. Our findings may help direct the selection of specific animal protein-free diets that can be introduced early into development to maximize vertebrate health.
Dataset DOI: 10.5061/dryad.v15dv427m
Description of the data and file structure
This file set contains the data required to replicate our analyses in "alternative proteins support somatic and muscular development while remodeling the microbiome in zebrafish". This dataset contains survivorship, growth measurements, gene expression values, and gut microbiota data from zebrafish fed various alternative protein diets including pea, milk, and wheat-based formulations over a ~6 month period.
Files and variables
File: Amino_Acid_Data.csv
Description: Amino acid profiles of the dietary treatments compared to recommended amounts.
Variables
- Diet: Diet treatment group (FM, Fishmeal; P1, Pea-1; P2, Pea-2; M, Milk; M/P2, Milk/Pea-2; W/P2, Wheat/pea-2).
- Percent: Proportion of the specific amino acid within the total protein content of the diet (expressed as %).
- AminoAcid: Standard abbreviation for the amino acid.
File: Biological_Replicate_Data.csv
Description: qPCR gene expression results from biological replicates of zebrafish fed alternative protein diets. Each row represents the mean expression data for a target gene in a specific diet group (biological group), based on pooled technical replicates. Expression levels were normalized to the endogenous control gene E1a (eef1a1l1). Note that ΔΔCt = 0 and RQ = 1 for the FM (reference) group by definition, and missing RQ values indicate control genes or normalization baselines. Note that this file includes intentional empty cells created by the software.
Variables
- BiogroupName: Name of the dietary treatment group (FM, Fishmeal; P1, Pea-1; P2, Pea-2; M, Milk; M/P2, Milk/Pea-2; W/P2, Wheat/pea-2).
- Target: Name of the gene being measured.
- Omitted: Indicates whether any technical replicates were omitted due to QC issues (True/False/mixed).
- TechReplicates: Number of technical replicates used in the calculation.
- RQ: Relative Quantification (fold change compared to FM control).
- RQMin/RQMax: Lower and upper confidence bounds for RQ.
- CTMean: Mean cycle threshold (Ct) value across technical replicates.
- ΔCTMean: Difference in Ct between the target gene and endogenous control.
- ΔCTSE: Standard error of the ΔCt.
- ΔΔCT: Normalized expression difference compared to FM control.
File: Microbiome_Data.zip
Description: This archive contains 16S rRNA sequencing data from the gut microbiome of zebrafish fed alternative protein diets. The data include alpha and beta diversity metrics, principal coordinate analyses (PCoA), and taxonomic summaries across multiple levels. The study compares microbiome outcomes across six diet treatment groups: (FM, Fishmeal; P1, Pea-1; P2, Pea-2; M, Milk; M/P2, Milk/Pea-2; W/P2, Wheat/pea-2).
File Contents:
1) alpha_shannon.csv
Description: Shannon diversity index (alpha diversity) per sample. It summarizes microbial richness and evenness for each sample across groups.
Structure:
Row: individual zebrafish sample
Columns:
- Group: Diet treatment group.
- Samples: Unique sample ID (e.g., S00OL_XXXX).
- Variable: Always "Shannon" (alpha diversity metric used).
- Value: Calculated Shannon diversity index for that sample.
- se: Standard error (not available in this dataset).
2) beta_pcoa_score.csv
Description: Principal coordinate analysis (PCoA) scores from beta diversity analysis. Each row corresponds to one sample.
Structure:
Row: individual zebrafish sample.
Columns:
- Axis.1: First principal coordinate (PCoA axis 1).
- Axis.2: Second principal coordinate (PCoA axis 2).
- MBI.Sample.ID: Internal microbiome ID used in sequencing analysis.
- Your.Sample.ID: Experimental sample ID.
- Sampling.Date: Date when sample was collected.
- Sample.Type: Tissue type (viscera).
- Group: Diet treatment group.
- Reads: Total sequencing read count.
- Shannon: Shannon index value (alpha diversity).
3) CLR_input.data.csv
Description: Centered log-ratio (CLR) transformed OTU abundance data used for multivariate analyses.
Structure:
- Row: individual zebrafish sample
- First column A (unnamed): Sample IDs (e.g., S00OL_XXXX).
- Columns B onwards: CLR transformed abundances of individual OTUs (e.g., Otu0002, Otu0003,...). Each column represents one OTU feature.
- Values: Log-ratio transformed abundance values for each OTU, standardized across all samples.
4) GroupM_OTU_raw.result.csv, GroupM_P2_OTU_raw.result.csv, GroupP1_raw.result.csv, GroupP2_OTU_raw.result.csv, GroupW_P2_OTU_raw.result.csv.
Description: Differential abundance results for individual OTUs in the diet groups, generated from statistical analysis using DESeq2 comparing OTU-level microbiome profiles across treatment groups.
Structure:
Row: one OTU (e.g., Otu0001, Otu0002, etc.), with statistical test results evaluating whether that OTU is significantly differentially abundant in the given diet group (GroupX) relative to others.
Columns:
- GroupX.baseMean: Mean normalized abundance of the OTU across all samples.
- GroupX.log2FoldChange: Log2 fold change in abundance between GroupX and reference group.
- GroupX.lfcSE: Standard error of the log2 fold change.
- GroupX.stat: Test statistic.
- GroupX.pvalue: Raw p-value for the test.
- GroupX.padj: Adjusted p-value.
- GroupX.reject: Significance flag (TRUE is significant at FDR threshold).
- GroupX.df: Degrees of freedom used.
5) proportions_Class.csv, proportions_Family.csv, proportions_Genus.csv, proportions_Order.csv, proportions_Phylum.csv
Description: These files contain the taxonomic composition of the zebrafish gut microbiome at the Phylum, Class, Order, Family, and Genus levels, expressed as proportional abundances.
Structure:
Rows: Individual zebrafish gut samples
Columns: Bacterial taxonomy.
Cell values: proportion of reads assigned to each bacteria classification in eachsample (range 0 to 1).
6) relative_abundance_OTUs_input.data.csv
Description: This file contains the relative abundance data of OTUs for each zebrafish gut sample. The abundance values are normalized to proportions (range from 0 to 1), reflecting the fraction of total reads assigned to each OTU per sample.
Structure:
Rows: individual zebrafish samples.
Columns: OTUs.
Cell values: Proportional abundance of each OTU in sample.
File: Muscle_Cell_Area_Data.csv
Description: This dataset contains individual skeletal muscle fiber cross-sectional area measurements from adult zebrafish fed various protein diets for 145 days.
Variables
- Tank: Numeric identifier for each experimental tank.
- Diet: Diet treatment group (FM, Fishmeal; P1, Pea-1; P2, Pea-2; M, Milk; M/P2, Milk/Pea-2; W/P2, Wheat/pea-2).
- Fish: Identifier for each individual fish sampled.
- Sex: Biological sex of each fish (F= female, M=male).
- Cell #: Index number for each measured muscle fiber from a muscle section.
- Area: Cross-sectional area of the muscle cells (µm²).
File: Nuclei_Data.csv
Description: Nuclei per muscle section in adult zebrafish fed alternative proteins for 145 days.
Variables
- FM: Fishmeal
- P1: Pea-1
- P2: Pea-2
- M: Milk
- M/P2: Milk/Pea-2
- W/P2: Wheat/pea-2
File: Standard_Length_Data.csv
Description: Standard body length (mm) of zebrafish following 145 days of dietary treatment on alternative protein diets.
Variables
- Tank: Numeric identifier for each experimental tank.
- Diet: Diet treatment group (FM, Fishmeal; P1, Pea-1; P2, Pea-2; M, Milk; M/P2, Milk/Pea-2; W/P2, Wheat/pea-2).
- Length: Mean standard body length of fish in each tank (mm).
File: Survivorship_Data.csv
Description: Survivorship was recorded at multiple timepoints to evaluate how different dietary treatments influence mortality across development.
Variables
- Days: Number of days since the start of dietary treatment (day 0 marks the beginning of alternative diet feeding).
- Treatment: Diet treatment group (FM, Fishmeal; P1, Pea-1; P2, Pea-2; M, Milk; M/P2, Milk/Pea-2; W/P2, Wheat/pea-2).
- Survive: Percent survival of fish in each treatment group at the indicated time point.
File: Total_Length_Data.csv
Description: Total body length (mm) of zebrafish fed alternative protein diets.
Variables
- Tank: Numeric identifier for each experimental tank.
- Diet: Diet treatment group (FM, Fishmeal; P1, Pea-1; P2, Pea-2; M, Milk; M/P2, Milk/Pea-2; W/P2, Wheat/pea-2).
- Day: Number of days on the diet at the time of measurement (alternative diet feeding began at 30 dpf).
- Length: Mean total body length of fish in each tank (mm).
File: Technical_Replicate_Data.csv
Description: qPCR gene expression results from zebrafish fed alternative protein diets. Data were generated using SYBR green chemistry. Target genes were normalized against the endogenous control gene E1a (eef1a1l1). Reference group is FM (fishmeal diet). Each row represents a technical replicate for a specific gene in a fish from a given diet group. Data are grouped by sample name and target gene, and include relative quantification (RQ) values, as well as error and Ct metrics. Note that each sample is derived from one individual fish, and that this file includes intentional empty cells created by the software.
Variables
- Sample Name: Diet identifier for each biological replicate (FM, Fishmeal; P1, Pea-1; P2, Pea-2; M, Milk; M/P2, Milk/Pea-2; W/P2, Wheat/pea-2).
- Target Name: Gene measured by qPCR.
- Omitted: Indicates whether the replicate was omitted (true/false).
- RQ: Relative quantification value (fold change vs reference group FM).
- RQ Min/RQ Max: Confidence interval bounds for RQ.
- Ct Mean: Average cycle threshold (Ct) value from technical replicates.
- ΔCt Mean: Difference between Ct of target and endogenous control.
- ΔCt SE: Standard error of ΔCt.
- ΔΔCt: Normalized expression difference between treatment and reference group.
Code/software
Microsoft Excel or other spreadsheet software
Raw microbiome data (.fastq) can be analyzed using QIIME2, DADA2, or other microbiome analysis pipelines.