Data from: A zebrafish model to elucidate the impact of host genes on the microbiota
Data files
Jan 24, 2024 version files 6.01 MB
-
ASV_table_dada2.tsv
-
ASV_table_decontam.tsv
-
ASV_taxonomy_table_dada2.tsv
-
ASV_taxonomy_table_decontam.tsv
-
ASVs.fa
-
LULU_curated_ASV_table.txt
-
LULU_curated_taxonomy_table.csv
-
metadata_complete.csv
-
README.md
Abstract
Every host species and organism provide a unique environmental niche contributing to the overall diversity of microbial ecosystems from the intestine of an animal to the oceans and forests of our planet. The study of host-microbiota interactions has long focused on the well-established effects the microbiota has on its host. In contrast, little focus has been allocated to the role of the host in these intricate interactions. However, understanding the role of the host may well be an essential key to understanding the complexity of the relationship between the host and its microbiota. In this study, we present a model in which the effects of host genes on the microbiota can be elucidated and how such genetic effects may shape host-associated microbiota. We demonstrate a hologenomic approach implementing the CRISPR/Cas system in the zebrafish model to combine the effects of a host gene with 16S metabarcoding and metabolomics data. We show that knocking out the gene coding for the rate-limiting enzyme in melanogenesis, tyrosinase (tyr), correlates with changes in the intestinal microbiota of zebrafish and differences in the abundance of specific metabolites illustrating the value of our model for studying the impact of host genes on the composition and function of the intestinal microbiota.
README: A zebrafish model to elucidate the impact of host genes on the microbiota
https://doi.org/10.5061/dryad.s1rn8pkfv
The dataset is composed of metadata, files and tables related to 16S V3-V4 sequencing and processing of zebrafish related samples, tank water samples and controls. The fish are derived from a CRISPR gene editing protocol for knocking out a pigmentation gene (tyr).
Description of the data and file structure
The data is composed of ASV tables and taxonomy tables generated from processing 16S V3-V4 amplicons and associated metadata.
- Metadata
- ASV table and taxonomy table generated from dada2 including a fasta file of the ASVs (https://benjjneb.github.io/dada2/index.html)
- ASV table and taxonomy table generated from decontamination of (1.) using the R package decontam (https://benjjneb.github.io/decontam/vignettes/decontam_intro.html)
- ASV table and taxonomy table generated from ASV curation of (2.) with the R package LULU (https://github.com/tobiasgf/lulu)
File list:
Metadata_complete.csv
ASV_table_dada2.tsv
ASV_taxonomy_table_dada2.tsv
ASVs.fa
ASV_taxonomy_table_decontam.tsv
ASV_table_decontam.tsv
LULU_curated_ASV_table.txt
LULU_curated_taxonomy_table.csv
Data specific information for: Metadata_complete.csv
- Number of variables:17
- Number of samples/cases/rows:182
- Variable list *Sample_ID: name of forward fastq file *demultiplexing_ID: Name used for bioinformatic pre-processing of samples *lab_sample_name: Sample name used on collection of samples *Thesis_name: Shortened name for controls for better readability in output graphs. The "NA" apply to samples not assigned a shortened name *Sample_type: The type of sample collected, a single "NA" denotes an unspecified sample *Fish_type: Denotes the Pigmentation Phenotypes. "NA´s" denote samples where assigning a pigmentation phenotype was not appliccable *Qubit: post DNA extraction DNA concentration (ng/ul) estimated with a Qubit fluorometer *Sample_or_control: Denotes whether sample is a true sample or a control sample *experi: The samples relevant here are denoted with "Tyr". Samples noted "IRF8" were discarded during analyses and are only included to enhance transparency. The "NAs" denote samples that are control samples. *Eviro:Denotes whether a sample is an envrionmental sample not *Batch: Denotes which sequencing batch the sample belongs to *Tank_s: The number/name of the tank where the sample was aquired, "NA´s" denote samples which were not aquired from a tank *Sex_approx: Approximation of fish sex. "NAs" are samples where either the sample is not a fish or a fish sample where the fish was not sexed or sexing not feasable. *striped_spotted: Whether the zebrafish had a striped or spotted phenotype. "NAs" are either samples where zebrafish was not phenotyped as striped or spotted or samples that were not aquired from a fish. *Blank_Type: Denotes the type of control sample. "NAs" are samples that are not control samples *Water condition: Describes the relative amount of fecal matter in tank water at time of sampling. "NA´s" are samples which were not derived from tank. *Individ: Sample names given to some individual zebrafish at time of sampling. "NAs" denote samples not assigned an individual name or samples where no individual fish was implicated. 4. Missing data codes: Denoted under each variable above where appliccable
Data specific information for: ASV_table_dada2.tsv
- Number of variables:182
- Number of samples/cases/rows:4125
- Data specifications: *Rownames are ASV names *Column names are sample ID´s *Values in cells denote ASV counts
Data specific information for: ASV_taxonomy_table_dada2.tsv
- Number of variables:6
- Number of samples/cases/rows:4125
- Data specifications: *Rownames are ASV names *Column names are taxonomic ranks *Values in cells denote the assigned taxnonmic affeliation of each ASV
- Missing data code: "NA"-unable to assign taxonomy to ASV at taxnomic rank
Data specific information for: ASVs.fa
fasta file of all ASVs generated.
Data specific information for: ASV_taxonomy_table_decontam.tsv
- Number of variables:6
- Number of samples/cases/rows:3892
- Data specifications: *Rownames are ASV names *Column names are taxonomic ranks *Values in cells denote the assigned taxnonmic affeliation of each ASV
- Missing data code: "NA"-unable to assign taxonomy to ASV at taxnomic rank
Data specific information for: ASV_table_decontam.tsv
- Number of variables:183
- Number of samples/cases/rows:3892
- Data specifications: *Rownames are ASV names *Column names are sample ID´s *Values in cells denote ASV counts
Data specific information for: LULU_curated_ASV_table.txt
- Number of variables:182
- Number of samples/cases/rows:812
- Data specifications: *Rownames are ASV names *Column names are sample ID´s *Values in cells denote ASV counts
Data specific information for: LULU_curated_taxonomy_table.csv
Number of variables:6
Number of samples/cases/rows:812
Data specifications:
*Rownames are ASV names
*Column names are taxonomic ranks
*Values in cells denote the assigned taxnonmic affeliation of each ASVMissing data code: "NA"-unable to assign taxonomy to ASV at taxnomic rank
The data is structured in a way so the user can access tables from all filtering and curation processes and can make any other decision for filtering and curating if wished.
Sharing/Access information
Raw reads are available at the European Nucleotide Archive (ENA):
Code/Software
The code used for the study is publicly available at github: https://github.com/eirikurandri/ZEBRAFISH-AND-CRISPR-CAS-A-MODEL-TO-ELUCIDATE-HOST-GENETIC-EFFECTS-ON-THE-MICROBIOTA/tree/main