Gene expression comparisons between captive and wild shrew brains reveal captivity effects
Data files
Oct 12, 2023 version files 13.11 MB
-
Cortex_counts.txt
2.37 MB
-
Cortex_results.csv
2 MB
-
Cortex_significant_list.txt
30.51 KB
-
Downregulated_intersection.txt
3.51 KB
-
GO_pathways_down.txt
8.18 KB
-
GO_pathways_up.txt
1.87 KB
-
Hippocampus_counts.txt
2.33 MB
-
Hippocampus_results.csv
1.98 MB
-
Hippocampus_significant_list.txt
17.34 KB
-
NADHgenes.xlsx
17 KB
-
Olf_bulb__significant_list.txt
19.11 KB
-
Olf_bulb_counts.txt
2.35 MB
-
Olf_bulb_results.csv
1.98 MB
-
README.md
6.22 KB
-
Samples_cortex.txt
196 B
-
Samples_hippoc.txt
236 B
-
Samples_olf_bulb.txt
204 B
-
Upregulated_intersection.txt
1.51 KB
Feb 29, 2024 version files 7.35 MB
-
Cortex_captivity.R
18.98 KB
-
Cortex_counts.txt
2.37 MB
-
Cortex_pathways_down.txt
8.06 KB
-
Cortex_pathways_up.txt
1.51 KB
-
Cortex_results.csv
80.02 KB
-
Cortex_significant_gene_list.txt
5.67 KB
-
Downregulated_genes_cortex.txt
2.44 KB
-
Downregulated_genes_hippoc.txt
1.39 KB
-
Downregulated_genes_olf_bulb.txt
938 B
-
Hippocampus_captivity.R
23.56 KB
-
Hippocampus_counts.txt
2.33 MB
-
Hippocampus_pathways_down.txt
5.62 KB
-
Hippocampus_pathways_up.txt
1.40 KB
-
Hippocampus_results.csv
41.12 KB
-
Hippocampus_significant_gene_list.txt
2.96 KB
-
NADH_genes_2024.xlsx
24.66 KB
-
Olf_bulb_captivity.R
18.88 KB
-
Olf_bulb_counts.txt
2.35 MB
-
Olf_bulb_pathways_down.txt
4.41 KB
-
Olf_bulb_pathways_up.txt
1.29 KB
-
Olf_bulb_results.csv
36.78 KB
-
Olfactory_bulb_significant_gene_list.txt
2.73 KB
-
README_2024.md
6.44 KB
-
README.md
6.77 KB
-
samples_cortex.txt
285 B
-
samples_hippoc.txt
325 B
-
samples_olf_bulb.txt
295 B
-
Upregulated_genes_cortex.txt
1.42 KB
-
Upregulated_genes_hippoc.txt
508 B
-
Upregulated_genes_olf_bulb.txt
676 B
Jun 07, 2024 version files 11.40 MB
-
2024_2Cortex_captivity.R
19.32 KB
-
2024_2Hippocampus_captivity.R
25.84 KB
-
2024_2Olfbulb_captivity.R
19.78 KB
-
Captivity_BioinfoTable_Full.xlsx
14.32 KB
-
Cortex_counts.txt
2.37 MB
-
Cortex_pathways_down.txt
5.85 KB
-
Cortex_pathways_up.txt
944 B
-
Cortex_results_all.csv
1.61 MB
-
Cortex_significant_gene_list.txt
3.78 KB
-
Downregulated_genes_cortex.txt
1.59 KB
-
Downregulated_genes_hippoc.txt
731 B
-
Downregulated_genes_olf_bulb.txt
627 B
-
Hippoc_results_all.csv
1.56 MB
-
Hippocampus_counts.txt
1.80 MB
-
Hippocampus_pathways_down.txt
4.21 KB
-
Hippocampus_pathways_up.txt
1.44 KB
-
Hippocampus_significant_gene_list.txt
1.63 KB
-
NADH_genes_2024.xlsx
19.94 KB
-
Olf_bulb_counts.txt
2.35 MB
-
Olf_bulb_pathways_down.txt
3.54 KB
-
Olf_bulb_pathways_up.txt
1.56 KB
-
Olf_bulb_results_all.csv
1.58 MB
-
Olfactory_bulb_significant_gene_list.txt
2.63 KB
-
README.md
7.35 KB
-
samples_cortex.txt
285 B
-
samples_hippoc.txt
325 B
-
samples_olf_bulb.txt
295 B
-
Upregulated_genes_cortex.txt
905 B
-
Upregulated_genes_hippoc.txt
239 B
-
Upregulated_genes_olf_bulb.txt
1.09 KB
Abstract
Compared to their free-ranging counterparts, wild animals in captivity experience different conditions with lasting physiological and behavioral effects. Although shifts in gene expression are expected to occur upstream of these phenotypes, we found no previous gene expression comparisons of captive vs. free-ranging mammals. We assessed gene expression profiles of three brain regions (cortex, olfactory bulb, and hippocampus) of wild shrews (Sorex araneus) compared to shrews kept in captivity for two months and undertook sample drop-out to examine robustness given limited sample sizes. Consistent with captivity effects, we found hundreds of differentially expressed genes in all three brain regions, 104 overlapping across all three, that enriched pathways associated with neurodegenerative disease, oxidative phosphorylation, and genes encoding ribosomal proteins. In the shrew, transcriptomic changes detected under captivity resemble responses in several human pathologies, including major depressive disorder and neurodegeneration. While interpretations of individual genes are tempered by small sample sizes, we propose captivity influences brain gene expression and function and can confound analyses of natural processes in wild individuals under captive conditions.
README: Large captivity effect based on gene expression comparisons between captive and wild shrew brains
https://doi.org/10.5061/dryad.qz612jmng
Scripts and code to reproduce RNAseq analysis for looking at changes in expression due to captivity effects in shrews; specifically in the cortex, hippocampus, and olfactory bulb. These regions will help to understand region specific changes to the brain, as well as impacts on the whole brain.
Description of the data and file structure
First, we assess RNA-seq data quality, filtering low quality reads and trimming adapters in fastp. RNA-seq sample lists (samples_hippoc.txt, samples_olf_bulb.txt, samples_cortex.txt) are described below.
- Rows =Sample_ids. Note: Samples named HFS* correspond to wild shrews, and those named Eto* correspond to captive shrews as shown in the file.
- Columns = Organ, experimental group (wild or captive), sex, toothrow length (cm)
Then, we map reads pruned reads and mapped to the reference transcriptome, quantifying transcript abundance in kallisto. Quality of these processes can be found in Captivity_BioinforTable_Full.xlsx. NAs represent data that is not applicable to the row.
- Rows = Samples
- Columns = Region, Condition, Capture (Date), Euthanized (Date), Weight Caught (g), Weight Euthanized, Sex, Toothrow (cm), RIN, Reads pre filtered, Reads post filtered, Reads pre filtered (pairwise), Reads post filtered (pairwise), Reads removed, Reads removed (pairwise), K processed, K aligned, and mapping percent (%).
Gene count files for each tissue type (Olf_bulb_counts.txt, Hippocampus_counts.txt, Cortex_counts.txt), are described below.
Gene Counts (counts)
- Rows = Genes
- Columns = Samples
After this, we normalize data and analyze differential expression between conditions using DESeq2. Below are the results for the comparisons between wild and captive individuals (Olf_bulb_results_all.csv, Hippocampus_results_all.csv, Cortex_results_all.csv). NAs in p-values represent genes were differential expression was not tested, as they were filtered out if within a row, all samples have zero counts, if a row contains a sample with an extreme count outlier, or if a row is filtered by DESeq2 automatic independent filtering.
DESeq2 Results
- Rows = Genes
- Columns = means , log-fold changes, p-values
Additionally, here are the significant results (p<0.05) for each tissue (Olfactory_bulb__significant_gene_list.txt, Hippocampus_significant_gene_list.txt, Cortex_significant_gene_list.txt), as well as those genes sorted into unregulated and downregulated lists without loci (Downregulated_genes_cortex.txt, Downregulated_genes_hippoc.txt, Downregulated_genes_olf_bulb.txt, Upregulated_genes_cortex.txt, Upregulated_genes_hippoc.txt and Upregulated_genes_olf_bulb.txt) used for pathway analyses.
- Rows = List of significant genes
With the resultant genes we also perform a pathway enrichment analysis using DAVID Functional Enrichment Tools, to test whether they enrich KEGG pathways (Cortex_pathways_down.txt, Cortex_pathways_up.txt, Olf_bulb_pathways_down.txt, Olf_bulb_pathways_up.txt, Hippocampus_pathways_down.txt, Hippocampus_pathways_up.txt).
- Rows = Pathway
- Columns = genes tested, total hits, percent hits, enrichment value, p-values, adjusted p-values
Finally, an excel file where percentage of complex I NADH:ubiquinone oxidoreductase subunit genes in 12 of the enriched pathways was calculated.
NADH genes 2024.xlsx. NAs represent blank cells, as no more complex I NADH:ubiquinone oxidoreductase subunit genes were found in the pathway.
- Rows = Gene symbols in pathways, Number of NADH:ubiquinone oxidoreductase subunit genes (total), NADH:ubiquinone oxidoreductase subunit genes (%)
- Columns = Pathways
Code/Software
Required packages
Kallisto version 0.46.1
R version 4.2.0
DESeq2 version 1.36.0
Database for Annotation, Visualization and Integrated Discovery
(DAVID) (2022)
RNA-seq analyses first align to a reference and then quantify of reads. Genome and original unfiltered reads can be downloaded as described below. However, these steps could be skipped if reproducing, as count data has been provided above.
The reference (sorAra2; GCF_000181275.1) can be download from straight from NCBI, or using code provided.
mkdir ./data/ref/
wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/181/275/GCF_000181275.2_SorAra2.0/GCF_000181275.2_SorAra2.0_genomic.gff.gz ./data/ref/
wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/181/275/GCF_000181275.2_SorAra2.0/GCF_000181275.2_SorAra2.0_rna.fna.gz ./data/ref/
gunzip ../ref/GCF_000181275.2_SorAra2.0_rna.fna.gz
RNA-seq data from this project can be found on NCBI SRA. These can be downloaded manually, or using the getter.sh script with the help of sratoolkit (https://github.com/ncbi/sra-tools). Note, all scripts use indirect paths and may need to be updated depending on file structure.
bash get_rawseq.sh
Quality control, filtering, trimming
These scripts can be skipped if reproducing from counts. Trim adapters from downloaded reads and remove low quality reads using default settings and fastp. You will need to download fastp to your environment (https://github.com/OpenGene/fastp).
bash fastp.sh
Mapping and quantification
Reads that have went through quality control are then mapped to the reference transcriptome and quantified . This method does not directly map reads to the genome. (https://pachterlab.github.io/kallisto/about).
bash kallisto.sh
Note: This will create new transcript abundances separate ffrom the ones used in this analysis, but should be the same just stored in a different location. Change names in code below if required.
Analyses
Each analysis was conducted using the R code below for each tissue type. For best results, run in RStudio, as each matrix and figure is not set to print out in a best attempt to not overwrite results. Additionally, some intersection tables require data from the other codes, so 2024_2Cortex_captivity.R and 2024_2Olf_bulb_captivity.R are supposed to be run before 2024_2Hippocampus_captivity.R code, since hippocampus script includes codes for plots that need outputs from the two previous scripts.
2024_2Cortex_captivity.R
2024_2Olf_Bulb_captivity.R
2024_2Hippocampus_captivity.R
DAVID Geneset Enrichment
Both the above programs were done online at the below links. In a perfect world this should be scripted, however, due to conflicts in packages and Rversions they were not. https://david.ncifcrf.gov/summary.jsp