Visual opsin gene expression evolution in the adaptive radiation of cichlid fishes of Lake Tanganyika
Data files
Aug 03, 2023 version files 111.09 MB
-
00_Parse_GTF_biotypes.sh
888 B
-
01_Trimmomatic.sh
2.65 KB
-
010_VisualPalettes.R
6.31 KB
-
011_Correlations.R
4.05 KB
-
012_Cocorrelations.R
3.89 KB
-
02_FastQC_MultiQC.sh
1.31 KB
-
03_STAR_genome_index.sh
1.05 KB
-
04_STAR_mapping_RH2As.sh
2 KB
-
04_STAR_mapping.sh
1.85 KB
-
05_MappedReadPairsToRH2As.sh
7.26 KB
-
06_1_HTSeqCount.sh
1.89 KB
-
06_2_HTSeqCount_RH2As.sh
2.43 KB
-
07_HTSeqCount_results.R
20.45 KB
-
08_HTSeqCount_results_WeightedSpeciesMean_opsins.R
7.26 KB
-
09_DESeq2_results.R
9.26 KB
-
GCF_001858045.2_O_niloticus_UMD_NMBU_genomic_biotypes.txt
1.54 MB
-
GCF_001858045.2_O_niloticus_UMD_NMBU_genomic_geneID.gtf_FeatureCount_annotation_exons.txt
28.28 MB
-
HTSeqCount_ALL_exons_Individuals_withRH2As.txt
78.57 MB
-
HTSeqCount_exons_Barplot_PerTribe_RodCones_PE_Count.pdf
459.95 KB
-
HTSeqCount_exons_Barplot_PerTribe_RodCones_PE_TPM.pdf
462.40 KB
-
HTSeqCount_exons_PCA_Individual_opsins.pdf
771.76 KB
-
HTSeqCount_exons_PCA_Individual_proteins_lncRNAs.pdf
783.97 KB
-
Opsins_median_readcoverage_CDS_GW_PerTribe.pdf
139.58 KB
-
Orenil_opsins.txt
202 B
-
Parse_GTF_biotypes.py
1.11 KB
-
README_Dryad.md
4.92 KB
-
RNAseq_SpeciesTree.tre
5.88 KB
Abstract
Tuning the visual sensory system to the ambient light is essential for survival in many animal species. This is often achieved through duplication, functional diversification, and/or differential expression of visual opsin genes. Here, we examined 753 new retinal transcriptomes from 112 species of cichlid fishes from Lake Tanganyika to unravel adaptive changes in gene expression at the macro-evolutionary and ecosystem level of one of the largest vertebrate adaptive radiations. We found that, across the radiation, all seven cone opsins – but not the rhodopsin – rank among the most differentially expressed genes in the retina, together with other vision-, circadian-rhythm-, and haemoglobin-related genes. We propose two new visual palettes characteristic of very shallow- and deep-water living species, respectively, and show that visual system adaptations along two major ecological axes, macro-habitat and diet, occur primarily via gene expression variation in a subset of cone opsin genes.
In this study, we sequenced 753 new retinal transcriptomes of 112 cichlid species from African Lake Tanganyika to (i) identify adaptive changes in the expression of rod and cone visual opsin genes, (ii) define and reconstruct the evolution of visual palettes on the basis of cone opsin expression levels, and (iii) examine rod and cone opsin expression levels in relation to macro-habitat, diet, and relative eye size.
Sampling of fish eyes was performed between 2014 and 2020 at 44 locations at Lake Tanganyika covering the entire north-south axis of this approximately 670 km long lake. For each specimen, the retina of a single eye was dissected, homogenised (FastPrep-24; MP Biomedicals), and the total RNA was extracted using the Direct-zol RNA kit (Zymo) according to the manufacturer’s protocol. Individual libraries were constructed using the Illumina TruSeq stranded protocol including RiboZero Gold rRNA depletion (Illumina) and sequenced on an Illumina NovaSeq 6000 in PE 100-bp mode. Library construction and sequencing were conducted at the Genomics Facility Basel, which is jointly operated by the University of Basel and the Department of Biosystems Science and Engineering (D-BSSE) of ETH Zurich. Quality filtering and adapter removal of Illumina strand-specific paired-end sequences were performed using Trimmomatic (v. 0.39) with a 4-bp window size, a required window quality of 15, and 80 bp as minimum read length. Cleaned reads were mapped against the Nile tilapia reference genome (Oreochromis niloticus; RefSeq accession GCF_001858045.2, female), which is phylogenetically equidistant to all members of the cichlid adaptive radiation in Lake Tanganyika (that is, all species in our dataset except O. tanganicae and Tylochromis polylepis), using STAR (v. 2.7.3a) with --outFilterMultimapNmax 1 --outFilterMatchNminOverLread 0.4 --outFilterScoreMinOverLread 0.4. To obtain a reasonable estimate of mapped reads to exonic features, we filtered out mapped singletons and then assigned and counted read pairs within exons using the HTSeq-count script from the HTSeq framework (v. 0.11.2). Due to the high gene sequence similarity between RH2Aa and RH2Ab (percent sequence identity of exons between 94.81% and 99.40%), we retrieved all read pairs exactly mapping to both features and corrected the read count by assigning the remaining read pairs to RH2Aa and RH2Ab according to their original ratio. The total number of exonic features retrieved by HTSeq-count was 41’945 (out of 42’622 annotated genes). We filtered the read count dataset to retain both protein-coding RNAs and lncRNAs (long non-coding RNAs). This resulted in 38’228 RNAs, from which we excluded 9’226 lowly expressed genes (<= five counts in less than three samples). The final read count dataset used for the subsequent analyses included 29’002 RNAs from which 25’335 were protein-coding RNAs (out of 29’532 exonic feature) and 3’667 were lncRNAs (out of 8’696 exonic features).
A GitHub reposit is available for further information: https://github.com/Ninet93/RNASeq_Ricci_et_al