Skip to main content

Coregonus spp. opsin amplicon sequence alignments

Cite this dataset

Eaton, Katherine; Krabbenhoft, Trevor (2020). Coregonus spp. opsin amplicon sequence alignments [Dataset]. Dryad.


Local adaptation can drive diversification of closely related species across environmental gradients and promote convergence of distantly related taxa that experience similar conditions. We examined a potential case of adaptation to novel visual environments in a species flock (Great Lakes salmonids, genus Coregonus) using a new amplicon genotyping protocol on the Oxford Nanopore Flongle. Five visual opsin genes were sequenced for individuals of C. artedi, C. hoyi, C. kiyi, and C. zenithicus. Comparisons revealed species-specific differences in a key spectral tuning amino acid in rhodopsin (Tyr261Phe substitution), suggesting local adaptation of C. kiyi to the blue-shifted depths of Lake Superior. Ancestral state reconstruction demonstrates that parallel evolution and “toggling” at this amino acid residue has occurred several times across the fish tree of life, resulting in identical changes to the visual systems of distantly related taxa across replicated environmental gradients. Our results suggest that ecological differences and local adaptation to distinct visual environments are strong drivers of both evolutionary parallelism and diversification.


Visual opsin genes from 18 Coregonus artedi, 19 C. hoyi, 21 C. kiyi, and 16 C. zenithicus were PCR amplified and sequenced using the Oxford Nanopore Flongle. Reads were aligned to version 1 of the Coregonus sp. 'balchen' assembly (De-Kayne et al. 2020) using bwa mem. Raw sequence data is deposited in SRA, under BioProject #PRJNA664981.

Raw Sanger sequencing data is also included here, in .ab1 file formats. For each of 13 samples, we sequenced a ~700 bp segment of rhodopsin using Sanger sequencing, to verify the results of our nanopore sequencing. These sequence files (both forward and reverse for each sample) are included here, in the format (SAMPLE ID)_RH_A_A_(Fwd/Rev).ab1. 

Usage notes

Alignments of reads are in BAM format, and are named according to the sample ID, which is also available on NCBI (BioProject #PRJNA664981) (i.e. CA01.bam is sample # of Coregonus artedi). BAM index files (*.bam.bai) are also included. 

A VCF file containing called SNPs for all samples is also included (FINAL_filtered_opsins.vcf). This file contains called SNPs for 80 samples, 74 of which were used in the present study (these 74 sample names correspond to the names of the BAM files that are also included in this repository). The file has been filtered to remove genotype calls that were made at <100x coverage, as this was the point at which genotyping was determined to be unreliable. 


Great Lakes Fishery Commission, Award: 2018_KRA_44073