Asellus aquaticus genome and VCF file from samples obtained on Gotland, Sweden
Data files
May 19, 2021 version files 2.41 GB
Abstract
Organisms well suited for the study of ecotype formation have wide distribution ranges, where they adapt to multiple drastically different habitats repeatedly over space and time. Here we study such ecotypes in a Crustacean model, Asellus aquaticus, a commonly occurring isopod found in freshwater habitats as diverse as streams, caves and lakes. Previous studies focusing on cave versus surface ecotypes have attributed depigmentation, eye loss and prolonged antennae to several south European cave systems. Likewise, surveys across multiple Swedish lakes have identified the presence of dark-pigmented "reed" and light-pigmented "stonewort" ecotypes, which can be found within the same lake. In this study, we sequenced the first draft genome of A. aquaticus, and subsequently use this to map reads and call variants in surface stream, cave and two lake ecotypes. In addition, the draft genome was combined with a RADseq approach to perform a QTL mapping study using a laboratory bred F2 and F4 cave x surface intercross. We identified genomic regions associated with body pigmentation, antennae length and body size. Furthermore, we compared genome-wide differentiation between natural populations and found several genes potentially associated to these habitats. The assessment of the cave QTL regions in the light-dark comparison of lake populations suggests that the regions associated with cave adaptation are also involved with genomic differentiation in the lake ecotypes. These demonstrate how troglomorphic adaptations can be used as a model for related ecotype formation.
Methods
An Asellus aquaticus population from Lummelunda Cave on Gotland, Sweden was collected in 2014 and has been reared in the laboratory of Linkoping University until 2021. Genomic DNA from a single female specimen from this inbred population was extracted using standard salt-based DNA extraction method and sequenced on a single S4 lane of an Illumina NovaSeq 6000 machine. 10xGenomics linked read libraries were prepared from the genomics DNA prior to sequencing (outsourced to SciLife Labs., Sweden). The genome assembly was conducted using SUPERNOVA software and was subsequently scaffolded using paired end WGS data from two other Asellus aquaticus samples collected in the same area. Important note!: As part of our study we screened only scaffolds that we analysed for contamination. Therefore, this raw genome build is bound to have contamination sequences in it. Keep this in mind and apply de-contamination pipelines if necessary.
The VCF file includes over 6 million SNPs and was derived from natural populations (LU-Lummelunda Upstream, LC-Lummelunda Cave, LD-Lummelunda Downstream, HR-Horsan Reed, HS-Horsan Stony bottom) and from F2 and F4 populations derived from a cave x surface intercross. Most of the present SNP variants are poorly represented among the samples (around 1000-2000 SNPs are better represented).