Marine populations of the threespine stickleback (Gasterosteus aculeatus) have repeatedly colonized and rapidly adapted to freshwater habitats, providing a powerful system to map the genetic architecture of evolved traits. Here, we developed and applied a binned genotyping-by-sequencing (GBS) method to build dense genome-wide linkage maps of sticklebacks using two large marine by freshwater F2 crosses of more than 350 fish each. The resulting linkage maps significantly improve the genome assembly by anchoring 78 new scaffolds to chromosomes, reorienting 40 scaffolds, and rearranging scaffolds in 4 locations. In the revised genome assembly, 94.6% of the assembly was anchored to a chromosome. To assess linkage map quality, we mapped quantitative trait loci (QTL) controlling lateral plate number, which mapped as expected to a 200-kb genomic region containing Ectodysplasin, as well as a chromosome 7 QTL overlapping a previously identified modifier QTL. Finally, we mapped eight QTL controlling convergently evolved reductions in gill raker length in the two crosses, which revealed that this classic adaptive trait has a surprisingly modular and nonparallel genetic basis.
README
Summary of all files in this Dryad package
FileS4 NewScaffoldOrder.csv
Revised scaffold order for each chromosome (consensus of FTC and BEPA). Revised coordinates (based on this study) and original assembly coordinates are presented. Orientations are defined relative to original genome assembly. The orientation of some scaffolds was not detected in this study. These scaffolds are labeled as having 'unknown' orientation; their orientation was not altered relative to their orientation in the original genome assembly. Chromosome 'M' is the mitochondrial genome sequence, which was not analyzed in this study but is replicated in the revised genome assembly.
FileS5 revisedAssemblyUnmasked.fa.zip
Fasta file containing revised genome assembly based on consensus scaffold order and orientation as described in File S4 in the Glazer et al. manuscript. File is zipped.
FileS6 revisedAssemblyMasked.fa.zip
Repeat masked fasta file containing revised genome assembly based on consensus scaffold order and orientation as described in File S4 in the Glazer et al. manuscript. Repeat masked fasta file is based off the repeat masked version of the original genome assembly, which was masked with RepeatMasker. File is zipped.
FileS7 ensGene_revised.gtf
Revised .gtf file of Ensembl gene predictions. Coordinates of gene predictions were converted to the revised assembly coordinates. All Ensembl-predicted genes were included, except ENSGACT00000019430, which spans two scaffolds (11 and 79) that are not adjacent in the revised genome assembly. File is zipped.
ScafKeyForNewFasta.csv
Key for scaffold order in fasta files
SampleList.csv
List of all samples and barcodes in the GBS F2s.
convertCoordinate.R
This R function converts between the 'old' and 'new' stickleback assembly coordinate systems. The 'old' coordinate system is the assembly described in the Jones et al 2012 stickleback genome paper. It requires access to the FileS4 NewScaffoldOrder.csv file. It has 4 inputs: chr, pos, direction, and scafFile. It returns a list of [chromosome, position]. See README or comments in convertCoordinate.R for further details.