Automated improvement of stickleback reference genome assemblies with Lep-Anchor software
Kivikoski, Mikko; Rastas, Pasi; Löytynoja, Ari; Merilä, Juha (2021), Automated improvement of stickleback reference genome assemblies with Lep-Anchor software, Dryad, Dataset, https://doi.org/10.5061/dryad.bzkh1896t
We describe an integrative approach to improve contiguity and haploidy of a reference genome assembly and demonstrate its impact with practical examples. With two novel features of Lep-Anchor software and a combination of dense linkage maps, overlap detection and bridging long reads we generated an improved assembly of the nine-spined stickleback (Pungitius pungitius) reference genome. We were able to remove a significant number of haplotypic contigs, detect more genetic variation and improve the contiguity of the genome, especially that of X chromosome. However, improved scaffolding cannot correct for mosaicism of erroneously assembled contigs, demonstrated by a de novo assembly of a 1.7 Mbp inversion. Qualitatively similar gains were obtained with the genome of three-spined stickleback (Gasterosteus aculeatus). The utility of genome-wide sequencing data in biological research depends heavily on the quality of the reference genome. Although the reference genomes have improved, it is evident that the assemblies could still be refined, especially in non-model study organisms.
This submission contains a snapshot of computer code and instructions for automated improvement of reference genome assemblies and scripts for reproducing the published analyses. An up-to-date version is available at https://github.com/mikkokivikoski/NSP_V7.
Academy of Finland, Award: 129662
Academy of Finland, Award: 134728
Academy of Finland, Award: 218343
Academy of Finland, Award: 322681