Data from: Major improvements to the Heliconius melpomene genome assembly used to confirm 10 chromosome fusion events in 6 million years of butterfly evolution
Data files
Jan 21, 2017 version files 9.64 GB
-
Eueides.Hmel2.vcf.gz
-
Genome.tar.gz
-
GitHub.tar.gz
-
Hmel_cross.Hmel_haplotype_scaffolds.vcf.gz
-
Hmel_cross.Hmel1-1_primaryScaffolds.vcf.gz
-
Hmel_cross.linkage_map.clean.db.gz
-
Hmel_cross.linkage_map.db.gz
-
Hmel1-2_haploid.fa.gz
-
Hmel1-2_pacbio_merge.fa.gz
-
Hmel1-2.fa.gz
-
pacbio_falcon_assembly.fa.gz
-
pacbio_haploid.fa.gz
-
readdepth_gc_1kb.windows.Hmel2.adjusted.tsv.gz
-
README.txt
Abstract
The Heliconius butterflies are a widely studied adaptive radiation of 46 species spread across Central and South America, several of which are known to hybridize in the wild. Here, we present a substantially improved assembly of the Heliconius melpomene genome, developed using novel methods that should be applicable to improving other genome assemblies produced using short read sequencing. First, we whole-genome-sequenced a pedigree to produce a linkage map incorporating 99% of the genome. Second, we incorporated haplotype scaffolds extensively to produce a more complete haploid version of the draft genome. Third, we incorporated ∼20x coverage of Pacific Biosciences sequencing, and scaffolded the haploid genome using an assembly of this long-read sequence. These improvements result in a genome of 795 scaffolds, 275 Mb in length, with an N50 length of 2.1 Mb, an N50 number of 34, and with 99% of the genome placed, and 84% anchored on chromosomes. We use the new genome assembly to confirm that the Heliconius genome underwent 10 chromosome fusions since the split with its sister genus Eueides, over a period of about 6 million yr.