Genome assembly and annotation of the Dark-branded Bushbrown butterfly Mycalesis mineus (Nymphalidae: Satyrinae)
Data files
Feb 26, 2024 version files 562.53 MB
-
Mmineus_v0.6.fa
504.51 MB
-
Mmineus_v0.6.gff3
58.02 MB
-
Mminues_Assembly_scripts.txt
3.26 KB
-
README.md
1.08 KB
Abstract
We report a high-quality genome draft assembly of the Dark-branded Bushbrown, Mycalesis mineus, a member of the Satyrinae subfamily of nymphalid butterflies. This species is emerging as a promising model organism for investigating the evolution and development of phenotypic plasticity. Using 45.99 Gb of long-read data (N50=11.11 Kb), we assembled a genome size of 497.4 Mb for M. mineus. The assembly is highly contiguous and nearly complete (96.8% of BUSCO lepidopteran genes were complete and single-copy). The genome comprises 38.71% of repetitive elements and includes 20,967 predicted protein coding genes. The assembled genome was super-scaffolded into 28 pseudo-chromosomes using a closely related species, Bicyclus anynana, with a chromosomal-level genome as a template. This valuable genomic tool will advance both ongoing and future research focused on this model organism.
README: Mycalesis mineus genome assembly
The submission contains the assembly and annotation files of M. mineus assembly.
Mmineus_v0.6.fa - assembly
Using 45.99 Gb of long-read data (N50=11.11 Kb), we assembled a genome size of 497.4 Mb for Mycalesis mineus. The assembly is highly contiguous and nearly complete (96.8% of BUSCO lepidopteran genes were complete and single-copy). The genome comprises 38.71% of repetitive elements. The assembled genome was super-scaffolded into 28 pseudo-chromosomes using a closely related species, Bicyclus anynana, with a chromosomal-level genome as a template. The current version of the assembly v0.6 has 342 scaffolds with N50 of 17.8MB.
Mmineus_v0.6.gff3 - annotation
The genome was annotated using the Braker2 automated pipeline, which resulted in 18,360 genes accompanied by 20,967 transcripts. The Mmineus_v0.6.gff3 file contains the information of the genes and transcript annotation in the Mmineus_v0.6.fa - assembly.
Mminues_Assembly_scripts- This file contains the codes and steps used in assembling the genome and annotation
Methods
High molecular weight (HMW) DNA was extracted from a female individual. Following the user's protocol, the Oxford Nanopore library was prepared using the Ligation Sequencing kit for gDNA (SQK-LSK114). Sequencing was performed on an FLO-PRO002 cell in the PromethION machine, and base calling was performed using Guppy 6.5.7. Initial genome assembly was performed separately using the Flye assembler (Kolmogorov et al., 2019) and the wtdbg2 assembler (Ruan & Li, 2020). Flye was run with the default settings, with polisher iteration set to 1, while wtdbg2 was run with its default settings. For both assemblers, the genome size was set to 500MB, based on the genome size of the closely related species Bicyclus anynana. The two genome assemblies were merged using quickmerge (Chakraborty et al., 2016), with wtdbg2 as the reference assembly and the Flye assembly as the query. Purge_haplotigs were used to remove any heterozygosity in the genome. The genome was broken at the point of misassembly with ragtag_correct using the B. anynana genome as a reference. The final genome was scaffolded to chromosome level using the B. anynana genome as reference using Chromosome_Scaffolder.