Skip to main content

Data from: Chromosome-level reference genome assembly and gene editing of the dead-leaf butterfly Kallima inachus

Cite this dataset

Yang, Jie et al. (2020). Data from: Chromosome-level reference genome assembly and gene editing of the dead-leaf butterfly Kallima inachus [Dataset]. Dryad.


The leaf resemblance of Kallima (Nymphalidae) butterflies is an important ecological adaptive mechanism that increases survival. However, the genetic mechanism underlying ecological adaptation remains unclear owing to a dearth of genomic information. Herein, we revealed the karyotype (n = 31) of the dead-leaf butterfly Kallima inachus, assembled its high-quality chromosome-level reference genome (568.92 Mb; contig N50: 19.20 Mb), and identified its Z and candidate W chromosomes. To our knowledge, this is the first study to report on these aspects of this species. In the assembled genome, 15,309 protein-coding genes and 49.86% repeat elements were annotated. Phylogenetic analysis showed that K. inachus diverged from Melitaea cinxia (no leaf resemblance), both of which are in Nymphalinae, around 40 million years ago. Demographic analysis indicated that the effective population size of K. inachus decreased during the last interglacial period in the Pleistocene. The wings of adults with the pigmentary gene ebony knocked out using CRISPR/Cas9 showed phenotypes in which the orange dorsal region and entire ventral surface darkened, suggesting its vital role in the ecological adaption of dead-leaf butterflies. Our results provide important genome resources for investigating the genetic mechanism underlying protective resemblance in dead-leaf butterflies and insights into the molecular basis of protective coloration.


Pupae of K. inachus from Jingdong County, Puer City, Yunnan Province, China, were collected and reared until eclosion in garden conditions (25-27 ℃, 80% relative humidity, 16 h/8 h light/darkness). Around 100 female and male adults were kept in a greenhouse of 20 m × 20 m × 3 m (length × width × height) with spoiled fruits as adult food and with Strobilanthes as egg-laying and larval host plants. The fifth instar larvae were collected as samples for karyotype analysis and for Hi-C sequencing, while female adults were used for Illumina sequencing for genome survey and PacBio sequencing for de novo genome assembly. Newly laid eggs were used for gene-editing experiment. In addition, one male collected from Yunnan Province was sequenced using Illumina platform for identifying the W chromosome.

Genomic DNA was isolated from the head and thorax tissues of a female individual using TreliefTM Animal Genomic DNA Kit (TsingKe, China). Paired-end libraries (insert size: 350 bp) were generated using NEB Next® Ultra DNA Library Prep Kit for Illumina HiSeq4000 platform at Novogene (Tianjin, China). The female raw reads that have more than 90% bases with quality < Q20 were filtered, and the rest were employed to estimate the genome size based on the 17 k-mer size using kmerfreq (Liu et al., 2013). These female Illumina reads were also used to correct errors of de novo assembled genome at the base level. In addition, genomic DNA was also isolated from the thorax tissues of two male individuals using a phenol-chloroform DNA extraction protocol, and paired-end libraries (insert size: 350 bp) were generated using KAPA® Hyper Prep Kit for Illumina HiSeq Xten platform at Novogene (Tianjin, China). These male reads were used to validate the identification of the W chromosome. For PacBio SMRT long-read sequencing, genomic DNA from another female adult was isolated to construct one 20 kb library (NextOmics, China). DNA templates and enzyme complexes were transferred to the zero-mode waveguide for SMRT sequencing on a PacBio Sequel instrument at the Genome Center of Nextomics (Wuhan, China).

We carried out sample (four fifth instar larvae) treatment and library construction for Hi-C sequencing following previous protocols (Belton et al., 2012; Erez et al., 2009; Ma et al., 2015; Nagano et al., 2015) with some improvements. Firstly, the isolated cells from sliced tissues were cross-linked, lysed and digested by restriction enzyme Dpn II overnight. Secondly, the cohesive ends were blunted, reversed and marked with biotin-14-dATP. Thirdly, DNA was purified by removing of biotin from un-ligated ends, then sheared to fragments of 200~300 bp by Covaris M220 (Covaris, Woburn, MA). After DNA size selection with AMPure XP beads, point ligation junctions were pulled down by Dynabeads® MyOneTM Streptavidin C1 (Thermofisher). Finally, the Hi-C library was sequenced on the Illumina NovaSeq sequencing platform at Novogene (Tianjin, China).

Usage notes

Kin_Hic.fasta:  the chromosome-level genome data of Kallima inachus

12species_ortholog_align_cds.fa: the alignment sequences of 12 species orthlog genes