Skip to main content
Dryad

Chromosome-level genome of the peach fruit moth Carposina sasakii (Lepidoptera: Carposinidae) provides a resource for evolutionary studies on moths

Cite this dataset

Wei, Shu-Jun; Cao, Li-Jun (2020). Chromosome-level genome of the peach fruit moth Carposina sasakii (Lepidoptera: Carposinidae) provides a resource for evolutionary studies on moths [Dataset]. Dryad. https://doi.org/10.5061/dryad.m0cfxpp1j

Abstract

Here we provide scripts and parameters for genome assembly and annotation, as well as the manually annotated circadian genes of period (PER), timeless (TIM), Clock (CLK), cycle (CYC) and cryptochrome (CRY), five detoxification gene families of cytochrome P450 monooxygenase (P450s), glutathione S-transferase (GSTs), carboxyl/cholinesterases (CCEs), UDP-glycosyltransferases (UGTs) and ATP-binding cassette (ABC) transporters, IR, OR, OBP, GR genes from the genome of the peach fruit moth (PFM), Carposina sasakii Matsumura (Lepidoptera: Carposinidae, superfamily Copromorphoidea) and genomes of its related species.

Methods

Manual annotation of circadian genes

We further manually annotated well-studied circadian genes: period (PER), timeless (TIM), Clock (CLK), cycle (CYC) and cryptochrome (CRY), using BLAST v2.2.31 (Altschul et al. 1990). Reference protein sequences of insect circadian genes were obtained from the Uniprot database. Conserved domains within proteins were annotated against the conserved domain database (Lu et al. 2020). Circadian genes of the other 15 insect species were annotated in the same way. For a common domain of three genes (CLK, PER and CYC), a neighbor-joining tree was constructed using MEGA7 (Kumar et al. 2016) with 500 bootstrap replicates.

Manual annotation of detoxification gene families

We manually annotated five detoxification gene families of cytochrome P450 monooxygenase (P450s), glutathione S-transferase (GSTs), carboxyl/cholinesterases (CCEs), UDP-glycosyltransferases (UGTs) and ATP-binding cassette (ABC) transporters. We used the bioinformatic pipeline BITACORA (Vizueta et al. 2019) to conduct HMMER v3.3 (Finn et al. 2011)and BLAST v2.2.31 (Altschul et al. 1990) analyses under a full mode. Hits were filtered with a default cut-off E-value of 10e-5. The HMMs of P450 were downloaded from Pfam v32.0 (El-Gebali et al. 2018), while other HMMs of detoxification gene families were created by HMMER v3.3 (Finn et al. 2011). Orthologs from Bombyx mori and D. melanogaster were used as evidence. The annotated genes were further filtered manually based on gene length and the presence of conserved domains. Genes with a length shorter than 80 amino acids were removed. Orthologs were aligned with the G-INS-I algorithm implemented in MAFFT v7.450 (Katoh & Standley 2013). A neighbor-joining tree was constructed for each gene family using MEGA7 (Kumar et al. 2016) with 500 bootstrap replicates.

Usage notes

Sequences are in fasta format by gene.

Funding