Subsociality and wood-eating or xylophagy are understood as key drivers in the evolution of eusociality in Blattodea (cockroaches and termites), two features observed in the cockroach genus Cryptocercus, the sister group of all termites. We present and analyse two new high-quality genomes from this genus, C. punctulatus from North America and C. meridianus from Southeast Asia, to explore the evolutionary transitions to xylophagy and subsociality within Blattodea. Our analyses reveal evidence of relaxed selection in both Cryptocercus and termites, indicating that a reduction in effective population size may have occurred in their subsocial ancestors. These findings challenge the expected positive correlation between dN/dS ratios and social complexity, as Cryptocercus exhibits elevated dN/dS values that may exceed those of eusocial termites. Additionally, we identify positive selection on mitochondrial ribosomal proteins and components of the NADH dehydrogenase complex, suggesting significant evolutionary changes in energy production. Future studies incorporating additional genomic data from diverse blattodea species are essential to elucidate the molecular mechanisms driving transitions to xylophagy and eusociality.

Description of the data and file structure

Genomic DNA was extracted from single snap-frozen legs of Cryptocercus punctulatus individuals collected in Virginia and reared in the laboratory. The tissue was homogenised on ice using a TissueRuptor and lysed with buffer CT and Proteinase K. RNA contamination was removed by RNase A treatment. High-molecular-weight DNA was captured using Circulomics Nanodisks and eluted in elution buffer. The extraction followed the Circulomics Insect Big DNA kit protocol (v0.20a). The resulting DNA was of high quality and high molecular weight, with fragment sizes of approximately 170 kb. Quality and size distribution were confirmed using an Agilent Femto Pulse system.

Genome Annotations:

cryptocercus_punctulatus.scaffolded.gff - Cryptocercus punctulatus annotations based on the genome that is stored on NCBI within the project PRJNA1188519

Annotations of IRs with BitaCora:

*_IRs.fa: protein sequences of Ionotropic Receptors

IR_db.: sequence database used as input for BitaCora

Enriched GO terms, results of TopGo analyses:

_BP_table.txt (from cafe results)

Significant_GOterms_selection.xlsx (from selection analyses)

Files and variables

File: cryptocercus_punctulatus.scaffolded.gff

Description: genome annotations for C. punctulatus

The following fasta files are named xx_IR.fa contain protein sequences of aannotatedIonotropic Receptors (IRs)

File: pame_IRs.fa

Description: IR sequences of P. americana

File: rspe_IRs.fa

Description: IR sequences of R. speratus

File: cmer_IRs.fa

Description: IR sequences of C. meridianus

File: znev_IRs.fa

Description: IR sequences of Z. nevadensis

File: csec_IRs.fa

Description: IR sequences of C. secundus

File: elan_IRs.fa

Description: IR sequences of E. langierum

File: dpun_IRs.fa

Description: IR sequences of D. punctata

File: cpun_IRs.fa

Description: IR sequences of C. punctulatus

File: focc_IRs.fa

Description: IR sequences of F. occidentalis

File: mnat_IRs.fa

Description: IR sequences of M. natalensis

File: cfor_IRs.fa

Description: IR sequences of C. formosanus

File: bger_IRs.fa

Description: IR sequences of B. germanica

File: CryptocercusRootContractions_BP_table.txt

Description: TopGo results for contractions at the Cryptocercus root

Variables

GO.ID: GO term ID
Term: GO term description
Annotated: Number of annotated GO terms
Significant: Number of genes of interest
Expected: Expected number
pvalue

File: CryptocercusRootExpansions_BP_table.txt

Description: TopGo results for expansions at the Cryptocercus root

Variables

GO.ID: GO term ID
Term: GO term description
Annotated: Number of annotated GO terms
Significant: Number of genes of interest
Expected: Expected number
pvalue

File: CryptocercusBranchesContractions_BP_table.txt

Description: TopGo results for contractions on all Cryptocercus branches

Variables

GO.ID: GO term ID
Term: GO term description
Annotated: Number of annotated GO terms
Significant: Number of genes of interest
Expected: Expected number
pvalue

File: CryptocercusBranchesExpansions_BP_table.txt

Description: TopGo results for expansions on all Cryptocercus branches

Variables

GO.ID: GO term ID
Term: GO term description
Annotated: Number of annotated GO terms
Significant: Number of genes of interest
Expected: Expected number
pvalue

File: IR_db.msa

Description: IR database alignment file

File: IR_db.fasta

Description: IR database fasta file

File: IR_db.hmm

Description: HMMER output, specifically a Hidden Markov Model (HMM) database related to ionotropic receptors (IR).

File: Significant_GOterms_selection.csv

Description: TopGo results for selection analyses

Variables

GO.ID: GO term ID
Term: GO term description
Annotated: Number of annotated GO terms
Significant: Number of genes of interest
Expected: Expected number
pvalue

Code/software

Python script for calculating CpGo/e

Access information

Other publicly accessible locations of the data:

NCBI - PRJNA1188519

Data was derived from the following sources:

C. punctulatus genome is novel
C. meridianus genome here: https://doi.org/10.1101/2025.01.20.633303

Code/Software

Cpg.py: This Python script calculates the CpG observed/expected (CpG O/E) ratio for each gene sequence in a FASTA file. It reads the input file from the command line and stores each gene’s DNA sequence in a dictionary. For every sequence, it counts the number of cytosines (C), guanines (G), and CpG dinucleotides (CG). The CpG O/E value is computed as the observed number of CGs divided by the expected number based on C and G frequencies in the sequence. If no CpG sites are present, the value is set to zero. Finally, the script outputs each gene name and its CpG O/E ratio in a tab-separated format.

Data from: Cryptocercus genomes expand knowledge of adaptations to xylophagy and termite sociality

Data files

Abstract

Description of the data and file structure

Files and variables

File: cryptocercus_punctulatus.scaffolded.gff

File: pame_IRs.fa

File: rspe_IRs.fa

File: cmer_IRs.fa

File: znev_IRs.fa

File: csec_IRs.fa

File: elan_IRs.fa

File: dpun_IRs.fa

File: cpun_IRs.fa

File: focc_IRs.fa

File: mnat_IRs.fa

File: cfor_IRs.fa

File: bger_IRs.fa

File: CryptocercusRootContractions_BP_table.txt

Variables

File: CryptocercusRootExpansions_BP_table.txt

Variables

File: CryptocercusBranchesContractions_BP_table.txt

Variables

File: CryptocercusBranchesExpansions_BP_table.txt

Variables

File: IR_db.msa

File: IR_db.fasta

File: IR_db.hmm

File: Significant_GOterms_selection.csv

Variables

Code/software

Access information

Code/Software

Data from: Cryptocercus genomes expand knowledge of adaptations to xylophagy and termite sociality

Data files

Abstract

README: Cryptocercus genomes expand knowledge of adaptations to xylophagy and termite sociality

Description of the data and file structure

Files and variables

File: cryptocercus_punctulatus.scaffolded.gff

File: pame_IRs.fa

File: rspe_IRs.fa

File: cmer_IRs.fa

File: znev_IRs.fa

File: csec_IRs.fa

File: elan_IRs.fa

File: dpun_IRs.fa

File: cpun_IRs.fa

File: focc_IRs.fa

File: mnat_IRs.fa

File: cfor_IRs.fa

File: bger_IRs.fa

File: CryptocercusRootContractions_BP_table.txt

Variables

File: CryptocercusRootExpansions_BP_table.txt

Variables

File: CryptocercusBranchesContractions_BP_table.txt

Variables

File: CryptocercusBranchesExpansions_BP_table.txt

Variables

File: IR_db.msa

File: IR_db.fasta

File: IR_db.hmm

File: Significant_GOterms_selection.csv

Variables

Code/software

Access information

Code/Software