Data from: Evolution of five environmentally responsive gene families in a pine-feeding sawfly, Neodiprion lecontei (Hymenoptera: Diprionidae)
Data files
Aug 25, 2023 version files 2.74 MB
-
AppendixB_tables.xlsx
57.53 KB
-
CYP450_intact_350AAmin_AmDmPbNlNv_mafftGinsi.fasta
652.54 KB
-
GR_intact_350AAmin_AmArCcDmHsNlNv_mafftLinsi.fasta
306.54 KB
-
HisnavicinFullAlignment_AmDmHsNlNv.fasta
113.79 KB
-
Nlec_manual_gene_annotation_sequences.xlsx
178.37 KB
-
OBP_intact_100AAmin_AmCcDmHsNlNv_mafftLinsi.fasta
182.91 KB
-
OR_intact_350AAmin_AmArCcDmHsNlNv_mafftLinsi.fasta
1.25 MB
-
README.md
1.27 KB
Abstract
A central goal in evolutionary biology is to determine the predictability of adaptive genetic changes. Despite many documented cases of convergent evolution at individual loci, little is known about the repeatability of gene family expansions and contractions. To address this void, we examined gene family evolution in the redheaded pine sawfly Neodiprion lecontei, a non-eusocial hymenopteran and exemplar of a pine-specialized lineage evolved from angiosperm-feeding ancestors. After assembling and annotating a draft genome, we manually annotated multiple gene families with chemosensory, detoxification, or immunity functions before characterizing their genomic distributions and molecular evolution. We find evidence of recent expansions of bitter gustatory receptor (GR), clan 3 cytochrome P450 (CYP3), olfactory receptor (OR), and antimicrobial peptide (AMP) subfamilies, with strong evidence of positive selection among paralogs in a clade of gustatory receptors possibly involved in the detection of bitter compounds. In contrast, these gene families had little evidence of recent contraction via pseudogenization. Overall, our results are consistent with the hypothesis that in response to novel selection pressures, gene families that mediate ecological interactions may expand and contract predictably. Testing this hypothesis will require the comparative analysis of high-quality annotation data from phylogenetically and ecologically diverse insect species and functionally diverse gene families. To this end, increasing sampling in under-sampled hymenopteran lineages and environmentally responsive gene families and standardizing manual annotation methods should be prioritized.
Raw alignment files for select Hymenoptera gene family phylogenies
For each gene family, a multi-species, amino acid phylogeny was constructed with manually curated annotations from N. lecontei, select Hymenoptera, and D. melanogaster. Intact sequences were size filtered (350³ for GR, OR, CYP; 100³ for histnavicin and OBP); pseudogenes and partial annotations were excluded. Select sequences were aligned with MAFFT (v7.305b) (Katoh et al. 2002) (parameters: --maxiterate 1000–localpair).
Manual gene annotation sequences
The olfactory and gustatory receptor genes were annotated following Robertson et al. (2003, 2006). Briefly, manually curated chemoreceptor genes from select Hymenoptera were used as TBLASTN (v2.2.19) (Altschul et al. 1990) queries against the N. lecontei draft genome (parameters: -e 100000 -F F). Gene models were manually built in TextWrangler (v5.5) (Bare Bones Software) and new gene models were iteratively added to TBLASTN searches until new chemoreceptors were no longer found.
Odorant binding protein genes were annotated with custom scripts that identified automated Maker gene annotations with the classic/6C, Plus-C, Minus-C, or atypical odorant binding protein motif (Xu et al. 2009).
Cytochrome P450 genes were annotated with a search of 52 insect CYP genes against the N. lecontei genome assembly (E-value cutoff 1e3). Scaffolds with hits were then searched against 8782 known insect CYPs. Based on these searches, candidate N. lecontei CYP sequences were manually curated based on comparison to the best BLAST hits.
Antimicrobial protein genes (including Hisnavicin) were annotated with BLAST queries from three representative hymenopterans. BLASTP searches were performed against the annotated proteins and TBLASTN searches were performed against the assembled genome.