Data from: Complex evolution of insect insulin receptors and homologous decoy receptors, and functional significance of their multiplicity
Smýkal, Vlastimil et al. (2020), Data from: Complex evolution of insect insulin receptors and homologous decoy receptors, and functional significance of their multiplicity, Dryad, Dataset, https://doi.org/10.5061/dryad.4xgxd255n
Evidence accumulates that the functional plasticity of insulin and insulin-like growth factor signaling in insects could spring, among others, from the multiplicity of insulin receptors (InRs). Their multiple variants may be implemented in the control of insect polyphenism, such as wing or caste polyphenism. Here, we present a comprehensive phylogenetic analysis of insect InR sequences in 118 species from 23 orders and investigate the role of three InRs identified in the linden bug, Pyrrhocoris apterus, in wing polymorphism control. We identified two gene clusters (Clusters I and II) resulting from an ancestral duplication in a late ancestor of winged insects, which remained conserved in most lineages, only in some of them being subject to further duplications or losses. One remarkable yet neglected feature of InR evolution is the loss of the tyrosine kinase catalytic domain, giving rise to decoys of InR in both clusters. Within the Cluster I, we confirmed the presence of the secreted decoy of insulin receptor in all studied Muscomorpha. More importantly, we described a new tyrosine kinase-less gene (DR2) in the Cluster II, conserved in apical Holometabola for ∼300 My. We differentially silenced the three P. apterus InRs and confirmed their participation in wing polymorphism control. We observed a pattern of Cluster I and Cluster II InRs impact on wing development, which differed from that postulated in planthoppers, suggesting an independent establishment of insulin/insulin-like growth factor signaling control over wing development, leading to idiosyncrasies in the co-option of multiple InRs in polyphenism control in different taxa.
The dataset consists of full sequences and sequence alignments of insulin receptor (InR) proteins of insects and other animals mined from public databases and completed with newly assembled genomic and transcriptomic data.
GenBank was BLAST examined for InR and InR homologs in taxon-specific searches through NCBI web page interface. In addition, genome drafts of Blatella germanica, Ephemera danica and Homalodisca vitripennis were downloaded from the database of I5k genome sequencing initiative (http://i5k.github.io/arthropod_genomes_at_ncbi) and prospected using BLAST algorithm in Geneious program (Biomatters, Ltd.). In case of Lepisma saccharina, de novo assembly of transcriptomes was performed using Trinity algorithm from sequence read archive (SRA) available at NCBI (https://www.ncbi.nlm.nih.gov/sra) and the resulting database was subsequently explored with BLAST in Geneious. Embiratermes neotenicus, Prorhinotermes simplex and Pyrrhocoris apterus InRs were retrieved from in-house transcriptomic databases assembled from Illumina reads with Trinity algorithm. These transcriptomes will be published elsewhere. All retrieved protein sequences were explored for the presence of protein domains using InterProScan (http://www.ebi.ac.uk/interpro/search/sequence-search), transmembrane domains were predicted on TMHMM Server v. 2.0 (http://www.cbs.dtu.dk/services/TMHMM/). Fast tree and RAxML phylogenetic analyses of Clustal W alignments (both in Geneious) and manual scoring for the presence of characteristic domain modules were used to classify a protein as a candidate for InR or decoy of insulin receptor, which differ from InR only by absence of tyrosine kinase domain. Protein sequences were aligned using MAFFT v7.308 with the E-INS-i multiple alignment method and BLOSUM80 scoring matrix. Poorly aligned regions were identified visually and removed, or adjusted using trimAl (http://trimal.cgenomics.org/) with 'gappyout' parameter.
The dataset contains following files:
Fasta full seq.docx = full protein sequences of InRs of insects
InR_AllPhyla-SI-alignment.fasta = alignment of 67 animal InR sequences used for a phylogenetic comparison across Animalia (fig. S1)
205einsi.gappy--length1301.fasta = alignment of 205 insect InR sequences used for a phylogenetic comparison across Insecta (fig. 1, S2-S4)
Blattodea-SI-alignment.fasta = alignment of InR sequences used for a detailed phylogenetic comparison in Blattodea (fig. S5)
Orthoptera-SI-alignment = alignment of InR sequences used for a detailed phylogenetic comparison in Orthoptera (fig. S6, S7)
Hymenoptera-SI-alignment = alignment of InR sequences used for a detailed phylogenetic comparison in Hymenoptera (fig. S11)
European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme, Award: 726049
Grantová Agentura České Republiky, Award: 18-21200S
Grantová Agentura České Republiky, Award: 17-01003S
Ústav Organické Chemie a Biochemie, Akademie Věd České Republiky, Award: RVO 61388963
Grantová Agentura České Republiky, Award: 15-23681S
European Research Council
European Union’s Horizon 2020